Patent MT and Retrieval Prototype Beta

TitlePatent MT and Retrieval Prototype Beta
Publication TypeDeliverable
AuthorsChechev, M, Enache, R, España-Bonet, C, Gonzàlez, M, Màrquez, L, Popov, B, Ranta, A
Accession Number071
PublisherThe MOLTO Consortium
Year of Publication2012
NumberD 7.1
Date Published01/2012

This document is the written report of the first deliverable corresponding to WP7, Case Study: Patents. It describes the preliminar prototype for patent translation and retrieval.

First, there is a general overview of the workpackage and we briefly summarize the scenarios considered within the prototype. Then, we give the general layout of the prototype architecture, the demonstrator interface and the technologies integrated in the prototype. Finally, we summarise the current status of the workpackage and the future directions for the final prototype.

KeywordsPatents case study, WP7
Type of WorkProject Deliverable


D71_final.pdf1.05 MB


New revision

Happy new year to you all!

I have uploaded the latest revision of the Deliverable with all your contributions and comments added. In case you need the Latex sources, they are also available under request.

Please, tell me any further comments the soonest, so we can change the status to "final" in the next days.

Cheers! Meritxell.

feedback on the deliverable

Hi everyone,

After reading the deliverable I have the following comments:

  • regarding the part where I should write more about the "characteristics of the query grammar", I would like some more directions about what I should exactly refer to, because there is not so much to say from my point of view. The interesting part is how you ended up with those examples (probably from relations in the ontology) because I only made the smallest grammar that covered all the positive example along with their generalizations, modulo the ontology.

  • about the answer produced in natural language, I'm afraid that I don't really follow since there is no grammar that verbalizes the patent ontology, and from what I saw in the demo you just show the text of the patent and highlight certain concepts, but not do any further manipulation of the result.

  • also I am not sure about the future number of languages that should be covered by the query grammar/translation system, since in the work plan there was a mention of having a minimal functionality for all 14 languages. Did you take this into account ?

  • the text needs some spell checking, because it has some grammatical inconsistencies, for example in the abstract, "summarise" appears along with "summarize" and "bio-medical" is later spelled "biomedical".

  • a large part of your introductory paragraph is taken from the report that I wrote with Adam which should be also cited as a reference.

If you need any contribution from me I would be happy to contribute as long as you make your request more concrete, because I was not aware of all the progress on the information retrieval part, since I only worked with the grammars.

Best regards, Ramona

answers in natural language

Right! We have to stress in the document that this is not the actual implementation. Instead, Section 2 describes the use cases in the case study. It includes the "wishes" for the prototype (e.g. response generation in NL) and also other scenarios that don't necessarily have to be integrated in the prototype (e.g. online translation).




Please, send me the reference to the report you mention, I will be happy to cite it. And, regarding the two sentences taken from there, I think that I took them from a very preliminar version of the paper that we sent to the MT summit, which I cite in the D71. My fault not to realize that the original source was the report.

In general, please tell me any other reference you miss.


contributions to D71

IMHO, the Deliverable is the description of the work done collaboratively by all the partners involved in the prototype. The document proposed is just a preliminar draft. You all, as contributors and authors, can propose any change, add/remove sections and explain your work. Nobody else but you can explain your work better than yourself. So, feel free to add your work in the Deliverable in the way you prefer.

In particular, from my point of view, the query grammar is an important part of the system since it is the input to the system. The input consists of two processes: first, parsing the user input and generating an abstract representation, and then converting this representation into SPARQL. So, the performance and capabilities of the rest of the system depend somehow on how this NL input is processed. That's why I find interesting to any future reader (and for us) to describe these two processes (parsing + translation) with some detail (indeed, same applies for the rest of the modules ;) )



Just to add to the spell-checking issues: In Table 2, all accents in the French column are missing.

--Jordi Saludes

About the languages involved

For everything related to patents I think we only promised 3 languages (English, French and German now), not the full 14. This was including the queries, right?

The workplan

The workplan says 3 languages. I have no idea who was the original author of this work package description.