Agenda
Participants: Aarne Ranta (UGOT), Adam Slaski (UGOT), Mariana Damova (Ontotext), Yasen Kiprov (Ontotext), Milen Chechev (Ontotext), Lluis Màrquez (UPC), Cristina España (UPC), Meritxell Gonzàlez (UPC).
The patents retrieval system requires an ontology on the particular domain. Retrieval will be based on the knowledge representation of the claims. MD informs that Ontotext has an ontology on the Biomedical domain but lacks an ontology on the particular domain for the WP7. UPC will provide Ontotext patent examples from the current corpus (Domain IPC A61P: Specific therapeutic activity of chemical compounds or medicinal preparations). Ontotext has an ontology capturing the structure of Patent documents. It consists of several modules including some FDA terms, drugs and
measurement related models.
The GF processes the patent claims and produces a representation in the GF abstract syntax. The GF can translate chunks and align the texts. These tasks will be used to build the hybrid MT system. Patents can be translated on a batch process, similarly to the retrieval pre-processing of the documents, but in a separate process.
Online (on the fly) translation occurs somehow in the query. The user writes a query in any of the available languages (theoretically English, French and German). This query will be translated into the GF abstract syntax. Then, the SPARQL query will be build from the abstract representation provided by the GF.
AR informs about the negotiation with EPO regarding the terms of use of the patents corpus (namely, the non‐commercial use of the applications). We will work for having a demo hosted by the EPO.
Attachment | Size |
---|---|
Minutes-WP7-11-June-2011.pdf | 65.54 KB |