strict warning: Only variables should be passed by reference in /home/local/www/molto-project.eu/modules/book/book.module on line 559.

     

WP7: Case Study Patents

The work will start with the provision of user requirements (WP9) and the preparation of a parallel patent corpus (EPO) to fuel the training of statistical MT (UPC). In parallel UGOT will work on grammars covering the domain and subsequently, together with UPC, apply the hybrid (WP2, WP5) MT on abstracts and claims. Ontotext will provide semantic infrastructure with loaded existing structured data sets (WP4) from the patent domain (IPC, patent ontology, bio-medical and pharmaceutical knowledge bases, e.g. LLD). Based on the use case requirements, Ontotext will build a prototype (D7.1, D7.2) exposing multiple cross-lingual retrieval paradigms and MT of patent sections. The accuracy will be regularly evaluated through both automatic (e.g. BLEU scoring) and human based (e.g. TAUS) means (WP9).

Task List

The work package is split into 9 major tasks as follows:

  • Task 7.1 User Requirements and Scenarios (Task Lead: UPC)
  • Task 7.2 Patent corpora (Task Lead: UPC)
  • Task 7.3 Grammars for the patent domain (Task Lead: UGOT)
  • Task 7.4 Ontologies and document indexation (Task Lead: Ontotext)
  • Task 7.5 Prototype (Task Lead: Ontotext)
  • Task 7.6 SMT and Hybrid MT (Task Lead: UPC)
  • Task 7.7 Prototype (user interface) (Tas Lead by Ontotext)
  • Task 7.8 Human evaluation (Task Lead: TBD)
  • Task 7.9 Patent Case Study: Final Report (Task Lead: UPC)


Month 10-15 plan

  • Task 7.2 starts in M10 and is due to provide a first set of corpora at the end of M16. Final revision depends on the availability of the EPO data.
  • Task 7.3 starts in M10 and is due to provide a preliminary report at the end of M16.

Month 16-21 plan

  • Task 7.1 starts at M15 and is due to provide a preliminar version at the beginning of M17.
  • Task 7.3 will produce a more complete report by the beginning of M19.
  • Task 7.4 starts at M16 and is due to provide a description of the type of queries at the end of M16.
  • Task 7.5 starts at M16 and is due to provide a description of the Prototype architecture at the end of M16.
  • Task 7.6 starts along with WP5 and will produce a SMT baseline for the Patents prototype.
  • D7.1 deadline is M21.