Case Study: Patents
Use of resources
-
Node Budgeted Period 1 Period 2 (est) Period 3 (est) UGOT 12 0 RE:2.4, AS:2.5 X UPC 15 0 7,5 X Ontotext 15 0 X X
Objectives
The objectives are to
- (i) create a commercially viable prototype of a system for Machine Translation (MT) and Retrieval of patents in the bio-medical and pharmaceutical domains,
- (ii) allowing translation of patent abstracts and claims in at least 3 languages, and
- (iii) exposing several cross-language retrieval paradigms on top of them.
Description of work
The work will start with the provision of user requirements (WP9) and the preparation of a parallel patent corpus (EPO) to fuel the training of statistical MT (UPC). In parallel UGOT will work on grammars covering the domain and subsequently, together with UPC, apply the hybrid (WP2, WP5) MT on abstracts and claims. Ontotext will provide semantic infrastructure with loaded existing structured data sets (WP4) from the patent domain (IPC, patent ontology, bio-medical and pharmaceutical knowledge bases, e.g. LLD). Based on the use case requirements, Ontotext will build a prototype (D7.1, D7.2) exposing multiple cross-lingual retrieval paradigms and MT of patent sections. The accuracy will be regularly evaluated through both automatic (e.g. BLEU scoring) and human based (e.g. TAUS) means (WP9).
Tasks
ID![]() |
Status | Timeframe | |
---|---|---|---|
7.1 | User Requirements | Completed | May 2011 - Oct 2011 |
7.2 | Patent Corpora | Completed | Jun 2011 - Oct 2012 |
7.3 | Grammars for the patent domain | Ongoing | Jan 2011 - Nov 2012 |
7.4 | Ontologies and Document Indexation | Ongoing | Jun 2011 - Oct 2012 |
7.5 | Patents Retrieval System | Completed | Jun 2011 - Dec 2012 |
7.6 | Machine Translation Systems | Completed | Jan 2012 - Dec 2012 |
7.7 | Protoype (User Interface) | Completed | Jun 2011 - Sep 2012 |
7.8 | Evaluations | Planned |
What links here
No backlinks found.