Machine Translation Systems

Start: 22 Mar 2010

Timezone: Europe/Vienna

ID:

7.6

Workpackage:

Case Study: Patents

Assignees:

aarne.ranta

Assignees:

cristina.españa

Assignees:

lluis.marquez

Assignees:

meritxell.gonzalez

Assignees:

ramona.enache

Relevant Deliverables:

Patent Case Study Final Report

Relevant Deliverables:

Patent MT and Retrieval Prototype

Relevant Deliverables:

Patent MT and Retrieval Prototype Beta

Dependencies:

Grammars for the patent domain

Status:

Completed

Timeframe:

Jan 2012 - Dec 2012

Completed on:

11 January, 2013 (All day)

Contact @UPC: Lluis and Cristina

DEPENDENCIES:

TASK 2, 3
WP5. A baseline of the WP5 system will be integrated in the prototype.

Patents abstracts and claim are translated using the baseline of the hybrid system.

Comments

Input Encoding

Submitted by meritxell.gonzalez on 25 January, 2013 - 17:05.

After the completion of the translation we detected a tricky bug in the data that affects the quality of the translation of the compounds, which is one of the strong points of our system.

The solution has been to encode all the text into UTF8 and retrain the baseline system.

Utrecht meeting notes

Submitted by meritxell.gonzalez on 1 October, 2012 - 11:55.

The patent documents were translated using the SMT baseline system. These documents were later annotated to be added to the retrieval databases.

In order to improve the retrieval accuracy, we have change the annotation approach. So now, the documents are first annotated and then are being translated keeping the semantic annotations in the target language.

Also, a subset of the documents will be translated with the best version of the hybrid system.

What links here

No backlinks found.

Demos

Recent News

Recent Publications

Machine Translation Systems

Comments

Input Encoding

Utrecht meeting notes

See also

What links here

Wiki index

EVENTS

Current signups for