warning: Creating default object from empty value in /home/local/www/molto-project.eu/sites/all/modules/i18n/i18ntaxonomy/i18ntaxonomy.pages.inc on line 34.


Robust and statistical translation methods

From the Corpora-List "Release: 23M German-English parallel sentences from patent text"

Institut für Computerlinguistik -- Universität Heidelberg

We are happy to announce the release of a parallel corpus of patent text for the German-English language pair. The corpus has been constructed from EPO, WIPO and USPTO patent documents extracted from the MAREC collection and contains 23 million sentence pairs from all patent text sections.

All sentences are labeled with metadata: patent document id, patent family, patent classification and publication date.

The corpus is distributed under a Creative Commons License.

Ramona Enache is visiting UPC

Ramona Enache from UGOT is spending a research study visit at UPC to work with the local team on hybrid methods for robust statistical translation. She is one of the expert developer of GF, so do not miss talking to her if you are interested in the current research done at Chalmers on grammar-based machine translation.

Manual evaluation of patents (and MOLTO at MTSummit)

The MTSummit 2011 has been this week, including a workshop specialised on patent translation. MOLTO has been presented with talk at the workshop.

There have been presentations of the most important patents offices and, as expected, all of them apply manual evaluation to their translations. It seems interesting to us to use similar criteria to theirs in our evaluation.

EPO gives OK for the patent corpus

We have now received OK from the EPO to proceed in getting a better license for the patent corpus we need for carrying out our work in WP5 and WP7. This means that we can publish the results more freely than with the previous, personal license.

Syndicate content