2. Points Illustrated

We consider both the end-user perspective and the content producer perspective.

From the user perspective

  • Interlingua-based translation: we translate meanings, rather than words
  • Incremental parsing: the user is at every point guided by the list of possible next words
  • Mixed input modalities: selection of words ("fridge magnets") combined with text input
  • Quasi-incremental translation: many basic types are also used as phrases, one can translate both words and complete sentences, and get intermediate results
  • Disambiguation, esp. of politeness distinctions: if a phrase has many translations, each of them is shown and given an explanation (currently just in English, later in any source language)
  • Fall-back to statistical translation: currently just a link to Google translate (forthcoming: tailor-made statistical models)
  • Feed-back from users: users are welcome to send comments, bug reports, and better translation suggestions

From the programmer's perspective

  • The use of resource grammars and functors: the translator was implemented on top of an earlier linguistic knowledge base, the GF Resource Grammar Library
  • Example-based grammar writing and grammar induction from statistical models (Google translate): many of the grammars were created semi-automatically by generalization from examples
  • Compile-time transfer especially, in Action in Words: the structural differences between languages are treated at compile time, for maximal run-time efficiency
  • The level of skills involved in grammar development: testing different configurations (see table below)
  • Grammar testing: use of treebanks with guided random generation for initial evaluation and regression testing