Frequently Asked Questions

These are the questions we have been asked about MOLTO, with our answers. If you do not find what you are looking for, you are welcome to contact us.

  • What is MOLTO's goal, in one sentence?

    We want to develop tools for automatically translating documents on the web, with a high quality and between many languages (up to 15 simultaneously).

  • How does MOLTO differ from existing translation tools on the web?

    Tools like Systran (Babelfish) and Google Translate are designed for consumers of information, but we will mainly serve the producers of information. We want the quality to be good enough so that, for instance, an e-commerce site can translate their web pages automatically without the fear that the message will change. With other tools, a potential customer can, for instance, read an e-commerce page written in French and translate it into Swedish just to find out whether the shop has something of interest for her.

  • Isn't this too good to be true?

    There is a price we have to pay of course: we will not be able to translate just anything. We can only translate things that we have customized the system to translate. This follows from a well-know trade-off in machine translation: one cannot at the same time reach full coverage and full precision. In this trade-off, Systran and Google have opted for coverage whereas MOLTO opts for precision.

  • What kind of things will you be able to translate?

    MOLTO translators are specialized to different domains, which use language in uniform and well-understood ways. In MOLTO itself, we will build systems for three such domains: mathematical exercises, biomedical patents, and museum object descriptions. But these domains are just examples, which help us to develop and evaluate the tools; we expect the tools to be applicable to new domains by other people. Examples of such domains could be e-commerce sites, Wikipedia articles, contracts, business letters, user manuals, and software localization.

  • Where else can I find MOLTO news?

    The MOLTO twitter feed publishes small newsflash items, interesting facts on MOLTO technologies and people, and even photos of events occasionally. You can follow us at http://twitter.com/moltoproject.

  • How do I acknowledge MOLTO funding?

    All publications shall include the following statement to indicate that foreground was generated with the assistance of financial support from the European Union :

    The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° FP7-ICT-247914.

  • How do I connect to NEF (the participant portal of the EU)?

    The Participant Portal hosts services that facilitate the monitoring and the management of the projects. It is also a secure Internet site that ensures adequate authentication and confidentiality mechanisms, based on the European Commission Authentication Service (ECAS).

    The access to NEF is via the Participant Portal: http://ec.europa.eu/research/participants/portal/

    As a participant, you can access NEF via the participant portal to enter data to be submitted by the coordinator.

  • What does the open access -- special clause 39 -- entail exactly?

    The FP7 open access pilot is based on self-archiving open access of peer-reviewed publications. The Commission also provides the opportunity for fully open access publishing in FP7 by reimbursing costs for open access publishing in subscription-based journals.

    Authors shall deposit the peer-reviewed manuscripts of their articles in repositories (also called open archives) at the time of publication.

  • How is self-archiving and open access at UGOT?

    The system for self-archiving at the University of Gothenburg is under development. Self-archiving will be done in GUP (Göteborgs Universitets Publikationer) and it will be possible to set a time for when the publication should be openly available. At the moment self- archiving can be done in GUPEA (Göteborgs Universitets Publikationer - Elektroniskt Arkiv), but you will need help with this.

  • Is there a style for MOLTO deliverables?

    No, not officially since every group uses his favorite editor. Ideally the deliverable are prepared collaboratively online in the wiki as "living deliverables". This allows for archiving comments and for versioning (as done by our CMS Drupal). If you choose to do your edits online then, once the deliverable is available under

    http://www.molto-project.eu/wiki/living-deliverables

    you may generate a frozen version to be time-stamped as delivered using the printer-friendly followed by Print. This produces the page numbers and a good A4 format, in most cases.

  • Will you be able to translate newspaper texts?

    No. "Newspaper text" is not a well-defined domain in MOLTO's sense, at least not in the light of the knowledge we have today. So we leave it to other tools to translate newspapers, novels, and random web pages.

  • Is it a huge effort to build quality translation systems for new domains?

    This is exactly what we want to make easier. Traditionally, it has been an effort of years to build a translation system of any reasonable size. We want to bring this down to months, in some cases even to days. And we want it to be doable for persons without special training in MOLTO, in linguistics, or in programming. Read the "Technology" section to find out how we believe we can do this.

  • Will you make human translators unemployed?

    No. Firstly because we cannot translate outside well-defined domains. Secondly, and more interestingly, we will provide new working modes for human translators: instead of translating similar documents in the same domain over and over again, they will be able to work on customizing the translation systems. The systems will learn from a few well-chosen examples, translated by humans, how to translate other texts within the same domain. This will raise the translator's work to a higher level.

  • Will the quality match human translators?

    Human translators will always be better than MOLTO at making intelligent decisions about style, and hence produce more elegant text. On the other hand, MOLTO will be good at terminologies and idiomatic usages in specialized domains, for which human translators might lack training.

  • What languages are there in MOLTO?

    MOLTO is committed to dealing with 15 languages, which includes 12 official languages of the European Union - Bulgarian, Danish, Dutch, English, Finnish, French, German, Italian, Polish, Romanian, Spanish, and Swedish - and 3 other languages - Catalan, Norwegian, and Russian. But during the project, other languages are likely to be added, since they are provided by other on-going projects.

  • How can I add my language?

    The main thing we use for each language in MOLTO is a resource grammar, which is actually a software library that defines the grammatical rules of the language: its word inflection and syntactic structures. Writing a resource grammar for a new language requires an effort of 3--6 months from a reasonably skilled programmer with good theoretical and practical knowledge of the language.

  • Which are the most likely next languages?

    There is on-going work on at least Arabic, Farsi, Hebrew, Hindi/Urdu, Icelandic, Japanese, Latvian, Maltese, Portuguese, Swahili, Tswana, and Turkish. The EU languages that still lack developers are Czech, Estonian, Greek, Hungatian, Irish, Lithuanian, Slovak, and Slovene. You are most welcome to contribute to any of these languages!

  • When will MOLTO be available for use?

    We will release the first prototype of MOLTO web service in June 2010. This prototype will be constantly updated, and more mature tools will be released during 2011. The case studies will be finished in late 2012. But you can already now get an idea of the underlying technology by trying out a fridge magnet demo or a text input demo.

  • Your translator has errors - where's the quality?

    We will receive feedback from users continuously, and fix all errors as soon as possible. One advantage with MOLTO technology is that it is highly programmable: we can locate errors in translations with high precision, and produce a fixed version of the system quickly without breaking anything else.

  • Which people are there in MOLTO?

    We are three universities and two private companies, from five EU countries. About 25 persons will be actively involved in MOLTO.

  • What are your backgrounds and competences?

    MOLTO has people with backgrounds in computer science, linguistics, and mathematics. There are university professors, PhD students, engineers, and translators.

  • How will the EU money be used?

    The total budget is just below 3,000,000 EUR, of which the EC contribution is 2,375,000 EUR. This will pay 390 person months of work, divided to engineers, PhD students, translators, a project manager, and partial salaries of faculty members. More than 90% of the budget is for salaries and salary-related costs; the rest is mainly for travels, both for internal meetings between the sites, and for participation in conferences to disseminate the results. In another perspective, 86% is for research and development, 10% for dissemination and exploitation, and 5% for management.

  • Who may exploit the results?

    Almost all software will be publicly available as open-source free software released under the GNU LGPL license. The LGPL license implies that anyone may use MOLTO tools for anything, both for research and for commercial purposes. The third-party applications need not be released as open source again, like with the GPL license. But of course we expect much of the derived work also to be released with open source back to the community.

  • Will there be commercial applications?

    Yes, our company partners will evaluate the commercial use during MOLTO.

  • What are the main ideas behind MOLTO?

    The main idea is to use interlinguas based on domain semantics and equipped with reversible generation functions. Thus translation is a composition of parsing the source language and generating the target language. An implementation of this technology is provided by GF, Grammatical Framework, grammaticalframework.org. GF is in MOLTO complemented by the use of ontologies, such as used in the semantic web. We will also use methods of statistical machine translation (SMT) for improving robustness and extracting grammars from data.

  • What is GF?

    GF is a framework for defining multilingual grammars, each based on a common abstract syntax. The abstract syntax is defined by using type theory, in the same way as in logical frameworks. The natural language generation part is called concrete syntax, which is a feature-based grammar formalism equivalent to PMCFG (Parallel Multiple Context-Free Grammars) and has polynomial parsing behaviour. GF uses PMCFG as its "machine language", which is compiled from

  • Why do you believe these ideas will work?

    GF has been developed for 12 years now, and multilingual GF-based translation has been tested in numerous applications, ranging from mathematics via software specifications to spoken dialogue systems (see GF homepage). We also believe there are lots of interesting domain translation tasks out there, even if we cannot provide a competitor to open-domain systems like Google translate.

  • Isn't interlingua an unrealistic dream?

    Yes, it is, if we want to have a universal interlingua working for everything. This is why we don't believe we can ever translate newspapers with MOLTO techniques. However, domain-specific interlinguas have proved quite feasible. Notice that this move is similar to what has happened in ontologies: they have moved from universal ontologies to domain ontologies.

  • Are there any scientific challenges?

    The first challenge is to scale up the size of applications. Not so much the number of languages, which we know how to manage already, but the lexicon size - from hundreds to thousands of words. We need techniques to build manually and extract automatically such translation lexica. This leads to the second challange, which is to minimize the development effort, in terms of skills and time: to make GF available for people with no special training, as a part of their normar work flows.