Contract No.: | FP7-ICT-247914 |
---|---|
Project full title: | MOLTO - Multilingual Online Translation |
Deliverable: | D9.2 MOLTO evaluation and assesment report |
Security (distribution level): | Public |
Contractual date of delivery: | M36 |
Actual date of delivery: | March 2013 |
Type: | Report |
Status & version: | Draft |
Author(s): | Jussi Rautio, Maarit Koponen |
Task responsible: | UHEL |
Other contributors: | UPC |
Abstract
The impact of MOLTO is not about just individual use cases. During the 3 years of the project, we have developed methods of efficient grammar writing, dividing the task such that grammar experts and domain experts get to do what they can best. These guidelines are documented in D2.3, Best practices.
We have conducted a grammar evaluation survey for people who have written grammars. The results of the survey and an overview of the practices are documented in Part 1.
We have also noted the time and measures for correcting grammars. Since the release of the first MOLTO demo (D10.2, tourist phrasebook), we have collected feedback and bug reports, and corrected the bugs. Part 2 describes these bugs and the effort that has been needed to fix them.
The impact of MOLTO is not about just individual use cases. During the 3 years of the project, we have developed methods of efficient grammar writing, dividing the task such that grammar experts and domain experts get to do what they can best. These guidelines are documented in D2.3, Best practices.
Best practices document was published in October 2012, but many of the grammars are written before that. Here is first an overview of the best practices and whether the grammars are written accordingly.
(This summary is copypaste from the document.)
The following tools are standard and well-tested in MOLTO’s and other applications:
It has two modules: Sentences, which contains phrases that can be defined by a functor over the resource grammar API. The phrases that are likely to have different implementations are in the module Words.
Semantic validity is handled with simple, restrictive abstract syntax. For example, an abstract syntax tree like
HowFarBy : Place -> ByTransport -> Question
guarantees that we can say "How far is the church by taxi" but not "How far is John by beer": the arguments need to be a place and a transport.
Module structure: Common constructions with a functor
Starting point for the grammar was a test corpus of sentences we want to express in the grammar. These sentences are used as a documentation for the abstract syntax:
AHasAge : Person -> Number -> Action ; -- I am seventy years AHasChildren: Person -> Number -> Action ; -- I have six children AHasName : Person -> Name -> Action ; -- my name is Bond
ACE-GF: based on Attempto Controlled English. (ACE is ____.)
Acewiki working on ACE (acewiki subset), grammars for Cat, Dan, Dut, Eng (not ACE), Est, Fin, Fre, Ger, Ita, Lav, Nor, Pol, Ron, Rus, Spa, Swe, Urd (https://github.com/Attempto/ACE-in-GF/tree/master/grammars/acewiki_aceowl).
Grammar modules: ACE base, in addition domain lexicons (Geography).
(in AceWiki also normal grammars, not ace. But unrelated to ACE grammar.)
Questionnaire Basic information: Use of development tools: Diagnostic tools Compilation diagnostics: Grammar display modes: Testing Tools for generation and testing: RGL Resource grammar tools: Grammar writing Starting point for your grammar: Basic unit of the grammar: Semantic control: Module structure: Concrete syntax:
Analysis of answers: ....
Some things answered in "Other", not in Best practices(?):
Other method for treebanks: Haskell code to store, edit and show differences in treebanks.
Other development tool: Haskell and shell scripts generating grammars
Examples of grammar modification
case study: Phrasebook
Phrasebook was published as deliverable 10.2 in June 2010, third month of MOLTO. Initially it translated between 14 European languages (now 20 languages) and was written by 8 authors. These include people with varied GF skills, from 2-day GF course to major developers of GF. Some of the language versions were written by people with actually no skills in the language, using example-based grammar writing (see the report for more information).
During the 2.5 years, we have gotten feedback and bug reports. The issues can be divided in Phrasebook errors and resource grammar library (RGL) errors. Both of course show as errors in the application grammar, but the error needs to be fixed at a different level. Also the time spent fixing the problem and the expertise of the grammar writer is different between the two error types.
Feedback has been given various ways. There is a feedback button in the demo for anonymous feedback; this has gone to ____ (WHERE) and has been assigned to ____ (WHO). The Phrasebook demo has been shown in various presentations, and sometimes during the presentation an audience members or the presenter has noticed a problem. The problem has been either fixed by the presenter, or in a case where the presenter lacks time, language skills or GF skills to fix the bug, it has been given to someone with skills and time.
Initially there was no project-wide reporting system, but since autumn 2012, UHEL has set up one in http://tfs.cc/trac. Each application grammar has an owner who gets a notification about new tickets, and can fix the bug or assign the job to someone.
Crowdsourcing is another possible source for bug detection. However, in order to profit from that we would need a large number of people browsing the site and our apps, which is not realistic. Most of the bug reports come from people already involved in MOLTO.
Here I list issues that I know of. This is not necessarily a complete list.
The difference between application grammar issue and RGL issue can be unclear; for instance, an incorrect morphology in the application grammar may result in using wrong RGL functions or there not being a correct RGL function in the first place. In a case where there exists a correct RGL function but the user has chosen a wrong one, I have classified the error as application grammar issue, as the fix has been made in the application grammar.
Spanish:
1) HowFar, HowFarFrom, HowFarBy ja HowFarFromBy
2) Plane
mkN "avión" masculine
.3) Fish
fish_N
, and its meaning is live fish.mkN "pescado"
.4) Adjectives ending in consonant inflect wrong
mkA
. With smart paradigms this means choosing the right number of arguments, which in this case is 5 as opposed to 1. Applied to 8 adjectives in the application grammarCatalan:
1) HowFar, HowFarFrom, HowFarBy ja HowFarFromBy
Finnish:
1) Locative cases for geographical names
Spanish and Catalan:
1) Negative imperatives
ImpNeg
function in Spanish and Catalan RGL and used it in the application grammar.2) Adjectives ending in consonant inflect wrong
French:
1) Wrong agreement in French superlative forms
DetNP
, which only produces masculine versions.DetNPFem
for all Romance languages, have the application grammar a construction based on the gender of the nounFinnish:
1) Vowel harmony of possessive suffixes
2) Wrong word forms in Finnish genetive+possessive suffix http://tfs.cc/trac/ticket/34 3) Pronoun problems with the modal verb "must" in Finnish http://tfs.cc/trac/ticket/23 4) Incorrect plural stem for "children" in Finnish http://tfs.cc/trac/ticket/27 5) Translation of modal verb + a location not working for Finnish. Modal verb problems also in Italian, Catalan and Russian. http://tfs.cc/trac/ticket/15