X.2 Grammar modification

Examples of grammar modification

case study: Phrasebook

Phrasebook was published as deliverable 10.2 in June 2010, third month of MOLTO. Initially it translated between 14 European languages (now 20 languages) and was written by 8 authors. These include people with varied GF skills, from 2-day GF course to major developers of GF. Some of the language versions were written by people with actually no skills in the language, using example-based grammar writing (see the report for more information).

During the 2.5 years, we have gotten feedback and bug reports. The issues can be divided in Phrasebook errors and resource grammar library (RGL) errors. Both of course show as errors in the application grammar, but the error needs to be fixed at a different level. Also the time spent fixing the problem and the expertise of the grammar writer is different between the two error types.

Feedback

Feedback has been given various ways. There is a feedback button in the demo for anonymous feedback; this has gone to ____ (WHERE) and has been assigned to ____ (WHO). The Phrasebook demo has been shown in various presentations, and sometimes during the presentation an audience members or the presenter has noticed a problem. The problem has been either fixed by the presenter, or in a case where the presenter lacks time, language skills or GF skills to fix the bug, it has been given to someone with skills and time.

Initially there was no project-wide reporting system, but since autumn 2012, UHEL has set up one in http://tfs.cc/trac. Each application grammar has an owner who gets a notification about new tickets, and can fix the bug or assign the job to someone.

Crowdsourcing is another possible source for bug detection. However, in order to profit from that we would need a large number of people browsing the site and our apps, which is not realistic. Most of the bug reports come from people already involved in MOLTO.

List of grammar issues

Here I list issues that I know of. This is not necessarily a complete list.

The difference between application grammar issue and RGL issue can be unclear; for instance, an incorrect morphology in the application grammar may result in using wrong RGL functions or there not being a correct RGL function in the first place. In a case where there exists a correct RGL function but the user has chosen a wrong one, I have classified the error as application grammar issue, as the fix has been made in the application grammar.

Application grammar issues

Spanish:

1) HowFar, HowFarFrom, HowFarBy ja HowFarFromBy

Error: Structure of "How far" questions. Initially had a structure that was more common in Latin America and sounded weird for speakers in Spain.
Fix: By copying the structure from French into the application grammar.
Time: < 30 minutes
Skills: Medium GF skills (have made a mini resource and some application grammars)

2) Plane

Error: The word for plane (avión) had wrong gender. The word had been defined in the application grammar and not in the resource grammar.
Fix: Changing the gender in the application grammar, mkN "avión" masculine.
Time: < 5 minutes
Skills: Medium GF skills

3) Fish

Error: The word for fish was a word that means live fish, whereas the context in Phrasebook needs a word for fish as a food. The word was taken from the RGL lexicon, which has only one fish_N, and its meaning is live fish.
Fix: Defining the word in the application grammar, mkN "pescado".
Time: < 5 minutes
Skills: Medium GF skills

4) Adjectives ending in consonant inflect wrong

Error: Wrong paradigm chosen in the RGL functions.
Initial fix: Choose right paradigm of mkA. With smart paradigms this means choosing the right number of arguments, which in this case is 5 as opposed to 1. Applied to 8 adjectives in the application grammar
Time: < 30 minutes
Skills: Medium GF skills

Catalan:

1) HowFar, HowFarFrom, HowFarBy ja HowFarFromBy

Same error as in Spanish. Due to Catalan having been copied from Spanish. Same fix.

Finnish:

1) Locative cases for geographical names

Error: All geographical names have the same locative case, which is wrong for some
Fix: Added parameters for the data structure of geographical names, so that the right locative case can be chosen.
Time: < 30 minutes?
Skills: Advanced GF skills, native speaker of Finnish

RGL errors

Spanish and Catalan:

1) Negative imperatives

Error: Negative imperatives formed by using the positive imperative and adding a negation particle. Really it should be done with subjunctive mood + negation particle.
Fix: Created an ImpNeg function in Spanish and Catalan RGL and used it in the application grammar.
Time: ~1 hour
Skills: Medium GF skills, fluent non-native Spanish & Catalan

2) Adjectives ending in consonant inflect wrong

Error: The same error Wrong paradigm chosen in the RGL functions.
Fix: Make new smart paradigm for these adjectives that takes only 2 forms. In Catalan a more throrough revision of the smart paradigm system.
Time: ~1 hour in Spanish
Time: half day in Catalan
Skills: Medium GF skills, fluent non-native Spanish & Catalan

French:

1) Wrong agreement in French superlative forms

Error: The superlative is formed with DetNP, which only produces masculine versions.
Fix: Make DetNPFem for all Romance languages, have the application grammar a construction based on the gender of the noun
Time: ~1 hour
Skills: Medium GF skills

Finnish:

1) Vowel harmony of possessive suffixes

Error: Vowel harmony of possessive suffixes not working, gives all words a back vowel variant
Fix: Implement new parameter for vowel harmony in the Finnish resource grammar, change cat for nouns and determiners, change functions that handle them
Time: ~1 day (if counting first attempt, that turned out being too slow, and redesign)
Skills: Medium GF skills

2) Wrong word forms in Finnish genetive+possessive suffix http://tfs.cc/trac/ticket/34 3) Pronoun problems with the modal verb "must" in Finnish http://tfs.cc/trac/ticket/23 4) Incorrect plural stem for "children" in Finnish http://tfs.cc/trac/ticket/27 5) Translation of modal verb + a location not working for Finnish. Modal verb problems also in Italian, Catalan and Russian. http://tfs.cc/trac/ticket/15

All these corrected by a user with advenced GF skills, time taken in total around half a day.