4. Defining evaluation criteria

A helpful list of quality dimensions relevant to MOLTO evaluation can be derived from the DoW list of links between the main objectives and the tasks in WP’s:

  1. adaptability of translation systems: WP2
  2. user friendliness and integration in workflows: WP3
  3. integration with semantic web technology: WP4
  4. usefulness on different domains: WP6,WP7,WP8
  5. scaling up towards more open text: WP5,WP7
  6. quality of translation: WP9
  7. wide user adaptation and exploitability: WP10

Here are some measurable expected outcomes. Most of them are directly applicable as testable quantitative evaluation measures.  It is another thing how many test rounds we can do, given the need of fresh test subjects.

Feature Current Projected Remarks
Languages up to 7 up to 15 languages treated simultaneously
Domain size 100’s of words 1000’s of words 4 domains with substantial applications (“substantial” not quantified here)
Robustness none open-text capability translation quality: “complete” or “useful” on the TAUS scale (Translation Automation Users Society)
Development per domain months days
Development per language days hours
Learning (grammarians) weeks days
Learning (authors) days hours source authoring: the MOLTO tool for writing translatable controlled text can be learned in less than one hour, the speed of writing translatable controlled text is in the same order of magnitude as writing unlimited plain text

The number 18 of grammar library languages is the minimum number of languages we expect to be available at the end of MOLTO. The number 3 to 15 is the number of languages actually implemented in MOLTO’s domain grammars (3 in WP7, 15 in WP6 and WP8).

The measurements of all these features are performed within WP9 in connection to the project milestones. The advisory group will confirm the adequacy and accuracy of the measurements.

The objects of evaluation – even the translated texts – vary considerably per WP. We detail some criteria per WP below. Evaluation criteria and methods have been collected on the UHEL MOLTO website (esp. https://kitwiki.csc.fi/twiki/bin/view/MOLTO/EvaluationCookbook).