WP2, WP3, WP5, WP6, WP7, WP10 Meeting 2012.04.04

Participants: OC, MA, RE, CE, MG

Topic: how to exploit the current MOLTO results and tools in the general panorama of e-science and WDML, and CLARIN and .....

Minutes

Malin and Ramona have worked on developing a Swedish parser based on a wide coverage grammar, Ramona and Txell are handling the patents language, and Jordi the mathematical fragments, plus we have the KRI from Ontotext. It seems that all this should come together now :)

So how hard is the following challenge:

  • to harvest meaningful data that has appeared in the scientific literature? (For Cristina's that would be astrophysics) Note that this aligns well with the current efforts in Semantic Web, World Digital Mathematics Library, and also ICSU World Data System (http://www.icsu-wds.org/) and fit to the general idea of eScience workbench.

In discussing the above we stumbled on a number of questions:

  • is that really valuable when we all write and publish science in English? We are loosing the possibility of speaking science in our natural language if we loose the specialized language.
  • who makes the specialized terms? how do domain specific terms make it into the dictionary? who translates them? we realized we did not know! Tasks assigned below.
  • UPC publishes a terminology booklet for lecturers, this seems like a good idea for using example-based GF generation. Investigate further.
  • how difficult is to extract a parallel corpus from abstracts that were produced by experts (to be published in national societies journals) in English and, say Catalan? Good news it seems for research.

A number of task have been volunteered :) here below. No specific deadline, maybe one of funding.

TODO log:

[4/4/12 11:24:51 AM] Olga Caprotti: http://eudml.org/home.action;jsessionid=4E3743F7FAF048E7CDA31E9ACD53E054
[4/4/12 11:39:35 AM] Olga Caprotti: OC: look for swedish math paper to send to malin
[4/4/12 11:39:46 AM] Olga Caprotti: OC: look for funding
[4/4/12 11:40:19 AM] Cristina: CE: look for lexicon catalan-spanish-english
[4/4/12 11:41:11 AM] Olga Caprotti: RE: scenario of example based generation, check if existing swedish domain-specific language guide for instructors at uni
[4/4/12 11:41:25 AM] Cristina: MG: look for the terminology unit at upc
[4/4/12 11:41:59 AM] Olga Caprotti: OC: scenario for exploitation planning of MOLTO result: e-science multilingual data harvest
[4/4/12 11:42:54 AM] Olga Caprotti: 
[4/4/12 11:43:32 AM] Olga Caprotti: MA: ask at språkbanken how specific domain terms make it into the dictionary

Comments

Terminology Service at UPC

Hi all!

Find below some links to the terminology service at UPC. The web page is in Catalan, let me know if you need some help to navigate it.

And also,

  • The UN multlingual terminology database (you probably already know, but just in case): http://unterm.un.org/

Cheers! Meritxell.

ToDo

[4/4/12 11:40:19 AM] Cristina: CE: look for lexicon catalan-spanish-english

the digital version of the lexicons is here:

http://www.ub.edu/sl/ca/acollida/vobasics.php

There are also links to the terminology databases of several Catalan universities.