Submitted by olga.caprotti on 18 April, 2013 - 11:41.
while working on the Geography assignment, I spotted a sentence with no tense in one of the leading encyclopedias in Italy.
A web service detecting this kind of problem could be used in many places, we often think more quickly than we type and some parts of the sentence gets trapped by the keyboard.
Submitted by olga.caprotti on 24 September, 2012 - 13:45.
they belong to the world (are they UNESCO heritage sites?) and when city planners decide to add a major change to their landscape, the world should be informed. I read such a news today in the Italian papers. It made me look for the planners' renderings of the building intervention that is causing the political debate. I did not find it yet but I found a nice interface to browse past-present-future city works, alas it is not multilingual.
See, Venice is a fat boot-shape, like Italy but shorter.
I think this fits very well in the SRA of Meta-NET. It is a perfect example of a local situation which might enrage people beyond the national borders.
Submitted by olga.caprotti on 13 July, 2012 - 16:54.
I was about to hand wash my fancy dress with metal fiber fabric so I decided to check the washing instructions.
It is surprising to note that the Spanish has a negative description, mentioning a compound that should be avoided, while the others just say what to use. Hard to imagine what is the abstract grammar for that, my guess is that it needs quite a bit of context to work. AceWiki? :)
BTW, I would love to know if you spot similar examples in those fancy new sportswear fabrics.
Submitted by olga.caprotti on 20 June, 2012 - 11:21.
I have been looking at ways of extracting marked up text and mathematics from scans of old mathematics books, in particular from A synopsis of elementary results in pure mathematics: containing propositions, formulæ, and methods of analysis, with abridged demonstrations. Supplemented by an index to the papers on pure mathematics which are to be found in the principal journals and transactions of learned societies, both English and foreign, of the present century (1886) available at http://archive.org/details/synopsisofelemen00carrrich.
When I try a simple cut&paste from the PDF of the book (the DjVu version still to be downloaded), say of page 27, I get:
INDEX TO TROPOSITIONS OF EUCLID REFERRED TO IX THIS WORK.
Tho references to Euclid are made in Koinan and ^Vrabic numerals ; e.g. (VI. 19).
BOOK T.
I. 4.—Triaui^'los arc equal and similar if two sides and the included an<^le of each are equal each to each.
I. 5.—The angles at the base of an isosceles triangle are equal.
1. 0.—The converse of 5.
I. 8.—Triangles are equal and similar if tlie tliroe sides of eacli arc
ecjual each to each.
I. IT). —The exterior angle of a triangle is grojiter than the interior
and opposite.
I. 20.—Twosidesofatrianglearegreaterthanthethird.
I. 26.—Triangles are equal and similar if two angles and one corres-
ponding side of each are equal each to each.
I. 27.—Two straight lines are parallel if tlicy make equal alternate
angles with a third line. I. 29.—Theconverseof27.
I wonder how much of this could be corrected automatically.
[PDF] Semantics and quantification in natural language question answering
[PDF] from stanford.edu
WA Woods… - Advances in computers, 1978 - stanford.edu
The history of communication between man and machines has followed a path of increasing
provision for the convenience and ease of communication on the part of the human. From
raw binary and octal numeric machine languages. through various symbolic assembly. ...
Cited by 179 - Related articles - View as HTML - All 7 versions
led me to the final section on Loose Ends. It seems worth a second expert look to decide whether the MOLTO tools can tackle such issues, if they are still open.
Any volunteer? I believe this is an enjoyable read.
As it happened now, in a private correspondence between Jordi and myself, I was pointing out the word problem mentioned in the Wikipedia page, http://en.wikipedia.org/wiki/Brahmagupta,
In chapter twelve of his Brahmasphutasiddhanta, Brahmagupta finds Pythagorean triples,
12.39. The height of a mountain multiplied by a given multiplier
is the distance to a city; it is not erased. When it is divided
by the multiplier increased by two it is the leap of one of the
two who make the same journey.[9]
or in other words, for a given length m and an arbitrary multiplier x,
let a = mx and b = m + mx/(x + 2).
Then m, a, and b form a Pythagorean triple.[9]
and Jordi noticed that m=3, x=1, then a = 3, b = 3 + 3/3 = 4, but (1,3,4) is not a Pythagorean triple.
Symbolic transliteration from the English (we assume) is incorrect. Will it ever be possible to automatically setup such a system of equations? How was the original? Is it easier to comprehend and convert to a system of equations?
Asked Shafqat and Prasad to try to find the original.
Languages like Italian have strict rules on the use of the proper conjunctive tense form in subordinate sentences. These rules are very frequently broken, maybe because of poor grammar education in primary school, maybe because language is evolving towards an easier, simplified but less elegant form, maybe because it is now being spoken by non natives. Nevertheless, seeing blatant errors published online by newspapers contributes to the degrading quality of the language. Could a generic GF parser (as the one written by Malin for Swedish) be useful for checking tenses accordance?
Below a screenshot of the example that prompted this comment.
Are there any other languages with such features? Is Swedish one such too?
Submitted by olga.caprotti on 24 April, 2012 - 12:17.
I believe it would be easy to use GF grammars to generate flashcards, the kind used by students to review a subject area, or a language, or definitions in a subject.
I can imagine also using the KRI to make more complicated flash cards just by enumerating the questions, printing them on one side, and printing the answers on the reverse side.
Mobile apps must be available that take some standard input format for flashcards.
Also, device like game consoles (I am thinking Nintendo DS and similar ones that already have language learning applications, see e.g. My Coach series).
It is a modular application and possibly we need only look into
flashcards-data-service : Data provider for the application. Python + Google App Engine project.
The service responds with the words(along with articles and English translation) of a specified count (currently 100) and a user key. If the same user key is given to the service for the next time, the service returns the next set. If no user key is passed, then it returns the first set.
Turning it to an online service just requires changing the function getEntrySetForUser in
https://github.com/aliok/flashcards/blob/master/flashcards-data-service/...
to fetch randomly generated entries from our lexicon resources (this could include more complicated sentence patterns than what the current dictionary supports).
Comments
Robust parsing of syntax
while working on the Geography assignment, I spotted a sentence with no tense in one of the leading encyclopedias in Italy.
A web service detecting this kind of problem could be used in many places, we often think more quickly than we type and some parts of the sentence gets trapped by the keyboard.
About detection of inconsistencies in natural language
Reality checker: How to cut nonsense from the net shows an interesting application field where advanced language technologies could be applied.
Read the full article Reality checker: How to cut nonsense from the net - tech - 19 September 2012 - New Scientist.
Venice like Amsterdam or Stockholm
they belong to the world (are they UNESCO heritage sites?) and when city planners decide to add a major change to their landscape, the world should be informed. I read such a news today in the Italian papers. It made me look for the planners' renderings of the building intervention that is causing the political debate. I did not find it yet but I found a nice interface to browse past-present-future city works, alas it is not multilingual.
See, Venice is a fat boot-shape, like Italy but shorter.
I think this fits very well in the SRA of Meta-NET. It is a perfect example of a local situation which might enrage people beyond the national borders.
Semantic equivalende of multilingual washing instructions
I was about to hand wash my fancy dress with metal fiber fabric so I decided to check the washing instructions.
It is surprising to note that the Spanish has a negative description, mentioning a compound that should be avoided, while the others just say what to use. Hard to imagine what is the abstract grammar for that, my guess is that it needs quite a bit of context to work. AceWiki? :)
BTW, I would love to know if you spot similar examples in those fancy new sportswear fabrics.
About the multilingual web
How are translations of web sites contributed these days?
Check e.g. http://www.librarything.com/about-translation.php, they seem to have a few translations going.
Especially check the inspiration links, for instance Google in your Language
Improving OCR software output
I have been looking at ways of extracting marked up text and mathematics from scans of old mathematics books, in particular from A synopsis of elementary results in pure mathematics: containing propositions, formulæ, and methods of analysis, with abridged demonstrations. Supplemented by an index to the papers on pure mathematics which are to be found in the principal journals and transactions of learned societies, both English and foreign, of the present century (1886) available at http://archive.org/details/synopsisofelemen00carrrich.
When I try a simple cut&paste from the PDF of the book (the DjVu version still to be downloaded), say of page 27, I get:
I wonder how much of this could be corrected automatically.
Loose ends from LUNAR
Quickly reading through the tech report
led me to the final section on Loose Ends. It seems worth a second expert look to decide whether the MOLTO tools can tackle such issues, if they are still open.
Any volunteer? I believe this is an enjoyable read.
Firefox plugin for predictive typing
I just installed the German Austrian dictionary for Firefox. It spells check what I type, e.g. as a comment in Facebook.
Along the same lines, shouldn't we able to do some predictive typing plugin too?
https://developer.mozilla.org/En/Firefox_addons_dev_guide see Chapter 5.
Bootstrapping semantic frame alignments with GF?
See inspiring talk of Dekai at EAMT2012, for doing it with ITGs and LTGs analysis with English-Chinese.
Also see
http://www.cs.ust.hk/~dekai/ssst
Asked Dana.
Ambitious checking of correctness of Wikipedia pages
As it happened now, in a private correspondence between Jordi and myself, I was pointing out the word problem mentioned in the Wikipedia page, http://en.wikipedia.org/wiki/Brahmagupta,
and Jordi noticed that m=3, x=1, then a = 3, b = 3 + 3/3 = 4, but (1,3,4) is not a Pythagorean triple.
Symbolic transliteration from the English (we assume) is incorrect. Will it ever be possible to automatically setup such a system of equations? How was the original? Is it easier to comprehend and convert to a system of equations?
Asked Shafqat and Prasad to try to find the original.
Correcting the use of tenses in subordinate sentences
Languages like Italian have strict rules on the use of the proper conjunctive tense form in subordinate sentences. These rules are very frequently broken, maybe because of poor grammar education in primary school, maybe because language is evolving towards an easier, simplified but less elegant form, maybe because it is now being spoken by non natives. Nevertheless, seeing blatant errors published online by newspapers contributes to the degrading quality of the language. Could a generic GF parser (as the one written by Malin for Swedish) be useful for checking tenses accordance?
Below a screenshot of the example that prompted this comment.
Are there any other languages with such features? Is Swedish one such too?
Computable Format Documents made Multilingual
Would that be possible?
eScience user group, eLearning.
See more at
http://www.wolfram.com/training/special-event/wolfram-cdf-virtual-worksh...
Correct short scientific text in online applets
Over and over one happens to be reading grammatically incorrect English. It just happened now with the otherwise very nice applet http://htwins.net/scale2/scale2.swf?bordercolor=white.
Needs some playing with it to check the kind of language used in these small descriptions of popular science communication.
Translating its language file would turn this applet to such a nice multilingual learning resource.
Generate flashcards
I believe it would be easy to use GF grammars to generate flashcards, the kind used by students to review a subject area, or a language, or definitions in a subject. I can imagine also using the KRI to make more complicated flash cards just by enumerating the questions, printing them on one side, and printing the answers on the reverse side.
Mobile apps must be available that take some standard input format for flashcards.
Also, device like game consoles (I am thinking Nintendo DS and similar ones that already have language learning applications, see e.g. My Coach series).
Sample code
There is some sample code for flashcards
https://github.com/aliok/flashcards
http://bit.ly/deFlashcardsSource
It is a modular application and possibly we need only look into
flashcards-data-service : Data provider for the application. Python + Google App Engine project.
The service responds with the words(along with articles and English translation) of a specified count (currently 100) and a user key. If the same user key is given to the service for the next time, the service returns the next set. If no user key is passed, then it returns the first set.
The dictionary is a static Python code. The
flashcards-word-set-generator
module generates that static Python dictionary code, see https://github.com/aliok/flashcards/blob/master/flashcards-data-service/...Turning it to an online service just requires changing the function
getEntrySetForUser
in https://github.com/aliok/flashcards/blob/master/flashcards-data-service/... to fetch randomly generated entries from our lexicon resources (this could include more complicated sentence patterns than what the current dictionary supports).