Adding voice output to gfsage

Description

OS X has voice output buit-in, usable from the shell by way of the say command. You can use several voices in English or download more for other languages.

Usage

You must build the system on mgl/sage as described previously.
Check that you have at least one voice for your prefered languages: Go to System Preferences > Speech and click on System Voice
See that you have the right ones. If not, click Customize on the pop-up
Select the ones for you and click Ok. When downloading terminates, you may run the tool.
You can call gfsage in 3 different ways, but for voiced output you must use the one with OPTIONS:

     gfsage Use english
     gfsage LANGUAGE Use this language   
     gfsage [OPTIONS] where OPTIONS are:
     -h --help print this page
     -i INPUT --input-lang=INPUT Make queries in LANGUAGE
     -o OUTPUT --output-lang=OUTPUT Give answers in LANGUAGE
     -v[VOICE] --voice[=VOICE] use voice output. To list voices use ? as VOICE.
     -F --with-feedback Restate the query when answering.

The options relevant here are -v and -F. Use the first to select voice output. With no argument it will pick the first available voice for the OUTPUT voice selected:

./gfsage -i english -v
Voiced by Agnes

... It will use Agnes as English voice. Notice that if you do not give a -o option, the OUTPUT language is assume to be the same as the INPUT language.

To list the available voices use:

./gfsage -i english -v?
Agnes, Albert, Alex, Bahh, Bells, Boing, Bruce, Bubbles, Cellos, Daniel, Deranged, Fred, Hysterical, Junior, Kathy, Princess, Ralph, Trinoids, Vicki, Victoria, Whisper, Zarvox

It will list the English voices. To use a specific voice write:

./gfsage -i german -vYannick
Voiced by Yannick

The option -F is to make the system paraphrase your query on answering. First, get a simple answer:

./gfsage -i english
Login into localhost at port 9000
Session ID is df7ad7c769f2faac68b6bb9489bb97e2
waiting... EmptyBlock 3
sage&gt; compute the factorial of 5.
(4) 120
answer: it is 120 .

... and now the same with paraphrasing:

./gfsage -i english -F
Login into localhost at port 9000
Session ID is 88549994a28940fe0657eb9e506a5e84
waiting... EmptyBlock 3
sage&gt; compute the factorial of 5.
(4) 120
answer: the factorial of 5 is 120 .

So, to experience voice output in its full glory you have to use both -v and -F.

Experiences with Google voice

Following a suggestion from Aarne, I found some Google service for speech input, but the experiments are not encouraging:

I recorded Compute this into a mp4 file using QuickTime Player on the mac
Converted it to flac using:

sox compute.m4a compute.flac rate 16k
And get into the service by:

curl -H "Content-Type:audio/x-flac; rate=16000" "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US" -F "myfile=@compute.flac

But got:

 `{"status":0,"id":"56bdb158dd66b25fc2e221364004e620-1","hypotheses":[{"utterance":"coffee lol","confidence":0.46219563}]}`

Other examples:

"I like pickles" ⇒ "I like turtles"
"The determinant of x" ⇒ "new york" (with confidence 0.88!)
"Compute this" ⇒ "coffee lol"

Of course I'm not a native English speaker, but I expected a better performance.