TODO

gmail mode - uses all browser commands too

more commands for launching different apps

refine spoken feedback - only give feedback when visual feedback not immediate, such as grammar switching. Or always give feedback when switching grammars, even if acknowledgements are off.

test on Windows - think about way of managing and configuring windows- and linux-specific grammars

see if accuracy can be improved by tweaking sphinx configuration file and microphone settings.

dictation mode - try creating language model from sent emails and IM logs

commands for time and date status - probably spoken status is best

google search mode - want to be able to speak word list, language model not very important

numbers mode "one hundred two", etc.

automatic loading of app-specific grammar based on the currently focused window. Look into Gnome AT-SPI (assistive technology service provider interface) (also has a CORBA interface may be accessible via Java CORBA bindings), GOK (Gnome onscreen keyboard). If Java JNI is needed to access some C/C++ code, looks like SWIG can generate the JNI code.

switch to a new version control system - bazaar-ng, arch, darcs, subversion, or CVS

addenda dictionary for custom words -- look into voxapl usage in conf/sphinx.xml dictionary section - only supported in CVS version of S4... don't want to require that just yet.

define a word/command - say 'define word' and you are prompted to spell the word, which is basically a key sequence. Once it is correct, you say 'all done'. Then it prompts for the 'sound' of the word. You say the word 3 times. Using the pronunciation code that Will Walker has, it generates the phone set for the word. That phone set is added to the custom dictionary along with an ascii version of the word (since non-printing keys may be involved if it is a command). A corresponding entry in the command grammar is created with the exact sequence of keys. This could be used to add proper names to the dictionary on the fly, or shortcut commands on the fly. In the case of proper names, hopefully there is a way to add them to the dictation language model, at least in a crude way, even if the probabilities aren't there.

gui?