For several months I have wanted a cross browser voice recognition system that doesn’t rely on a server, use browser plugins or extensions, or use external programs like Flash. Something that could continually listen for keywords and trigger functions when one is detected. I looked into the
webkitSpeechRecognition() object in Chrome, but unfortunately that relies on Google servers and is only available in Chrome. I looked into building extensions and plugins for Firefox and Chrome that package CMU Sphinx, but that is not native code. I even got voice recognition working in Flash, but wasn’t happy because it didn’t work on my Android device.
After months of looking I have found one that fits the bill completely and is really awesome.
It gets it’s audio input from the
navigator.getUserMedia() function, which gives a developer access to the webcam and microphone. PocketSphinx.js then processes speech the same way native PocketSphinx does, comparing captured audio to phonemes set up in a dictionary of words.
The process for setting it up is pretty simple and only took me a few hours to learn about phonemes and get the basics working. There were some speed bumps along the way but overall the system seems to be working pretty well.
Because this library relies on the
navigator.getUserMedia() function it is only supported on Chrome, Firefox, and Opera. There has yet to be word from Apple about when they will support it, and Microsoft is trying to push it’s own version of the
For now I am using PocketSphinx.js to do hotword detection for a natural language question/answer system, but I can see using it on larger projects to allow people to navigate my sites by voice commands alone.