About a month ago, I got into a brief Twitter conversation with the brilliant fellow who wrote Annyang, Tal Ater. Twitter sadly does not do a great job of threading the conversation, but there were some excellent points made. I wanted to share them with the Mojo Lingo audience.
You can read the whole exchange below, but I think there are a few key takeaways:
- Browser-based speech recognition is important, and will become even more important
- Browser-based speech recognition has a way to go to be universally available and as useful as it should be to developers (and, ultimately, to users)
- This area is getting a lot of attention, especially from developers
- We need browser makers to get this done!
- As speech recognition continues to get cheaper, more accurate, and more widely available, some really exciting products will become reality
Here is the entire conversation:
@bklang @benlangfeld Just watched a couple of your talks from @adhearsion. Very interesting. It was great seeing #annyang demoed on stage.
@bklang @benlangfeld If you've used it anywhere, I'd love to see it, as well as hear your feedback.
@TalAter @benlangfeld Thanks! Annyang is pretty cool. We've demoed it several times, but haven't deployed. Main problem: browser support
@bklang @benlangfeld Yes, browsers are slow to catch on... But I've seen resourceful hackers using it on anything from Arduino to AR Drones.
@TalAter @benlangfeld In this case, it's not just "catching on". Major problems to adoption: need a speech recognizer (and they aren't free)
@TalAter @benlangfeld Other problems: the spec isn't complete/ratified; Chrome implementation does not allow selecting alternate recognizer
@TalAter @benlangfeld I did find a partial @Firefox implementation from GSoC, but I don't think it was merged, probably for those reasons
@bklang @benlangfeld Major boons to adoption: Browsers are built by companies with deep pockets. Also, client side recognition is an option.
@TalAter @benlangfeld I wish I were that optimistic. Client-side would be poor quality and expensive to build/maintain: different API per OS
@TalAter @benlangfeld Server-side would need to be licensed, not-cheap at scale. ASR market has too few competitors esp for open-ended recog
@bklang @benlangfeld My hope is that once people get used to SR as a basic feature in their car, Android, Siri, Google Glass, they won't
@bklang @benlangfeld settle for anything less in their browser. It won't be just a fancy feature you can ignore, but a basic requirement.
@TalAter @benlangfeld Me too! I think good/cheap (enough) speech recog WILL eventually happen...just hope it's sooner rather than later
@bklang @benlangfeld and I do wish Chrome implemented grammar. I'm tired of it calling me Tall, and annyang not knowing its own name.
Thanks Tal for your insight, and thank you for Annyang. I look forward to continuing the conversation and working together for a better, more speech-enabled world.
Don't forget to follow both Tal (@TalAter) and myself (@bklang) and join in the conversation!
The post A conversation on browser-based speech recognition appeared first on Mojo Lingo.