SoundHound AI Inc. introduced Dynamic Interaction, a category-level breakthrough in conversational AI for human-computer interaction by not only recognizing and understanding speech, but also responding and acting in real-time.
Where existing voice technology requires wake words and relies on turn-taking to process requests, Dynamic Interaction uses the twin technologies of fragment parsing – which breaks speech down to partial-utterances and processes them in real-time – and full-duplex audio-visual integration to create an instantaneous, next-generation experience.
In customer service settings, like ordering food at a restaurant, this means that customers won’t have to speak in a slow or unnatural way in order to be understood. They can communicate just as if they were talking to a human, receive instant responses, and customize and edit a food order “live” as they go.
Other features include:
- Completely ignores off-topic speech – only responding to domain-specific topics, like the items on a menu.
- Multimodal, continuous feedback confirms requests via audio and visuals “live” as the customer engages with a device or service – gives firm reassurance that an order or request has been understood accurately
- Makes proactive suggestions to the user based on a real-time interpretation of the user's speech – like a dessert menu popping up onscreen when a customer says “for dessert I’ll have…”
- Users can input information via voice and touch interface interchangeably and simultaneously
- Assistant responds with audio and visual output, and intelligently decides when to speak to the user versus simply updating the visual output
This new technology has broad applicability to many industries, especially across customer service and employee productivity use cases. For the restaurant industry in particular, which is facing unprecedented staffing challenges, the need to automate and gain efficiencies is particularly relevant. As its first test ground, Dynamic Interaction will offer restaurants smart, accurate support for voice ordering at drive-thrus, kiosks, and via smartphone, tablet, and desktop ordering platforms.
“...This technology is incredibly user-friendly and precise. Consumers won’t have to modify how they speak to the voice assistant to get a useful response – they can just speak as naturally as they would to a human. As an added bonus they’ll also have the means to instantly know and edit registered requests,” said Keyvan Mohajer, Co-Founder and CEO of SoundHound. “In our 17-year history of developing cutting-edge voice AI, this is perhaps the most important technical leap forward. We believe, just like how Apple's multi-touch technology leapfrogged touch interfaces in 2009, this is a significant disruption in human-computer interfaces.”