We’ve heard about how voice interfaces are set to transform our daily lives for a while now. But for those of us who find themselves sceptical of Siri or annoyed with their Alexa, it’s kind of hard to believe.
What are the voice interfaces everyone’s talking about, and are they really worth the buzz?
While Gen Z might be ready to embrace voice commands, my generation is used to old-fashioned typing — a habit that will be hard to shift. But however cynical I’ve been in the past, it’s impossible to ignore the potential of voice control. So let’s give in and weigh up the pros and cons.
We all have something we consider “normal,” and this state of normality changes very fast. A year ago, queuing in a store was totally “normal,” today it’s borderline unacceptable. Five years ago, a person yelling “show me the weather” at a phone or a speaker was crazy. Today, it’s us.
The same process is true for controlling your devices with voice commands. For some of us, normal life doesn’t include voice control, but is it going to stay this way? Probably not. Generation Z and Alpha will likely be the last users to text via typing.
Andrew Ruegger from Marketingland said in his recent column that the next generation will use just voice control. It’s a shift that’s speeding up by the day, with searches like “Hey Google” starting to show up more frequently in search engine stats.
Critical web design mistakes to avoid
Technology makes an appearance
Getting to the point where we can casually ask our Apple Watches for nearby dinner recommendations is no small feat. It requires integrating advanced AI-driven natural-language-processing, speech recognition, computing horsepower, and wireless networking, to name just a few building blocks. It really is rocket science.
And yet, we’re just starting to see the potential of these technologies. Voice is the ultimate user interface because it’s not a UI but an essential part of our daily life and how we communicate.
Voice-enabled machines learn to adapt to our natural behaviors and not the other way around. Kids love joking with Siri — nobody clowns around with a keyboard.
The answer is plain and simple: COVID-19. The pandemic was a turning point for “gradual” to “sudden” voice adoption as people began to stay away from poking screens and buttons.
Not only that, but as people found themselves stuck at home during quarantine, they had more time to play around with readily available smartphones, apps, and cloud-based technology.
Add onto that the desire for human connection beyond the text, plus fatigue from too many video conference calls, and you’ve got the perfect conditions for greater adoption of voice technology.
Statistics show that lockdown has dramatically boosted smart speaker usage among US smartphone users. Since the outbreak, 35% of smart device owners say they’re listening to more news and information through their device, and 36% say they’ve increased their consumption of music and entertainment. These figures are even higher when looking at the 18-34 age group.
There are different types of voice technology:
- Automated speech recognition (ASR): essentially, turning speech into digital data, like text. Siri and Alexa are both based on this technological approach.
- Speech syntax: converting text into voice. That’s what people on TikTok are obsessed with.
- Voice recognition: verifying your identity based on a voice signal.
So how’s it used?
Voice interface is a programming feature that is supposed to help you free up hands and eyes and make life easier. For example, when you have an urge to find out how old Kanye West is while driving a car :)
Voice technology is actively incorporated in media, finance, production, education, and other sectors. For example, in personal banking, voice recognition is used to validate client identity and minimize phone fraud chances.
The movie industry is using speech recognition to save time creating subtitles. Voice control gives blind users and people with physical limitations tools to use smartphones more effectively.
This interface is crucial, considering that voice is the core element for websites from 2020 on. This will set apart the high-tech websites from all the others.
What are the voice interface characteristics, and how is it different from the standard visual version?
Specialists from the Nielsen Norman group point out five main features of voice user interfaces.
- Voice Input. We all know how this works, “Okay Google” and “Hey Siri” are the prime examples.
- Natural language. It’s not just voice control but also communicating with a device the way we usually do in daily life. Especially useful in social isolation!
- Voice output: information is pronounced by voice instead of being displayed on the screen.
- Smart interpretation. To understand the user better, the technology considers the current environment and past searches and requests.
- Collaboration. Voice interface will perform without us even having to ask for it. Basically, the devices will finally read our minds. The future is here.
What does the future hold?
Nearly 40% of users now use voice technology daily. You might not be an early adopter on the innovation curve, but it looks like we’re all getting there. Safe to say, voice control will be taking over nearly all areas of business and life.
Ultimately, voice technology isn’t a single industry. It’s a transformative technology that disrupts all aspects of our life, like smartphones and the internet did before.
The voice and speech recognition market is set to grow at 17.2% to reach $26.8 billion by 2025. So it goes without saying that voice UX is becoming a real pragmatic innovation. Even wearables are launching with voice assistants on board, not to mention cars and smart TVs.
Like companies needed an internet strategy in the ‘90s, a search strategy in 2000, and a mobile strategy in 2010, they now need a voice strategy. It's no longer optional.
Subscribe to hear about legendary business pivots on our CTRL SHIFT podcast.