Whether you’ve asked Alexa to play your favorite song or questioned Siri about tomorrow’s weather, chances are high you’ve used a voice assistant at least once in your life. In fact, statistics from Mindshare say an estimated 600 million people worldwide use voice assistants at least once a week.
Voice assistants are personal helpers that reside in smartphones, computers, and other internet-connected devices, ready to help you find information or complete a variety of tasks. Just wake up the assistant with the mention of a keyword, and the assistants use a natural language processor to decipher what you said and then jump into action.
Tasks can include finding all types of information on the internet, playing music from an accessible music library, setting appointments and reminders on your calendar, and controlling smart home appliances. All of this can be accomplished using voice controls.
While many of us turn to voice assistants without a second thought (unless they mess up our request), I’m prone to be curious about the way human voices are used in any project. This especially holds true as voice assistants become more natural sounding and advanced, prompting folks to speak to machines the same way they would speak to another human being.
Voice Assistant Options
- Google: Google Assistant is the straightforward name of Google’s voice assistant, available on Android phones, Android Wear wearable devices, and in homes as part of the Google Home smart speaker system. It responds to your voice as well as typed-in commands.
- Amazon: Alexa is the name of Amazon’s voice assistant, and you can interact with her through an Android app on smartphones and tablets as well as with Amazon Echo and Amazon Echo Dot devices. Amazon Echo combines the main control unit with a cylindrical speaker, while Amazon Echo Dot is just the main unit without the speaker. Each can work independently of one another.
- Apple: Apple’s voice assistant is named Siri, and she’s available on iPhones and the HomePod speaker system for household use.
The Voices They Use
The default voice for voice assistants has long been female, although the choices have since increased to both male and female options.
- Google Assistant lets you choose from a total of eight different male and female voices, with the voice of singer John Legend planned to be added to the mix down the line.
- Alexa has her default female voice, and Amazon recently introduced eight different male and female voices developers can use to change the default. If you’re a developer or savvy enough with technology to change the voice, the total here is nine.
- Siri got a revamped default female voice in September 2017, and you can also choose from a number of male and female options with American, British or Australian accents.
How They Choose the Voices
Using a female voice as the default voice for voice assistants may irk some who feel it stereotypes women as “mere assistants,” but the truth is that people generally tend to trust female voices over male voices.
As I discussed in a previous blog, there are several reasons why. The higher pitch of a female voice instills more confidence in listeners than the lower-pitched male voice. Female voices are also perceived as more soothing and comforting, as well as helpful rather than commanding.
It’s also been shown that the human brain is developed to like female voices, and perhaps even prefer them over the male voice. Studies suggest this preference can be traced back to the womb, with unborn babies reacting to the sound of their mother’s voice.
Of course, the perfect-sounding voice for one person may not be equally as pleasant for another, which is probably why companies have expanded the selection of voices to include a wider variety of different-sounding male and female voices.
Additional criteria that come into play for Apple when choosing voices for Siri include ensuring the voice is “perceived as being compatible with the Siri personality.” The Nordic edition of BusinessInsider notes the Siri personality is known to be neutral, professional, and somewhat restrained (even when delivering the occasional joke if you provide the right prompt).
Google’s use of John Legend’s voice appears to show celebrity status is likely to play a role in the voices it uses for its assistant.
How an Actor’s Voice becomes a Machine’s Voice
Once an actor is chosen to provide the voice of the assistant, recording the actor’s voice comes next. This process can last anywhere from 10 to 20 hours to get a solid sample of the actor’s voice. The recorded materials can include any number of different voice over scripts, including things like navigation instructions, short question responses, audiobooks, and yes, even a joke or two.
While 20 hours of recording provides a good sample of a voice, it’s certainly not enough to create every single possible the virtual assistant may need to make. That’s where technology comes in. After the speech is chopped up into blocks of components, technology, such as Wavenet, can arrange those blocks into new words and phrases as needed.
The key is ensuring those new words and phrases sound more human than they do synthetic, or as if you’re speaking to another person instead of a robot. Artificial intelligence techniques can help, which is what Apple uses to make Siri sound more human.
As human as voice assistants may sound, they’re still machines. While they can help with many mundane tasks, or even deliver an occasional joke, they can’t take the place of real human connections. And since they rely on human voice recordings to develop their own voices, they evidently can’t take the place of real human voice over actors, either.