Technology is moving so fast it can make our heads spin, especially in the world of text to speech (TTS). As voice over actors, we’re certainly aware of TTS – and some of us may even fear the technology is advancing us right out of our careers. But it’s really not. Despite the rapid advances in the field, TTS remains unable to replace the real deal. Keep reading to find out why.
How TTS Has Advanced
Text to speech (TTS) is a system that converts the written word into the spoken word. Simple enough, right? But it gets more complex from there. TTS systems store speech units that can include phones, diphones, words and entire sentences. It then puts those speech units together in specific combinations to create synthetic speech that says anything – all using the voice that initially recorded those speech units.
While the first talking machine was initially introduced back in 1939, advances in the world of TTS over the past several years have been more rapid and dramatic than over the past 75. Some of these advances include the ability to:
- Incorporate a model of the vocal tract and other human voice characteristics to sound more human.
- Correct synthetic speech mispronunciations, adjust regional pronunciations, add emphasis, and other tricks through Speech Synthesis Markup Language (SSML).
- Produce robo calls that stop and ask “Can you hear me?” or wait for a reply, like a human would, before continuing their spiel.
- Copy lip-movements for dubbing.
- Fix small errors in voice over recordings with synthetic edits.
- Create a model, or “voice bank,” of a real person’s voice for later use as synthetic speech
Once TTS began to converge with machine learning, big data and artificial intelligence (AI), it became smarter, more realistic and, as mentioned earlier, a perceived threat to some in the voice over industry.
Potential TTS Threats to the VO Industry
There is no doubt the advances of TTS have aroused a number of concerns across the voice over industry, with some of the most common outlined below.
Losing Ongoing Royalties
The royalty structure keeps giving us a steady flow of money each time our voice is used, regularly paying us even though we’ve already done the work. If we are recording into a voice bank, are we going to get royalties every time our voice is used to create synthetic speech? Probably not. While we can likely expect to be paid a large amount for the initial recording session, we may lose out on royalties each time our voice is used down the line. After all, how can we be paid royalties for a future recording that uses our voice but we didn’t technically record?
No Control Where Your Voice is Used
Since technology allows for a pre-recorded voice to be used to create any type of message or project down the line, voice over artists may fear they won’t have a say in the type of work that will be attached to their voice. Some work may be unacceptable, but we may have no control or say over the matter.
Being Prohibited from Future Spots
If we offer buyouts on our voice banks, we could be limiting our careers without realizing it. For instance, let’s say our voice is used for a car company. We would then potentially be prohibited from doing all spots for all other companies in the future – even though we didn’t know we’d be associated with a car company at the time of the buyout.
Continuously Declining TTS Rates
Recording sessions for TTS are no longer in the $50K range. As the technology advances, the rates continue to decrease. Methods of capturing and synthesizing voice take far less recording time, which means far less pay for the voice over talent.
Why Voice Over Actors Don’t Need to Fret
While TTS concerns may feel valid for us voice over artists, we don’t have to lose sleep over them for several reasons. For starters, TTS still harbors many limitations – like the inability to spontaneously generate the infinite human range of emotions and vocal techniques.
Being able to create synthetic speech by simply typing in the words you want it to say is also not something that can yet be done. And synthetic speech, no matter how advanced or finely tuned, has still not shown it can match the multiple nuances and components associated with a real human voice.
Ongoing payments may still even exist. In addition to a recording fee, we could arrange licensing agreements that outline when and where our voices can be used down the line. Turning our TTS fears into the framework for a clear-cut contract can help ensure we have all bases covered – and continue to thrive in our profession.
Hi Debbie!
For younger voice actors who are just starting out, do you see this still being a viable career path, or will we be losing our jobs to synthetic voices in 10-15 years?
Natalie, in response to your question about the viability of a VO career in 10-15 years with the potential competition of synthetic voices…I truly have no idea. Certainly, there are people who will prognosticate on plenty of things, and some are more reliable than others. For me, the truth about the future is…that no one really knows. For now, do what makes you happy, as long as it’s not hurting anyone, and the future will be what it will be. The VO world has changed and advanced so much just in the 25 or so years that I’ve been in it, I can only imagine the trajectory of change will continue even steeper in the next 10 – 15. But don’t do something, or choose NOT to do something, because of some unknowable future. that’s my take.
does this apply to medical narration services also?
Interesting and informative article! I learned a lot. But I will disagree with you about the severity of the threat posed by TTS and AI.
My very best client in the past decade has booked me to narrate many, many projects for Fortune 500 companies. Recently, we had a conversation about the business and how it is evolving. (My own business with them has been declining steadily, and I am one of their top “go to” narrators). He said: ” It seems we are moving more and more into VR and AR (“Augmented Reality”), and also using text to speech/AI voices. Some clients like ****** are also having their internal classroom trainers record audio for courses. But most of the big clients like (mentions 3 Fortune 500 firms) still prefer professional VO artists.”
I do not worry about AI replacing me for any sort of commercial work, or work that requires acting skills. But for corporate work that is seen simply as a “cost of doing business,” such as compliance materials especially, but arguably some training, employee communications and other non-critical projects that do NOT directly make money for the company – that sort of work will be done as cheaply as possible.” Good enough” will prevail, and if a company can save significant amounts of money by buying a subscription to TTS/AI software (rather than paying a guy like me $425 for the first hour), you can bet that most will go the cheap route.
Yes, AI cannot “act,” so commercial work should be safe for the foreseeable future. But rates for Radio/TV VO have been plummeting. I’m sorry to say I think the future is rather bleak for most VO talent.
That being said, I have some specific strategies to help my own business adapt to these new developments. For example, one strategy will be to get back into on-camera work, since that can’t be replaced by AI any time soon. The Golden Years of VO are long gone, and we’ll have to hustle even more and be very creative to keep up a viable income.
Tom Test
Hi Tom,
Thanks for reading and commenting here.
Cetainly, like any business, the world of VO, advertising and entertainment is constantly in flux.
Sounds like you’ve been working in the biz a long time, like I have, so have seen considerable
changes.
We’ve gone thru the doubts about VO being replaced by AI, and just the business being flooded with
new folks, who will work for less, and thereby reducing the status quo and ability for those who
use this as their livelihood to continue to make a good living.
As in many entreprenuerial ventures, we realize that each job and each connection
can be something that blossoms into a long-term relationship, or can be a one-hit-play.
Certainly the goal for any of us in this for the long haul is to cultivate those relationships
that continue to be prosperous for us, and thrive. But there is always change, and sometimes
those big/best clients move on, without apology, to the latest trend, or just a change in
voice. I feel that shift too, and it’s part of the deal as a VO-preneur…or any actor, really.
My theory has always been to be as diverse as possible, to create as many avenues of VO income
from as many clients as possible, so that even if one of the BIG ones goes away, there is another
to replace, or many more to supplement. I find that often the experience and professionalism one can
offer in comparison to an “internal classroom trainer/narrator” or someone who is NOT a professional VO talent,
is worth the extra expense (usually minimal in the big budget picture) and loss of time when working
with sub-par talent. And then the end result suffers as well. Domino effect.
But we all, as independent business owners, have to assess our own particular situation, and do what we
feel is necessary to keep the money rolling in. I know it can be challenging for many in this field,
and it requires talent in my ways other than just a great voice and ability to handle copy well.
Looks like you possess many skills to keep you in that successful category. I’ll be looking for you
on TV. Not sure if you’re in the Mid-West still, (I’m in Michigan, just over the IN state line) but for me,
moving here from So Cal made my on-camera work something that went back burner. Also, just being a woman,
the jobs for O/C can be few and far between as we age beyond the median age. I’m not ready to play grandmas
yet, but I’m old enough to if I wanted! A little different perspective from this gender on that front. And even
in the VO field, where certainly still a fair portion of jobs at hand go to male voices. I’m not complaining. I’ve
been able to sustain a very lucrative career over a few decades, but I am also not blind to what is the norm and how many jobs I
lose to a male voice.
But we persist! 🙂
I wish you all the best in 2020~
Debbie
Tom, I believe you’re right on the money with your whole assessment. To your last sentence…I started as voice over talent, then weaved into a broadcast media buying service. One hand plays off the other: media buying helps land V.O. and V.O. helps land media buys.