How to control text-to-speech

Here you will learn how to control text-to-speech by typing your text in different ways.

SlideTalk has also a special set of commands to control the text-to-speech properties (voice, reading speed, silences and timbre). The special commands of SlideTalk always start with the character @ followed by the command itself.

Improving pronunciation or words and acronyms

To improve the pronunciation or to get the best result when pronouncing non-standard words like names, foreign words, places and so on, you can rewrite the word in a way that gets a better result. It is a trial-and-error method, so you might need to experiment a while with text-to-speech before getting used to. Please use the "Listen to the text" feature when editing a slidetalk to hear how does the text sound.

For instance if you need to use the word "mlearning", you can type "m-learning" instead and get a better result.

Regarding acronyms, text-to-speech is able to guess how to pronounce some of the most common acronyms correctly, like mp3. But there other acronyms that are not as common and you need to help text-to-speech by typing the acronym differently. For instance if you are writing about an association called FOCAD (Friends Of Cats And Dogs), you would need to either type "Friends Of Cats And Dogs" or "F O C A D" instead of "FOCAD" if you want the TTS to pronounce the complete name or spell the acronym.

Improving phrasing

Text-to-speech has a built-in intelligence to guess phrasing based on the grammatical properties of a sentence. However some manual intervetion might be needed to fine tune the phrasing.

It is usually enough to play with punctuation. For instance adding a comma (",") wherever you want to have a little pause, even when a comma might be left out accorinding to the normal language rules. Or making sure that each sentence is concluded with a period ("."), a question mark ("?"), or an exclamation mark ("!") to control the phrase ending.

When typing text to be used by text-to-speech, the more fine control of punctuation, the better. Play with it while using the "Listen to the text" feature to learn how the text-to-speech reacts to different way of writing a sentence.

Alternative acoustic rendering of words

Sometimes you might want to get an alternative acoustic rendering of words. The Text-to-speech always select the rendering that is most likely to be correct, but you can ask an alternative rendering of a word by typing @alt1, @alt2 or @alt3 before the word in question. Like this example:

What a @alt2 strange world.

In this case we are instructing the text-to-speech to use an alternative rendering for the word 'strange'

The effects of the alternative rendering are often subtle and are difficult to predict in advance, so you need to use a trial and error method.

Changing the acoustic rendering of a word will also impact the neighbour words and the overall phrasing of a sentence.

Switching voice and/or language

To switch voice you need to type @ followed by the voice name. For instance @Sharon will switch to the voice Sharon.

For instance if you have chosen Rod as your default voice and you type the following text:

"Hello my name is Rod, @Sharon and I am Sharon. @Rod Today we are to present..."

You will hear first Rod speaking "Hello my name is Rod", then Sharon speaking "and I am Sharon", and then Rod once again saying "Today we are to present...".

You can use as many voices as you like and even change language within a sentence. As in the following example: "@Rod The cat is on the table. @Antonio El gato está sobre la mesa." You will first hear Rod pronouncing the sentence in English, and then Antonio pronouncing it in Spanish.

Adding silences and in the narration

There are 3 types of silences that you can add in your narration:

  • @breath -> add a silence of 0,5 sec
  • @pause -> add a silence of 1 sec
  • @silence -> add a silence of 3 sec

So if you type:

"We looked around @silence then we decided to move on"

there will be a 3 seconds pause between "We looked around" and "then we decided to move on".

You can also specify pauses of any length by writing @p followed by the desired silence length in milliseconds, as in this example:

"Did you really check @p2000 the keys?"

A silence of 2 seconds (2000 msecs) will be inserted between "Did you really check" and "the keys".

Changing the reading speed

There are 5 settings for the reading speed:

  • @slow -> quite slow reading (25% slower than default)
  • @moderate -> slightly slower reading (10% slower than default)
  • @normal -> set back all parameters to default (including reading speed)
  • @fast -> slightly faster reading (10% faster than default)
  • @faster -> quite fast reading (30% faster than default)

You can change the reading speed anywhere in the text. As for instance if you type:

"I am speaking normally, @faster now I speak much faster, @slow and now I speak slowly"

you will hear the reading speed changing as the sentence is spoken.

You can also adjust the reading speed as you wish by using @s followed by the speed expressed in percentage of default speed. For instance @s150 will set a reading speed 50% faster than default, while @s60 will set the reading speed at 60% of default speed.

Changing the timbre of the voice

The timbre (or colour) of a voice can be changed by using a parameter which is also referred to as "vocal tract". It simulates different shapes for the vocal tract resulting in different timbres.

We have the following presets:

  • @lighter -> sets a lighter voice, corresponding to a voice 15% lighter than default
  • @normal -> set back all parameters to default (including timbre)
  • @darker -> sets a darker voice, corresponding to a voice 15% darker than default.

You can change the timbre anywhere in the text. As for instance if you type:

"I speak normally @darker and suddenly I have a darker voice, @lighter or a lighter one"

you will hear the timbre changing accordingly.

You can also adjust the timbre as you wish by using @v followed by the timbre expressed in percentage of default. Values below 100 give darker timbre, while values above 100 give lighter tones. For instance @v120 will set a timbre 20% lighter than default, while @s70 will set a timbre 30% darker than default.

We suggest that you play around with this parameter to gain confidence, by starting with the predefined values or values around 100. Excessive values of timbre may be used for comical effect but will complete distort the quality of the voice.