Here is a short session using Festival's command interpreter.
Start Festival with no arguments
$ festival Festival Speech Synthesis System 1.4.3:release Dec 2002 Copyright (C) University of Edinburgh, 1996-2002. All rights reserved. For details type `(festival_warranty)' festival>
Festival uses the a command line editor based on editline for terminal input so command line editing may be done with Emacs commands. Festival also supports history as well as function, variable name, and file name completion via the <TAB> key.
Typing help
will give you more information, that is help
without any parenthesis. (It is actually a variable name whose value is a
string containing help.)
Festival offers what is called a read-eval-print loop, because it reads an s-expression (atom or list), evaluates it and prints the result. As Festival includes the SIOD Scheme interpreter most standard Scheme commands work
festival> (car '(a d)) a festival> (+ 34 52) 86
In addition to standard Scheme commands a number of commands specific to speech synthesis are included. Although, as we will see, there are simpler methods for getting Festival to speak, here are the basic underlying explicit functions used in synthesizing an utterance.
Utterances can consist of various types See Utterance types, but the simplest form is plain text. We can create an utterance and save it in a variable
festival> (set! utt1 (Utterance Text "Hello world")) #<Utterance 1d08a0> festival>
The (hex) number in the return value may be different for your installation. That is the print form for utterances. Their internal structure can be very large so only a token form is printed.
Although this creates an utterance it doesn't do anything else. To get a waveform you must synthesize it.
festival> (utt.synth utt1) #<Utterance 1d08a0> festival>
This calls various modules, including tokenizing, duration,. intonation
etc. Which modules are called are defined with respect to the type
of the utterance, in this case Text
. It is possible to
individually call the modules by hand but you just wanted it to talk didn't
you. So
festival> (utt.play utt1) #<Utterance 1d08a0> festival>
will send the synthesized waveform to your audio device. You should
hear "Hello world" from your machine.
To make this all easier a small function doing these three steps exists.
SayText
simply takes a string of text, synthesizes it and sends it
to the audio device.
festival> (SayText "Good morning, welcome to Festival") #<Utterance 1d8fd0> festival>
Of course as history and command line editing are supported <c-p> or up-arrow will allow you to edit the above to whatever you wish.
Festival may also synthesize from files rather than simply text.
festival> (tts "myfile" nil) nil festival>
The end of file character <c-d> will exit from Festival and
return you to the shell, alternatively the command quit
may
be called (don't forget the parentheses).
Rather than starting the command interpreter, Festival may synthesize files specified on the command line
unix$ festival --tts myfile unix$
Sometimes a simple waveform is required from text that is to be kept and played at some later time. The simplest way to do this with festival is by using the text2wave program. This is a festival script that will take a file (or text from standard input) and produce a single waveform.
text2wave myfile.txt -o myfile.wav
Options exist to specify the waveform file type, for example if Sun audio format is required
text2wave myfile.txt -otype snd -o myfile.wav
Use -h on text2wave to see all options.