Simple command driven session - Festival Speech Synthesis System

Next: Getting some help, Previous: Basic command line options, Up: Quick start

7.2 Sample command driven session

Here is a short session using Festival's command interpreter.

Start Festival with no arguments

     $ festival
     Festival Speech Synthesis System 1.4.3:release Dec 2002
     Copyright (C) University of Edinburgh, 1996-2002. All rights reserved.
     For details type `(festival_warranty)'
     festival>

Festival uses the a command line editor based on editline for terminal input so command line editing may be done with Emacs commands. Festival also supports history as well as function, variable name, and file name completion via the <TAB> key.

Typing help will give you more information, that is help without any parenthesis. (It is actually a variable name whose value is a string containing help.)

Festival offers what is called a read-eval-print loop, because it reads an s-expression (atom or list), evaluates it and prints the result. As Festival includes the SIOD Scheme interpreter most standard Scheme commands work

     festival> (car '(a d))
     a
     festival> (+ 34 52)
     86

In addition to standard Scheme commands a number of commands specific to speech synthesis are included. Although, as we will see, there are simpler methods for getting Festival to speak, here are the basic underlying explicit functions used in synthesizing an utterance.

Utterances can consist of various types See Utterance types, but the simplest form is plain text. We can create an utterance and save it in a variable

     festival> (set! utt1 (Utterance Text "Hello world"))
     #<Utterance 1d08a0>
     festival>

The (hex) number in the return value may be different for your installation. That is the print form for utterances. Their internal structure can be very large so only a token form is printed.

Although this creates an utterance it doesn't do anything else. To get a waveform you must synthesize it.

     festival> (utt.synth utt1)
     #<Utterance 1d08a0>
     festival>

This calls various modules, including tokenizing, duration,. intonation etc. Which modules are called are defined with respect to the type of the utterance, in this case Text. It is possible to individually call the modules by hand but you just wanted it to talk didn't you. So

     festival> (utt.play utt1)
     #<Utterance 1d08a0>
     festival>

will send the synthesized waveform to your audio device. You should
hear "Hello world" from your machine.

To make this all easier a small function doing these three steps exists. SayText simply takes a string of text, synthesizes it and sends it to the audio device.

     festival> (SayText "Good morning, welcome to Festival")
     #<Utterance 1d8fd0>
     festival>

Of course as history and command line editing are supported <c-p> or up-arrow will allow you to edit the above to whatever you wish.

Festival may also synthesize from files rather than simply text.

     festival> (tts "myfile" nil)
     nil
     festival>

The end of file character <c-d> will exit from Festival and return you to the shell, alternatively the command quit may be called (don't forget the parentheses).

Rather than starting the command interpreter, Festival may synthesize files specified on the command line

     unix$ festival --tts myfile
     unix$

Sometimes a simple waveform is required from text that is to be kept and played at some later time. The simplest way to do this with festival is by using the text2wave program. This is a festival script that will take a file (or text from standard input) and produce a single waveform.

An example use is

     text2wave myfile.txt -o myfile.wav

Options exist to specify the waveform file type, for example if Sun audio format is required

     text2wave myfile.txt -otype snd -o myfile.wav

Use -h on text2wave to see all options.