The source code - Festival Speech Synthesis System

Next: Writing a new module, Up: Programming

27.1 The source code

The ultimate authority on what happens in the system lies in the source code itself. No matter how hard we try, and how automatic we make it, the source code will always be ahead of the documentation. Thus if you are going to be using Festival in a serious way, familiarity with the source is essential.

The lowest level functions are catered for in the Edinburgh Speech Tools, a separate library distributed with Festival. The Edinburgh Speech Tool Library offers the basic utterance structure, waveform file access, and other various useful low-level functions which we share between different speech systems in our work. See Overview.

The directory structure for the Festival distribution reflects the conceptual split in the code.

./bin/: The user-level executable binaries and scripts that are part of the festival system. These are simple symbolic links to the binaries or if the system is compiled with shared libraries small wrap-around shell scripts that set LD_LIBRARY_PATH appropriately
./doc/: This contains the texinfo documentation for the whole system. The Makefile constructs the info and/or html version as desired. Note that the festival binary itself is used to generate the lists of functions and variables used within the system, so must be compiled and in place to generate a new version of the documentation.
./examples/: This contains various examples. Some are explained within this manual, others are there just as examples.
./lib/: The basic Scheme parts of the system, including init.scm the first file loaded by festival at start-up time. Depending on your installation, this directory may also contain subdirectories containing lexicons, voices and databases. This directory and its sub-directories are used by Festival at run-time.
./lib/etc/: Executables for Festival's internal use. A subdirectory containing at least the audio spooler will be automatically created (one for each different architecture the system is compiled on). Scripts are added to this top level directory itself.
./lib/voices/: By default this contains the voices used by Festival including their basic Scheme set up functions as well as the diphone databases.
./lib/dicts/: This contains various lexicon files distributed as part of the system.
./config/: This contains the basic Makefile configuration files for compiling the system (run-time configuration is handled by Scheme in the lib/ directory). The file config/config created as a copy of the standard config/config-dist is the installation specific configuration. In most cases a simpel copy of the distribution file will be sufficient.
./src/: The main C++/C source for the system.
./src/lib/: Where the libFestival.a is built.
./src/include/: Where include files shared between various parts of the system live. The file festival.h provides access to most of the parts of the system.
./src/main/: Contains the top level C++ files for the actual executables. This is directory where the executable binary festival is created.
./src/arch/: The main core of the Festival system. At present everything is held in a single sub-directory ./src/arc/festival/. This contains the basic core of the synthesis system itself. This directory contains lisp front ends to access the core utterance architecture, and phonesets, basic tools like, client/server support, ngram support, etc, and an audio spooler.
./src/modules/: In contrast to the arch/ directory this contains the non-core parts of the system. A set of basic example modules are included with the standard distribution. These are the parts that do the synthesis, the other parts are just there to make module writing easier.
./src/modules/base/: This contains some basic simple modules that weren't quite big enough to deserve their own directory. Most importantly it includes the Initialize module called by many synthesis methods which sets up an utterance structure and loads in initial values. This directory also contains phrasing, part of speech, and word (syllable and phone construction from words) modules.
./src/modules/Lexicon/: This is not really a module in the true sense (the Word module is the main user of this). This contains functions to construct, compile, and access lexicons (entries of words, part of speech and pronunciations). This also contains a letter-to-sound rule system.
./src/modules/Intonation/: This contains various intonation systems, from the very simple to quite complex parameter driven intonation systems.
./src/modules/Duration/: This contains various duration prediction systems, from the very simple (fixed duration) to quite complex parameter driven duration systems.
./src/modules/UniSyn/: A basic diphone synthesizer system, supporting a simple database format (which can be grouped into a more efficient binary representation). It is multi-lingual, and allows multiple databases to be loaded at once. It offers a choice of concatenation methods for diphones: residual excited LPC or PSOLA (TM) (which is not distributed)
./src/modules/Text/: Various text analysis functions, particularly the tokenizer and utterance segmenter (from arbitrary files). This directory also contains the support for text modes and SGML.
./src/modules/donovan/: An LPC based diphone synthesizer. Very small and neat.
./src/modules/rxp/: The Festival/Scheme front end to An XML parser written by Richard Tobin from University of Edinburgh's Language Technology Group.. rxp is now part of the speech tools rather than just Festival.
./src/modules/parser: A simple interface the the Stochastic Context Free Grammar parser in the speech tools library.
./src/modules/diphone: An optional module contain the previouslty used diphone synthsizer.
./src/modules/clunits: A partial implementation of a cluster unit selection algorithm as described in black97c.
./src/modules/Database rjc_synthesis: This consist of a new set of modules for doing waveform synthesis. They are inteneded to unit size independent (e.g. diphone, phone, non-uniform unit). Also selection, prosodic modification, joining and signal processing are separately defined. Unfortunately this code has not really been exercised enough to be considered stable to be used in the default synthesis method, but those working on new synthesis techniques may be interested in integration using these new modules. They may be updated before the next full release of Festival.
./src/modules/*: Other optional directories may be contained here containing various research modules not yet part of the standard distribution. See below for descriptions of how to add modules to the basic system.

One intended use of Festival is offer a software system where new modules may be easily tested in a stable environment. We have tried to make the addition of new modules easy, without requiring complex modifications to the rest of the system.

All of the basic modules should really be considered merely as example modules. Without much effort all of them could be improved.