Site initialization - Festival Speech Synthesis System

Next: Checking an installation, Previous: Configuration, Up: Installation

6.3 Site initialization

Once compiled Festival may be further customized for particular sites. At start up time Festival loads the file init.scm from its library directory. This file further loads other necessary files such as phoneset descriptions, duration parameters, intonation parameters, definitions of voices etc. It will also load the files sitevars.scm and siteinit.scm if they exist. sitevars.scm is loaded after the basic Scheme library functions are loaded but before any of the festival related functions are loaded. This file is intended to set various path names before various subsystems are loaded. Typically variables such as lexdir (the directory where the lexicons are held), and voices_dir (pointing to voice directories) should be reset here if necessary.

The default installation will try to find its lexicons and voices automatically based on the value of load-path (this is derived from FESTIVAL_HOME at compilation time or by using the --libdir at run-time). If the voices and lexicons have been unpacked into subdirectories of the library directory (the default) then no site specific initialization of the above pathnames will be necessary.

The second site specific file is siteinit.scm. Typical examples of local initialization are as follows. The default audio output method is NCD's NAS system if that is supported as that's what we use normally in CSTR. If it is not supported, any hardware specific mode is the default (e.g. sun16audio, freebas16audio, linux16audio or mplayeraudio). But that default is just a setting in init.scm. If for example in your environment you may wish the default audio output method to be 8k mulaw through /dev/audio you should add the following line to your siteinit.scm file

     (Parameter.set 'Audio_Method 'sunaudio)

Note the use of Parameter.set rather than Parameter.def the second function will not reset the value if it is already set. Remember that you may use the audio methods sun16audio. linux16audio or freebsd16audio only if NATIVE_AUDIO was selected in speech_tools/config/config and your are on such machines. The Festival variable *modules* contains a list of all supported functions/modules in a particular installation including audio support. Check the value of that variable if things aren't what you expect.

If you are installing on a machine whose audio is not directly supported by the speech tools library, an external command may be executed to play a waveform. The following example is for an imaginary machine that can play audio files through a program called adplay with arguments for sample rate and file type. When playing waveforms, Festival, by default, outputs as unheadered waveform in native byte order. In this example you would set up the default audio playing mechanism in siteinit.scm as follows

     (Parameter.set 'Audio_Method 'Audio_Command)
     (Parameter.set 'Audio_Command "adplay -raw -r $SR $FILE")

For Audio_Command method of playing waveforms Festival supports two additional audio parameters. Audio_Required_Rate allows you to use Festivals internal sample rate conversion function to any desired rate. Note this may not be as good as playing the waveform at the sample rate it is originally created in, but as some hardware devices are restrictive in what sample rates they support, or have naive resample functions this could be optimal. The second addition audio parameter is Audio_Required_Format which can be used to specify the desired output forms of the file. The default is unheadered raw, but this may be any of the values supported by the speech tools (including nist, esps, snd, riff, aiff, audlab, raw and, if you really want it, ascii).

For example suppose you run Festival on a remote machine and are not running any network audio system and want Festival to copy files back to your local machine and simply cat them to /dev/audio. The following would do that (assuming permissions for rsh are allowed).

     (Parameter.set 'Audio_Method 'Audio_Command)
     ;; Make output file ulaw 8k (format ulaw implies 8k)
     (Parameter.set 'Audio_Required_Format 'ulaw)
     (Parameter.set 'Audio_Command
      "userhost=`echo $DISPLAY | sed 's/:.*$//'`; rcp $FILE $userhost:$FILE; \
       rsh $userhost \"cat $FILE >/dev/audio\" ; rsh $userhost \"rm $FILE\"")

Note there are limits on how complex a command you want to put in the Audio_Command string directly. It can get very confusing with respect to quoting. It is therefore recommended that once you get past a certain complexity consider writing a simple shell script and calling it from the Audio_Command string.

A second typical customization is setting the default speaker. Speakers depend on many things but due to various licence (and resource) restrictions you may only have some diphone/nphone databases available in your installation. The function name that is the value of voice_default is called immediately after siteinit.scm is loaded offering the opportunity for you to change it. In the standard distribution no change should be required. If you download all the distributed voices voice_rab_diphone is the default voice. You may change this for a site by adding the following to siteinit.scm or per person by changing your .festivalrc. For example if you wish to change the default voice to the American one voice_ked_diphone

     (set! voice_default 'voice_ked_diphone)

Note the single quote, and note that unlike in early versions voice_default is not a function you can call directly.

A second level of customization is on a per user basis. After loading init.scm, which includes sitevars.scm and siteinit.scm for local installation, Festival loads the file .festivalrc from the user's home directory (if it exists). This file may contain arbitrary Festival commands.