In order for Festival to use a database it is most useful to build utterance structures for each utterance in the database. As discussed earlier, utterance structures contain relations of items. Given such a structure for each utterance in a database we can easily read in the utterance representation and access it, dumping information in a normalised way allowing for easy building and testing of models.
Of course the level of labelling that exists, or that you are willing to do by hand or using some automatic tool, for a particular database will vary. For many purposes you will at least need phonetic labelling. Hand labelled data is still better than auto-labelled data, but that could change. The size and consistency of the data is important too.
For this discussion we will assume labels for: segments, syllables, words, phrases, intonation events, pitch targets. Some of these can be derived, some need to be labelled. This would not fail with less labelling but of course you wouldn't be able to extract as much information from the result.
In our databases these labels are in Entropic's Xlabel format, though it is fairly easy to convert any reasonable format.
The script festival/examples/make_utts is an example Festival script which automatically builds the utterance files from the above labelled files.
The script, by default assumes, a hierarchy in an database directory of the following form. Under a directory festival/ where all festival specific database ifnromation can be kept, a directory relations/ contains a subdirectory for each basic relation (e.g. Segment/, Syllable/, etc.) Each of which contains the basic label files for that relation.
The following command will build a set of utterance structures (including building hte relations that link between these basic relations).
make_utts -phoneset radio festival/relation/Segment/*.Segment
This will create utterances in festival/utts/. There are
a number of options to make_utts use -h to find
them. The -eval option allows extra scheme code to be
loaded which may be called by the utterance building process.
The function make_utts_user_function
will be called on all
utterance created. Redefining that in database specific loaded
code will allow database specific fixed to the utterance.