The appropriate diphone is selected based on the name of the phone identified in the segment stream. However for better diphone synthesis it is useful to augment the diphone database with other diphones in addition to the ones directly from the phoneme set. For example dark and light l's, distinguishing consonants from their consonant cluster form and their isolated form. There are however two methods to identify this modification from the basic name.
When the diphone module is called the hook diphone_module_hooks
is applied. That is a function of list of functions which will be
applied to the utterance. Its main purpose is to allow the conversion
of the basic name into an augmented one. For example converting a basic
l
into a dark l, denoted by ll
. The functions given in
diphone_module_hooks
may set the feature
diphone_phone_name
which if set will be used rather than the
name
of the segment.
For example suppose we wish to use a dark l (ll
) rather than
a normal l for all l's that appear in the coda of a syllable.
First we would define a function to which identifies this condition
and adds the addition feature diphone_phone_name
identify
the name change. The following function would
achieve this
(define (fix_dark_ls utt) "(fix_dark_ls UTT) Identify ls in coda position and relabel them as ll." (mapcar (lambda (seg) (if (and (string-equal "l" (item.name seg)) (string-equal "+" (item.feat seg "p.ph_vc")) (item.relation.prev seg "SylStructure")) (item.set_feat seg "diphone_phone_name" "ll"))) (utt.relation.items utt 'Segment)) utt)
Then when we wish to use this for a particular voice we need to add
(set! diphone_module_hooks (list fix_dark_ls))
in the voice selection function.
For a more complex example including consonant cluster identification
see the American English voice ked in
festival/lib/voices/english/ked/festvox/kd_diphone.scm. The
function ked_diphone_fix_phone_name
carries out a number of
mappings.
The second method for changing a name is during actual look up of a
diphone in the database. The list of alternates is given by the
Diphone_Init
function. These are used when the specified diphone
can't be found. For example we often allow mappings of dark l,
ll
to l
as sometimes the dark l diphone doesn't actually
exist in the database.