Next: , Previous: Accessing an utterance, Up: Utterances


14.6 Features

In previous versions items had a number of predefined features. This is no longer the case and all features are optional. Particularly the start and end features are no longer fixed, though those names are still used in the relations where yjeu are appropriate. Specific functions are provided for the name feature but they are just short hand for normal feature access. Simple features directly access the features in the underlying EST_Feature class in an item.

In addition to simple features there is a mechanism for relating functions to names, thus accessing a feature may actually call a function. For example the features num_syls is defined as a feature function which will count the number of syllables in the given word, rather than simple access a pre-existing feature. Feature functions are usually dependent on the particular realtion the item is in, e.g. some feature functions are only appropriate for items in the Word relation, or only appropriate for those in the IntEvent relation.

The third aspect of feature names is a path component. These are parts of the name (preceding in .) that indicated some trversal of the utterance structure. For example the features name will access the name feature on the given item. The feature n.name will return the name feature on the next item (in that item's relation). A number of basic direction operators are defined.

n.
next
p.
previous
nn.
next next
pp.
previous
parent.
daughter1.
first daughter
daughter2.
second daughter
daughtern.
last daughter
first.
most previous item
last.
most next item
Also you may specific traversal to another relation relation, though the R:<relationame>. operator. For example given an Item in the syllable relation R:SylStructure.parent.name would give the name of word the syllable is in.

Some more complex examples are as follows, assuming we are starting form an item in the Syllable relation.

stress
This item's lexical stress
n.stress
The next syllable's lexical stress
p.stress
The previous syllable's lexical stress
R:SylStructure.parent.name
The word this syllable is in
R:SylStructure.parent.R:Word.n.name
The word next to the word this syllable is in
n.R:SylStructure.parent.name
The word the next syllable is in
R:SylStructure.daughtern.ph_vc
The phonetic feature vc of the final segment in this syllable.
A list of all feature functions is given in an appendix of this document. See Feature functions. New functions may also be added in Lisp.

In C++ feature values are of class EST_Val which may be a string, int, or a float (or any arbitrary object). In Scheme this distinction cannot not always be made and sometimes when you expect an int you actually get a string. Care should be take to ensure the right matching functions are use in Scheme. It is recommended you use string-append or string-match as they will always work.

If a pathname does not identify a valid path for the particular item (e.g. there is no next) "0" is returned.

When collecting data from speech databases it is often useful to collect a whole set of features from all utterances in a database. These features can then be used for building various models (both CART tree models and linear regression modules use these feature names),

A number of functions exist to help in this task. For example

     (utt.features utt1 'Word '(name pos p.pos n.pos))

will return a list of word, and part of speech context for each word in the utterance.

See Extracting features, for an example of extracting sets of features from a database for use in building stochastic models.