One of the basic tools available with Festival is a system for building and using Classification and Regression Trees (breiman84). This standard statistical method can be used to predict both categorical and continuous data from a set of feature vectors.
The tree itself contains yes/no questions about features and ultimately provides either a probability distribution, when predicting categorical values (classification tree), or a mean and standard deviation when predicting continuous values (regression tree). Well defined techniques can be used to construct an optimal tree from a set of training data. The program, developed in conjunction with Festival, called wagon, distributed with the speech tools, provides a basic but ever increasingly powerful method for constructing trees.
A tree need not be automatically constructed, CART trees have the advantage over some other automatic training methods, such as neural networks and linear regression, in that their output is more readable and often understandable by humans. Importantly this makes it possible to modify them. CART trees may also be fully hand constructed. This is used, for example, in generating some duration models for languages we do not yet have full databases to train from.
A CART tree has the following syntax
CART ::= QUESTION-NODE || ANSWER-NODE QUESTION-NODE ::= ( QUESTION YES-NODE NO-NODE ) YES-NODE ::= CART NO-NODE ::= CART QUESTION ::= ( FEATURE in LIST ) QUESTION ::= ( FEATURE is STRVALUE ) QUESTION ::= ( FEATURE = NUMVALUE ) QUESTION ::= ( FEATURE > NUMVALUE ) QUESTION ::= ( FEATURE < NUMVALUE ) QUESTION ::= ( FEATURE matches REGEX ) ANSWER-NODE ::= CLASS-ANSWER || REGRESS-ANSWER CLASS-ANSWER ::= ( (VALUE0 PROB) (VALUE1 PROB) ... MOST-PROB-VALUE ) REGRESS-ANSWER ::= ( ( STANDARD-DEVIATION MEAN ) )
Note that answer nodes are distinguished by their car not being atomic.
The interpretation of a tree is with respect to a Stream_Item The FEATURE in a tree is a standard feature (see Features).
The following example tree is used in one of the Spanish voices to predict variations from average durations.
(set! spanish_dur_tree ' (set! spanish_dur_tree ' ((R:SylStructure.parent.R:Syllable.p.syl_break > 1 ) ;; clause initial ((R:SylStructure.parent.stress is 1) ((1.5)) ((1.2))) ((R:SylStructure.parent.syl_break > 1) ;; clause final ((R:SylStructure.parent.stress is 1) ((2.0)) ((1.5))) ((R:SylStructure.parent.stress is 1) ((1.2)) ((1.0))))))
It is applied to the segment stream to give a factor to multiply the average by.
wagon
is constantly improving and with version 1.2 of the speech
tools may now be considered fairly stable for its basic operations.
Experimental features are described in help it gives. See the
Speech Tools manual for a more comprehensive discussion of using
wagon.
However the above format of trees is similar to those produced by many other systems and hence it is reasonable to translate their formats into one which Festival can use.