Edinburgh Speech Tools  2.1-release
wagon_test

Table of Contents

Test CART models

Synopsis

wagon_test is used to test CART models on feature data.

A detailed description of the CART model can be found in the Overview section.

Options

  • -desc: ifile Field description file
  • -data: ifile Datafile, one vector per line
  • -tree: ifile File containing CART tree
  • -track: ifile track for vertex indices
  • -predict: Predict for each vector returning full vector
  • -predict_val: Predict for each vector returning just value
  • -predictee: string name of field to predict (default is first field)
  • -heap: int " {210000}" Set size of Lisp heap, should not normally need to be changed from its default
  • -o: ofile File to save output in

Testing trees

Decision trees generated by wagon (or otherwise) can be applied to and tested against data sets using this program. This program requires a data set which is in the same format as wagon (and other programs) requires. It also needs a dataset description file naming the fields and given their type (see wagon for a description for the actual format.

wagon_test -data feats.data -desc feats.desc -tree feats.tree

This will simply uses the tree against each sample in the data file and compare the predicted value with the actual value and produce a summary of the result. For categorial predictees a percentage correct and confusion matrix is generated. For continuous values the root mean squared error (RMSE) and correlation between the predicted values and the actual values is given.

By default the predictee is the first field but may also be specified on the command line. The dataset may contain features which are not used by the tree.

This program can also be used to generate output values for sampled data. In this case the sample data must still contain a "value" for the predictee even if it is dummy. The option -predict will output the new sample vector with the predicted value in place, and the option -predict_val option will just output the value.

This program is specifically designed for testing purposes although it can also just be used for prediction. It is probably more efficient to use the Lisp function wagon or underlying C++ function wagon_predict().