Edinburgh Speech Tools  2.1-release
EST_SCFG_traintest Class Reference

A class used to train (and test) SCFGs is an extension of EST_SCFG. More...

#include <include/EST_SCFG.h>

Inheritance diagram for EST_SCFG_traintest:
Collaboration diagram for EST_SCFG_traintest:

Public Member Functions

 EST_SCFG_traintest ()
 
 ~EST_SCFG_traintest ()
 
void test_corpus ()
 
void test_crossbrackets ()
 
void load_corpus (const EST_String &filename)
 
void train_inout (int passes, int startpass, int checkpoint, int spread, const EST_String &outfile)
 
- Public Member Functions inherited from EST_SCFG
 EST_SCFG ()
 
 EST_SCFG (LISP rules)
 Initialize from a set of rules. More...
 
 ~EST_SCFG ()
 
EST_read_status load (const EST_String &filename)
 Load grammar from named file. More...
 
EST_write_status save (const EST_String &filename)
 Save current grammar to named file. More...
 
void set_rules (LISP rules)
 Set (or reset) rules from external source after construction. More...
 
LISP get_rules ()
 Return rules as LISP list. More...
 
int distinguished_symbol () const
 
void find_terms_nonterms (EST_StrList &nt, EST_StrList &t, LISP rules)
 
EST_String nonterminal (int p) const
 Convert nonterminal index to string form. More...
 
EST_String terminal (int m) const
 Convert terminal index to string form. More...
 
int nonterminal (const EST_String &p) const
 Convert nonterminal string to index. More...
 
int terminal (const EST_String &m) const
 Convert terminal string to index. More...
 
int num_nonterminals () const
 Number of nonterminals. More...
 
int num_terminals () const
 Number of terminals. More...
 
double prob_B (int p, int q, int r) const
 The rule probability of given binary rule. More...
 
double prob_U (int p, int m) const
 The rule probability of given unary rule. More...
 
void set_rule_prob_cache ()
 (re-)set rule probability caches More...
 

Additional Inherited Members

- Public Attributes inherited from EST_SCFG
SCFGRuleList rules
 The rules themselves. More...
 

Detailed Description

A class used to train (and test) SCFGs is an extension of EST_SCFG.

This offers an implementation of Pereira and Schabes ``Inside-Outside reestimation from partially bracket corpora.'' ACL 1992.

A SCFG maybe trained from a corpus (optionally) containing brackets over a series of passes reestimating the grammar probabilities after each pass. This basically extends the EST_SCFG class adding support for a bracket corpus and various indexes for efficient use of the grammar.

Definition at line 259 of file EST_SCFG.h.

Constructor & Destructor Documentation

EST_SCFG_traintest::EST_SCFG_traintest ( void  )

Definition at line 196 of file EST_SCFG_inout.cc.

EST_SCFG_traintest::~EST_SCFG_traintest ( void  )

Definition at line 204 of file EST_SCFG_inout.cc.

Member Function Documentation

void EST_SCFG_traintest::test_corpus ( )

Test the current grammar against the current corpus print summary.

Cross entropy measure only is given.

Definition at line 561 of file EST_SCFG_inout.cc.

void EST_SCFG_traintest::test_crossbrackets ( )

Test the current grammar against the current corpus.

Summary includes percentage of cross bracketing accuracy and percentage of fully correct parses.

Definition at line 519 of file EST_SCFG_Chart.cc.

void EST_SCFG_traintest::load_corpus ( const EST_String filename)

Load a corpus from the given file.

Each sentence in the corpus should be contained in parentheses. Additional parenthesis may be used to denote phrasing within a sentence. The corpus is read using the LISP reader so LISP conventions shold apply, notable single quotes should appear within double quotes.

Definition at line 209 of file EST_SCFG_inout.cc.

void EST_SCFG_traintest::train_inout ( int  passes,
int  startpass,
int  checkpoint,
int  spread,
const EST_String outfile 
)

Train a grammar using the loaded corpus.

Parameters
passesthe number of training passes desired.
startpassfrom which pass to start from
checkpointsave the grammar every n passes
spreadPercentage of corpus to use on each pass, this cycles through the corpus on each pass.
outfileOutput file name

Definition at line 484 of file EST_SCFG_inout.cc.


The documentation for this class was generated from the following files: