A class used to train (and test) SCFGs is an extension of EST_SCFG. More...
#include <include/EST_SCFG.h>
Public Member Functions | |
EST_SCFG_traintest () | |
~EST_SCFG_traintest () | |
void | test_corpus () |
void | test_crossbrackets () |
void | load_corpus (const EST_String &filename) |
void | train_inout (int passes, int startpass, int checkpoint, int spread, const EST_String &outfile) |
Public Member Functions inherited from EST_SCFG | |
EST_SCFG () | |
EST_SCFG (LISP rules) | |
Initialize from a set of rules. More... | |
~EST_SCFG () | |
EST_read_status | load (const EST_String &filename) |
Load grammar from named file. More... | |
EST_write_status | save (const EST_String &filename) |
Save current grammar to named file. More... | |
void | set_rules (LISP rules) |
Set (or reset) rules from external source after construction. More... | |
LISP | get_rules () |
Return rules as LISP list. More... | |
int | distinguished_symbol () const |
void | find_terms_nonterms (EST_StrList &nt, EST_StrList &t, LISP rules) |
EST_String | nonterminal (int p) const |
Convert nonterminal index to string form. More... | |
EST_String | terminal (int m) const |
Convert terminal index to string form. More... | |
int | nonterminal (const EST_String &p) const |
Convert nonterminal string to index. More... | |
int | terminal (const EST_String &m) const |
Convert terminal string to index. More... | |
int | num_nonterminals () const |
Number of nonterminals. More... | |
int | num_terminals () const |
Number of terminals. More... | |
double | prob_B (int p, int q, int r) const |
The rule probability of given binary rule. More... | |
double | prob_U (int p, int m) const |
The rule probability of given unary rule. More... | |
void | set_rule_prob_cache () |
(re-)set rule probability caches More... | |
Additional Inherited Members | |
Public Attributes inherited from EST_SCFG | |
SCFGRuleList | rules |
The rules themselves. More... | |
A class used to train (and test) SCFGs is an extension of EST_SCFG.
This offers an implementation of Pereira and Schabes ``Inside-Outside reestimation from partially bracket corpora.'' ACL 1992.
A SCFG maybe trained from a corpus (optionally) containing brackets over a series of passes reestimating the grammar probabilities after each pass. This basically extends the EST_SCFG class adding support for a bracket corpus and various indexes for efficient use of the grammar.
Definition at line 259 of file EST_SCFG.h.
EST_SCFG_traintest::EST_SCFG_traintest | ( | void | ) |
Definition at line 196 of file EST_SCFG_inout.cc.
EST_SCFG_traintest::~EST_SCFG_traintest | ( | void | ) |
Definition at line 204 of file EST_SCFG_inout.cc.
void EST_SCFG_traintest::test_corpus | ( | ) |
Test the current grammar against the current corpus print summary.
Cross entropy measure only is given.
Definition at line 561 of file EST_SCFG_inout.cc.
void EST_SCFG_traintest::test_crossbrackets | ( | ) |
Test the current grammar against the current corpus.
Summary includes percentage of cross bracketing accuracy and percentage of fully correct parses.
Definition at line 519 of file EST_SCFG_Chart.cc.
void EST_SCFG_traintest::load_corpus | ( | const EST_String & | filename | ) |
Load a corpus from the given file.
Each sentence in the corpus should be contained in parentheses. Additional parenthesis may be used to denote phrasing within a sentence. The corpus is read using the LISP reader so LISP conventions shold apply, notable single quotes should appear within double quotes.
Definition at line 209 of file EST_SCFG_inout.cc.
void EST_SCFG_traintest::train_inout | ( | int | passes, |
int | startpass, | ||
int | checkpoint, | ||
int | spread, | ||
const EST_String & | outfile | ||
) |
Train a grammar using the loaded corpus.
passes | the number of training passes desired. |
startpass | from which pass to start from |
checkpoint | save the grammar every n passes |
spread | Percentage of corpus to use on each pass, this cycles through the corpus on each pass. |
outfile | Output file name |
Definition at line 484 of file EST_SCFG_inout.cc.