Edinburgh Speech Tools  2.1-release
EST_TokenStream Class Reference

#include <include/EST_Token.h>

Public Member Functions

 EST_TokenStream ()
 
 ~EST_TokenStream ()
 will close file if appropriate for type More...
 
EST_Token get_upto (const EST_String &s)
 get up to s in stream as a single token. More...
 
EST_Token get_upto_eoln (void)
 get up to s in end of line as a single token. More...
 
EST_Tokenpeek (void)
 peek at next token More...
 
int fread (void *buff, int size, int nitems) EST_WARN_UNUSED_RESULT
 Reading binary data, (don't use peek() immediately beforehand) More...
 
int open (const EST_String &filename)
 open a EST_TokenStream for a file. More...
 
int open (FILE *ofp, int close_when_finished)
 open a EST_TokenStream for an already opened file More...
 
int open (std::istream &newis)
 open a EST_TokenStream for an already open istream More...
 
int open_string (const EST_String &newbuffer)
 open a EST_TokenStream for string rather than a file More...
 
void close (void)
 Close stream. More...
 
stream access functions
EST_TokenStreamget (EST_Token &t)
 get next token in stream More...
 
EST_Tokenget ()
 get next token in stream More...
 
get the next token which must be the argument.
EST_Tokenmust_get (EST_String expected, bool *ok)
 
EST_Tokenmust_get (EST_String expected, bool &ok)
 
EST_Tokenmust_get (EST_String expected)
 
stream initialization functions
void set_WhiteSpaceChars (const EST_String &ws)
 set which characters are to be treated as whitespace More...
 
void set_SingleCharSymbols (const EST_String &sc)
 set which characters are to be treated as single character symbols More...
 
void set_PunctuationSymbols (const EST_String &ps)
 set which characters are to be treated as (post) punctuation More...
 
void set_PrePunctuationSymbols (const EST_String &ps)
 set which characters are to be treated as (post) punctuation More...
 
void set_quotes (char q, char e)
 set characters to be used as quotes and escape, and set quote mode More...
 
int quoted_mode (void)
 query quote mode More...
 

miscellaneous

int linenum (void) const
 returns line number of EST_TokenStream More...
 
int eof ()
 end of file More...
 
int eoln ()
 end of line More...
 
EST_FilePos filepos (void) const
 current file position in EST_TokenStream More...
 
EST_FilePos tell (void) const
 tell, synonym for filepos More...
 
int seek (int position)
 seek, reposition file pointer More...
 
int seek_end ()
 
int restart (void)
 Reset to start of file/string. More...
 
const EST_String pos_description ()
 A string describing current position, suitable for error messages. More...
 
const EST_String filename () const
 The originating filename (if there is one) More...
 
FILE * filedescriptor ()
 For the people who need the actual description (if possible) More...
 
EST_TokenStreamoperator>> (EST_Token &p)
 
EST_TokenStreamoperator>> (EST_String &p)
 
std::ostream & operator<< (std::ostream &s, EST_TokenStream &p)
 

Detailed Description

A class that allows the reading of EST_Token from a file stream, pipe or string. It automatically tokenizes a file based on user definable whitespace and punctuation.

The definitions of whitespace and punctuation are user definable. Also support for single character symbols is included. Single character symbols always are treated as individual tokens irrespective of their white space context. Also a quote mode can be used to read uqoted tokens.

The setting of whitespace, pre and post punctuation, single character symbols and quote mode must be down (immediately) after opening the stream.

There is no unget but peek provides look ahead of one token.

Note there is an interesting issue about what to do about the last whitespace in the file. Should it be ignored or should it be attached to a token with a name string of length zero. In unquoted mode the eof() will return TRUE if the next token name is empty (the mythical last token). In quoted mode the last must be returned so eof will not be raised.

Author
Alan W Black (awb@c.nosp@m.str..nosp@m.ed.ac.nosp@m..uk): April 1996

Definition at line 239 of file EST_Token.h.

Constructor & Destructor Documentation

EST_TokenStream::EST_TokenStream ( )

Definition at line 120 of file EST_Token.cc.

EST_TokenStream::~EST_TokenStream ( )

will close file if appropriate for type

Definition at line 180 of file EST_Token.cc.

Member Function Documentation

int EST_TokenStream::open ( const EST_String filename)

open a EST_TokenStream for a file.

Definition at line 213 of file EST_Token.cc.

int EST_TokenStream::open ( FILE *  ofp,
int  close_when_finished 
)

open a EST_TokenStream for an already opened file

Definition at line 231 of file EST_Token.cc.

int EST_TokenStream::open ( std::istream &  newis)

open a EST_TokenStream for an already open istream

Definition at line 251 of file EST_Token.cc.

int EST_TokenStream::open_string ( const EST_String newbuffer)

open a EST_TokenStream for string rather than a file

Definition at line 264 of file EST_Token.cc.

void EST_TokenStream::close ( void  )

Close stream.

Definition at line 419 of file EST_Token.cc.

EST_TokenStream & EST_TokenStream::get ( EST_Token t)

get next token in stream

Definition at line 499 of file EST_Token.cc.

EST_Token & EST_TokenStream::get ( void  )

get next token in stream

Definition at line 723 of file EST_Token.cc.

EST_Token & EST_TokenStream::must_get ( EST_String  expected,
bool ok 
)

Definition at line 574 of file EST_Token.cc.

EST_Token& EST_TokenStream::must_get ( EST_String  expected,
bool ok 
)
inline

Definition at line 322 of file EST_Token.h.

EST_Token& EST_TokenStream::must_get ( EST_String  expected)
inline

Definition at line 324 of file EST_Token.h.

EST_Token EST_TokenStream::get_upto ( const EST_String s)

get up to s in stream as a single token.

Definition at line 505 of file EST_Token.cc.

EST_Token EST_TokenStream::get_upto_eoln ( void  )

get up to s in end of line as a single token.

Definition at line 529 of file EST_Token.cc.

EST_Token& EST_TokenStream::peek ( void  )
inline

peek at next token

Definition at line 332 of file EST_Token.h.

int EST_TokenStream::fread ( void *  buff,
int  size,
int  nitems 
)

Reading binary data, (don't use peek() immediately beforehand)

Definition at line 368 of file EST_Token.cc.

void EST_TokenStream::set_WhiteSpaceChars ( const EST_String ws)
inline

set which characters are to be treated as whitespace

Definition at line 341 of file EST_Token.h.

void EST_TokenStream::set_SingleCharSymbols ( const EST_String sc)
inline

set which characters are to be treated as single character symbols

Definition at line 344 of file EST_Token.h.

void EST_TokenStream::set_PunctuationSymbols ( const EST_String ps)
inline

set which characters are to be treated as (post) punctuation

Definition at line 347 of file EST_Token.h.

void EST_TokenStream::set_PrePunctuationSymbols ( const EST_String ps)
inline

set which characters are to be treated as (post) punctuation

Definition at line 350 of file EST_Token.h.

void EST_TokenStream::set_quotes ( char  q,
char  e 
)
inline

set characters to be used as quotes and escape, and set quote mode

Definition at line 353 of file EST_Token.h.

int EST_TokenStream::quoted_mode ( void  )
inline

query quote mode

Definition at line 355 of file EST_Token.h.

int EST_TokenStream::linenum ( void  ) const
inline

returns line number of EST_TokenStream

Definition at line 360 of file EST_Token.h.

int EST_TokenStream::eof ( )
inline

end of file

Definition at line 362 of file EST_Token.h.

int EST_TokenStream::eoln ( void  )

end of line

Definition at line 832 of file EST_Token.cc.

EST_FilePos EST_TokenStream::filepos ( void  ) const
inline

current file position in EST_TokenStream

Definition at line 367 of file EST_Token.h.

EST_FilePos EST_TokenStream::tell ( void  ) const
inline

tell, synonym for filepos

Definition at line 369 of file EST_Token.h.

int EST_TokenStream::seek ( int  position)

seek, reposition file pointer

Definition at line 318 of file EST_Token.cc.

int EST_TokenStream::seek_end ( )

Definition at line 282 of file EST_Token.cc.

int EST_TokenStream::restart ( void  )

Reset to start of file/string.

Definition at line 450 of file EST_Token.cc.

const EST_String EST_TokenStream::pos_description ( )

A string describing current position, suitable for error messages.

Definition at line 882 of file EST_Token.cc.

const EST_String EST_TokenStream::filename ( void  ) const
inline

The originating filename (if there is one)

Definition at line 378 of file EST_Token.h.

FILE* EST_TokenStream::filedescriptor ( )
inline

For the people who need the actual description (if possible)

Definition at line 380 of file EST_Token.h.

EST_TokenStream & EST_TokenStream::operator>> ( EST_Token p)

Definition at line 485 of file EST_Token.cc.

EST_TokenStream & EST_TokenStream::operator>> ( EST_String p)

Definition at line 490 of file EST_Token.cc.

Friends And Related Function Documentation

std::ostream& operator<< ( std::ostream &  s,
EST_TokenStream p 
)
friend

The documentation for this class was generated from the following files: