Functions | |
void | sig2coef (EST_Wave &sig, EST_Track &a, EST_String type, float factor=2.0, EST_WindowFunc *wf=EST_Window::creator(DEFAULT_WINDOW_NAME)) |
void | sigpr_base (EST_Wave &sig, EST_Track &fv, EST_Features &op, const EST_StrList &slist) |
void | power (EST_Wave &sig, EST_Track &a, float factor) |
void | energy (EST_Wave &sig, EST_Track &a, float factor) |
void | fbank (EST_Wave &sig, EST_Track &fbank, const float factor, EST_WindowFunc *wf=EST_Window::creator(DEFAULT_WINDOW_NAME), const bool up=false, const bool take_log=true) |
void | melcep (EST_Wave &sig, EST_Track &mfcc_track, float factor, int fbank_order, float liftering_parameter, EST_WindowFunc *wf=EST_Window::creator(DEFAULT_WINDOW_NAME), const bool include_c0=false, const bool up=false) |
In the following functions, the input is a EST_Wave waveform, and the output is a (usually multi-channel) EST_Track. The track must be set up appropriately before hand. This means the track must be resized accordingly with the correct numbers of frame and channels.
The positions of the frames are found by examination of the time array in the EST_Track, which must be filled prior to the function call. The usual requirement is for fixed frame analysis, where each analysis frame is, say, 10ms after the previous one.
A common alternative is to perform pitch-synchronous analysis where the time shift is related to the local pitch period.
void sig2coef | ( | EST_Wave & | sig, |
EST_Track & | a, | ||
EST_String | type, | ||
float | factor = 2.0 , |
||
EST_WindowFunc * | wf = EST_Window::creator(DEFAULT_WINDOW_NAME) |
||
) |
Produce a single set of coefficients from a waveform. The type of coefficient required is given in the argument type
.
type | { Possible types are:
|
The order of the analysis is calculated from the number of channels in fv
. The positions of the analysis windows must be given by filling in the track's time array.
This function windows the waveform at the intervals given by the track time array. The length of each window is factor * the local time shift
. The windowing function is given by wf
.
sig | input waveform |
fv | {output coefficients. These have been pre-allocated and the number of channels in a indicates the order of the analysis.} |
type | the types of coefficients to be produced. "lpc", "cep" etc |
factor | {the frame length factor, i.e. the analysis frame length will be this times the local pitch period.} |
wf | function for windowing. See Windowing mechanisms |
Definition at line 399 of file sigpr_utt.cc.
void sigpr_base | ( | EST_Wave & | sig, |
EST_Track & | fv, | ||
EST_Features & | op, | ||
const EST_StrList & | slist | ||
) |
Produce multiple coefficients from a waveform by repeated calls to sig2coef.
sig | input waveform |
fv | output coefficients. These have been pre-allocated and the number of channels in a indicates the order of the analysis. |
op | Features structure containing options for analysis order, frame shift etc. |
slist | list of types of coefficients required, from the set of possible types that sig2coef can take. |
Definition at line 138 of file sigpr_utt.cc.
Calculate the power for each frame of the waveform.
sig | input waveform |
a | output power track |
factor | the frame length factor, i.e. the analysis frame length will be this times the local pitch period. |
Definition at line 422 of file sigpr_utt.cc.
Calculate the rms energy for each frame of the waveform.
This function calls sig2energy
sig | input waveform |
a | output coefficients |
factor | optional: the frame length factor, i.e. the analysis frame length will be this times the local pitch period. |
Definition at line 445 of file sigpr_utt.cc.
void fbank | ( | EST_Wave & | sig, |
EST_Track & | fbank, | ||
const float | factor, | ||
EST_WindowFunc * | wf = EST_Window::creator(DEFAULT_WINDOW_NAME) , |
||
const bool | up = false , |
||
const bool | take_log = true |
||
) |
Mel scale filter bank analysis. The Mel scale triangular filters are computed via an FFT (see fastFFT). This routine is required for Mel cepstral analysis (see melcep). The analysis of each frame is done by sig2fbank.
A typical filter bank analysis for speech recognition might use log energy outputs from 20 filters.
sig | input waveform |
fbank | the output. The number of filters is determined from the number size of this track. |
factor | the frame length factor, i.e. the analysis frame length will be this times the local pitch period |
wf | function for windowing. See {Windowing mechanisms} |
up | whether the filterbank analysis should use power rather than energy. |
take_log | whether to take logs of the filter outputs |
Definition at line 496 of file sigpr_utt.cc.
void melcep | ( | EST_Wave & | sig, |
EST_Track & | mfcc_track, | ||
float | factor, | ||
int | fbank_order, | ||
float | liftering_parameter, | ||
EST_WindowFunc * | wf = EST_Window::creator(DEFAULT_WINDOW_NAME) , |
||
const bool | include_c0 = false , |
||
const bool | up = false |
||
) |
Mel scale cepstral analysis via filter bank analysis. Cepstral parameters are computed for each frame of speech. The analysis requires fbank . The cepstral analysis of the filterbank outputs is performed by fbank2melcep .
A typical Mel cepstral coefficient (MFCC) analysis for speech recognition might use 12 cepstral coefficients computed from a 20 channel filterbank.
sig | input: waveform |
mfcc_track | the output |
factor | the frame length factor, i.e. the analysis frame length will be this times the local pitch period |
fbank_order | the number of Mel scale filters used for the analysis |
liftering_parameter | for filtering in the cepstral domain See fbank2melcep |
wf | function for windowing. See Windowing mechanisms |
include_c0 | whether the zero'th cepstral coefficient is to be included |
up | whether the filterbank analysis should use power rather than energy. |
Definition at line 540 of file sigpr_utt.cc.