3. The SPro tools

3.1 File formats    Waveform and feature file formats

3.2 Common options    Tools common options

3.3 I/O via stdin and stdout    Standard input, standard output and pipes

3.4 Extracting features    Feature extraction with SPro

3.5 Manipulating feature streams    The scopy utility for manipulating feature streams

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.1 File formats

3.1.1 Waveform streams Supported input waveform file formats

3.1.2 Feature streams Output feature file format

This section describes the file formats manipulated by SPro. Most SPro tools input signal from a waveform stream and output feature vectors to a feature stream.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.1.1 Waveform streams

Waveform streams are files which contains the signal samples, either in raw PCM format or in an encoded format to save disk space. Currently, SPro supports raw, mono, 16 bits/sample files as well as WAVE and optionally SPHERE(4) files. The SPHERE format is only supported if SPro has been compiled with the SPHERE library (`--with-sphere' in configure). Raw format (i.e. with no header) with a 8 kHz sample rate is the default assumed by SPro if not otherwise specified.

Waveform are considered as streams by SPro and are read via an input buffer which means they can be of arbitrary (even infinite) length. Even file formats for which the number of samples is known in advance from the header will not be entirely loaded into memory. In particular, this mechanism makes it possible to read waveforms from the standard input even though the number of signals is not known offhand. One particularly interesting consequence is the possibility to pipe the output of an external command into the input of a SPro command. For example, it is possible using a pipe to support file formats which are not supported by SPro. The following line

madplay --left --output=raw:- foo.mp3 | sfbcep -f 11025 - foo.mfcc
shows how to decode the left channel of an MP3 encoded file (`foo.mp3') into a raw, mono, 16 bits/sample file which is then piped into the sfbcep tool, assuming the sample rate of the MP3 file is 11,025 Hz.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.1.2 Feature streams

A feature streams is a file containing feature vectors. The format used to store the feature vectors is specific to SPro and consists of a header followed by data. The header itself is divided in two parts, an optional variable length header and a fixed length compulsory header.

To avoid byte-order problems, binary parts of the feature streams, such as the fixed length header and the feature vectors, are always stored in little-endian format (Intel-like processor) and therefore must be swapped if read on a big-endian (Motorola-like processor) machine. Byte swapping is automatically taken care of when using the library functions to read SPro streams. See section 4. The SPro library, for details on SPro stream I/O functions.

The variable length header is an optional ASCII header containing `attribute = value' statements, starting with a `<header>' tag and ending with `</header>'. The following is a sample variable length header:

<header> a_field = an arbitrary value; # a comment date = Wed Jul 23 14:59:12 CEST 2003; # this is the date snr = 20 dB; # SNR </header>
Both the `attribute' and `value' strings are arbitrary. Note that as of now, none of the SPro tools output variable length headers. However, such headers are supported and can be added using the cat or bcat command. For example, the command

bcat header.txt foo.mfcc > bar.mfcc
could be used to add the variable length header contained in file `header.txt' to the output of an SPro command `foo.prm', the resulting file being `bar.prm'. The header file `header.txt' is a regular text file containing text such as given in the example above, where the last line of the file must consist of the `</header>' tag, possibly with a carriage return.

The compulsory fixed length header is a 10 byte binary header containing the feature vector dimension(5) (unsigned short = 2 bytes), a flag describing the content of the feature vector (long = 4 bytes) and the frame rate in Hz (float = 4 bytes). The feature stream description flag is actually a field of bits with the following meaning

bit letter description

1 `E' feature vector contains log-energy.

2 `Z' mean has been removed

3 `N' static log-energy has been suppressed (always with `E' and `D')

4 `D' feature vector contains delta coefficients

5 `A' feature vector contains delta-delta coefficients (always with `D')

5 `R' variance has been normalized (always with `Z')
The letter in the second column corresponds to the letter used in all the SPro tools to modify or visualize the feature description flags.

Feature vectors, or data, are stored after the header in time ascending order. A feature vector is a binary vector of float's as illustrated in the following example

+-----------------+---+-----------------+----+-----------------+---+ | static | E | delta | dE | delta delta |ddE| +-----------------+---+-----------------+----+-----------------+---+
with the static coefficient first, optionally followed by the log-energy, the delta and delta-delta features as indicated by the feature description flag.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.2 Common options

Here is a list of options common to all (or most of) the tools. The scopy feature manipulation tool options slightly differ from the list below since most of the options are concerned with waveform processing.

3.2.1 I/O options    Common I/O options

3.2.2 Waveform framing options    Common frame blocking options

3.2.3 Feature vector options    Common feature vector extraction options

3.2.4 Miscellaneous options    More common options

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.2.1 I/O options

The following options are used to control the waveform and feature I/Os:

-F, --format=str
: Specify the input waveform file format. The format string str is one of `PCM16', `wave' or `sphere', the latter being possible only if SPro was compiled with the SPHERE library. Argument is case insensitive. Default value is `PCM16'.
-f, --sample-rate=f
: Set input waveform sample rate to f Hz for `PCM16' waveform files. This option is ignored for waveform file formats for which the sample rate is specified in the header. Default value is 8,000 Hz.
-x, --channel=n
: For multiple channel waveform files, set the channel to consider for feature extraction. Default value is 1.
-B, --swap
: Swap the input waveform samples. This is particularly useful for waveform files generated on a machine with a different endian. Default is not to swap.
-I, --input-bufsize=n
: Set the input buffer size to n kbytes. The smaller the input buffer size, the more disk access and therefore, the slower the program is. So you will have to choose between speed and memory! Default is 10 Mbytes.
-O, --output-bufsize=n
: Set the output buffer size to n kbytes. Again, you need a compromise between speed and memory requirements. However, one important point is that global processing such as mean subtraction, energy normalization and delta computation are done on the buffer basis (i.e. such processings are done only when the buffer is full or at the end of the stream, whichever comes first) which introduces some inconsistencies at the buffer boundaries(6). Using a small output buffer size can then result in many boundary problems and it is recommended not to diminish the output buffer size below a couple of thousand frames. Default is 10 Mbytes.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.2.2 Waveform framing options

Waveform framing is driven by the following options:

-k, --pre-emphasis=f
: Set the pre-emphasis coefficient to f. Default is 0.95.
-l --length=f
: Set the analysis frame length to f ms. Default is 20.0 ms.
-d, --shift=f
: Set the interval between two consecutive frames to f ms. Default is 10.0 ms.
-w, --window=str
: Specify the waveform weighting window. The window is one of `Hamming', `Hanning', `Blackman' or `none'. If the argument is `none', no window is applied. Argument is case insensitive. Default is `Hamming'.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.2.3 Feature vector options

The following options are used to control the content of the output feature vectors, enabling global normalizations and dynamic feature computation:

-Z, --cms
: Perform mean normalization.
-R, --normalize
: Perform variance normalization. Variance normalization is only possible if `--cms' is also specified. Otherwise, an error is generated.
-L, --segment-length=n
: Set normalization and energy scaling segment length. If this option is specified, mean, variance or max calculation is performed using a sliding window of `n' frames. Default is to calculate mean, variance or max globally when flushing the output buffer. This argument is ignored if neither `--cms' nor `--normalize' are specified.
-D, --delta
: Add first order derivatives to the feature vector.
-A, --acceleration
: Add second order derivatives to the feature vector. This is only possible if `--delta' is also specified. Otherwise, an error is generated.
-N, --no-static-energy
: Remove static log-energy from the feature vector. This is only possible if `--delta' is also specified. Otherwise, an error is generated.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.2.4 Miscellaneous options

Last but not least, here are some very practical options (specially the second one):

-v, --verbose
: Turn on verbose mode
-h, --help
: Print a help message for the tool and exit.
-V, --version
: Print version information and exit.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.3 I/O via stdin and stdout

Every SPro command requires that input and output files are explicitly specified. However, in the very Unix philosophy, the special symbol `-' (dash) can be used as input file to specify that input is to be read from stdin or as output file to specify that output should be directed to stdout.

The use of standard input and output makes it possible to pipe the SPro commands one after the other or even with external programs. The example

sfbcep foo.lin - | scopy -o ascii - -
illustrates the use of pipes to list the feature vectors directly from the waveform file `foo.lin'. Another particularly useful example of pipes with SPro commands is given in 4.1 Waveform streams.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.4 Extracting features

3.4.1 Filter-bank analysis tools Tools for filter-bank derived features

3.4.2 LPC analysis tools Tools for linear prediction derived features

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.4.1 Filter-bank analysis tools

The tools sfbank and sfbcep are dedicated to filter-bank based speech analysis.

Filter-bank log-magnitude features    All about sfbank

Filter-bank cepstral features    All about sfbcep

Options    sfbank and sfbcep options

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

Filter-bank log-magnitude features

The first filter-bank analysis tool, sfbank, takes as input a waveform and output filter-bank magnitude features. For each frame, the FFT is performed on the windowed signal, possibly after zero padding, and the magnitude is computed before being integrated using a triangular filter-bank. See section 2.3 Filter-bank analysis, for mathematical details. To avoid numerical problems, a threshold is used to keep channel log-magnitudes positive or null. The signal bandwidth may be artificially limited by specifying lower and higher frequencies using the `--freq-min' and `--freq-max' options respectively. In this case, the central frequencies of the filter-bank channels are regularly taken in the specified bandwidth. Even if frequency warping is used, the lower and upper frequencies are specified in the linear frequency domain, though, of course, the filter's central frequencies will be taken regularly in the transformed domain. Both MEL and bilinear frequency warping are possible with sfbank.

First and second order derivatives can be appended to the filter-bank log-magnitude features using `--delta' and `--acceleration' respectively.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

Filter-bank cepstral features

The second filter-bank analysis tool, sfbcep, takes as input a waveform and output filter-bank derived cepstral features. The filter-bank processing is similar to what is done in sfbank (see previous section). The cepstral coefficients are computed by DCT'ing the filter-bank log-magnitudes and possibly liftered.

Optionally, the log-energy can be added to the feature vector. In sfbcep, the frame energy is calculated as the sum of the squared waveform samples after windowing. As for the magnitudes in the filter-bank, the log-energy are thresholded to keep them positive or null. The log-energies may be scaled to avoid differences between recordings.

Mean and variance normalization of the static cepstral coefficients can be specified with the global `--cms' and `--normalize' options but do not apply to log-energies. The normalizations can be global (default) or based on a sliding window whose length is specified with `--segment-length'.

Finally, first and second order derivatives of the cepstral coefficients and of the log-energies can be appended to the feature vectors. When using delta features, the absolute log-energy can be suppressed using the `--no-static-energy' option.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

Options

The following options are available for both sfbank and sfbcep.

-n, --num-filters=n
: Specify the number of channels in the filter bank. Default is 24.
-a, --alpha=f
: Use bilinear frequency warping and set the warping parameter a to f (f must be between 0 and 1). This option is incompatible with `--mel' and will be overwritten by the latter. Default is no warping.
-m, --mel
: Use MEL frequency warping. This option overwrites the `--alpha' one as both are incompatible. Default is no warping.
-i, --freq-min=f
: Specify band limiting and set the lower frequency bound to f Hz. Default is no band limiting.
-u, --freq-max=f
: Specify band limiting and set the upper frequency bound to f Hz. Default is no band limiting.
-b, --fft-length=n
: Set FFT length to n samples. The FFT length must be a power of two and greater than or equal to the number of samples in a frame. If FFT length is greater, the windowed frame samples are padded with zeroes before running the Fourier transform.

The following options are also available for sfbcep.

-p, --num-ceps=n: Set the number of output cepstral coefficients to n. n must be less or equal to the number of channels in the filter bank. Default is 12.
-r, --lifter=n: Set liftering parameter L to n. Default is no liftering.
-e, --energy
: Add log-energy to the feature vector.
-s, --scale-energy=f
: Scale energy according to e_t = 1 + f (e_t - max_t(e_t)). The way the maximum energy value is computed depends on whether `--segment-length' is specified or not.

sfbank supports the `--delta' and `--acceleration' options. In addition, sfbcep also supports the `--cms' and `--normalize' options. See section 3.2 Common options, for a description of these options and for additional ones.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.4.2 LPC analysis tools

SPro provides two different tools, slpc and slpcep, for linear predictive analysis of speech signals.

Linear prediction coefficients    All about slpc

Linear prediction cepstrum    All about slpcep

Options    slpc and slpcep options

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

Linear prediction coefficients

The tool slpc takes as input a waveform and output linear prediction derived features. For each frame, the signal is windowed after pre-emphasis and the generalized correlation is computed and further used to estimate the reflection and the prediction coefficients which can, in turn, be transformed into log area ratios or line spectrum frequencies. See section 4.7.1 Linear prediction, for mathematical details. The default is to output the linear prediction coefficients however reflection coefficients can be obtained with the `--parcor' option, log-area ratios with `--lar' option and line spectrum pairs with the `--lsp' one.

Optionally, the log-energy can be added to the feature vector. In slpc, the log-energy is taken as the linear prediction filter gain, which is also the variance of prediction error, and thresholded to be positive or null. The log-energies may be scaled to avoid differences between recordings using the `--scale-energy' option.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

Linear prediction cepstrum

Program slpcep takes as input a waveform and outputs cepstral coefficients derived from the linear prediction filter coefficients. The linear prediction processing steps are as in slpc (see previous section) and cepstral coefficients are computed from the linear prediction coefficients using the recursion previously described. The required number of cepstral coefficients must be less then or equal to the prediction order.

As for slpc, the log-energy, taken as the gain of the linear prediction filter, can be added to the feature vectors.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

Options

The following options are available for both slpc and slpcep.

-n, --order=n
: Specify the linear prediction analysis order. Default is 24.
-a, --alpha=f
: Use bilinear frequency warping and set the warping parameter a to f (f must be between 0 and 1). Default is no warping.
-r, --parcor: Output reflection coefficients rather than linear prediction coefficients.
-g, --lar: Output log area ratios rather than linear prediction coefficients.
-p, --lsp: Output line spectrum pairs rather than linear prediction coefficients.
-e, --energy
: Add log-energy to the feature vector.
-s, --scale-energy=f
: Scale energy according to e_t = 1 + f (e_t - max_t(e_t)). The way the maximum energy value is computed depends on whether `--segment-length' is specified or not.

The following options are also available for slpcep.

-p, --num-ceps=n: Set the number of output cepstral coefficients to n. n must be less or equal to the number of channels in the filter bank. Default is 12.
-r, --lifter=n: Set liftering parameter L to n. Default is no liftering.

Also, slpcep supports the `--cms' and `--normalize' normalization options as well as `--delta' and `--acceleration'. See section 3.2 Common options, for a description of these options and for additional ones.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.5 Manipulating feature streams

SPro provides a tool, scopy for manipulating feature streams. More than a mere copy tool, scopy also allows to normalize features, add dynamic features, scale the features, apply a linear transformation to the feature vectors and extract some components of the feature vector. All of these operations are detailed below. In addition, scopy can import feature files from previous SPro release, export files to alien formats such as HTK, or view the content of an SPro feature file in text format.

3.5.1 Operations on feature streams    Maniuplating feature streams with scopy

3.5.2 Exporting features    Exporting features to alien formats with scopy

3.5.3 Importing from a previous SPro release    Compatibility questions

3.5.4 Copy options    scopy options

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.5.1 Operations on feature streams

As mentioned in the introduction, scopy may be used for

mean and variance normalization,
dynamic features computation,
multiplicative scaling,
linear transformation, and
components extraction.

The two first transformations, i.e. normalization and dynamic feature computation, are actually done at once when loading the input features. If normalization is specified, the static coefficients, not including energy, are normalized before delta and acceleration features are computed. If dynamic feature are used, the static log-energy can be discarded using `--no-static-energy'. As in all the feature extraction tools, normalization is either global or based on a sliding window, depending on whether `--segment-length' was specified or not.

Multiplicative scaling is a simple operation which consists in multiplying every component of every feature vector by a scaling factor. This is sometimes used to reduce the variance of features with a high dynamic range in order to avoid numerical problems when computing a linear transformation for those features or when doing some modeling.

A linear transformation matrix can be specified using `--transform' to project the input feature vectors according to y'(t) = A z(t), where y'(t) is the transformed vector for frame t and z(t) is a column vector containing the input feature frame y(t) plus possibly some context frames(7). For example, assuming a context size k, z(t) will be the concatenation of input feature vectors y(t-k) to y(t+k). If m is the input feature dimension, possibly after adding the dynamic features if this was asked, and n the output dimension, the transformation matrix will have nrows=n rows and ncols=(2 k + 1) * m columns. The matrix A is stored in a text file with the following syntax

nrows ncols nsplice A[1][0] A[1][1] ......... A[1][ncols] ......... A[nrows][0] ......... A[nrows][ncols]
where nsplice is the context size.

Component extraction consists in extracting some components of the feature vectors. The extraction pattern is specified using the `--extract=str' option where str is a comma separated list of components to keep. The latter are specified either as a single component index or as a index range using a dash (`-'). Component indices start at 1. For example, the command

scopy --extract=1-12,25-36 foo.prm bar.prm
could be used to extract components 1 to 12 and 25 to 36 from `foo.prm' into `bar.prm', which, one can imagine, would correspond to keeping the 12 static features and the 12 acceleration features, thus discarding the delta features.

When performing either linear transformation or component extraction, the content of the resulting feature vector can no longer be described using a feature description flag. Indeed, specifying if a vector as delta features after a linear transformation does make no sense. For this reason, the output stream description flag will be arbitrarily set to zero if at least one of this transformation is specified.

If several operations are specified, they are applied in the order in which they are listed above. Therefore, delta coefficients are computed before the linear transformation if both are specified. As for now, there is unfortunately no direct and easy way to change the order of these operations. In particular, it is not possible to add delta coefficients after linear transformation which is an operation that does not seem illogical. The easiest, though CPU consuming, way to change the processing order is to use scopy several times, possibly with pipes. For example, the line

scopy --transform=pca.mat foo.prm - | scopy -ZD - bar.prm
will apply the linear transformation stored in file `pca.mat' to the feature vectors in `foo.prm' (first scopy) and then remove the mean of the static features before adding the delta features and store the result in `bar.prm' (second scopy).

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.5.2 Exporting features

Exporting feature streams to alien formats is also possible with scopy. Currently, three alien formats are supported, namely HTK(8), Sirocco(9) and ASCII text format.

Export to HTK and Sirocco file formats is only possible on seekable streams, i.e. regular files in which the C function fseek works. The reason for this constraint is that those formats include the number of frames in the header. Since the number of frames is not in the SPro header, sopy uses fseek to seek to the end of the input feature stream in order to determine the number of frames. As a consequence, it is not possible to export to one of these alien formats when reading from a pipe. On the other hand, no seek in the output file is therefore necessary and the output of scopy can be piped into another command. This is particularly usefull with HTK, where setting the environment variable HPARMFILTER to `scopy -o HTK $ -', enables to read directly read SPro files with HTK. See section "Input/Output via Pipes and Networks" in the HTK 3.2 book for details.

Export to ASCII is useful to list in a (almost) human-readable way the content of a feature stream. In particular, combining the ASCII output with the `--info' option which gives information about the content of the stream. This option is also useful to visualize the different operations performed on the input feature streams and their order. For example, the command

scopy -i -ZDA -t xxx.mat -x 1-3,7 -z foo.prm -
will produce the following output

sample_rate = 100.000000 input: dim=12 (<nil>) convert: dim=36 (ZDA) transform: dim=10 (xxx.mat) extract: dim=4 (1-3,7)
In the above example, the input file dimension is 12 is then modified to 36 by adding the dynamic coefficients (`-ZDA') and further reduced to 10 using the linear transform in `xxx.mat' before being decimated to 4 by extracting components 1 to 3 and 7 of the resulting feature vectors.

As mentioned in 3.1 File formats, SPro feature files are always in little endian byte order. On the contrary, exported files are written in the machine's natural byte order. As both HTK and Sirocco expects files in big-endian byte order(10), the option `--swap' can be used to swap the byte order before writing the file in alien file formats. This option is ignored if the output file format is ASCII (obviously) or SPro.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.5.3 Importing from a previous SPro release

The option `--compatibility' is provided for compatibility and enables to read feature files from previous versions of SPro. When this option is used, the entire feature file is loaded into memory at once as this used to be the case in previous versions. Using this options with large files may therefore be quite memory consuming (and slow by the same occasion). All the processing capabilities (normalization, dynamic features, linear transform, etc.) remains possible when importing files from previous SPro versions.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

3.5.4 Copy options

The following options are available in scopy:

-c, --compatibility
: Turn on compatibility and set the input file format to former SPro format. Default is SPro 4.0 format.
-I, --bufsize=n
: Set the I/O buffer size in kbytes. Default is 10 Mbytes. If `--compatibility' is specified, the specified buffer size applies only to the output buffer, the entire input data being loaded into memory.
-i, --info
: Print stream information.
-z, --suppress
: Suppress data output. If this option is turned on, no output is created. This option is provided mainly for use with `--info' in order to print the stream description flag or for diagnosis purposes.
-B, --swap
: Swap byte order before writing new file. Byte swapping is only possible if the output format is either HTK or Sirocco (see `--output-format' below). Default is to use the machine's natural byte-order.
-o, --output-format=str
: Set the output format, where str is one of ascii, htk or sirocco. Default is the native SPro format.
-m, --scale=f
: Scale features, multiplying them by the scaling factor f.
-t, --transform=str
: Apply the linear transformation whose matrix is specified in file str.
-x, --extract=str
: Extract the specified components of the feature vector. The argument str is a comma separated list of components to extract, where the components are specified either as a single index or a range of indices specified using a dash (`-'). The index of the first component is 1.
-s, --start=n
: Start copying frames at frame index n. Frame numbers start with zero. Default is 0.
-e, --end=n
: End copying at frame index n (included). Frame numbers start with zero. Default is to copy to the end of stream.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Guillaume Gravier on March, 5 2004 using texi2html

3.1 File formats		Waveform and feature file formats
3.2 Common options		Tools common options
3.3 I/O via stdin and stdout		Standard input, standard output and pipes
3.4 Extracting features		Feature extraction with SPro
3.5 Manipulating feature streams		The scopy utility for manipulating feature streams

3.1.1 Waveform streams		Supported input waveform file formats
3.1.2 Feature streams		Output feature file format

bit	letter	description
1	`E'	feature vector contains log-energy.
2	`Z'	mean has been removed
3	`N'	static log-energy has been suppressed (always with `E' and `D')
4	`D'	feature vector contains delta coefficients
5	`A'	feature vector contains delta-delta coefficients (always with `D')
5	`R'	variance has been normalized (always with `Z')

3.2.1 I/O options		Common I/O options
3.2.2 Waveform framing options		Common frame blocking options
3.2.3 Feature vector options		Common feature vector extraction options
3.2.4 Miscellaneous options		More common options

3.4.1 Filter-bank analysis tools		Tools for filter-bank derived features
3.4.2 LPC analysis tools		Tools for linear prediction derived features

Filter-bank log-magnitude features		All about `sfbank`
Filter-bank cepstral features		All about `sfbcep`
Options		`sfbank` and `sfbcep` options

Linear prediction coefficients		All about `slpc`
Linear prediction cepstrum		All about `slpcep`
Options		`slpc` and `slpcep` options

3.5.1 Operations on feature streams		Maniuplating feature streams with `scopy`
3.5.2 Exporting features		Exporting features to alien formats with `scopy`
3.5.3 Importing from a previous SPro release		Compatibility questions
3.5.4 Copy options		`scopy` options