Mircea Giurgiu * Results on Automatic Speech Recognition in Romanian
2.1. Time-domain analysis
Fig. 1. -
Silence/speech/silence (word cinci (five)).
Fig. 2. - Endpointed speech.
2.2. Short-time spectrum analysis
Short-time spectrum analysis is the key feature of
speech processing. It has been accomplished by using both FFT
and LPC analyses [3] applied to
256 samples of speech windowed
with a Hamming window and the predictor coefficients are computed
by using Levinson-Durbin recursive algorithm. From each segment
of speech a 12-component vector is obtained by averaging the LPC
spectrum (Fig. 3 and 4). These vectors are then vector-quantized
in order to obtain speech compression and to reduce the calculus
time. The predictor can be written as
and the transfer function of the
vocal tract
where ak(k=1,2, ...,p)
are the predictor coefficients
and
Fig. 3. - A speech frame.
Fig. 4. - LPC and FFT spectra.
179
is the LPC spectrum.