Maria-Mirela Petrea, Dan Cristea * Dealingwith Prosody. A Computer-Assisted Language Learning Approach
The pitch movements were classified according to five domains of slopes values: steep rise, sweet rise, level, steep fall and sweet fall. For instance, in figure 6 there are described some parameters of the utterance "Can you manage?" pronounced by a female speaker, where the intonational contour sweet falls in the first word, then follows a level pattern in "you", to sharp rise and then fall in the first syllable of manage, with a sweet final rising (as expected, since the utterance is a yes/no question). Having the prosody characterized in this way, it is easy to decide where the sentence or word main accent resides. In figure 6 the main accent is in "manage", more specifically in the first syllable of the word.
3.8. Alignment of MASTER and STUDENT signals
The prosody feature extraction module is applied to the MASTER signal as well as to the STUDENT signal. In order to permit a comparison between the prosodic melodies of the student's utterance and that of the master, the two signals must be aligned (see also [7]). The intention, in fact, is to obtain an alignmentof the triad in figure 7.
Fig. 7. - Thealignment triad.
PROSODICS has two approaches to solve the problem: realizing a semi-automatic MASTER to TEXT alignment (by means of segmentation followed by a full text introducing editing) and a fully automatic MASTER to STUDENT alignment, or computing a fully automatic SIGNAL to TEXT alignment, that is, signal-to-phonemic transcription alignment. In the first case, the signal-to-signal alignment is performed using pitch traces; at the end, this alignment will result also in the STUDENT to TEXT alignment (as suggested in figure 8). In the second case, each signal (MASTER or STUDENT) is aligned to the same phonemic transcription, which results obviously in MASTER -STUDENT alignment (see figure 9).
3.8.1. Pitch-to-Pitch alignment
The fundamental procedure for alignment is independent of the data processed, being able in principle to give alignment markers and an alignment score of any two vectors of data. Any speech processing results could, at least theoretically, be used as input to this phase.
Fig. 8. - Pitch-to-Pitchalignment.
The alignment process is implemented
as a branch-and-bound searching algorithm in a state space where
the nodes are pairs of T0 trace segments belonging to the two
signals. More precisely, a node in the state space is uniquely
characterized by the pair of MASTER and STUDENT time points up
to where the alignment was done. First, the whole state space
is generated, therefore all legal combinations of trace segments
are to be inspected. When a path is developing in this space,
a new node is appended to the already existing path if it gains
the best score among the future possible candidates. A couple
of criteria concurs to compute this score: best resemblance of
the silence gaps between two pitch segments, best match of the
segments' lengths, number of inner traces, etc.
158