Romanian Language Technology

Maria-Mirela Petrea, Dan Cristea * Dealingwith Prosody. A Computer-Assisted Language Learning Approach

As highlighted above, the pitch results in a direct mirroring of the fundamental period. For savings in computational time, we preferred to simply compute a symmetric-to-the-horizontal function of T₀(t) slope instead of calculating F₀(t)as 1/T₀(t).

The pitch movements were classified according to five domains of slopes values: steep rise, sweet rise, level, steep fall and sweet fall. For instance, in figure 6 there are described some parameters of the utterance "Can you manage?" pronounced by a female speaker, where the intonational contour sweet falls in the first word, then follows a level pattern in "you", to sharp rise and then fall in the first syllable of manage, with a sweet final rising (as expected, since the utterance is a yes/no question). Having the prosody characterized in this way, it is easy to decide where the sentence or word main accent resides. In figure 6 the main accent is in "manage", more specifically in the first syllable of the word.

3.8. Alignment of MASTER and STUDENT signals

The prosody feature extraction module is applied to the MASTER signal as well as to the STUDENT signal. In order to permit a comparison between the prosodic melodies of the student's utterance and that of the master, the two signals must be aligned (see also [7]). The intention, in fact, is to obtain an alignmentof the triad in figure 7.

Fig. 7. - Thealignment triad.

PROSODICS has two approaches to solve the problem: realizing a semi-automatic MASTER to TEXT alignment (by means of segmentation followed by a full text introducing editing) and a fully automatic MASTER to STUDENT alignment, or computing a fully automatic SIGNAL to TEXT alignment, that is, signal-to-phonemic transcription alignment. In the first case, the signal-to-signal alignment is performed using pitch traces; at the end, this alignment will result also in the STUDENT to TEXT alignment (as suggested in figure 8). In the second case, each signal (MASTER or STUDENT) is aligned to the same phonemic transcription, which results obviously in MASTER -STUDENT alignment (see figure 9).

3.8.1. Pitch-to-Pitch alignment

The fundamental procedure for alignment is independent of the data processed, being able in principle to give alignment markers and an alignment score of any two vectors of data. Any speech processing results could, at least theoretically, be used as input to this phase.

Fig. 8. - Pitch-to-Pitchalignment.

The alignment process is implemented as a branch-and-bound searching algorithm in a state space where the nodes are pairs of T0 trace segments belonging to the two signals. More precisely, a node in the state space is uniquely characterized by the pair of MASTER and STUDENT time points up to where the alignment was done. First, the whole state space is generated, therefore all legal combinations of trace segments are to be inspected. When a path is developing in this space, a new node is appended to the already existing path if it gains the best score among the future possible candidates. A couple of criteria concurs to compute this score: best resemblance of the silence gaps between two pitch segments, best match of the segments' lengths, number of inner traces, etc.

158