Dealingwith Prosody. A Computer-Assisted Language Learning Approach

Maria-Mirela Petrea, Dan Cristea

1. Introduction

PROSODICS (see also [3, 4, 10]) is a stand-alone application created in order to deal with the problem of audio-active comparison for computer-assisted foreign language learning. It is currently running on Macintosh platforms, it is extendible by Apple events, and, being accessible by means of high-level events, its main procedures may be used from any other application (see [10]). Although the main goal of the application is to act as a computer-aided language learning environment, PROSODICS incorporates features that make it helpful also as a research tool, mainly as a prosody annotating tool.

Before describing the application, and just to make clearer the way it acts, it might be interesting to have a look at the motivation that led us to its development. The systems for computer-aided language learning usually incorporate exercises intended to help students to improve their abilities in pronouncing phonemes, syllables, words or long utterances. The student might be instructed to a correct perception of utterances by repeatedly listening to speech signals pronounced with a high accuracy by native speakers, might be instructed to a correct production of utterances, by rules of coarticulation, indications about controlling the position of vocal tract articulators and so on. But when evaluating student's performances, that is, when the student is uttering, as no feed-back comes from the automatic tutor, it remains at the student's choice whether her performances are good or not.

PROSODICS is a system that reacts to student's actions, ensuring the needed feed-back. The general scenario the application puts on stage is this: a master signal (pronounced by a native speaker), eventually labeled with phonological information, is supplied to the student; in turn, the student records and listens to her voice, while comparing it to the master's, and then waits for system's reply to see whether her pronunciation was good or not; the reply consists of a visual aid and a written diagnosis, as well as indications on how to overcome and correct the mistake.

PROSODICS makes a comparison between the two signals (named master-signal and student-signal) that results in assigning a score to the student and, if the score is bad, telling her where the mistake is and in what it consists. The relevant parameters the comparison focuses on result from time domain analysis and regard energy, voicing and F₀ estimation. Energy is used for signals' segmentation, while accurate F₀ estimation is essential to the approximation of intonational contour and to decide where the main accent of the utterance resides. The final diagnosis results rely on rhythm and duration, the overall intonational curve, lexical and sentential stress, prosody pattern.

In order to deal with the task described, special procedures have been implemented for silence detection, fricatives' detection, F₀ estimation, noise elimination, the detection of boundaries delimiting speech units, prosody computing, signal-to-signal alignment, and signal-to-text alignment.

The paper is organized as follows: section 2 gives a short description of the way the application behaves during a working session, section 3 shows details of the technical realization and section 4 contains some concluding remarks.

151