The efforts towards building human-computer
voice interfaces have reached now a stage where, for a few languages,
the existing commercial products are able to automatically recognise
speech from vocabularies of thousands to tens of thousands words,
uttered with short pauses between them, in a speaker-dependent
or adaptive manner, and synthetise speech from a wide range of
input texts.
The speed at which this area is evolving makes us
believe that speech technology must be a priority of Romanian
research, and the projects undertaken by our group1, briefly described
in what follows, are attempts to materialise this belief.
A first and essential step towards
spoken language computer interfaces is the design and collection
of appropriate linguistic resources (text archives, pronunciation
dictionaries, and speech databases) aimed at facilitating fundamental
research in phonetics and phonology, and applications development.
Section 2 gives some details of our work in this area. Text-to-speech
conversion, described in Section 3, is an essential component
for a conversational computer system, together with speech recognition,
which is the topic of Section 4. The last section tries to draw
some conclusions based on the work so far, and to outline future
plans.
To explore the use of statistical
modelling methods (hidden Markov models [2])
in speech recognition
[3] and the impact of speech-processing methods
on their performance
[4], the first linguistic resource we collected
was a small speech
database [5] consisting of 300 signal files
containing isolated
utterances in Romanian of 0 to 9 digits, repeated three times
by 100 speakers (50 men and 50 women), plus a file containing
for each speaker some personal data about aspects that had or
could have had an influence on their voice quality: sex, height,
weight, age, mother tongue, smoking habits, etc. Organised in
two training and test sets consisting of 68 and 32 speakers respectively,
and partly hand-labelled at the word level, this database has
been used for speech recognition studies, but its use is by no
means limited to this, and we can make it available to other interested
laboratories.
174
2. Linguistic resources
1 This paper is based in part on activities funded by the
European Commission through Contract Copernicus 1304/1994 and the Romanian Ministry
of Education through Contracts 4004/1995 and 5004/1996, implementing NURC Grants
56/1995, 354/1996 and 355/1996.