Corneliu Burileanu & al * Text-to-SpeechSynthesis for Romanian Language
We may take into account two main
applications of speech synthesis [3]:
A. The synthesis of messages
made up of previously established isolated words or sentences
The typical applications are the
public announcement systems and the transmission of standard messages
by phone. The utilised vocabulary is small enough, word or sentences
being processed once and then memorised. Despite the actual low
price of memory circuits, it must be observed that a second of
speech requires 64-128 kb for a simple PCM coding and 2.4 kb if
one uses linear prediction coding (LPC). The essential
issues of these systems are naturalness of words or word group
concatenation and a certain phrase intonation.
B. The automatic conversion
of stored text to synthetic speech (Text-to-Speech)
Terminal based applications of
TTS technology include talking terminals and training devices,
warning and alarm systems, talking aids for the hearing-impaired
and vocally handicapped and reading aids for the blind people.
Audiotext services allow users to retrieve information from public
or private database using a telephone as a terminal. While some
of this information could be provided using stored human speech,
TTS systems are appropriate when services access a large or frequently
changing database. TTS systems reduce also storage needs to a
few hundred bits for an equivalent text sentence.
The vocabulary is theoretically
unlimited and speech must be as much as possible fluent, intelligible
and natural. An appropriate number of segmental synthesis symbols
must be found for the synthesis of unrestricted text.
Following the features and historical
evolution of the applications listed above, automatic speech
synthesis methods are generally divided into three types:
The principles of these three
methods are presented in Figure 1 and a few details of each method
will be discussed in the following.
Fig. 1. - Basic
principles of three speech synthesis methods.
140