Corneliu Burileanu & al *
Text-to-SpeechSynthesis for Romanian Language
4.2.2. The choice of the speaker
We adopted the following criteria
for this choice:
- he must be a person with a
clear voice, a pleasant timbre and without excessive rising and
falling in intonation;
- he must be able to read fluently
any imposed text;
- he must also co-operate in
reading the same words or sentences many times, or differently;
- finally, he must be able to
keep the same rhythm and loudness for a period of time.
For many reasons, we adopted the
principle of storing separately recorded syllables, even if we
get segmental units longer than their equivalents extracted from
words.
4.2.3. Syllable decomposition
After a serious linguistic and
phonetic study on syllable structure in Romanian language, we
selected for our purpose two basic rules and a set of exceptions
[14, 15].
The fundamental rules are:
- Each syllable must contain
al least one vowel.
- The syllable separation is
made between two vowels, depending on consonant number existent
between the vowels. Some of the exceptions used are described
next.
- If possible, one must analyze
first the beginning of the word, to separate the "stable"
prefixes.
- A special attention asks the
groups ce, ci,
ge, gi,
che, chi,
ghe, ghi,
which represent a single sound, but with context dependent
phonetical value.
- In most cases, the "i"
vowel at the end of the word represents a semivowel (like in
blo-curi,
bu-cãi, but also in
ciu-perci,
în-tregi), but if the word ends in
"consonant"-"r"-"i'
or "consonant"-"l"-"i'
(like in a-cri, in-te-gri),
"i"
is a vowel, forming a syllable with the precedent two consonants
- The groups that contain many
vowels need to be separately analyzed, depending on their word
positions (at the beginning, at the end, or in the middle of the
word).
For example, for two adjacent
vowels, like in aer, real,
întreagã,
chior, ºaºiu, etc.,
thereis not a certain rule, but we can mention some special situations:
- At the beginning of the word,
the groups ia, ui,
ie,
iu, ãi,
io are
always diphthongs (like in iar-nã,
ui-tãturã,
ie-ºire, io-bagi),
so they belongto the same syllable.
- At the end of the word, the
groups ea, au,
ãu,
eo, eu,
ei, oi,
îu, ou,
îi,
ãi, ui,
iu, io
are also diphthongs (like in ca-lea,
a-rau,
cãlãu).
- In many situations, in the
middle of the word the groups ea,
ai,
oa, îi,
ia are
diphthongs.
- The groups ea
and ia are very unstable and in many cases they
form hiatus, so the syllable separation is made between the two
vowels.
- For the group VC...CV
the decomposition rules depend on the consonants between
syllables. For example:
- The -VCV- case: in
all situations the decomposition rule is: V-CV
(ca-ta-log,
si-la-bã, ca-targ
etc.).
- The -VCCV- case: usually,
the decomposition rule is: VC-CV
(ac-tual,
ar-matã, al-bit
etc.).
All these situations and other
exceptions were interpreted, encoded and introduced in the algorithm
that realizes the automatic syllable decomposition.
4.2.4. The acoustic data-base structure
The acoustic data-base contains
up to the present more than 4000 syllables digitally stored. They
were introduced in a single file, but with many index files, so
that the searching is very fast. The syllable is found in two
steps: first, the index file referring to the syllable is found,
then the syllable position in the data file is found, based on
information provided by the index file. This searching algorithm
was conceived using the C++ programming language, with eight index
files.
The data-base structure was organized
in terms of phoneme number composition.
147