Romanian Language Technology

Corneliu Burileanu & al * Text-to-SpeechSynthesis for Romanian Language

4.2.2. The choice of the speaker

We adopted the following criteria for this choice:

he must be a person with a clear voice, a pleasant timbre and without excessive rising and falling in intonation;
he must be able to read fluently any imposed text;
he must also co-operate in reading the same words or sentences many times, or differently;
finally, he must be able to keep the same rhythm and loudness for a period of time.

For many reasons, we adopted the principle of storing separately recorded syllables, even if we get segmental units longer than their equivalents extracted from words.

4.2.3. Syllable decomposition

After a serious linguistic and phonetic study on syllable structure in Romanian language, we selected for our purpose two basic rules and a set of exceptions [14, 15].

The fundamental rules are:

Each syllable must contain al least one vowel.
The syllable separation is made between two vowels, depending on consonant number existent between the vowels. Some of the exceptions used are described next.
If possible, one must analyze first the beginning of the word, to separate the "stable" prefixes.
A special attention asks the groups ce, ci, ge, gi, che, chi, ghe, ghi, which represent a single sound, but with context dependent phonetical value.
In most cases, the "i" vowel at the end of the word represents a semivowel (like in blo-curi, bu-cãi, but also in ciu-perci, în-tregi), but if the word ends in "consonant"-"r"-"i' or "consonant"-"l"-"i' (like in a-cri, in-te-gri), "i" is a vowel, forming a syllable with the precedent two consonants
The groups that contain many vowels need to be separately analyzed, depending on their word positions (at the beginning, at the end, or in the middle of the word).
For example, for two adjacent vowels, like in aer, real, întreagã, chior, ºaºiu, etc., thereis not a certain rule, but we can mention some special situations:
- At the beginning of the word, the groups ia, ui, ie, iu, ãi, io are always diphthongs (like in iar-nã, ui-tãturã, ie-ºire, io-bagi), so they belongto the same syllable.
- At the end of the word, the groups ea, au, ãu, eo, eu, ei, oi, îu, ou, îi, ãi, ui, iu, io are also diphthongs (like in ca-lea, a-rau, cãlãu).
- In many situations, in the middle of the word the groups ea, ai, oa, îi, ia are diphthongs.
- The groups ea and ia are very unstable and in many cases they form hiatus, so the syllable separation is made between the two vowels.
For the group VC...CV the decomposition rules depend on the consonants between syllables. For example:
- The -VCV- case: in all situations the decomposition rule is: V-CV (ca-ta-log, si-la-bã, ca-targ etc.).
- The -VCCV- case: usually, the decomposition rule is: VC-CV (ac-tual, ar-matã, al-bit etc.).

All these situations and other exceptions were interpreted, encoded and introduced in the algorithm that realizes the automatic syllable decomposition.

4.2.4. The acoustic data-base structure

The acoustic data-base contains up to the present more than 4000 syllables digitally stored. They were introduced in a single file, but with many index files, so that the searching is very fast. The syllable is found in two steps: first, the index file referring to the syllable is found, then the syllable position in the data file is found, based on information provided by the index file. This searching algorithm was conceived using the C++ programming language, with eight index files.

The data-base structure was organized in terms of phoneme number composition.

147