Felicia ªerban & al. * Database of the Romanian Language Phonetics and Phonology




5. The utility of the database

5.1. Scientific applications

We mention some of the possible scientific applications of the database:

5.2. Practical applications

5.2.1. 'ORTOGRAF': spelling check for Romanian language

'ORTOGRAF' is a software for spelling check. It can be seen as an independent unit that can be integrated in different text processors. It is the first component of a complete linguistic package, which may be enriched with a software for syntactic analysis and a dictionary of synonyms. In this way, it will be similar to the linguistic packages edited by several firms for other languages, destined to improve the quality of the texts elaborated by means of text processors. The module includes a special function for automatic hyphenation. The software is based on its own dictionary of over 60,000 roots, covering more than 2,000,000 words, taking into account the strong flexional character of the Romanian language.

The module of spelling check contains two important functions:

5.2.1.1. The spell-checking function receives at its entry a word from the text that has to be analysed and at its exit gives the answer about the correctness of the word. Its effective recognition algorithm is based on finding the grammatical components of the word; it also checks whether the root exists in the dictionary, as well as other components (prefixes, suffixes, and endings) in its internal schemes[15,16,17]. The final answer is determined by checking the coherence of the coining, on the basis of the specific grammar rules of Romanian. For a wrong or unknown word, the module offers (if possible) a list of suggestions of available correct items, also taking into account the possible typing errors.

The way of correcting deals well enough with the hyphen constructions which are specific for the Romanian language.

Orthographic analysis can be also controlled by means of certain user dictionaries. The most common dictionary is destined to enrich the vocabulary with non-declined words (proper names, denominations, neologisms, etc.). Another dictionary may help to the recognition of certain flexional words that are momentarily absent from its own dictionaries, on the basis of certain grammatical similarities with already existing words. The dictionary of suggestions serves to improve the implicit corrections. Some words, otherwise correct, but not wanted in the current context of editing, can be placed into an exclusion dictionary. These user dictionaries are simple (ASCII) text files that can be edited by any editor, the only condition being that the text should be ASCII.

5.2.1.2. The hyphenation function receives the word that should be hyphenated at its entry and answers by giving the row of syllables determined by the specific rules of hyphenation. This function can be enriched, too, by means of a user dictionary, where one can find the exceptions from the usual hyphenation rules.

'ORTOGRAF' is independent from the fonts used, but it implicitly presupposes the usage of the internal codes existing in WinCP 1250, which is the acknowledged norm for the Eastern Europe languages.

At the present moment, the product is integrated in text processors of the OFFICE package of Microsoft, so it functions on its 16 bites for Word 6.0, respectively 32 bites for Word 7.0. After its installation, the product signals its existence by the appearance of two other options in the list of languages to be selected: ROMANIAN for the new syntax recommended by the Romanian Academy and ROMANIAN (OLD) for the previous syntax. The selected language should be attached to the section of text chosen for analysis. The effective analysis is begun by the specific command of the editor and starts from the current point inserted in the text. For unknown words, a dialogue box appears and shows the specific information of the text-editor. Analysis can be restricted to small portions of the text, its inferior limit being the word. Any errors signalled in the text can be interactively corrected by activating the editing surface, and the analysis can be continued from the point indicated by the current position of the prompter.

The product can be integrated to any editing system. It is only necessary to have the specific interfaces, as well as the installation procedures by means of which the software can become an internal function of the processors.

5.2.2. Other applications

Facilities for teaching Romanian as a foreign language can be created by means of electronic text-books on different levels (beginners, advanced at different degrees). The vocabulary may have variable sizes, with translations of the meanings in different languages and with automatic flexion of all the words, to which the correct pronunciation is added (orally, in writing and phonetic transcription).



193

Previous Index Next