Luciana Peev, Lidia Bibolar, Jodal Endre * A Formalization Model of the Romanian Morphology
There was also created a product that could constitute the basis
for a system of learning the flexion in Romanian, computer assisted.
The present information is suitable and sufficient for other special
applications like automatic indexing, interactive bilingual dictionaries
and it could lead to the successful approach of computer assisted
translation.
The primary basis is going to be further improved with the attributes
necessary to a syntactical analysis and it stays open to all the
attributes necessary in other domains of interest for linguistics.
ORTOGRAF is a software for spelling check. It can be seen
as an independent unit that can be integrated in different text
processors. It is the first component of a complete linguistic
package, which will be enriched with a software for syntactic
analysis and a dictionary of synonyms. In this way, it will be
similar to the linguistic packages edited by other firms for other
languages, destined to improve the quality of the texts elaborated
by means of text processors. The module includes a special function
for automatic hyphenation. The software is based on its own dictionary
of over 60,000 roots, covering more than 2,000,000 words, taking
into account the strong flexionary character of the Romanian language.
The module of spelling check contains two important functions:
The way of correcting deals well enough with the hyphen constructions,
which are specific for the Romanian language.
Orthographic analysis can be also controlled by means of certain
user dictionaries. The most common dictionary is destined to enrich
the vocabulary with nondeclined words (proper names, denominations,
neologisms etc.) Another dictionary that may help to recognize
certain flexionary words that are momentarily absent from its
own dictionaries, based on certain grammatical similarities with
already existing words. The dictionary of suggestions serves to
improve the implicit correction. Some words, otherwise correct,
but which are not wanted in the current context of editing, can
be placed into an exclusion dictionary. These user dictionaries
are simple (ASCII) text files that can be edited by any editor,
the only condition being that the text should be ASCII.
ORTOGRAF is independent of the fonts used, but it implicitly presupposes
the usage of the internal codes existing in WinCP 1250, which
is the acknowledged norm for the Eastern Europe languages.
At the present moment, the product is integrated in text processors
of the OFFICE package of Microsoft, so it functions on its 16
bites for Word 6.0, respectively 32 bites for Word 7.0. After
its installation, the product signals its existence by the appearance
of two other options in the list of languages to be selected:
ROMANIAN for the new syntax recommended by the Romanian Academy
and ROMANIAN (OLD) for the previous syntax. The selected language
should be attached to the section of text chosen for analysis.
The effective analysis is begun by the specific command of the
editor and starts from the current point where it is inserted
in the text. For unknown words, a dialogue box appears and shows
the specific information of the editor. Analysis can be restricted
to small portions of the text, its inferior limit being the word.
Any errors signaled in the text can be interactively corrected
by activation of the editing surface, and the analysis can be
continued from the point indicated by the current position of
the prompter.
The product can be integrated to any editing system. It is necessary
to have only the specific interfaces, as well as the installation
procedures by means of which the software can become an internal
function of the processors. Its need of disk capacity is about
1MB.
76