Luciana Peev, Lidia Bibolar, Jodal Endre * A Formalization Model of the Romanian Morphology




By applying the codification system for the irregular verb "a lua" we notice that the verb displays three roots which determine the total alteration of the root form "lu"/"ia"/"ie". The method of variable value letters, used in the definition of the alternations, is extended here and so the differing one is no longer the letter, but the very root. The root can be considered as a variation "l0", for which we define, just as presented above, the following values:

Once the root is defined, the corresponding generic flexionary class can be established. The first level consists in establishing the group: the verb "a lua" is part of the category of I-st conjugation verbs, that is group A, and, if we take into account the Gerund, which divides group A into two (luând), it means that the corresponding flexionary group is A2; the second level of classification is the one connected to the forms of the verb in Present Indicative and Subjunctive, singular, III-p within group A8. From now on we are no longer interested in whether the verb is irregular or not, because the algorithm treating flexion will interpret the root according to the corresponding grammatical categories (mode, tense, person, number) and will pick up the flective from the lot of class A1 and the subclass A8, according to the grammatical categories.

Note: both the root and the flectives are established according to the classical theory and we consider that the lexico-morphological thesaurus of Romanian language that we created can constitute the basis of various applications dedicated to theoretical study of Romanian language morphology.

3.5. The determining of data structures

The structure of an article is specific to the part of speech it represents. In describing the structures, we shall mention only those grammatical attributes that are valuable in solving the problems of morphological analysis and generation. Each article contains the element of identification in its primary structure, represented by lemma, the coded root and the generic flexionary classes.

Noun representation: {lemma, root, ctg, clsflex, gender,...}
Verb representation: {lemma, root, ctg, clsflex, ...}
Inflexible parts representation: {lemma, root, ctg,..}

4. Dictionary generating

An interactive medium was projected in order to create the lexical thesaurus on the basis of the enunciated principles. This medium has integrated all the necessary functions in creating and verifying the grammatical attributes put in, in exploiting and preserving the primary data base, as well as in studying it with the means of interrogating and leafing through interrogation and browse functions. The basical dictionary of the resulted thesaurus does not participate directly in any application. To perform a linguistic application, one needs to define the problem and the necessary resources. According to these, those dictionaries specific to the application can be generated. This is realized by means of the generators of dictionaries integrated in the work medium of the computerized thesaurus of Romanian language that was realized by our team.

For example, for an application of the recognizing type and of morphological analysis for Romanian language there will be created a specific dictionary in which for the verb "a absorb", the roots and their corresponding flexionary classes will be generated:

absorb x1
absorb x2

In the case of a translation from another language into Romanian, in which it is necessary to create a dictionary of correspondence between the two languages, the verb "a absorb" will appear this time in the following form:

abso0rb y1

The flexionary classes x1 and x2 in the first case and y1 in the second one are generated according to the morphological data in the thesaurus. The general diagram for the obtaining of specialized dictionaries, according to the application, is the following:

APPLICATIONS
lexical thesaurus ---> dictionary generator ---> optimizer ---> spelling checker
morphological analyzer
automatical translation
............
other applications



75

Previous Index Next