A Formalization Model of the Romanian Morphology

Luciana Peev, Lidia Bibolar, Jodal Endre


1. Introduction

Our group has been working in the field of artificial programming languages, and starting with the 90-ies we have focused on the practical aspect of the usage of natural languages. The first preoccupations were directed towards creating a software for editing Romanian texts (more specifically, a spelling check). Our studies drove us to accumulating a great deal of linguistic knowledge, which, if they are systematized as they should be, can lead to creating an informatised model of morphology.

2. A few theoretical linguistic aspects

We must remind some linguistic aspects which represented the basis of our definition of morphological attributes and of the flexionary classes from the Romanian language thesaurus which we have defined.

The non-flexible words have a unique form of representation and are easy to analyze from a lexical point of view. The flexible ones change their form in different syntactic situations, i. e. they can be declined or conjugated. The sum of all the flexionary forms of a word represents its paradigm

P:: + {Root + Flective}

The Root is compulsory for any flexionary unit and it is also the element that carries the unit of content. In the paradigm, the root may vary due to phonetic alternances or irregular forms.

The Flective represents the differentiating element from the paradigm. Unlike the root, it can be zero. The flective varies according to the grammatical categories of the part of speech, i. e. in the case of nouns the variation corresponds to number, case, gender and determination; in the case of verbs, it corresponds to mood, tense, number and person.

3. Ways of determining the structure of the computerized Romanian lexic

The computerized Romanian lexical thesaurus is supposed to be the fundamental theoretical basis for building the automatic instruments destined for the further work on the linguistic information specific for our language. This information is conceived and structured so that it allows the development of a wide range of applications. Up to the present moment, we have focused on elaborating a spelling check soft (which corrects and also generates forms), and the base contains only the specific morphological attributes. We have to mention here the attributes introduced up to the present moment have allowed to create a first version of a morphological spelling check, also offering the possibility to pursue work and to create a complete morphological analyzer (which is now only at the stage of the automatic conjugation of verbs). The thesaurus is a primary basis, which can be enriched with new specific attributes for the different fields of language. In order to solve the specific informatic instruments, we must use specific generators of dictionaries. The purpose of this application is to obtain better dictionaries containing the minimal information necessary for solving the specific applications.

3.1. A definition of the flexionary classes

The variable part form from flexion is called the flective. Within the same part of speech, the flectives corresponding to different grammatical categories can have different forms. This is the criterion of grouping words belonging to the same part of speech in flexionary classes. The flectives which characterize a flexionary class are the same for all the words which belong to that class. The flexionary class associated to a part of speech determines the sum of flectives which characterize the class and, implicitly, the sum of attributes corresponding to the grammatical categories accepted by the classical grammar.


72

Previous Index Next