Project summary
The goal of the ADAMo project is to create a text classifier that can identify texts produced by artificial intelligence (AI). Even though language specific features will be employed, the resulting solution will be a language-independent one. Given the existence of similar resources in other languages, new neural models can be trained to detect AI-generated texts in other languages. We will use the representative corpus of the contemporary Romanian language (CoRoLa), containing more than 1 billion words (in written and spoken texts) as original data for training the classifier. In order to cope with the language varieties in Romania and Moldova, CoRoLa will be enriched with at least 15M tokens of high quality texts, with cleared intellectual property rights, from Moldova, following the design principles of the original data collection, metadata construction, levels of annotation. The whole corpus will also undergo syntactic parser, so as to capture similarities and differences at more linguistic levels and, thus, to be a valuable resource for the study of the two varieties.
Partners
I.C.I.A.
The Research Institute for Artificial Intelligence “Mihai Drăgănescu” ICIA of the Romanian Academy
U.T.M.
Technical University of Moldova
Results
Automatic Detection of AI-Generated Texts from Moldova and Romania (ADAMo), A Project Presentation
in the 20th International Conference on Linguistic Resources and Tools for Natural Language Processing, Bucharest, 8-10 Oct, 2025.
conference presentation: Verginica Barbu Mititelu, Victoria Bobicev, Victoria Alexei, Rodica Braniște, Olesea Caftanatov, Maria Mitrofan, Radu Ion, Elena Irimia, Daniela Istrati, Ludmila Malahov, Sergiu Nisioi and Alexandr Parahonco
in E. Irimia et al. (eds.), Proceedings of the 20th International Conference Linguistic Resources and Tools for Natural Language Processing, Publishing House of the “Alexandru Ioan Cuza” University of Iași, 2025, pp. 249-264.
published in the proceedings of a scientific conference. Verginica Barbu Mititelu, Victoria Bobicev, Victoria Alexei, Rodica Braniște, Olesea Caftanatov, Maria Mitrofan, Radu Ion, Elena Irimia, Daniela Istrati, Ludmila Malahov, Sergiu Nisioi, Alexandr Parahonco, Vasile Păiș
Cognitive impact:
The project has a major cognitive impact in the domains of Artificial Intelligence and Romanian linguistics, as it is gathering together in the same corpus texts from the two varieties of Romanian with two major goals: their automatic linguistic comparison (seconded by manual analysis) and the impact on the detection of AI-generated texts when two language varieties are involved. The results will contribute to the existing literature of both domains with original empiric data, thus offering a sound foundation for further developments. We have already established synergy with the project Defending against deep fake news with large language and image models, in which the data collected in ADAMo
will be used for deep fake news detection. Furthermore, the project results (the corpus) will serve as a qualitative source of information for the description of the Romanian language with its both varieties.
The cognitive impact is also supported through the development of the research competencies of the involved team, particularly in areas such as corpus collection, text preprocessing and processing, metadata-based description, and the development of classifiers for detecting linguistic varieties and AI-generated texts. The knowledge generated can be further transferred to the academic and educational environment through the integration of the results into training activities and scientific dissemination.
Socio-economic impact:
From a socio-economic perspective, the project addresses a major and highly topical challenge, namely the ability to distinguish between original texts and those generated by Artificial Intelligence, as well as between true information and false content. By collecting a corpus from the Republic of Moldova and conducting experiments involving prompt design and text generation, we will obtain linguistic material for developing classifiers capable of automatically distinguishing between original and AI-generated texts, as well as between the two varieties of the language. Through the linguistic resources and tools that will be made available to the community, the project will contribute to the development of instruments of automatic detection of AI-generated texts and, ultimately, of potentially false news.
Team
I.C.I.A. Team Members

Dr. Verginica BARBU MITITELU
Principal InvestigatorScientific Researcher II

Dr. Vasile Florian PĂIȘ
Scientific Researcher II

Dr. Elena IRIMIA
Scientific Researcher III

Dr. Radu ION
Scientific Researcher II

Dr. Tiberiu BOROȘ
Scientific Researcher II

Dr. Maria CARP
Scientific Researcher II

Sergiu Nisioi
Associate Professor

Eric CUREA
Scientific Researcher III
U.T.M. Team Members

Dr. Victoria Bobicev
partner leaderUniversity Lecturer (tenured),
Scientific Researcher (cumulative)

Olesea Caftanatov
Laboratory chief

Dr. Daniela Istrati
University Lecturer (tenured)

PhD. Stud. Rodica Braniște
University Lecturer

Alexandr Parahonco
-

Victoria Alexei
Technical University of Moldova

Ludmila Malahov
-