Research Interests

I’m a researcher at the Research Institute for Artificial Intelligence (RACAI) working in several projects, mainly in the fields of Statistical Machine Translation and Question Answering.
I have a Ph.D. in Computer Science (May 2009) under the supervision of Mr. Dan Tufis with the thesis: Statistical Machine Translation for Romanian as Source Language. Also, I have a M.Sc. in Computational Linguistics.

Statistical Machine Translation

During my PhD program I developed two factored SMT systems for the English-Romanian language pair using Moses. The systems use three factors (lemma, word-form and morpho-syntactical description) to train the translation and language models. The systems had (May 2009) better BLUE scores than Google Translate. See http://www.racai.ro/webservices/FactoredTranslation.aspx.
The systems were train on the Romanian and English documents of the JRC-Acquis corpus. The corpus was sentence and word aligned using tools developed at RACAI.

Question Answering

I am a member of the RACAI team that participated with very good results to the last three CLEF QA tasks. The RACAI QA system placed first in the Romanian ResPubliQA track of the CLEF 2009 competition. In this project I am responsible with the indexing part of the QA engine, terminology identification and paragraph classification. The prototype of the 2010 system is available at http://www.racai.ro/SearchNlp.

Part-of-Speech Tagging

I have developed text processing tools (tokenization, sentence splitting, morpho-syntactical tagging, lemmatization, diacritics recovery, etc.) for English and Romanian (French and German tentative) mainly using statistical classifiers (Maxent and SVM). See http://www.racai.ro/webservices/TextProcessing.aspx for a client to the morpho-syntactical annotation web service and http://www.racai.ro/diac for a Microsoft Word plugin for diacritics recovery.

Lexical Ontologies

From 2005 I helped the development of the Romanian Wordnet. Now, the Romanian Wrodnet has more than 56000 sysets. The statistics and a browser to the Romanian Wordnet are available at http://www.racai.ro/wnbrowser