COST Action 21167
COST Action 23147
SAROJ
USPDATRO
ENRICH4ALL
ELG
ELE
COST Action 19102
CURLICAT
Project summary
The aim of the project is to compile curated datasets in seven languages of the consortium (Bulgarian, Croatian, Hungarian, Polish, Romanian, Slovak and Slovenian) in domains of relevance to European Digital Service Infrastructures (DSIs) with a view to enhancing the eTranslation automated translation system. The prime source of data will come from national corpora of the above languages and will cover domains relevant for CEF DSIs, such as eHealth, Europeana or eGovernment. The corpus will contain at least 14 million sentences (estimated to contain 185 million words) from domains including culture, education, health and science.
The data will be technically and legally cleaned. Terms from the IATE database will be identified and annotated so that the language models built with the help of these corpora will consist of not only single words but also multi-word expressions. Since an important aspect of today’s neural machine translation technology is the quality of the language model, the envisaged seven language corpora, although monolingual datasets in themselves, will make an impact on the quality of eTranslation through the enhanced language models built with them.
The project Translation Automation Services (TAS) for EU Council (January 2019-June
2020) focuses on the multilingual challenges of the EU Council Presidencies, particularly
addressing to the needs of the Romanian (2019), Finnish (2019) and Croatian (2020)
presidencies. The Action fundamentally aims to make the European Commission’s
eTranslation platform available to users in EU Member States by integrating the platform into
an EU Presidency TAS, by enabling a variety of audiences to use the eTranslation platform in
their everyday work, and by extending the eTranslation platform with a set of custom MTs
tailored for the EU Presidency domain.
It aims to make the language technology solutions –the eTranslation online service –available
to a wider range of individuals in order to meet their multilingual needs, thus helping to
establish the infrastructure for the multilingual Digital Single Market. The Action also
leverages the best of language technology to extend eTranslation to meet the specific needs of
EU Member States, by developing high-quality custom MT systems for the EU Presidency
domain and the DSI domains at the focal points of the EU Council Presidency.
The project will achieve this outcome by providing a translation service that lowers language
barriers and enables international journalists, EU delegates, and policymakers to access
precise and correct translated information about the hosting countries of the EU Council
Presidency in 2019-2020
In addition, the EU Presidency TAS will also include MT engines that are specially tailored
for the DSIs at the focal point of the EU Council Presidencies in 2019-2020, including Cyber
security (in Romania), eJustice (in Croatia), and a DSI chosen by Finland.
COST Action CA 18209
MARCELL
DRuKoLA
COST Action IS1312
IC1207 COST Action, PARSEME
COST Action IS1310
MUMIA COST Action IC1002
METANET4U
- Enhancing the European Linguistic Infrastructure (ICT PSP Objective identifier: 6.1 Open linguistic infrastructure) (Grant Agreement No 270893)
- Coordinator: University of Lisbon
- Duration: 2010 – 2013
ACCURAT
- Analysis and evaluation of Comparable Corpora for Under Resourced Areas of machine Translation (FP7/2007-2013, Grant Agreement no. 248347)
- Coordinator: ZEMANTA: Zemanta d.o.o., Ljubljana, Slovenia
- Duration: 2010 – 2012
MULTILINGUAL Web
- Standards and best practices for MultilingualWeb (ICT PSP Grant Agreement No. 250500, and as part of the Competitiveness and Innovation Framework Programme)
- Coordinator: ERCIM/W3C
- Duration: 2010 – 2013
CLARIN
- Common Language Resources and Technology Infrastructure (INFRA-2007-2.2-01, 212230)
- Coordinator: Utrecht University, The Netherlands
- Duration: 2008 – 2010
FlaRENet
- Fostering Language Resources Network (TN, eContentplus 617001)
- Coordinator: University of Pisa, Italy
- Duration: 2008 – 2010
COST A31
- Stability and Adaptation of Classification Systems in a Cross-Cultural Perspective
- Coordinator: CRLAO, Paris, France
- Duration: 2007 – 2010
eSDI-Net
- European Network on Geographic Information Enrichment and Reuse
- Coordinator: Technical University of Darmstadt, Germany
- Duration: 2007 – 2010
SEE-ERA.NET
- Building Language Resources and Translation Models for Machine Translation focused on South Slavic and Balkan Languages (ICT 10503 RP)
- Duration: 2007 – 2008
WISE
- An Electronic Marketplace to Support Pairs of Less Widely Studied European Languages (BSEC 009 / 05.2007)
- Duration: 2007 – 2008)
Collocations en contexte
- Extraction and contrastive analysis (AUF 2091RR703)
- Duration: 2006 – 2008
TOWNTOLOGY COST Action C21
- Urban Ontologies for an improved communication in UCE projects
- Coordinator: LEMA – Université de Liège
- Duration: 2005-2009
RomNetEra
- ROManian Inventory and NETworking for Integration in ERA (SSA FP6-510475)
- Duration: 2004 – 2007
ProLearn
- PROfessional LEARNing (NoE FP6- 507310)
- Duration: 2004 – 2007
KNOWLEDGE-WEB
- (NoE FP6-507482)
- Duration: 2004-2007
ARS-ROCOCO
- Acquiring Reading Skills in Romanian by Comparable Corpora (bilateral project with the British Academy)
- Duration: 2004 – 2005
DICO-EAST
- Dictionary consultation for research and education (bilateral project with the University of Geneve)
- Duration: 2003 – 2004
FF-POIROT
- Financial Fraud Prevention-Oriented Information Resources using Ontology Technology (STREP IST-2001-38248)
- Duration: 2002 – 2005
BalkaNet
- Design and Development of a Multilingual Balkan WordNet (FP5, IST-2000-29388)
- Duration: 2001 – 2004
KATEDRAST
- Project between Research Institute for Artificial Intelligence, the George Mason University and Physics Laboratory of Noetic Institute, Orinda, California) proposes to explore principles and foundations of science.
- Duration: 2001 – 2005
ELSNET
- European Network of Excellence in Human Language Technologies (FP5, IST-1999-12127)
- Duration: 2000 – 2002
LARFLAST
- Learning Foreign Language Scientific Terminology (INCO-Copernicus 977074)
- Duration: 1999 – 2002
ELAN and TELRI-II
- ELAN (European Language Activity Network) and TELRI-II (Trans-European Language Resources Infrastructure) (1998-1999/1999-2001); complementary COPERNICUS projects (1998-1999/ 1999-2001) with 28 partners, representatives of all the national languages from Europe (plus an associated partner from China).
- Duration: 1998 – 2001
CADBFR
- Construction Automatique de Dictionnaire Bilingue Franco-Roumain (AUPELF-UREF)
- Duration: 1999 – 2000
HEADGEN
- Head-Driven Generator for Unification Grammars (NATO research project, with LIMSI Paris)
- Duration: 1998
CONCEDE
- Consortium for Central European Dictionary Encoding (INCO-Copernicus PL961142)
- Duration: 1997 – 1999
AUPELF-UREF
- Financed by the French Government; it sets itself to experiment the machine extraction of bilingual dictionaries (Romanian-French), reversible, on the basis of a parallel corpus
- Duration: 1996 – 1998
ILPNET2
- Network of Excellence in Programming Pan-European Scientific Network
- Duration: 1998 – 2002
Using Multistrategy Learning as a Framework for Building Knowledge-Based Systems
- Bilateral project with George Mason University
- Duration: 1998 – 1999
AMASE
- Agent-based Mobile Access to Information Services (ESPRIT/ACTS project with partners from Sony, Motorola, Siemens, Space-Hellas, Tecsi, the Aachen Technique University, London College University and Research Institute for Artificial Intelligence)
- Duration: 1998 – 1999
ROMANIAN INTERNET ACADEMY
- European Network of Excellence in Human Language Technologies (bilateral project with International Centre for Advanced Studies in Information Technology, Washington, SUA)
- Duration: 1995 – 1999
ELSNET GO EAST
- European Network of Excellence in Human Language Technologies (COPERNICUS Concerted Action)
- Duration: 1995 – 1998
LIMSI/CNRS
- Financed by NATO; it aimed a fundamental research in generating cognitive models of the natural language
- Duration: 1996 – 1998
TELRI-I
- Trans-European Language Resources Infrastructure (EC Concerted Action)
- Duration: 1995 – 1997
MULTEXT-EAST
- Multilingual Text Tools and Corpora for Central and Eastern European Languages (EC COPERNICUS 106)
- Duration: 1995 – 1997
PAIL
- General AI Educational Environment (bilateral project with IDSIA Lugano, Switzerland)
- Duration: 1993 – 1995
EGLU
- Unification-based Platform for Parsing, Transfer and Generation of Natural Language (bilateral project with the ISSCO Institute, Geneve, Switzerland)
- Duration: 1993 – 1994
KRIL
- Knowledge-based Interlingual Generator (NSF FLUENT-2 project with George Mason University-Fairfax and Massachusetts Institute of Technology-Boston, USA)
- Duration: 1993 – 1994
ELSNET
- European Network of Excellence in Human Language Technologies (CEC ESPRIT)
- Duration: 1991 – 1995