Multilingual activities should
always be based on cooperation. Especially, there should be cooperation
with native speakers from other research institutes and foreign
countries.
Moreover, the history of past
or ongoing projects in MT shows that data sharing is a very efficient
method of cooperation and fruitful competition. A good example
is the data collection activity in speech recognition. There exist
corpora on the following domains:
- ATIS,
- Wall Street Journal,
- Conference Booking,
- Appointment Scheduling.
In machine interpreting meanwhile
the scenario "appointment scheduling" is used in research
groups in Germany (Verbmobil), Korea, Spain, USA, Italy, and Japan.
Two of the most important activities in the long run therefore
are:
Another method to cooperate is to rely on common external evaluators
from other groups.
10. Organisational requirements
What we learned from successful projects can be summarized as
follows:
- Define clear technical aims to the funding institution and
internally, aims which you can prove afterwards,
- Define clear deliverables and deadlines
- when applying, and
- when employing researchers,
- Agree on intermediate presentations/publications
with the project crew,
- Define a test set (and training
set, if stochastic processes are involved) to assess your progress,
and install regular benchmarks every six months,
- Try to achieve an early implementation
and keep it operational all over the run time of the project,
- Start or keep open and friendly
competition with other international groups,
- Keep in touch with other groups
in your field and send them reports and concrete results regularly.
126