Walter von Hahn * Machine Translation
6. Challenges for research / Future aims
- linguistics -
One of the most urgent needs of complex natural language systems
is the integration of formalisms, i.e. the integration of syntactic
(HPSG, ...), semantic (STUFF, ...) and discourse (DRS, ...) formalisms.
Another challenge is the integration
of rule based processing and stochastic methods (stochastic driven
chart parsing, see Weber). There exist a lot of stochastic methods
in information systems and document processing which do not relate
to linguistic approaches. These methods must be evaluated as components
in linguistic processing.
There is not enough research on
needs and demands for the different types of machine translation
(see Carbonell). Similarly, there is not enough formal and operational
research on human translation / interpreting.
Standards and benchmarks must be defined to surmount the difficulty
of quality measuring in machine translation.
Every translator, especially technical
translators, stress the fact, that more than linguistic knowledge
is necessary for an adequate translation (Schmitt 1992). The role
of knowledge representation and knowledge engineering must be
redefined for MT and MAT.
Example: Japanese to English4
As examples let us inspect some
features of Japanese, which cause difficulties when translated
from or to other languages, e.g. English (see Uszkoreit 1995):
- Japanese is a head-final language.
Translating Japanese sentences always means to rearrange the constituents
from left to right of a verb and vice versa with translations
from English.
- Japanese has no pronouns (zero-pronouns).
Necessary pronouns must be "invented". This means rather
often to determine artificially the referent, which sometimes
is left ambiguously in the source text. In English texts pronouns
must be omitted and the referent must be found implicitly.
- Japanese has no particles
in time and spatial expressions, which requires similar techniques
of reconstruction.
- Japanese has phrase-final
particles. They must be replaced by a completely different linguistic
class: punctuation.
- Japanese has very flexible
nominalizing techniques. This can be adopted in English syntax
only in some cases. Translations into Japanese can use this feature
more often.
- Japanese has no difference
between complement and adjunct. On the other hand you need this
distinction in English for lexical choice.
- The most prominent feature
of Japanese is the use of honorifics. Lexical choice is determined,
e.g., by social rank, sex and age of the conversing partners.
Missing information about the listeners/readers of written texts
may make lexical choice rather complicated. In translations into
English several words and grammatical classes collapse into one.
It is often difficult to decide whether honorifics have to be
translated by other means (in diplomatic talks, political negotiations,
etc.).
- Japanese speakers like to
insert metacommunicative comments, which is unusual (to this extent)
in English. In fluent Japanese texts such meta-level utterances
may be inserted.
- Japanese allows for free topicalizing.
This might cause problems in languages with a more strict grammatical
word order.
- computational concepts -
Research in this field centers around the notion of translation
strategies:
- Which is the search space for translation equivalents: the
linguistic correctness, the communicative adequacy, or a rough
contents paraphrase, or a combination of these?
- Which strategies are applied in cases of translation difficulties?
Compare the following internal or external situations:
- one component (e.g. syntax)
of the system cannot achieve a consistent result,
- the input text is ambiguous,
- the input text is structurally
incorrect, or
- the input text is factually
wrong.
4 Material from Hans Uszkoreit
124