Looking more closely to the functional
types of translations, Carbonell distinguishes two categories:
"dissemination" and "assimilation". Only for
dissemination the quality requirements are high, because:
- in most cases the reader is unknown to the writer, and
- the text contains exactly the information to be shared with the reader,
whereas in assimilation
the amount of information to be extracted is dependent on the
specific interest of the reader, which may vary between "to
know what it is about" and "what exactly does the author
mean?".
Technological Types and Dimensions
Another view to today's translation technology is the overall
functionality and the processing type, which can be categorized
along three dimensions:
- Functionality (What does the system do for its user?)
- Machine aided translation
- Machine translation
- Machine interpreting
- Processing depth (Which steps between source and target?)
- Shallow matching
- Transfer
- Interlingua
- Variable depth
- Processing Paradigm (Which basic principles?)
- knowledge-based
- rule-based
- stochastic
- connectionist
- example-based
- semantic parsing
4. State of the Art - Research2
The following statements characterize
the (very heterogeneous) state of research in MT. However, they
emphasize the changes in interest and paradigms in the
MT community:
- rule based paradigm is less
abstract than that of the "indirect" models,
- syntactic analysis is restricted
to surface constituency/dependency relations (roles and cases);
single mono-stratal representation (unification/constraint-based
analysis),
- semantic analysis is limited
mainly to identification of sentence and clause roles (agents,
patients, etc.),
- lexical information derived
primarily from standard dictionary sources ("crude"
syntactic categories and semantic features),
- lexical/ structural transfer
rules (constraint-based) operating on shallow representations,
- example translations (aligned
bilingual corpora),
- statistical data about lexical
collocations and vocabulary frequencies (monolingual),
- probabilities of lexical transfer
(bilingual),
- domain specific knowledge
bases (both linguistic and subject knowledge),
- feedback ("learning")
for grammar/lexicon improvement (connectionist?),
- greater emphasis on discourse
and text stylistic aspects,
- integration into documentation
processing and publishing systems.
1 see Carbonell's Saarbrücken Material
2 this table is a quotation from Hutchins' lecture at
Tzigov Summer School
122