Machine translation (MT) has a long history of ambitious goals and unfulfilled promises. Early work in automatic, or “mechanical” translation, as it was known at the time, goes back at least to the 1940s. Its progress has, in many ways, followed and been fueled by advances in computer science and artificial intelligence, despite a few stumbling blocks like the ALPAC report in the United States (Hutchins, 2003).
Availability of greater computing power has made access to and usage of MT more straightforward. Machine translation has also gained wider exposure to the public through several dedicated services, typically available through search engine services. Most internet users will be familiar with at least one of Babel Fish, Google Language Tools, or Windows Live Translator. Most of these services used to be powered by the rule-based system developped by Systran. However, some of them (e.g., Google and Microsoft) now use statistical approaches, at least in part.
In this introduction, translation is defined as the task of transforming an existing text written in a source language, into an equivalent text in a different language, the target language. Traditional MT (which in the context of this primer, we take as meaning “prestatistical”) relied on various levels of linguistic analysis on the source side and language generation on the target side.
The first statistical approach to MT was pioneered by a group of researchers from IBM in the late 1980s (Brown et al., 1990). This may in fact be seen as part of a general move in computational linguistics: Within about a decade, statistical approaches became overwhelmingly dominant in the field, as shown, for example, in the proceedings of the annual conference of the Association for Computational Linguistics (ACL).
The general setting of statistical machine translation is to learn how to translate from a large corpus of pairs of equivalent source and target sentences. This is typically a machine learning framework: we have an input (the source sentence), an output (the target sentence), and a model trying to produce the correct output for each given input.
There are a number of key issues, however, some of them specific to the MT application. One crucial issue is the evaluation of translation quality. Machine learning techniques typically rely on some kind of cost optimization in order to learn relationships between the input and output data. However, evaluating automatically the quality of a translation, or the cost associated with a given MT output, is a very hard problem. It may be subsumed in the broader issue of language understanding, and will therefore, in all likelihood, stay unresolved for some time.
The early approach to SMT advocated by the IBM group relies on the sourcechannelapproach. This is essentially a framework for combining two models: aword-based translation model and a language model.The translation model ensures that the system produces target hypotheses thatcorrespond to the source sentence, while the language model ensures that the outputis as grammatical and fluent as possible.Some progress was made with word-based translation models. However, a significantbreakthrough was obtained by switching to log-linear models and phrase-based translation. Although the early SMT models essentially ignored linguistic aspects, a numberof efforts have attempted to reintroduce linguistic considerations into either thetranslation or the language models.

Photo by Plamen Invanov©
Comments
No comments yet. Be the first to comment.
Would you like to comment?