Table 1 presents some French-English translation examples taken from out-ofdomain sentences in the test set of the NAACL 2006 WMT shared task. These examples show the effect of semisupervised learning for model adaptation to a novel domain.
Table 2 presents some Chinese-English translation examples of the baseline and the semisupervised system using a phrase table learned on 5000 additional Chinese sentences. All examples are taken from the NIST portion of the 2006 test corpus. Except for the last example, which is taken from newswire, they come from newsgroup posts. The examples show that the semisupervised system outperforms the baseline system in terms of both adequacy and fluency.
Semisupervised learning has been previously applied to improve word alignments. In Callison-Burch et al. (2004), a generative model for word alignment is trained using unsupervised learning on parallel text. In addition, another model is trained on a small amount of hand-annotated word-alignment data. A mixture model provides a probability for word alignment. Experiments showed that putting a large weight on the model trained on labeled data performs best.
Along similar lines, Fraser and Marcu (2006) combine a generative model of word alignment with a log-linear discriminative model trained on a small set of handaligned sentences. The word alignments are used to train a standard phrase-based SMT system, resulting in increased translation quality. In Callison-Burch (2002), co-training is applied to MT. This approach requires several source languages which are sentence-aligned with each other and all translate into the same target language. One language pair creates data for another language pair and can be naturally used in a Blum and Mitchell (1998)-style co-training algorithm. Experiments on the Europarl corpus show a decrease in WER. However, the selection algorithm applied there is actually supervised because it takes the reference translation into account.
Self-training for SMT was proposed in Ueffing et al. (2007a) where the test data was repeatedly translated and phrase pairs from the translated test set were used to improve overall translation quality. In the work presented here, the additional monolingual source data is drawn from the same domain as the test set. In particular, we filter the monolingual source language sentences based on their similarity to the development set.
Table 1. Translations from the out-of-domain sentences in the NAACL 2006 French-English test corpus. The semisupervised examples are taken from sampling-based sentence selection with normalized scores. Lowercased output, punctuation marks tokenized.
|
baseline the so-called
|
the so-called ‘ grandfather bulldozer become the most of the Israelis and the final asset of western diplomacy and for the americans , surprisingly , for europeans too .
|
|
semisupervised
|
‘bulldozer’ became the grandfather of most of the israelis and the final asset of western diplomacy and for the americans , surprisingly, for europeans too .
|
|
reference
|
the „bulldozer” had become the grandfather of most israelis and the last card of western diplomacy , for americans and , surprisingly, for europeans , too .
|
|
baseline
|
baseline these are not all their exceptional periods which create bonaparte, which is probably better because leaders exceptional can provide the illusion that all problems have their solution , which is far fromtrue .
|
|
semisupervised
|
these are not all periods which create their exceptional bonaparte, which is probably better because leaders exceptional can provide the illusion that all problems have their solution which is far from being true.
|
|
reference
|
not all exceptional periods create their bonapartes , and this is probably a good thing , for exceptional leaders may give the illusion that all problems have solutions , which is far from true .
|
|
baseline
|
in both cases , must be anchored in good faith moving from one to another .
|
|
semisupervised
|
in both cases , it must be established for faith moving from one to another .
|
|
reference
|
in both cases , it takes a lot of blind faith to go from one to the other .
|
|
baseline
|
given an initial period to experiment with growth and innovation on these fronts may prove strong paying subsequently.
|
|
semisupervised
|
enjoy an initial period of growth and innovation to experiment with on these fronts may prove heavily paying subsequently.
|
|
reference
|
using an initial period of growth to experiment and innovate on these fronts can pay high dividends later on.
|
Table 2. Translation examples from the Chinese–English eval-06 corpus, NIST section. Lowercased output, punctuation marks tokenized.
|
baseline
|
you will continue to be arrested and beaten by villagers.
|
|
semisupervised
|
you continue to arrest, beat villagers,
|
|
reference
|
you have continued to arrest and beat villagers.
|
|
baseline
|
after all , family planning is a problem for chinese characteristics .
|
|
semisupervised
|
after all , family planning is a difficult problem with chinese characteristics.
|
|
reference
|
after all, family planning is a difficult topic with chinese characteristics.
|
|
baseline
|
i am very disappointed in recognition of the chinese people do not deserve to enjoy democracy! ! !
|
|
semisupervised
|
i am very disappointed to admit that the chinese nation do not deserve democracy ! ! !
|
|
reference
|
i am very disappointed to admit that the chinese people do not deserve democracy!
|
|
Baseline
|
china has refused to talk to both sides to comment .
|
|
semisupervised
|
the chinese side refused to comment on both sides of the talks .
|
|
reference
|
china has refused to comment on the talks between the two sides .
|
|
Baseline
|
reports said that there has been speculation that might trigger a computer in possession by the former metropolitan police chief steve vincent jazz yangguang survey of confidential information .
|
|
Semisupervised
|
reports said that the theft triggered speculation that the computer may be in the possession of the metropolitan police chief stevenson jazz led investigation of confidential information .
|
|
reference
|
the report pointed out that the theft triggered speculation that the computers may contain confidential information of the probe led by former metropolitan police commissioner lord stevens .
|
We presented a semisupervised learning algorithm for SMT which makes use of monolingual source-language data. The relevant parts of this data are identified, and then the SMT system is used to generate translations of those. The reliable translations are automatically determined and used to retrain and adapt the SMT system to a domain or style. It is not intuitively clear why the SMT system can learn something from its own output and is improved through semisupervised learning. There are two main reasons for this improvement:
First, the selection step provides important feedback for the system. The confidence estimation, for example, discards translations with low language model scores or posterior probabilities. The selection step discards bad machine translations and reinforces phrases of high quality. As a result, the probabilities of low-quality phrase pairs, such as noise in the table or overly confident singletons, degrade. The selec performance of semisupervised learning for SMT.
Second, our algorithm constitutes a way of adapting the SMT system to a new domain or style without requiring bilingual training or development data. Those phrases in the existing phrase tables which are relevant for translating the new data are reinforced. The probability distribution over the phrase pairs thus gets more focused on the (reliable) parts which are relevant for the test data.
One of the key components in our approach is that translations need to be proposed for sentences in the unlabeled set (which is from the same domain as the test set), and from those translations we would like to select the ones that are useful in improving our performance in this domain. For this problem, in future work we plan to explore some alternatives in addition to the methods presented here: in translating a source sentence f, the difficulty in assessing the quality of a translation into the target language e’ comes from the fact that we do not have any reference translation e. However, if e’ is a good translation, then we should probably be able to reconstruct the input sentence f from it. So we can judge the quality of e’ based on its translation f ’ (we can compute the BLEU score in this case since we do have access to f ), and for this translation direction we already have the translation probability tables. This approach is an attractive alternative to the problem of selecting good translations in our algorithm.
In addition to this, it would be interesting to study the proposed methods further, using more refined filter functions, e.g., methods applied in information retrieval.

Photo by Plamen Invanov©
Comments
No comments yet. Be the first to comment.
Would you like to comment?