How does the statistical machine translation work?
Jun 18th, 2007 by Julia
Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contrasts with the rule-based approaches to machine translation. The first ideas of statistical machine translation were introduced by Warren Weaver in 1949. Statistical machine translation was re-introduced in 1991 by researchers at IBM’s Thomas J. Watson Research Center and has contributed to the significant resurgence in interest in machine translation in recent years. As of 2006, it is by far the most widely-studied machine translation paradigm.
The benefits of statistical machine translation over traditional paradigms that are most often cited are the following:
Better use of resources
- There is a great deal of natural language in machine-readable format.
- Generally, SMT systems are not tailored to any specific pair of languages.
- Rule-based translation systems require the manual development of linguistic rules, which can be costly, and which often do not generalize to other languages.
More natural translations
The ideas behind statistical machine translation come out of information theory. Essentially, the document is translated on the probability p(e | f) that a string e in native language (for example, English) is the translation of a string f in foreign language (for example, French). Generally, these probabilities are estimated using techniques of parameter estimation.
Statistical machine translation tries to generate translations using statistical methods based on bilingual text corpora. The statistical translation has been implemented by Google in its Google Translate in the BETA version for certain translations (Arabic, Russian and Asian languages). The system has been trained using the United Nations Documents as a corpus. The corpus is 200 billion words worth of content. It uses existing source and target language translations (done by human translators at the U.N.) to find patterns it then uses to build rules for translating between those languages. Accuracy of the translation has improved.
Source Wikipedia. (Read the entire article. This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article “Metasyntactic variable”)


ho trovato siti e blog con una barra di bandiere da cliccare per avere la traduzione automatica. come si fa?
vi prego di darmi qualche consiglio qui o su arachesostufo@libero.it