Oleg Zabluda's blog
Tuesday, September 27, 2016
A Neural Network for Machine Translation, at Production Scale
A Neural Network for Machine Translation, at Production Scale
Ten years ago, we announced the launch of Google Translate, together with the use of Phrase-Based Machine Translation as the key algorithm behind this service. [...] Today we announce the Google Neural Machine Translation system (GNMT), [...] A few years ago we started using Recurrent Neural Networks (RNNs) to directly learn the mapping between an input sequence (e.g. a sentence in one language) to an output sequence (that same sentence in another language) [2]. Whereas Phrase-Based Machine Translation (PBMT) breaks an input sentence into words and phrases to be translated largely independently, Neural Machine Translation (NMT) considers the entire input sentence as a unit for translation. [...] Since then, researchers have proposed many techniques to improve NMT, including work on handling rare words by mimicking an external alignment model [3], using attention to align input words and output words [4] and breaking words into smaller units to cope with rare words [5,6]. Despite these improvements, NMT wasn't fast or accurate enough to be used in a production system, such as Google Translate. Our new paper [1] describes how we overcame the many challenges to make NMT work on very large data sets and built a system that is sufficiently fast and accurate enough to provide better translations for Google’s users and services.
compared to the previous phrase-based production system. GNMT reduces translation errors by more than 55%-85% on several major language pairs measured on sampled sentences from Wikipedia and news websites with the help of bilingual human raters. [...] GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms, and translating sentences in isolation rather than considering the context of the paragraph or page.

[1] Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation (2016) Yonghui Wu, [...] Quoc V. Le, [...] Oriol Vinyals, Greg Corrado, [...] Jeffrey Dean.

[2] Sequence to Sequence Learning with Neural Networks (2014) Ilya Sutskever, Oriol Vinyals, Quoc V. Le.


| |


Powered by Blogger