Oleg Zabluda's blog
Thursday, December 01, 2016
“sentence compression algorithms” just went live on [Google]. [...] “You need to use neural networks—or at least that is the only way we have found to do it,” Google research product manager David Orr says of the company’s sentence compression work. [...] Google trains these neural networks using data handcrafted by a massive team of PhD linguists it calls Pygmalion. In effect, Google’s machines learn how to extract relevant answers from long strings of text by watching humans do it—over and over again. [...] To train Google’s artificial Q&A brain, Orr and company also use old news stories, where machines start to see how headlines serve as short summaries of the longer articles that follow. But for now, the company still needs its team of PhD linguists. They not only demonstrate sentence compression, but actually label parts of speech in ways that help neural nets understand how human language works. Spanning about 100 PhD linguists across the globe, the Pygmalion team produces what Orr calls “the gold data,” while and the news stories are the “silver.” The silver data is still useful, because there’s so much of it. But the gold data is essential. Linne Ha, who oversees Pygmalion, says the team will continue to grow in the years to come. This kind of human-assisted AI is called “supervised learning,” [...] Right now, Orr says, the team spans between 20 and 30 languages. But the hope is that companies like Google can eventually move to a more automated form of AI called “unsupervised learning.”


| |


Powered by Blogger