A neural network of computer processors, fed millions of YouTube videos, taught itself to recognize cats, a feat of significance for fields like speech recognition.
Inside Google’s secretive X laboratory, known for inventing self-driving cars and augmented reality glasses, a small group of researchers began working several years ago on a simulation of the human brain.
Andrew Y. Ng, a Stanford computer scientist, is cautiously optimistic about neural networks.
There Google scientists created one of the largest neural networks for machine learning by connecting 16,000 computer processors, which they turned loose on the Internet to learn on its own.
Presented with 10 million digital images found in YouTube videos, what did Google’s brain do? What millions of humans do with YouTube: looked for cats.
The neural network taught itself to recognize cats, which is actually no frivolous activity. This week the researchers will present the results of their work at a conference in Edinburgh, Scotland. The Google scientists and programmers will note that while it is hardly news that the Internet is full of cat videos, the simulation nevertheless surprised them. It performed far better than any previous effort by roughly doubling its accuracy in recognizing objects in a challenging list of 20,000 distinct items.
The research is representative of a new generation of computer science that is exploiting the falling cost of computing and the availability of huge clusters of computers in giant data centers. It is leading to significant advances in areas as diverse as machine vision and perception, speech recognition and language translation.
Although some of the computer science ideas that the researchers are using are not new, the sheer scale of the software simulations is leading to learning systems that were not previously possible. And Google researchers are not alone in exploiting the techniques, which are referred to as “deep learning” models. Last year Microsoft scientists presented research showing that the techniques could be applied equally well to build computer systems to understand human speech.
“This is the hottest thing in the speech recognition field these days,” said Yann LeCun, a computer scientist who specializes in machine learning at the Courant Institute of Mathematical Sciences at New York University.
And then, of course, there are the cats.
To find them, the Google research team, led by the Stanford University computer scientist Andrew Y. Ng and the Google fellow Jeff Dean, used an array of 16,000 processors to create a neural network with more than one billion connections. They then fed it random thumbnails of images, one each extracted from 10 million YouTube videos.
See on www.nytimes.com