This Guy Beat Google’s Super-Smart AI—But It Wasn’t Easy
Andrej Karpathy knows what it’s like to compete with artificial intelligence.
He first went head-to-head with an artificial intelligence algorithm in 2011. A team of Stanford University researchers had just built the world’s most effective image-recognition software, and he wanted to see how well his very real brain stacked up against their digital creation on what was, at the time, a standard image recognition test.
The Stanford software analyzed a pool of about 50,000 images, slotting each into one of 10 categories, such as “dogs,” “horses,” and “trucks.” It was right about 80 percent of the time. Karpathy took the same test and completely smoked the AI code, scoring 94 percent. Karpathy, himself a graduate student at Stanford, thought humans would beat machines on this type of test for a long time. “[I]t will be hard to go above 80 percent,” he wrote in a blog post, referring to AI algorithms, “but I suspect improvements might be possible up to range of about 85-90 percent.”
Boy, was he wrong.
Last year, a system built by researchers at Google aced another, more complex, image recognition test, called ImageNet, scoring 93.4 percent accuracy (you can see how Google’s software performed on the test here). Again, Karpathy, with some colleagues at Stanford, went head-to-head with the system. But this time, they bombed what was a much more complex test, initially scoring about an 85 percent accuracy rate. Comparing the ImageNet test to the 2011 test software isn’t exactly an apples-to-apples comparison, but here’s the point: Humans were easily beating AI software in 2011; now that’s not the case. Not by a long shot.
The story goes a long way towards describing the excitement surrounding current artificial intelligence, which spans companies from Google to Facebook to IBM to Baidu. All of these giants are pouring big money into a burgeoning field called deep learning. Loosely modeled on the way the brain itself is able to accumulate knowledge, deep learning algorithms have been winning the ImageNet competition since 2013, and they’ve yielded remarkable results in the area of speech recognition, video recognition, and even financial analysis lately.
This is causing a shake up in the field of AI, as problems that have been considered unsolvable for a very long time are suddenly being solved, says Stuart Russell, a University of California, Berkeley, professor and artificial intelligence expert.
That said, computers still have a lot to learn.
AI Boot Camp
One reason Karpathy and his colleagues bombed against the Google systems was the way that ImageNet handles things like dogs. When he took that 2011 test, it had just one category for dogs. But in 2014, the test expected you to discern an artificial-mind-blowing 200 breeds.
That meant Karpathy had to know the difference between, say, Rhodesian ridgebacks and Hungarian pointers. “When I saw all these dogs come up, I was like: “Oh no. [The machine is] going to get this image, and I’m just struggling and sweating to label this precise breed of dog.”
So Karpathy entered his own kind of AI boot camp, teaching himself the categories of images that the ImageNet test expected him to know, and becoming a minor authority on dog breeds in the process. Two weeks later, and after about 50 hours of training and testing himself by clicking on random pictures, he bested the machines. He was right 94.9 percent of the time, a 1.7 percent margin over Google’s work. Chalk one up for humanity, but it wasn’t easy.
“It was a bit draining, but I felt that it was very important to get the human accuracy,” he says.
Abstract Thought
At the same time, Karpathy and his colleagues want AI to improve. They’re working to determine how the flaws in digital systems can be eliminated. “We’re trying to see if the computers perform at a human level, but we’re also trying to analyze their mistakes,” he says.
On the test, Karpathy could typically beat machines when he was presented with the image of something abstract. He could instantly spot, for example, a drawing of a bow. He could read the words “salt shaker” on a cone and understand what we was looking at. “Computers are not very good at abstract things,” he says.
They’re also not good at figuring out images in three-dimensional reality. A computer might be able to spot a Jack Russell terrier. But reckoning its size, or figuring out how it is positioned relative to some other object in the same room? That’s another matter. It’s also one that the Googles of the world are, no doubt, trying to solve as they dream of computers that can interpret images with the depth and subtlety of humans.
“Image recognition is important,” Karpathy says, “but it’s just a small piece of the puzzle.”
 
 
 
 
No comments:
Post a Comment