Caption: “A person riding a motorcycle on a dirt road.” Source: Io9 |
On top of that, a computer must be able to sort out the salient features of that object and identify what it is—what category it belongs to. Even more difficult is the ability to explain the relationship between objects—what's going on. Finally, in order to create a caption for an image, the computer also needs to be able to translate its understanding into natural sounding language.
Caption: “Two pizzas sitting on top of a stove top oven.” Source: Io9 |
The human's answer is better because he or she recognized that there were three different kinds of pizza, and that the pizzas were resting on a stove, not a "stove top oven."
At this stage, computers don't always get the captions right, and it's fascinating to see how they get it wrong. For example, the computer mistakenly believed the child in the knitted hat was blowing bubbles.
The problem all along with developing computer vision was that programmers were trying to solve it top-down by telling the computer what it needed to do. Part of the solution has been a bottom-up approach using deep learning to allow the computer to rapidly improve its performance.
Computer vision presents us with some immediate potential benefits: artificial systems will be able to help blind people, assist in manufacturing, and drive us around safely in cars.
But artificial intelligence in its darker potential manifestations presents an existential threat to humans, outlined in a current article "The Doomsday Invention" in the New Yorker, and in this TED talk (link to YouTube)
------
Computer vision on Wikipedia
Great Post Mr. Gi. (Nice job w the book-club report too.)
ReplyDeleteThe potential benefits are enormous, especially if you consider a real partnership of a cyborg/human alliance. The benefit to humans arising out of a God-like benefactor/caretaker achieving for us what we cannot do for ourselves or are unwilling to wait/work to achieve for ourselves (especially that bit about ‘Human-values’).
The potential/inevitable devastation is immense when you consider the current threat of cyber-warfare and how well we have mastered that ‘Human-values’ thing. I hope our Franken-Borg will at least keep our litter-box clean, and the Soilent-green palatable. -RQ
Thanks for this article. Machine learning is an interest of mine, and I've been spending quite a bit of time learning some of this stuff. The fact that they made Tensor-flow open source a few months ago is huge; it opens the possibility for all sorts of research. They're smart and understand that research and incorporation is being done in these fields all over the world, and they need as much breakthrough as possible, as the field is still pretty new. I did an AI workshop about a month ago where they explained how they interpreted all those street view images and addresses for Google Maps; it's quite fascinating because they do use image recognition embedded with geographical data and machine learning. Andrew Ng is a huge hero in that field; I'm a huge huge fan, as well as the Machine Learning department at Stanford. Ironically writing and reading your article from a Google workshop/HQ lol.
ReplyDeleteResistance is futile.....
ReplyDelete