Friday, July 26, 2013

Google's research on machine perception

Google is making huge strides in artificial (or machine) vision. Until now, most machine vision systems have been used on robotic assembly lines and other highly controlled environments.

One of Google's goals is to bring machine vision into the real world and see how well it does recognizing objects. Above are images of both specific and general objects gathered into sets based on features that the computer is trained to look for. Extracting these features from a digital image—or from a live video capture— can be immensely complex and subjective, especially when an object is seen in various angles and lightings.

As it improves, machine perception will help with driverless cars, so that the car can read street signs, look for garage sales, or distinguish black ice from a wet street. It will also be valuable for Google Street View to be able to know what it's looking at, and Google Glass to aid users in recognizing things at superhuman levels of ability: for example, to know the make, model and year of every car in your visual field.

It won't be long before machine vision surpasses ours in certain respects, and it already does so in many ways. One recent study demonstrated a machine system that could extract 100,000 features from a scene, while a comparable human could extract only 10,000. Part of the computer's advantage comes from GPS location-based information that gives it context to make such observations.

To make progress in this field, Google has been supporting scientific research in visual perception, and hosting experts in the field to give lectures, which you can watch online. This one (video link), by Dennis Proffitt brings out the point that non-human creatures definitely don't see the way we humans see. The frog, for example, is predisposed to see moving edges, moving dots (read insects), and changes in illumination.

Later in the talk (at 20:00) Proffitt describes a study that demonstrates that a right-handed person perceives their right arm to be longer than their left arm, and at 36:00, he shows that a person's estimate of walking distance to visible objects is directly correlated to their physiological health. At around 48:00, he suggests that our perceptual systems compresses vertical dimensions depending on the size and position of the screen or canvas.
A list of Google's online resources about machine perception


Quan M. Chu said...

The Google image right now can't help much. The result frequently come out with a vast contents that don't really relate to the key words. They have the ocean wide of image, but their depth is like a swimming pool. I sometimes struggled to find some good photos from Google image for the school projects, and later on have to jump out and take reference myself.
In my personal thought, this will be really helpful in the future for a lot of artists, especially illustrators and realist painters when they don't have to spend so much time looking for reference and related images that they need and satisfy with.

Robert J. Simone said...

Oh, so it's not my fault I tend to blow proportions by making things a little "squatty". That's a relief!