Friday, January 7, 2011

Predictive Coding

A new study from Duke University revises our idea of how the visual system works. 

Old idea: bottom up
According to the older idea, images are constructed in our minds in a hierarchical fashion starting at the bottom. Data arriving at the retina is sorted into basic features, such as horizontal and vertical lines. These elements gradually resolve as they pass up through the layers of neural organization. Eventually they form into complete images that we recognize as particular objects. The brain’s higher level inferences, according to this model, develop only after this bottom-up process is completed.

 
New idea: top-down
Experiments using modern brain imaging data show that the process is really quite the reverse. The new idea, called “predictive coding,” proposes that the brain develops predictions about what we’re about to see. It tests those predictions in a top-down, rather than merely a bottom-up mode. The information that we take in is edited to fit the conception. This all happens within milliseconds and is largely unconscious.


Neural storm
Whenever low-level input contradicts the predictions and forces us to change our reading of an object, there is a storm of neural activity. This neural storm is particularly strong with optical illusions, such as the one that began this post. You may have experienced this if you ever, in a split second, mistook a stick on the ground for a snake. The top-down expectations get quickly overwhelmed by bottom-up corrections.

This illustration from Dinotopia: The World Beneath shows this kind mental process in action. An ambiguous cave formation, left, can either be seen as a skull or a woman with babies. Depending on how the mind wants to perceive the form, the details are marshaled to match the perception.

This phenomenon of conjuring faces and other meaningful patterns in apparently random visual data is a phenomenon that’s also called “pareidolia,” covered in an earlier post.

Here are three suggestions for how this new theory may affect us as artists (and I'm sure you'll think of other implications:
1. We mostly see what we expect to see. Viewers come to your pictures unconsciously preloaded in various ways.
2. Much information is lost because of this automatic editing process. It never reaches our conscious minds because it's edited out. Therefore we just don’t see a huge amount (and maybe that’s a good thing at times).
3. Having a wrong search image can actually make us blind to what we’re looking for. For example, if you thought the book you were looking for had a blue spine, you might not even see the correct book with the red spine.
-----
Duke University news release
Another report on predictive coding
First optical illusion from Planet Perplex
Related GJ Post on Pareidolia and Apophenia:
Thanks, Rob Wood and Brad

22 comments:

Kyle V Thomas said...

That's really cool. Is it akin to seeing images in clouds?
How can we know that the brain predicts? How is that measurable?

Barbara said...

This rings true. I've spent a lot of time pondering how it is I can recognize people at distance even in a crowd because I'm very nearsighted and can't make out detail at all.

taylor said...

It is fascinating to me how modern science plays over ancient philosophy. "Predictive Coding" and the subsequent corrective details could easily be tech speak for Kant's metaphysics, and the argument between top-down and bottom-up understanding is as old as Plato and Aristotle (respectively). I love that we are immersed in these age-old discussions by doggedly pursuing our craft. I imagine that venerable minds such as Plato would be shocked to find artists (of whom he was not over fond) discussing the legacy of his epistemology in terms of concrete observable data. Thanks Jim for such fascinating topics!

Kessie said...

This is one of those things I've known unconsciously for ages, and I have to fight my own brain sometimes, like you said, when looking for a book with a blue cover when maybe it really does have a red cover.

I made use of this for years as a kid, sorting through Lego bricks for a certain shape or color. I found I could actually "set" my brain to look only at a certain color, and that color would actually appear brighter to me. It's very interesting to me to see this phenomenon explained.

Sean said...

I had a friend who simply could not see anything odd about this image for the longest time...

Bobby Chiu image at his deviantArt gallery

Bobby Chiu used that predictive coding to full advantage with the colours & composition. We're already conditioned to predict that green = vegetation, a pair of dark circles close together = eyes, and two white pointy-up things = animal ears. No problem, right? Reinforced by the prediction that low horizon = ground, it's a very effective illusion. And funny to boot, if you like that sort of thing :)

renatabarillipainter said...

Yes I do agree with all this.
Actually I din t meet it with art , but before with yoga, when you do special exercises.
But what sould be really interesting is if your brain has already immagens to drive your eyes in seeing,does this reveal what kind of person you are? I mean if you don't have a lot of choice are you a material person ??? or if you see angels, are you a religios person? or if you see many figures in it are you a fanciful one?
and so on .Probably artists have got planty of predictions

Kate Higgins said...

Many times I have caught myself looking for the "blue spine". So I began to force myself reject the preconceived notion and usually find the book immediately. I thought I was the only one who did that...another preconceived notion.

Tim Fehr said...

We do this when we read as well. It allows us to recognize the same words in many different fonts almost instantly - even in fonts we have never seen before.

Further, we pay more attention to, or gather more information from, the top of the letter forms than the bottoms.

Just take a piece of paper and cover the bottom half of the 'Leave your comment' line above on the right - line the paper up just at the height of the center of the cross stoke line in the 'e's. It's pretty easy to still read the line, but cover the top half and looking only at the bottom, it's much harder to read. Try this with almost any line of printed text.

This allows clever text artists like Scott Kim to create ambigrams which read the same right side up as upside down. While it's simple with a name like 'otto', Kim does words like Christmas, horizon, etc. Google the term 'ambigram' to see lots of examples.

How we process letter forms is a interesting as how we process other images.

Thanks for this entry, Jim.

I got 'Color and Light' for Christmas and even my non-artistic friends are finding it fun to read.

Chibi Janine said...

I was thinking of that most delightful word pareidolia the other day as my little boy who is 3 is at that age where he will get freaked out by patterns on the wall as his brain processes new information and a few air bubbles in paint can resemble faces couple that with his wonderful growing imagination (which I hope he never looses) leads to some intresting conversations .

Chris Jouan said...

Reminds me of the psychology experiment where subjects are shown a video of a basketball game and asked to count passes by a certain team. Very few people notice the player in the gorilla suit until they are shown the video and told to look for him/her.

Our perception, and sometimes lack of it, fascinates me.

Roberto said...

Now I’m no neuroscientist, but…
I recently finished two excellent series of lectures. One taught by Professor Jeanette Norden, PH.D., from the Vanderbilt University School of Medicine, entitled: ‘Understanding the Brain,’ and the second one: ‘Philosophy of Mind: Brains, Consciousness, and Thinking Machines’ by Professor Patrick Grim, B.Phil., Ph.D., from the State University of N.Y. at Stony Brook.
(Both from The Teaching Company, www.TEACH12.com).

One of the amazing things I got from them was that not only does our brain process (visual) information in both of these (higherarchical?) ‘top-down’ and ‘bottom-up’ ways, but simultaneously upside-down and inside-out as well! (to complete the metaphor)
Each eye sends info to both hemispheres, each by way of two parallel pathways: One dorsal for spatial (where) info, and one ventral for ‘what’ info, (color vision is in the ventral stream). There are over thirty separate visual cortical areas in our brains(!)
In addition, the info is not just linear, but it is propagated thru cascading-cytoarchitecture (whateverthehelthatis?!). And this is just what I remember!!
I think the bottom line is that ‘seeing’ is a construct of our 'sentient-piece-o-meat', and that our ‘experience’ of seeing is quite separate from the process of looking. Not only do we leave things out but we also add stuff, like filling in the blind-spot in our fovea and color constancy.
Whenever I start to struggle with a painting I chant to myself: ‘Simplify’… ‘Simplify’ … ‘Simplify’ … I guess I shoulda done that with this post. -RQ

@ Tim: Thanx for the ambigrams.
@ Jim: Thanx for the Journey.

Pati said...

This post reminds me of the line drawing that can look like either an old lady with a hook nose, or, a beautiful young women with a feathered hat. Personal growth guru Stephen Covey used the picture in one of his early books, as his example of someone having a paradigm shift. Some see the old lady and some would see the young women, but then, eventually you would be able to identify both women in the same line drawing. We do "see what we want to see....". Thanks for the post.
Pati
http://paintingsbypati.blogspot.com/

António Araújo said...

>proposes that the brain develops >predictions about what we’re about >to see. It tests those predictions (...)

There's an old saying "let the data speak for itself". But the data can never speak for itself, you have to come up with a framework for the data and then test against it; without the framework, the data has no voice.

The question arises, how do we come up with the frameworks? Evolution, and surviving of the fittest is a really powerful force. If your framework doesn't distinguish the snake from the stick you won't be around for long...

Eliza said...

Is this idea really that new? I thought this was the whole point of the Rorschach inkblots. If they use the same inkblot cards for everyone and everyone's brains use the bottom-up method, why would some people see a butterfly and others see a monster or whatever? It seems to me that we've had the top-down concept for quite a long time, though whether it makes any sense to draw conclusions about someone's psychological state with it is up for debate.

Suciô Sanchez said...

Love the images.
You might like Rorschach Redemption's blog.

James Gurney said...

Kyle, yes, I think it's exactly like seeing things in clouds. Don't know the answers to your other questions, but maybe someone will find out.

Barbara--I think you're on to something important. Face recognition is immediate and unconscious, and having to revise the first guess (when it turns out your friend is really a stranger) is pretty disturbing.

Taylor--Good point--it sort of is old wine in new bottles, but all the date coming from fMRI studies gives us such an incredible new and verifiable way of looking at how the mind works.

Sean--I love that Bobby Chiu image, and he's perfectly tuned it so that it reads just in that way.

James Gurney said...

Renata--yes, the unique way each of us sees says volumes about each of us. I wonder if it applies to cats, who seem to "see" bugs and birds in moving shadows.

Kate--remember that news story about the DC sniper/shooter a few years ago? These were professional detectives, but they were missing data because they were looking for that white van.

Tim--never tried that reading trick before. Thanks. And glad you're enjoying the book.

Chibi and Chris--great examples. I've seen that gorilla video, and it's unbelievable.

Roberto--thanks for adding those nuances. I meant to say that both top-down and bottom-up mechanisms were working simultaneously. Didn't know about the inside out mode. Wow!

Antonio--I think some of that framework can be programmed in, and it depends on the species. I read an interesting story in a book called "Illuminations in the Flatwoods" about a biologist who raised turkeys from the moment they hatched. One time while walking through the forest with the group of turkeys, he mistook a stick for a snake, and the turkeys seemed to understand his problem and look disgusted at him. http://www.amazon.com/Illumination-Flatwoods-Season-Wild-Turkey/dp/1599211971

Eliza--You're right: it's an old idea, but I think what's new about it is that it's testable and verifiable by new research methods.

Roberto said...

Well, that inside-out-mode might just be my brain ;p

But seriously, my biggest concern with all of this is its impact on ‘eye-witness’ testimony (as has played out in Texas recently), and the moral implications of the Death Penalty. Just wondering :( -RQ

António Araújo said...

James, more or less apropos, I think you are going to like this one, if you haven't seen it yet:

http://visionlab.harvard.edu/silencing/

It is pretty amazing. Apparently we really can't trust our lying eyes at all.

James Gurney said...

Thank you, Antonio--those are amazing videos, and the findings are news to me.

KB said...

You might like http://www.flickr.com/groups/pareidolia/pool/

I've heard that most of the neurons in the optic nerve of an infant decay and disappear during the first years of life -- possibly becasue infants start bottom-up on their way to learning to see top-down. Non-contributing neurons are culled.

dzart said...

Wow, your illustration is fantastically creepy.