Gurney Journey: Google Cloud Vision

Permissions

All images and text are copyright 2020 James Gurney and/or their respective owners. Dinotopia is a registered trademark of James Gurney. For use of text or images in traditional print media or for any commercial licensing rights, please email me for permission.

However, you can quote images or text without asking permission on your educational or non-commercial blog, website, or Facebook page as long as you give me credit and provide a link back. Students and teachers can also quote images or text for their non-commercial school activity. It's also OK to do an artistic copy of my paintings as a study exercise without asking permission.

Monday, November 8, 2021

Google Cloud Vision

Google Cloud Vision is a free service that lets you harness the power of machine learning to analyze images.

You can upload any picture. The algorithm will then compare the image to a vast database of labeled pictures and then make its best guess about what objects it sees.

In this Tom Lovell illustration, GCV is very certain that it sees a single cat, and it's relatively certain that it sees a person. No mention of the other cat, the knitting, the blue chair and the white sweater.

What happens if you give it a fantasy image that doesn't exist in the real world, such as a renegade warrior astride a Styracosaurus with a T.rex-tooth-helmet holding a saber-tooth cat skull on a staff? In this Dinotopia image it recognizes two generalized objects: "a person and an animal."

Clicking on the "labels" tab, you can see that it identifies general qualities of the image with decreasing certainty. It's wrong about hunting and it's wrong about a working animal, but it knows that it's an illustration of an extinct animal.

What happens if you input an image that has no analog in the real world because the image was itself generated by a machine-learning algorithm? Can it find something in the DNA of the image that could help it identify the word prompt that generated the image?

This picture was created by (VQGAN+Clip) with the prompt "Constructionist Typography." The properties that it finds are more general than that, but it's in the ballpark.

Try Google Cloud Vision yourself and let me know in the comments what you discover.

1 comment:

Drake Gomez said...: Well, I uploaded a photo of Duchamp's Bicycle Wheel. Google interprets it as a stool (84% likelihood) or a tire (51%), but not art or a sculpture. I suppose depending on your feelings about Duchamp's work, Google Cloud Vision is either not so smart, or very (artificially) intelligent indeed.; November 8, 2021 at 6:17 PM

The Artist's Guide to Sketching

James Gurney

Blog Index

Blog Archive

Tip Jar

Color and Light Book

Imaginative Realism

Other Official Sites

Illustration

Painting and Painters

Drawing & Cartooning

Animation Art

CG Art

Contact

Permissions

Monday, November 8, 2021

Google Cloud Vision

1 comment: