In life as well as in computer vision, context is critical.
Scroll through this post slowly and don’t cheat.
What do you see in the image below?
Now what do you see in this next picture?
How about this picture, can you now identify the images above?
How did you identify the car? “Because it was on a road”
How did you identify the lines? “Because there is a car on it”
Having the proper context of the full image lets you properly asses and identify what the entire scene is, as well as the individual elements.
Last question. What direction was the car traveling in?
How did you guess the direction? Was it based on assumptions about what lanes cars drive in? Does that change based on your country?
Until we can develop the algorithms for determining proper context, computer vision is at a significant disadvantage to humans. We need to join computer vision and AI to achieve the next generation of scene recognition. In the past I have often given the example of a person standing in the street waiting for a bus. To a human we can tell the person is not going to dart into the street as they are just waiting for a bus; for a self-driving car it just looks like an obstacle in the street. The context is missing.
This is just a quick post based on a part of a talk given by Dr. Takeo Kanade, one of the worlds foremost computer vision experts. It is an interesting post to watch, the information above is presented around 23:40 minute mark.
This was just a quick post but when watching that clip I thought it was a good illustration that I wanted to share. I hope you also enjoyed it.
Main image By Metropolitan Transportation Authority of the State of New York. Cropped and color-corrected prior to upload by Daniel Case – Metro-North Shuttle Buses, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=38214438