Wednesday, August 8, 2018

Object Segmentation

What is object segmentation?
Object segmentation is computer vision speak for “draw a bounding box around an object”. What it really means is separating the background from the foreground (the object).
If you’re new to computer vision, you’re going to think this is the easiest problem in the world. I mean, it’s pretty trivial for us to tell where one object ends and another begins. In fact, there’s good reason to believe our ability to quickly segment a scene into objects is the result of natural selection. If it takes you too long to identify the tiger hiding in the woods, you’re not going to survive long enough to tell the story.
So why is image segmentation so hard for computers? The main reason is that there are no rules. Objects have varying texture, size, colors and shapes. It’s practically impossible to hard-code a set of rules to identify where an object “ends”. It gets much easier if the computer has prior knowledge.
How to solve segmentation?
Prior knowledge can come in the shape of specific knowledge. For example, if you’re focusing on detecting screws in images taken on a factory floor, you can make some assumptions that render the problem simpler. You can also train a neural net to identify specific classes of objects.
A little less common is the idea of motion. If an object is moving, segmenting it becomes trivial. Think about watching cars on a highway. Segmentation becomes image subtraction: look at two consecutive images, the only thing changing is a car (because the highway looks the same). If you simply subtract the first image from the second, you’ll get a precise segmentation of the moving car!
You can also take this idea one step further. What if segmentation isn’t a passive task? What if we have a robot that can create motion? Well, in that case image segmentation becomes much easier. This is a great application of Interactive Perception.
Children learning segmentation
What I find most fascinating here is that a robot could bootstrap object segmentation through interaction. Eventually, it has seen enough examples that it can learn to do segmentation even without poking objects. And, if this sounds familiar, it’s probably because this is how visual tasks are learned by children.
You can read more about this type of work on my websites: Dov Katz and Dubi Katz.

No comments:

Post a Comment