Monday, August 6, 2018

Interactive Perception / Dov Katz

Interactive Perception

My name is Dov Katz. I am a roboticist by training. My research and work focuses on the intersection of machine learning, computer vision and robotics. I also like to write about these topics. 
Here are some thoughts about Interactive Perception as I originally published on Medium:

Introduction

I’ve been fascinated with robots forever. The idea of a machine that can think like a person but is physically far superior is intriguing. As I started doing research in robotics I quickly realized a surprising fact: building a brilliant machine is easy. Designing a machine that can do everything we do effortlessly is hard.
Here’s the truth about how far we’ve gotten in about 50 years of robotics: machines can beat almost every living person in chess, they can drive cars and navigate on Mars. And yet, they suck at opening a drawer, taking out some forks and placing them on the dinner table. This seems… weird. Why would robots excel at things so far out of what any 3 years old cares about and be so clueless in what every 3 years old can do effortlessly?
I believe the answer is that robots don’t get to grow up. They’ve never had to build from the grounds up, they never developed curiosity. Human children in the first three years of life are consumed by a desire to explore and experiment with objects. They are fascinated by causal relations between objects, and quite systematically explore the way one object can influence another object. They persistently explore the properties of objects using all their senses. For example, a child might gently tap a new toy car against the floor, listening to the sounds it makes, then try banging it loudly, and then try banging it against the soft sofa. This kind of playing around with the world, while observing the outcome of actions, is more than just play. It actually contributes to babies’ ability to solve the big, deep problems of disappearance, causality, and categorization.

Action and Perception

This explanatory drive tightly couples action and perception. This coupling was first observed in the 80s by the psychologist Gibson. Gibson’s research views perception as an active process, highly coupled with motor activities. Motor activities are necessary to perform perception and perception is geared towards detecting opportunities for motor activities. Gibson called these opportunities “affordances”. In my research in robotics I referred to this process as Interactive Perception.
Perceiving the world, making decisions, and acting to change the state of the world seem to be three independent processes. This is exactly why most people consider action and perception as separate. However, “enactive” approach to perception may be essential for surviving in a high-dimensional and uncertain world. Interactive Perception provides a straightforward way to formulate theories about the state of the world and directly test these theories through interactions.
For example, think about the first time a child encounters a pair of scissors. She has no sense of what this object does or how it works. Yes, she could spend some time looking at it and making educated guesses. But, what the child is most likely going to do is poke and probe it. This interaction will create motion, and this motion will make it easy to determine what scissors can do.
Interactive Perception imposes structure. It limits what needs to be perceived and explained. If we have hope of building robots that can do what toddlers do, I believe making them curious and letting them interact with the world to learn about it is essential.

And, maybe once they are expert toddlers, we can start thinking about sending them to preschool :-)


No comments:

Post a Comment