Robots used to perform tasks by executing the given instructions. But a new
robot is made to watch videos, learn and act accordingly.
Researchers at the University of Maryland have come out with a new approach
that enables robots to learn how to reproduce simple tasks by watching videos.
The researchers present a system that learns manipulation action plans by
processing unconstrained videos from the World Wide Web.
Team leader professor Yiannis Aloimonos said, “There exists a gargantuan amount of video information on the Internet that we can capitalize on and use our robots in order to learn.”
The paper talks about visual processing. That is about making a robot that
watches a human doing some tasks in a video, and then understands what and how that human is doing and finally replicates those actions using the its
The robot needs to learn what tool to grasp and on what object to perform the action in order to perform a manipulation action. The proposed system applies Convolutional Neural Network (CNN) based recognition modules to recognize the objects and tools in the video.
The system has two visual recognition modules: one module for classification of grasping types and the other for recognition of objects. In both modules the researchers used convolutional neural networks as classifiers.
Currently the videos are fed electronically to the robot.
The researchers considered ten most common actions in cooking scenarios for the experiments. The actions are cut, pour, transfer, spread, grip, stir, sprinkle, chop, peel and mix.
The system was able to robustly extract visual sentences with high accuracy and to learn atomic action commands with few errors.
“Our ultimate goal is to build a self-learning robot that is able to enrich its knowledge about fine grained manipulation actions by “watching” demo videos.”, the researchers wrote.