MEMBERSHIP

Spotlight June 2022 – Teaching Robots

Can machines learn to walk and act in the real world in a similar way to how children do it? Starting from scratch, watching, copying, trying, failing, retrying? Learning to move is not as easy as learning to think โ€“ but the robots are getting closer.

PUBLISHED BY FII Institute

June 27, 2022
Share this Publication

ROBOTS LEARN TO LEARN โ€“ A LITTLE LIKE THE WAY OUR KIDS DO

Can machines learn to walk and act in the real world in a similar way to how children do it? Starting from scratch, watching, copying, trying, failing, retrying? Learning to move is not as easy as learning to think โ€“ but the robots are getting closer.

 

THE ISSUE AT STAKE

MYON WOULD BE A 6TH GRADER BY NOW. It was built in 2011 by Manfred Hild, a professor of neurorobotics in Berlin.1 A humanoid-looking robot, Myon was specifically designed for โ€“ nothing. It had no special skills, but with 200 sensors, 50 motors and lots of limbs and joints, the machine was supposed to develop itself through observation of its environment.

โ€œMyon has no goal,โ€ said its creator Hild. โ€œBut we do have a goal: to understand things.โ€ In this case, to understand learning. โ€œIntelligence does not come overnight. Not in children, and not in robots either. How long does it take for children to stand, to walk, to talk?โ€

So, instead of a program, the new robot was simply equipped with some rules to follow. For example, follow conspicuous signals. Or, once youโ€™ve decided on something, stick with it for a while. And it was given a rather childlike design, with one overdimensioned eye in its head. The cuddle factor was supposed to lower the communication barrier and increase the patience of interacting humans. Look, itโ€™s a child; itโ€™s still learning.

Its moment of glory came in summer of 2015: Myon became an actor at the opera. At the Komische Oper Berlin, the Robaby played a leading role in the show โ€œMy Square Lady.โ€ A combination of video recordings and stage action allowed the audience to participate in the project and witness Myonโ€™s first successes, as well as its setbacks.

Mainly the latter. The โ€œfollow the signalโ€ rule, for example, caused Myon to turn its head to the loudspeakers, not to the singer. We know that a sopranoโ€™s voice belongs to the person currently moving on stage, but the robot attributed it to the place the sound was coming from.

And Myonโ€™s performance hasnโ€™t improved much since then, as Benjamin Panreck, one of its co-creators, admits.4 The robot needs intensive training to learn specific actions, and shows little sign of being able to transfer a once-learned lesson into a slightly altered situation. Today Myon is often used to teach students at professor Hildโ€™s neurorobotics lab how to train robots, but its own learning curve remains shallow at best. ROBOTS CAN LEARN THROUGH INCENTIVES But wait a minute, donโ€™t we live in the Age of Machine Learning? Yes, we do. It started 25 years ago, when Jรผrgen Schmidhuber and Sepp Hochreiter published their groundbreaking work on Long Short-Term Memory (LSTM; see box page 3).5 They found a way for neuronal networks to โ€œforgetโ€ certain outcomes and focus on the right ones. With sufficient training data, these recurrent neural networks can produce astonishing results for tasks like speech recognition or machine translation. And they can do it almost autonomously, with relatively little human training effort involved. Autonomously learning neural networks have become quite common. Autonomous robots, such as self-driving cars, are already a familiar concept. But autonomously learning robots remain only an aspiration. Robots that learn how to move and to act in the real world still rely heavily on human intervention.

So how do robots learn? The main robot education principle is one they share with children and babies: learning through incentives. The robot first behaves in random ways and then evaluates how these behaviors have worked. That can be done via feedback from the instructor, who tells the robot whether its actions were effective or not. The robot chooses the behavior that offers it the highest reward, and then turns to the next iteration. It applies a number of random variations to the chosen behavior and determines by trial and error which of the new behaviors is now the most successful, and so on.

This method is called โ€œreinforced learning,โ€7 and as long as you stay in the pure world of data, itโ€™s not that different from the โ€œforget gateโ€ designed by Schmidhuber and Hochreiter. But robots by definition come in touch with a physical world beyond pure data, and then things become different โ€“ and more complex. Take a robot that is learning to walk. In many cases the outcome is falling, usually a non-desired outcome. And then the robot should not simply forget the trial, but remember it so as not to do it again โ€“ like a child that touches a hot plate for the first time. So robo-learning through incentives can go both ways, reward and punishment. And speaking of falling, thatโ€™s costly. Every time the robot falls down or walks out of its training environment, it needs someone to pick it up and set it back on track. Thatโ€™s a lot of manpower, because robots need a lot of training. And it also poses a risk of damaging the robot. You may try to construct a robust robot, but to be able to walk, it also has to be flexible, with a lot of moving parts โ€“ joints and motors and sensors. And even if the damage risk is low for a single event, robots learning how to walk will produce a lot of such events during their training.

 

 

 

Related Publications

Don’t Forget Our Planet
Dive into a green recovery; explore nature-based solutions for a sustainable, resilient future.
read more
Healthcare Equity – A Moral Imperative
The “Healthcare Impact report 2020” published by the FII Institute emphasizes health equity as a moral imperative. There’s a stark global health inequity exhibited by the vast difference in life expectancy depending on where one is born. Additionally, a double burden arises when healthcare costs exceed a certain percentage of household income, forcing difficult decisions […]
read more
1 2 3 44

Be a part of the conversation

Tweet about this publication and mention #FIIPublications to connect with the author.
Membership
chevron-downarrow-right