Robots that Learn : Reinforcement Learning in Robotics
By: USQRD in AI
Written by Akarsh Gopal
The convergence of Robotics and Artificial Intelligence (AI) has always been inevitable. Being able to optimally control mechanisms with high degrees of freedom has been a hard problem for roboticists. Using AI algorithms, researchers have been able to get robots to learn surprisingly organic movement and behaviour. Here are examples of recent achievements in robotics using AI:
Robot learning how to ‘stand up’
Robot learning dexterity
The algorithms used in these robots use a specific approach called Reinforcement Learning (RL). RL is a sub-field of AI that deals with situations where an ‘agent‘ acts in an ‘environment‘. The agent receives feedback from the environment and has a particular goal to achieve within the environment. The objective of RL is to develop methods to allow the agent to learn to achieve the goal while acting within the environment. The idea is to provide the agent with an objective, place it in an environment and have the agent act such that it maximises its reward and therefore achieves the objective.
Deep Learning can be leveraged in RL to allow learning of complex control ‘policies’ since Deep learning excels at being able to abstract functions with layers of complexity. Deep RL has proved successful in learning how to play relatively complex video games such as Dota 2 (OpenAI Five) and Starcraft (Google Deepmind’s AlphaStar). Not to forget, Deepmind’s Alphazero, a chess playing AI. This approach is also showing promise in the control of physical robots.
Deep RL, however, requires large amounts of ‘training’ data since any agent that starts fresh has no prior knowledge about the environment. The collection of training data for physical-world applications can be prohibitively expensive or practically impossible in many cases. In such cases, an approach called Transfer Learning can help reduce the need and effort required to collect this training data. Transfer Learning is the process of improving learning performance using neural networks that have already been trained on a similar dataset to solve a similar problem.
Anybotics, a Switzerland-based robotics company recently demonstrated their quadruped robot ‘Anymal’ learning how to ‘stand up’ and recover from falls. They did this using RL in a simulated version of Anymal and then were able to successfully transfer the learnt control ‘policy’ to the physical world. This is evidence of how RL can help robots learn behaviour that can be prohibitively difficult to hand-engineer.
OpenAI, a non-profit AI research company co-founded by Elon Musk and Sam Altman, built a system called Dactyl, as published in their work, ‘Learning Dexterity’. In their project, the researchers used a modified version of the algorithm used in OpenAI Five to teach a robotic hand to manipulate a cube into target orientations. The learnt behaviour of the robotic hand is uncannily human-like and exhibits dexterity. Coding this behaviour using traditional logic statements would require hundreds if not thousands of hours of work. However using Deep RL in a simulation of the setup, the researchers were able to teach the ‘robotic hand’ in a far shorter period. They were then able to transfer this learning into the physical world without any significant changes to the learnt parameters of the algorithm. This is truly a remarkable achievement in AI and Robotics research.
The ultimate goal of RL is to develop an Artificial General Intelligence (AGI), which has the ability to successfully complete tasks and to adapt to different environmental constraints like humans. With all these developments in RL, we are gradually progressing towards general AI – capable of tackling tasks of increasing complexity.