Glen Berseth

I am an assistant professor at the University de Montreal and Mila. My research explores how to use deep learning and reinforcement learning to develop generalist robots.


I am an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, CIFAR AI chair, and co-director of the Robotics and Embodied AI Lab (REAL). I was a Postdoctoral Researcher with Berkeley Artificial Intelligence Research (BAIR), working with Sergey Levine. His previous and current research has focused on solving sequential decision-making problems for real-world autonomous learning systems (robots). The specific of his research has covered the areas of reinforcement-, continual-, meta-, hierarchical learning, and human-robot collaboration. In his work, Dr. Berseth has published at top venues across the disciplines of robotics, machine learning, and computer animation. Currently, he is teaching a course on robot learning at Université de Montréal and Mila that covers the most recent research on machine learning techniques for creating generalist robots.

To see a more formal biography, click here.

Interested in joining the lab?

Are you interested in the practical and theoretical challenges of creating generalist problem-solving robots? Please see this page to apply. I may not respond to emails.

News

Ashvin Nair Articles


Representative Publications

  • General task learning by inferring rewards from example data

    Can we use reinforcement learning to learn general-purpose policies that can perform a wide range of different tasks, resulting in flexible and reusable skills? Contextual policies provide this capability in principle, but the representation of the context determines the degree of generalization and expressivity. Categorical contexts preclude generalization to entirely new tasks. Goal-conditioned policies may enable some generalization, but cannot capture all tasks that might be desired. In this paper, we propose goal distributions as a general and broadly applicable task representation suitable for contextual policies. Goal distributions are general in the sense that they can represent any state-based reward function when equipped with an appropriate distribution class, while the particular choice of distribution class allows us to trade off expressivity and learnability. We develop an off-policy algorithm called distribution-conditioned reinforcement learnin (DisCo) to efficiently learn these policies. We evaluate DisCo on a variety of robot manipulation tasks and find that it significantly outperforms prior methods on tasks that require generalization to new goal distributions.