Glen Berseth

I am an assistant professor at the University de Montreal and Mila. My research explores how to use deep learning and reinforcement learning to develop generalist robots.

Publication Articles


  • Tue 10 December 2019
  • Publication
  • Link
  • International Conference on Learning Representations 2021 (Oral, top 1.8% of submissions)

SMiRL: Surprise Minimizing RL in Unstable Environments

Glen Berseth, Daniel Geng, Coline Devin, Nicholas Rhinehart, Chelsea Finn, Dinesh Jayaraman, Sergey Levine

All living organisms carve out environmental niches within which they can maintain relative predictability amidst the ever-increasing entropy around them [schneider1994, friston2009]. Humans, for example, go to great lengths to shield themselves from surprise --- we band together in millions to build cities with homes, supplying water, food, gas, and electricity to control the deterioration of our bodies and living spaces amidst heat and cold, wind and storm. The need to discover and maintain such surprise-free equilibria has driven great resourcefulness and skill in organisms across very diverse natural habitats. Motivated by this, we ask: could the motive of preserving order amidst chaos guide the automatic acquisition of useful behaviors in artificial agents?


Contextual Imagined Goals for Self-Supervised Robotic Learning

Ashvin Nair*, Shikhar Bahl*, Alexander Khazatsky*, Vitchyr Pong, Glen Berseth, Sergey Levine

While reinforcement learning provides an appealing formalism for learning individual skills, a general-purpose robotic system must be able to master an extensive repertoire of behaviors. Instead of learning a large collection of skills individually, can we instead enable a robot to propose and practice its own behaviors automatically, learning about the affordances and behaviors that it can perform in its environment, such that it can then repurpose this knowledge once a new task is commanded by the user? In this paper, we study this question in the context of self-supervised goal-conditioned reinforcement learning. A central challenge in this learning regime is the problem of goal setting: in order to practice useful skills, the robot must be able to autonomously set goals that are feasible but diverse. When the robot's environment and available objects vary, as they do in most open-world settings, the robot must propose to itself only those goals that it can accomplish in its present setting with the objects that are at hand. Previous work only studies self-supervised goal-conditioned RL in a single-environment setting, where goal proposals come from the robot's past experience or a generative model are sufficient. In more diverse settings, this frequently leads to impossible goals and, as we show experimentally, prevents effective learning. We propose a conditional goal-setting model that aims to propose goals that are feasible from the robot's current state. We demonstrate that this enables self-supervised goal-conditioned off-policy learning with raw image observations in the real world, enabling a robot to manipulate a variety of objects and generalize to new objects that were not seen during training.


Interactive Architectural Design with Diverse Solution Exploration

Glen Berseth, Brandon Haworth, Muhammad Usman, Davide Schaumann, Mahyar Khayatkhoei, Mubbasir Turab Kapadia, Petros Faloutsos

In architectural design, architects explore a vast amount of design options to maximize various performance criteria, while adhering to specific constraints. In an effort to assist architects in such a complex endeavour, we propose IDOME, an interactive system for computer-aided design optimization. Our approach balances automation and control by efficiently exploring, analyzing, and filtering space layouts to inform architects' decision-making better. At each design iteration, IDOME provides a set of alternative building layouts which satisfy user-defined constraints and optimality criteria concerning a user-defined space parametrization. When the user selects a design generated by IDOME, the system performs a similar optimization process with the same (or different) parameters and objectives. A user may iterate this exploration process as many times as needed. In this work, we focus on optimizing built environments using architectural metrics by improving the degree of visibility, accessibility, and information gaining for navigating a proposed space. This approach, however, can be extended to support other kinds of analysis as well. We demonstrate the capabilities of IDOME through a series of examples, performance analysis, user studies, and a usability test. The results indicate that IDOME successfully optimizes the proposed designs concerning the chosen metrics and offers a satisfactory experience for users with minimal training.


Gamification of Crowd-Driven Environment Design

Michael Brandon Haworth, Muhammad Usman, Davide Schaumann, Nilay Chakraborty, Glen Berseth, Petros Faloutsos, Mubbasir Kapadia

This paper describes using human creativity within a gamified collaborative design framework to address the complexity of predictive environment design. This framework is predicated on gamifying crowd objectives and presenting environment design problems as puzzles. A usability study reveals that the framework is considered usable for the task. Participants were asked to configure an environment puzzle to reduce an important crowd metric, the total egress time. The design task was constructed to be straightforward and uses a simplified environment as a probe for understanding the utility of gamification and the performance of collaboration. Single-player and multiplayer designs outperformed both optimization and expert-sourced designs of the same environment and multiplayer designs further outperformed the single-player designs. Single-player and multiplayer iterations followed linear and exponential decrease trends in total egress time respectively. Our experiments provide strong evidence towards an interesting novel approach of crowdsourcing collaborative environment design.


Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks

Glen Berseth, Christopher Pal

Imitation learning, the ability to reproduce some behaviour, is a challenging and vital problem. It is what enables animals with the ability to understand and mimic from observation. Many SoTA methods for imitation accomplish this via additional data that is often not available in the real world. For example, along with an expert's joint positions, the torques used by the expert are available as well. In this work, we describe a learning system that allows an agent to reproduce imitative behaviour of 3D simulated robots from video. This progress will enable us to create robots that can learn behaviour from observing humans, and allow humans to instruct robots in a very natural form of instruction.