TorchDriveEnv Release

Wed May 08 2024366 views

“While environment overfitting can occur in any learning algorithm, it is particularly pernicious in reinforcement learning.” [Whiteson et al]

What does this mean? In practical terms it means that many popular RL algorithms are de facto overfit to a small collection of environments (Atari, Mujoco, etc.) in the sense that the RL algorithms themselves often do not work well in new environments. This overfitting shows up in a number of ways, from reliance on particular reward scalings, assumptions about the "shape" of the conditional policy distribution family, and so forth.

Towards remedying this, it is incumbent upon the field to generate new environments, and, indeed for such new environments to be sufficiently challenging to push algorithm development beyond the status quo.

With our release of TorchDriveEnv we have done just this.

At the heart of self driving lies a complex planning and control problem: get where you want to go without driving off the road, crashing into other road users, and so forth.

TorchDriveEnv is the first OpenAI Gym-compatible, open-source, publicly available, industrial-grade top-down 2D self-driving planning and control environment that contains, by default, realistic, diverse, and reactive human-like non-ego actors. You might, inaccurately, think multi-agent RL. No, TorchDriveEnv is a single-agent RL environment in which the environment and driving tasks within it are made realistically complex via interactions with ITRA-derived NPCs. These NPCs and their associated GPU computation are provisioned, for free, to academic users via Inverted AI's INITIALIZE and DRIVE APIs.

To train test and validate in TorchDriveEnv means to optimize a reward (which you can modify) that, by default, rewards smooth, collision-free, on-road, low jerk driving that follows waypoints specified in space on a variety of built in CARLA maps (which you can augment with your own). Training and validation scenarios are separated (you can design and include your own of both) with Weights and Biases experiment tracking integrated as well.

To get a sense of the environment, roughly what training and validating in the presence of ITRA NPCs is like, check out these gifs in which a partially SAC-trained agent interacts with Inverted AI NPCs on CARLA maps in test scenarios (SAC EGO agent in red, ITRA NPCs in blue):

Or, otherwise, go the Inverted AI website sandbox to try out INITIALIZE and DRIVE, no coding required.

A more complete description of TorchDriveEnv is provided in this arXiv paper. TorchDriveEnv is directly pip installable

pip install torchdriveenv[baselines]

and source code is available on github.

Sign up for a free Inverted AI API key using your academic email and you're ready to go. Let's solve self driving together.