Learning Adaptive Safety

Learning Adaptive Safety for Multi-Agent Systems

Preprint and supplementary material available online.

Overview

Ensuring safety in dynamic multi-agent systems is challenging due to limited information about the other agents. Control Barrier Functions (CBFs) show promise for safety assurance but current methods make strong assumptions about other agents and often rely on manual tuning to balance safety, feasibility, and performance. In this work, we will delve into the problem of adaptive safe learning for multi-agent systems with CBF. We will show how emergent behavior can be profoundly influenced by the CBF configuration, highlighting the necessity for a responsive and dynamic approach to CBF design. So far, we have developed ASRL, a novel adaptive safe RL framework, to fully automate the optimization of policy and CBF coefficients, to enhance safety and long-term performance through reinforcement learning. By directly interacting with the other agents, ASRL learns to cope with diverse agent behaviors and maintains the cost violations below a desired limit. We will evaluate ASRL in a multi-robot system and a competitive multi-agent racing scenario, against learning-based and control-theoretic approaches. We will build upon CBF-based control to formulate a theory for safe control synthesis for hybrid dynamical systems.

Video Overview

Installation

The implementation has been tested with Python 3.8 under Ubuntu 20.04.

Steps

  1. Clone this repository.
  2. Install requirements: ```bash pip install -r requirements.txt

Docker

For better reproducibility, we will release soon a Dockerfile to build a container with all the necessary dependencies. :construction_worker:

Reproducing the Results

We assume that all the experiments are run from the project directory and that the project directory is added to the PYTHONPATH environment variable as follows:

export PYTHONPATH=$PYTHONPATH:$(pwd)

Experiment 1 - End-to-End Training

exp1

  1. For the multi-robot environment, run from the project directory:
./scripts/run_exp_baselines.sh [0-6]

where the exp-id [0-6] denotes runs with PPOPID, PPOLag, CPO, IPO, DDPGLag, TD3Lag, and PPOSaute respectively.

  1. Similary, For the racing environment, run:
./scripts/run_exp_baselines.sh [7-13]

The results will be saved in the logs/baselines folder.

Experiment 2 - Ablation Study

exp2

We provide a couple of ablate models to augment built-in controllers with adaptive safety in the checkpoints folder.

To play with trained models with adaptive safety, run:

./scripts/run_checkpoint_eval.sh [0-1]

where the exp-id [0-1] denotes runs for particle-env and racing environments respectively.

Publications

Contributors

Citation


@misc{berducci2023learning,
      title={Learning Adaptive Safety for Multi-Agent Systems}, 
      author={Luigi Berducci and Shuo Yang and Rahul Mangharam and Radu Grosu},
      year={2023},
      eprint={2309.10657},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}