RPCBF:

Constructing Safety Filters Robust to Model Error and Disturbances via Policy Control Barrier Functions

Luzia Knoedler1*, Oswin So2*, Ji Yin3, Mitchell Black4, Zachary Serlin4, Panagiotis Tsiotras3, Javier Alonso-Mora1, Chuchu Fan2
* Both authors contributed equally to this work 1 Delft University of Technology, 2 Massachusetts Institute of Technology 3 Georgia Institute of Technology 4 MIT Lincoln Labs

Robust Policy CBFs: Constructing robust CBFs from the Policy Value Function

In this work, we leverage the insight that the maximum-over-time constraint function is a CBF for any choice of rollout policy \(\pi\). The policy value function for a policy \(\pi\) is defined as

\( \displaystyle V_\infty^{h,\pi}(x) \coloneqq \sup_{t \geq 0}\, h(x_t^\pi) \),

where the avoid set \( \mathcal{A} \) is described as the superlevel set of some continuous constraint function \(h\):

\( \displaystyle \mathcal{A} = \{ x \mid h(x) > 0 \} \).

\(V_\infty^{h,\pi}\) contains knowledge about the invariant set, which can be used to render a (potentially unsafe) nominal policy \(\pi_\mathrm{nom}\) safe via a safety filter framework. However, deriving \(V_\infty^{h,\pi}\) over the infinite horizon is computationally intractable. Although an approximation of the policy value function \(V_\infty^{h,\pi}\) can be learned [1] , it requires certifying the neural network as a valid CBF and limits the interpretability. Furthermore, it does not consider uncertainties in the system dynamics. Therefore, we present a practical approximation of Robust Policy CBFs.

We define the robust policy value function equivalent of the above introduced value function as

\( \displaystyle V_\infty^{h,\pi}(x) \coloneqq \sup_{t \geq 0}\, \sup_{d(\cdot)} h(x_t^\pi) \).
Since this formulation is computationally intractable we propose a practical (finite-time and sampling-based) approximation:
\( \displaystyle V_{T,N}^{h,\pi}(x_0) \coloneqq \max_{i = 1,\ldots,N}\sup_{0 \leq t < T} h(x_t^i) \)

More details can be found in Finite-Time Approximation (of Infinite Time), Sampling-based Approximation (of Worst-Case Disturbance) or in the paper.

Simulation Experiments

To assess the safety improvements brought about by the proposed RPCBF, we integrate it with Shield-MPPI [2] and evaluate the performance on AutoRally.

MPPI

MPPI GIF

Shield-MPPI-RPCBF

Shield-MPPI-RPCBF GIF

Hardware Experiments

We conduct hardware experiments on the Crazyflie platform to determine whether the proposed RPCBF can be robust to disturbances encountered in the real world. The error between a simple double integrator model and the true dynamics is treated as an acceleration disturbance. We randomly generate a (possibly unsafe) nominal trajectory which might intersect with the cylindrical obstacle and is tracked using the onboard PID. The controller runs at 100 Hz.

PCBF Safety Filter enters Obstacle

RPCBF Safety Filter is safe

Finite-Time Approximation (of Infinite Time)

We introduce a approximation by considering a finite horizon \(T\) :

\( \displaystyle V_\infty^{h,\pi}(x_0) \coloneqq \max \{ \sup_{0 \leq t < T}\, h(x_t^\pi) , V_\infty^{h,\pi}(x_T)\} \approx \underbrace{\sup_{0\leq t < T} h(x_t^\pi)}_{\coloneqq V^{h,\pi}_T(x_0) }\),

Sampling-Based Approximation (of Worst-Case Disturbance)

We further introduce a sampling-based approximation of the worst-case disturbance over the finite horizon. We take the worst-case out of \( N \) sampled disturbance trajectories:

\( \displaystyle V_{T,N}^{h,\pi}(x_0) \coloneqq \max_{i = 1,\ldots,N}\sup_{0 \leq t < T} h(x_t^i) \).

Supplementary Video

Related Works

  1. 1. Oswin So, Zachary Serlin, Makai Mann, Jake Gonzales, Kwesi Rutledge, Nicholas Roy, and Chuchu Fan, "How to Train Your Neural Control Barrier Function: Learning Safety Filters for Complex Input-Constrained Systems" , IEEE International Conference on Robotics and Automation (ICRA) , 2024
  2. 2. Ji Yin, Charles Dawson, Chuchu Fan, and Panagiotis Tsiotras, "Shield model predictive path integral: A computationally efficient robust MPC method using control barrier functions" , IEEE Robotics and Automation Letters , 2023

Abstract

Control Barrier Functions (CBFs) have proven to be an effective tool for performing safe control synthesis for nonlinear systems. However, guaranteeing safety in the presence of disturbances and input constraints for high relative degree systems is a difficult problem. In this work, we propose the Robust Policy CBF (RPCBF), a practical method of constructing CBF approximations that is easy to implement and robust to disturbances via the estimation of a value function. We demonstrate the effectiveness of our method in simulation on a variety of high relative degree input-constrained systems. Finally, we demonstrate the benefits of RPCBF in compensating for model errors on a hardware quadcopter platform by treating the model errors as disturbances.