Are Learned Perception-Based Controllers Bound by the Limits of Robust Control?


The difficulty of optimal control problems has classically been characterized in terms of system properties such as minimum eigenvalues of controllability/observability gramians. We revisit these characterizations in the context of the increasing popularity of data-driven techniques like reinforcement learning (RL) in control settings where input observations are high-dimensional images and transition dynamics are not known beforehand. Specifically, we ask: to what extent are quantifiable control and perceptual difficulty metrics of a control task predictive of the performance of various families of data-driven controllers? We modulate two different types of partial observability in a cartpole “stick-balancing” problem – the height of one visible fixation point on the cartpole, which can be used to tune fundamental limits of performance achievable by any controller, and by using depth or RGB image observations of the scene, we add different levels of perception noise without affecting system dynamics. In these settings, we empirically study two popular families of controllers: RL and system identification-based H infinity control, using visually estimated system state. Our results show the fundamental limits of robust control have corresponding implications for the sample-efficiency and performance of learned perception-based controllers.