### Data Predictive Control: Bridging Machine Learning and Controls for Cyber-Physical Systems

Machine learning and control theory are two foundational but disjoint communities. Machine learning requires data to produce models, and control systems require models to provide stability and performance guarantees to plant operations. Machine learning is widely used for regression or classification, but thus far data-driven models have not been suitable for closed-loop control of physical plants. The challenge now, with using data-driven approaches, is to close the loop for real-time control and decision making.

We present novel data-driven approaches to synthesize control-oriented models that bridge machine learning and controls. While there are many areas like building control, process control, autonomous systems etc. where this finds application, our current focus is on its application to volatile energy markets.

Essentially, all models are wrong, but some are useful. - George E. P. Box

Our algorithms generate predictive models using Regression Trees, Random Forests and Gaussian Processes for finite-time receding horizon control - where we can not only predict the state of the building, but also generate control strategies using only historical weather, schedule, set-points and electricity consumption data. We call this approach Data Predictive Control (DPC). We have shown that, for a realistic building model, control strategies generated by DPC are remarkable similar to Model Predictive Control (MPC), while being scalable at the same time unlike MPC.

Checkout my talk at Microsoft Research Redmond to know more:

### Data Predictive Control using Random Forests

The central idea behind DPC is to obtain control-oriented models using machine learning or black-box modeling, and formulate the control problem in a way that receding horizon control (RHC) can still be applied and the optimization problem can be solved efficiently.

Consider a black-box model (of a dynamical system) given by $x_{k+1}=f(x_k,u_k,d_k)$, where $x,u,d$ represent states, inputs and disturbances, respectively. Depending upon the learning algorithm, $f$ is typically nonlinear, nonconvex and sometimes nondifferentiable (as is the case with regression trees and random forests) with no closed-form expression. Such functional representations learned through black-box modeling may not be directly suitable for control and optimization as the optimization problem can be computationally intractable, or due to nondifferentiabilities we may have to settle with a sub-optimal solution using evolutionary algorithms. These problems can be eliminated by decomposing $f(x_k,u_k,d_k)= g(d_k,x_k,h(u_k)),$ where both $g$ and $h$ are learned using the data, and $h(u_k)$ is convex and differentiable, and thus suitable for optimization. The DPC algorithm with Random Forests (below) exploits this functional decomposition or separation of variables to overcome the aforementioned challenges with black-box optimization. More info available here.

### Interactive Analytics

Today’s energy dashboards are static and only process historic data or provide analyses that are baked in. For example, they are used for simple data analytics, monitoring, visualization or anomaly detection. These analyses are formulated using recommended guidelines, experience and best practices. These shortcomings of the Energy Dashboard Systems call for the need for Interactive Analytics using Data Predictive Control.

Interactive Analytics or IAX is an energy analytics engine that learns from past building usage patterns to answer queries about prediction and control set point recommendations. It uses Amazon Alexa, a cloud service which allows for natural language interaction, to procedurally generate dashboards in response to user queries. Using Data Predictive Control at the backend, it can not only predict the state of the building but also generate control strategies using only historical weather, schedule, set-points and electricity consumption data.

You can think of IAX as a Siri for querying buildings’ energy usage. It allows a user to ask queries such as “Describe the conditions for Levine Hall when power consumption is more than 1.4 MW?”, “What is the predicted consumption for tomorrow?”, “What will happen if I change the cooling temperature to 27$\mathrm{^o}$C and the chilled water temperature to 8$\mathrm{^o}$C”, “Give me some set point options to choose from for Levine Hall for tomorrow?”, “Suggest a strategy to curtail power consumption to 1.1 MW” etc. The answers to these questions or recommendations are provided visually, textually and orally. IAX provides an easy way to increase financial rewards and reduce participation risk in Demand Response programs. It predicts power consumption and generates optimal curtailment strategies with confidence.

### Local Interpretability

More than the accuracy of synthesizing control strategies using a black-box model, the building operators are interested in solutions that are also interpretable and trustworthy. Thus, the DPC recommendations should have traceability so they can be verified to be stable and safe.

There is no true interpretation of anything; interpretation is a vehicle in the service of human comprehension. - Andreas Buja

There is always a trade-off between the accuracy and interpretability of the black-box models trained using machine learning. For example, random forests or neural networks, however accurate, cannot explain why a particular prediction should be trusted. On the other hand, decision trees are highly interpretable because of the structure of the algorithm but they tend to overfit very easily and thus result in a poor accuracy. The goal here is to investigate possibility of explaining predictions from any training method like random forests and neural networks.