Learning Policy-Aware Models for Model-Based Reinforcement Learning via Transition Occupancy Matching

Publication
L4DC