Oleh Rybkin

I am a Ph.D. student in the GRASP laboratory at the University of Pennsylvania advised by Kostas Daniilidis. I am now spending the Summer of 2019 at UC Berkeley with Sergey Levine and Chelsea Finn. I am interested in deep learning, computer vision, and robotics.

Previously, I received my bachelor's degree from Czech Technical University in Prague, where I also worked on camera geometry as an undergraduate researcher advised by Tomas Pajdla. For this research, I've spent time at INRIA with Josef Sivic, and TiTech with Akihiko Torii.

My name

Google Scholar  /  GitHub  /  Email  /  CV  /  LinkedIn

News
  • Jun 2019: Three new workshop papers presented at ICML and RSS workshops!
  • Apr 2019: New preprint on keyframe-based video prediction.
  • Mar 2019: I gave an invited talk on predictive models at Google, Mountain View (slides).
  • Feb 2019: I will be spending Spring and Summer 2019 at UC Berkeley with Sergey Levine and Chelsea Finn.
  • Dec 2018: Paper on discovering an agent's action space accepted to ICLR 2019 in New Orleans.
  • Jul 2018: I presented our work on discovering an agent's action space at the ICVSS 2018 in Sicily.
Research

I am broadly interested in designing learning algorithms with properties of human intelligence, which includes problems in artificial intelligence, machine perception, and cognitive robotics. My recent interests are in temporal representation learning through generative and predictive models. Specifically, I've been working toward making machines understand phenomena like agent motion, physics, and interesting moments in time through video prediction, and make use of this undertanding for control. I am also interested in intrinsic motivation and understanding human behavior through prediction.

During my bachelor's, I worked on camera geometry for structure from motion and proposed an algorithm for robust estimation of camera focal length. Check out this and my other fun projects on my GitHub page.

HEDGE: Hierarchical Event-Driven Generation
Frederik Ebert*, Karl Pertsch*, Oleh Rybkin*, Chelsea Finn, Dinesh Jayaraman, Sergey Levine
Workshop on Generative Modeling and Model-Based Reasoning for Robotics and AI at ICML, 2019
paper / poster / workshop page

We propose a hierarchical predictive model that predicts a sequence starting from the high level events and progressively fills in finer and finer details. We train the model on goal-conditioned prediction on up to 80-frames (=12.5 seconds) videos.

Visual Planning with Semi-Supervised Stochastic Action Representations
Karl Schmeckpeper, David Han, Kostas Daniilidis, Oleh Rybkin
Workshop on Generative Modeling and Model-Based Reasoning for Robotics and AI at ICML, 2019
paper / poster / workshop page

We learn to infer an action representation from either motor or sensory input by using a dual variational autoencoder. By learning a dynamics model in such semi-supervised manner, we achieve both high data efficiency and planning performance.

Perception-Driven Curiosity with Bayesian Surprise
Bernadette Bucher, Anton Arapin, Ramanan Sekar, Feifei Duan, Marc Badger, Kostas Daniilidis, Oleh Rybkin
Workshop on Combining Learning and Reasoning at RSS, 2019
paper / poster / workshop page

We learn a latent variable model for dynamics of image observations, and use it to construct an agent that maximizes Bayesian surprise of the future frames. The Bayesian agent can perform exploration that is more robust in stochastic environments than simpler prior prediction schemes.

KeyIn: Discovering Subgoal Structure with Keyframe-based Video Prediction
Karl Pertsch*, Oleh Rybkin*, Jingyun Yang, Kosta Derpanis, Joseph Lim, Kostas Daniilidis, Andrew Jaegle
Workshop on Task-Agnostic Reinforcement Learning at ICLR, 2019
project page & videos / arXiv / poster / slides / talk (1 minute) / workshop page

We discover keyframes in videos by learning to select frames that enable prediction of the entire sequence. We show that our method improves performance of hierarchical planning by finding meaningful keyframes in demonstration data.

Hover the mouse (or tap the screen) here to see the video.

Learning what you can do before doing anything
Oleh Rybkin*, Karl Pertsch*, Kosta Derpanis, Kostas Daniilidis, Andrew Jaegle
International Conference on Learning Representations (ICLR), 2019
project page & videos / paper / arXiv / poster / slides

We learn to discover an agent's action space along with a dynamics model from pure video data. After a calibration stage, the model can be used to perform model predictive control, requiring orders of magnitude fewer action-annotated videos than other methods.

Hover the mouse (or tap the screen) here to see the video.

Predicting the Future with Transformational States
Andrew Jaegle, Oleh Rybkin, Kosta Derpanis, Kostas Daniilidis
ArXiv, 2018
project page & videos / arXiv

The model predicts future video frames by learning to represent the present state of a system together with a high-level transformation that is used to produce its future state.

Hover the mouse (or tap the screen) here to see the video.

Blog

The reasonable ineffectiveness of pixel metrics for future prediction
2018

MSE loss and its variants are commonly used for training and evaluation of future prediction. But is this the right thing to do?

Hover the mouse (or tap the screen) here to see the video.

Note for undergraduate/master students

I am actively looking for students who are strongly motivated to work on a research project, including students who want to do a Master's thesis. Check out some of my work above and if you find it interesting, do send me an email!

Current mentees: Ramanan Sekar.
Previous mentees: Karl Schmeckpeper (PhD @ Penn), Anton Arapin (MS @ UChicago).

website template credit