Oleh Rybkin

I am a Ph.D. student in the GRASP laboratory at the University of Pennsylvania advised by Kostas Daniilidis. I am interested in deep learning, computer vision, and robotics. Most of my recent work concerns deep predictive models of videos.

I received my bachelor's degree from Czech Technical University in Prague, where I worked with Tomas Pajdla. I've spent time at INRIA with Josef Sivic, TiTech with Akihiko Torii, and UC Berkeley with Sergey Levine and Chelsea Finn.

My name

Google Scholar  /  GitHub  /  Email  /  CV  /  LinkedIn

  • Dec 2019: A preprint on learning predictive models from observation and interaction is out.
  • Jun 2019: Three new workshop papers presented at ICML and RSS workshops!
  • Apr 2019: New preprint on keyframe-based video prediction.
  • Mar 2019: I gave an invited talk on predictive models at Google, Mountain View (slides).
  • Feb 2019: I will be spending Spring and Summer 2019 at UC Berkeley with Sergey Levine and Chelsea Finn.
  • Dec 2018: Paper on discovering an agent's action space accepted to ICLR 2019 in New Orleans.
  • Jul 2018: I presented our work on discovering an agent's action space at the ICVSS 2018 in Sicily.

I am interested in building agents that are capable of predicting the future and using this prediction capability to act in the world. I believe that using vision as sensing modality is crucial for making such agents general purpose, and testing these algorithms on a real robotic system is one of the only sure ways to make progress towards intelligence. My recent work in this area involves machines trying to understand agent motion, physics, interesting moments in time, and human behavior, as well as intrinsically motivated machines.

During my bachelor's, I worked on camera geometry for structure from motion and proposed an algorithm for robust estimation of camera focal length. Check out this and my other fun projects on my GitHub page.

Learning Predictive Models From Observation and Interaction
Karl Schmeckpeper, Annie Xie, Oleh Rybkin, Stephen Tian, Kostas Daniilidis, Sergey Levine, Chelsea Finn
ArXiv, 2019
Workshop on Generative Modeling and Model-Based Reasoning for Robotics and AI at ICML, 2019
project page & videos / arXiv / workshop version

We are able to learn action representations that generalize between robot data and passive observations of other agents (e.g. humans). This enables the use of additional diverse sources of data to train models for robotic control.

Hover the mouse (or tap the screen) here to see the video.

HEDGE: Hierarchical Event-Driven Generation
Frederik Ebert*, Karl Pertsch*, Oleh Rybkin*, Chelsea Finn, Dinesh Jayaraman, Sergey Levine
Workshop on Generative Modeling and Model-Based Reasoning for Robotics and AI at ICML, 2019
paper / poster / workshop page

We propose a hierarchical predictive model that predicts a sequence starting from the high level events and progressively fills in finer and finer details. We train the model on goal-conditioned prediction on up to 80-frames (=12.5 seconds) videos.

KeyIn: Discovering Subgoal Structure with Keyframe-based Video Prediction
Karl Pertsch*, Oleh Rybkin*, Jingyun Yang, Kosta Derpanis, Joseph Lim, Kostas Daniilidis, Andrew Jaegle
Workshop on Task-Agnostic Reinforcement Learning at ICLR, 2019
project page & videos / arXiv / poster / slides / talk (1 minute) / workshop page

We discover keyframes in videos by learning to select frames that enable prediction of the entire sequence. By using the keyframe structure of the data for prediction, our method is further able to perform planning for longer horizons.

Hover the mouse (or tap the screen) here to see the video.

Learning what you can do before doing anything
Oleh Rybkin*, Karl Pertsch*, Kosta Derpanis, Kostas Daniilidis, Andrew Jaegle
International Conference on Learning Representations (ICLR), 2019
project page & videos / paper / arXiv / poster / slides

We learn to discover an agent's action space along with a dynamics model from pure video data. The model can be used for model predictive control, requiring orders of magnitude fewer action-annotated videos than other methods.

Hover the mouse (or tap the screen) here to see the video.


The reasonable ineffectiveness of pixel metrics for future prediction

MSE loss and its variants are commonly used for training and evaluation of future prediction. But is this the right thing to do?

Hover the mouse (or tap the screen) here to see the video.

Science reading list

The Structure of Scientific Revolutions, Thomas S. Kuhn.
Vision, David C. Marr.

Computing Machinery and Intelligence, Alan M. Turing.
The importance of stupidity in scientific research, Martin A. Schwartz.
As we may think, Vannevar Bush.

Note for undergraduate/master students

I am actively looking for students who are strongly motivated to work on a research project, including students who want to do a Master's thesis. Check out some of my work above and if you find it interesting, do send me an email!

Current mentees: Ramanan Sekar, Shenghao Zhou.
Previous mentees: Karl Schmeckpeper (PhD @ Penn), Anton Arapin (MS @ UChicago).

website template credit