Oleh Rybkin

I am a second year Ph.D. student in the GRASP laboratory at the University of Pennsylvania, where I work on deep learning and computer vision with Kostas Daniilidis.

Previously, I received my bachelor's degree from Czech Technical University in Prague, where I also worked as an undergraduate researcher advised by Tomas Pajdla. For this research, I've spent two summers at INRIA and TiTech, with Josef Sivic and Akihiko Torii respectively.

My name is best pronounced as "Oleg". I also prefer being called that in less formal writing.

Google Scholar  /  GitHub  /  Email  /  CV  /  LinkedIn

  • May 2019: We will be presenting a poster at ICLR 2019 in New Orleans.
  • Dec 2018: We presented our work at the Infer2Control workshop at NeurIPS 2018 in Montreal.
  • Jul 2018: I presented our work at the ICVSS 2018 in Sicily.
  • Jun 2018: We presented our work at the LAIR workshop at RSS 2018 in Pittsburgh.

My general interest is in creating neural network models that advance our computational understanding of cognition, which is broad and encompasses artificial intelligence, machine perception, and cognitive robotics. Recently, I've been working on making machines understand motion and intuitive physics through video prediction. I am also exploring several ideas related to intrinsic curiosity, and meta-learning.

During my bachelor's, I worked on camera geometry for structure from motion. Check out this and my other fun projects on my GitHub page.

Learning what you can do before doing anything
Oleh Rybkin*, Karl Pertsch*, Kosta Derpanis, Kostas Daniilidis, Andrew Jaegle
International Conference on Learning Representations (ICLR), 2019
project page & videos / paper / arXiv / poster

The method learns the action space of a robot from pure video data together with a predictive model. It can be used for model predictive control, and requires orders of magnitude fewer videos annotated with actions .

Hover the mouse (or tap the screen) here to see the video.

Predicting the Future with Transformational States
Andrew Jaegle, Oleh Rybkin, Kosta Derpanis, Kostas Daniilidis
ArXiv, 2018
project page & videos / arXiv

The model predicts future video frames by learning to represent the present state of a system together with a high-level transformation that is used to produce its future state.

Hover the mouse (or tap the screen) here to see the video.


The reasonable ineffectiveness of pixel metrics for future prediction

MSE loss and its variants are commonly used for training and evaluation of future prediction. But is this the right thing to do?

Hover the mouse (or tap the screen) here to see the video.

website template credit