UPenn

CIS 620 - Learning in Few-Labels Settings

Spring 2021, University of Pennsylvania

Dan Roth

Course Description

Machine Learning works when we have a lot of labeled data. However, in many realistic settings we do not have enough training data. In most cases this is due to semantic shift (a shift in the labels space Y) or domain shift (where the domain X of the target is different from the domain for which we have training data) but can also be due to the complexity and compositionality of the task. Some examples for these setting are:

And, of course, similar challenges exist in computer vision and other sub areas of AI.

The goal of this class is to define and understand the space of Learning in Low Labels Settings – understand the problems and the methods that have been studied for these setting. We will do this mostly in the context of natural language understanding with, possibly, some digressions to computer vision.

We will consider methods such as

And do it in the context of multiple tasks.

You will read, present and discuss papers, and work on two projects. A small, well-defined one, in the first third of the semester, and a large and open ended one in the rest of the semester.

Important Dates

Date Event
Feb 15, 2021 First Critical Survey Due
Mar 8, 2021 Second Critical Survey Due
Mar 15, 2021 Project 1 Paper Submission Deadline and Presentation
Mar 22, 2021 Project 2 Proposal Due
Mar 29, 2021 Third Critical Survey Due
Apr 5, 2021 Project 2 Progress Report and Brief Presentation
Apr 19, 2021 Fourth Critical Survey Due
Apr 26, 2021 Project 2 Final Presentation
May 5, 2021 Project 2 Due

Pre-requisites

Machine Learning class; CIS 419/519/520 or equivalent. NLP: Knowledge of NLP (equivalent to a basic Computational Linguistics/NLP class).

Time and Location

Lectures

Mon 3PM-6PM
Synchronously via Zoom

Office Hours

Mon 6PM-7PM