Project Ideas - CIS 400/401
IDEA: Remotely operated Amateur Radio station

PROFESSOR: Jonathan M. Smith (Email: jms at cis)

DESCRIPTION: Amateur (HAM) Radio has a long history at Penn and we are reestablishing a facility in the Engineering School, which is ultimately intended to be operable via the Internet. This will involve erecting a new antenna, connecting a radio to it, controlling the radio with a PC, and providing authenticated remote operation of the transceiver. There are many aspects to working on HAM-related projects, ranging from learning about radio communications systems and using and programming software radios, to multiuser management and network security.

IDEA: Prototyping Quantitative Trust Management (QTM) systems

PROFESSOR: Jonathan M. Smith (Email: jms at cis)

DESCRIPTION: Trust management is an approach to authorization which allows one party in a system to authorize another party to carry out some action. In a distributed system, cryptography is used to provide digital signatures which can be used to validate the rights of the authorizer. Powerful as trust management is, it presumes authorizers can be trusted, among other assumptions. It may not fully account for complex circumstances.

Reputation-based trust management, on the other hand, incorporates an element of risk analysis, based on actors and their actions. Quantitative trust management combines features from policy-based trust management and reputation-based trust management to enable complex and flexible policies to be interpreted dynamically, taking context into account as this can be incorporated into risk analysis.

The essence of this project is to extend the QuanTM system being implemented at Penn and to apply it to some networked applications. This project may involve considerable interaction with graduate students.

IDEA: Learning templates from radiology reports

PROFESSOR: Lyle Ungar (Email: ungar at cis)
Co-advisor: Prof. Curtis Langlotz (Radiology, School of Medicine)

DESCRIPTION: Health care providers often dictate their reports using templates that contain slots that can be filled with a variety of different procedures, measurements or findings. These templates are currently challenging to write. We are developing a sequence alignment method that uses dynamic programming to efficiently extract templates that are common across sets of reports. 

This project will involve developing specialized sequence alignment code (in python) for this problem, scaling it up to handle the two million radiology reports we have, and analysing the extracted templates and the frequencies of the contents of their slots. We have close collaboration the Medical School to assure that our results are clinically relevant.

IDEA: Learning "state models" of language

PROFESSOR: Lyle Ungar (Email: ungar at cis)
Co-advisor: Prof. Dean Foster (Statistics)

This project involves building statistical models to predict a variety of properties of the words in Wikipedia (and other corpora), including what entity type (E.g., person, place, organization ...) the words are, what they link to, and what part of speech they are. We do this using an alternative to Hidden Markov Models (HMMs) based on canonical correlation analysis, CCA, a generalization of Principle Component Analysis (PCA). Students should bring an interest in machine learning or natural language processing.

IDEA: Use of social networking information for target marketing

PROFESSOR: Lyle Ungar (Email: ungar at cis)
Co-advisor: Prof. Shawndra Hill (OPIM, Wharton)

DESCRIPTION: People who are linked in social networks such as Facebook or Linkedin tend to be more similar to each other than random people (they are "homophilous"), and tend to have more similar purchase patterns. We would like to better understand how social networks can be used to predict purchase patterns and hence to improve targeted marketing.

In this work we will test several hypotheses:
(1) Product categories differ in how powerfully users that are "close" in the social network share product preferences.
1a) Predictions for product categories that are more strongly segmented will benefit more from use of social networks.
(2) Demographics (age, gender, geographic location) can be predicted using social networks. Using these predicted demographics can improve accuracy of predicting purchase patterns, but will not capture the full effect of social network links.
(3) Different kinds of links exist, such as close friends, colleagues, and family. Different link types offer differing benefits for predicting product preference. These link types can either be explicit (e.g. Facebook vs. Linkedin) or implicit, and found by analyzing the structure of the social graph.

IDEA: A Universal Synchronization Substrate for iPhone Applications

PROFESSOR: Benjamin Pierce (Email: bcpierce at cis)

DESCRIPTION: There are scores of iPhone applications that perform various sorts of synchronization, storing files or data objects both on the local device and on a desktop machine or a server somewhere in the cloud.  Unfortunately, the implementations of these services are mostly quite clunky -- in most, synchronization must be initiated manually instead of being automatically performed as needed, there are various complicated setup procedures, etc.  

The goals of this project are (a) to design a really smooth, easy to use synchronization service for sharing data between iPhones and other machines and (b) to package it as a library that can be used by other iPhone developers to build a new generation of synchronizable apps.

This is a challenging project, requiring picking up a good deal of background in iPhone development, distributed programming, and synchronization algorithms -- definitely not for the faint hearted.  There is plenty of meat for a two-person team.

IDEA: A Lightweight Distributed Filesystem

PROFESSOR: Benjamin Pierce (Email: bcpierce at cis)

DESCRIPTION: Despite decades of research in advanced filesystems, replicating files across multiple hosts on the internet remains an esoteric practice.  The purpose of this project is to adapt the Unison file synchronizer (Google "unison") to make it behave more like a true distributed filesystem.  Specific ideas for extensions include: (1) making Unison synchronize filesystem changes continuously, rather than just when asked to, (2) making unison work with "passive" servers such as webdav, and (3) extending Unison's internal archive data structure to permit synchronization of more than two hosts at a time.

This is a difficult project: Unison is written in OCaml (for software engineering, speed, and portability), and its internals are fairly complex.  Moreover, since it is widely used, there is a heavy emphasis on stability, code quality, and testing for any changes that are going back into the main distribution.  The project is probably most appropriate for a two-person team.

In return for these challenges, the project offers an opportunity to get your hands dirty with a real distributed software system, and potentially to improve the lives of a large user community.

IDEA: Interested in robots, cameras, control systems, and GPUs?

PROFESSOR:  Daniel E. Koditschek
Galen Clark Haynes, Postdoc (Email:  gchaynes at seas)
Berkay Deniz Ilhan, Ph.D. Candidate

We are proposing a Senior Design project whose goal is to integrate lots of tiny camera modules onto an existing robot, RHex. Project goals include the design and integration of several embedded systems, software development of device drivers and high-level programming, as well as the use of computer vision techniques to control the mobile robot, potentially making use of techniques such as CUDA to access the power of mobile GPUs.

The full project proposal is located here:

Two talented and highly motivated students (EE or CIS) are sought to join an existing group of EE students.

IDEA: Auto-Plug: Open Automotive Architecture for Plug-n-Play Services

PROFESSOR:  Rahul Mangharam (Email:  rahulm at seas)
Willy Bernal, Ph.D. Candidate (Email:  willyg at seas)

This is an open system and network architecture for Plug-n-Play services for 3rd party hardware devices and software modules. It allows vehicles to become extensible, customizable, and more integrated with evolving technology over the lifetime of the vehicle. We are looking for two motivated students to work in the project.

The full project proposal is located here:

For further information, please visit

IDEA: Application-aware Anonymity (A3)

PROFESSOR:  Boon Thau Loo (Email:  boonloo at seas)

The A3 system is a distributed peer-to-peer service that provides high performance anonymity "for the masses". A3 allows applications to construct anonymous Onion paths that adhere to application specific constraints (e.g., end-to-end latency).  We are looking for a team of 2 students to help aid in the design and development of the A3 system. We are planning a code release plus a deployment on PlanetLab as a service ( The project will also involve building a secure coordinate system called Veracity. 

Please refer to our project website for more information.

IDEA: Programmable Routers with RapidNet/OpenFlow

PROFESSOR:  Boon Thau Loo (Email:  boonloo at seas)

OpenFlow ( is an open standard that enables one to develop and deploy experimental protocols in production networks. It provides a mechanism to add programmability to routers that support the OpenFlow API. We are looking for a team of 2 students to integrate RapidNet (, a declarative networking toolkit developed within Penn with OpenFlow on a target platform of linux PCs with multple NICs. The project will also involve coming up with new features in routers enabled by programmability, including network monitoring, rule-based access control, active networking, etc.

In addition to the above two projects, we are also looking for students to participate in the DS2 ( project where we have well-defined senior projects available. 

For more information, please contact Prof. Loo.

IDEA: Visualization of graph search-based planning

PROFESSOR:  Maxim Likhachev (Email:  maximl at seas)

Graph searches are often very difficult to debug: the graphs they are used on are often very large, have high branching factor and have highly non-planar structure. This project involves the development of a generic visualization tool to simplify the debugging process of graph searches. You would concentrate on the visualization of certain class of graph searches such as Dijkstra's and BFS. The project will require programming in C/C++.