Projects and Teams - CIS 400/401
Project: Spam Detection for Wikipedia

Students: Phillip Baker, Brittney Exline, Avantika Agrawal

Advisor: Oleg Sokolsky (sokolsky@cis) (also contact Andrew West, westand@cis.upenn)

DESCRIPTION: Wikipedia has been facing an increasing threat from spam- mers who post unproductive links on popular pages in order to boost their Google PageRank scores or to promote their personal blogs. This project will involve developing a system that automatically classifies link-spam edits in Wikipedia. We will begin by populating a corpus of edits and labeling these as spam or non-spam. We will then examine the cor- pus to develop a taxonomy of spam behavior, and extract features that can classify spam. Using a machine-learning classifier, we will then implement a real-time edit processing system for Wikipedia

Project: Autoplug

Students: Ross Boczar, Jason Suapengco, Gabriel Torres

Advisor: Rahul Mangharam

DESCRIPTION: In 2009, 15.2 million vehicles were recalled in the United States, and of those, 1.3 million were recalled by General Motors due to software issues alone. This recall resulted in more than 136 million dollars in losses for the firm. There is a need to remotely diagnose, update and certify automo- tive software for efficient warranty and safety management. This proposal will discuss the ability to build an automotive Electronic Controller Unit (ECU) test-bed to develop mecha- nisms and protocols for remote diagnosis, programming and testing of future vehicles. Using this electronic test-bed, a case study will be conducted demonstrating the viability of remote warranty and software management via AutoPlug. The ECU test-bed can also be used as a standard for other testing to occur on the modern car’s electronic system and opens doors to third-party development.

Project: Project Recomendation System

Students: Edward Siegel, Vikas Shanbhogue, Bennett Blazei

Advisor: Zach Ives

DESCRIPTION: Provide an intuitive system which can recommend electronic products to non-technically savvy customers.

Project:Hermes: The Free OBD Project

Students: Evan Hyde Michael Ottavi-Brannon

Advisor: Oleg Sokolsky

DESCRIPTION: As of 2008, the United States has over 255 million regis- tered vehicles traveling over its roadways. Although au- tomobiles provide an effective means of moving people and possessions from one location to another, they are compli- cated, error-prone systems. In order to increase the safety and lifespan of cars, auto manufactures have equipped them with error lights and warning messages to inform the opera- tor of system malfunctions. However, these measures are of very little use to the average driver, providing little informa- tion about the source of the error and even less information about cost-effective solutions. The complexity of the automo- bile has outpaced, and for good reason, the efficacy of vehicle warning systems. Symbols to indicate various malfunctions have become difficult to interpret, if not confusing. To improve the car ownership experience, individuals should be able to cheaply diagnose and resolve basic vehicle problems without having to resort to a vehicle specialist. hermes: The OBD Free Project will connect vehicle owners with crowd- sourced solutions to common vehicle problems. By provid- ing higher levels of consumer access to vehicle health and well-being, customers will be able to make more cost effec- tive decisions regarding vehicle maintenance and claim more ownership of their vehicles.

Project: Vehicle Traffic Simulation

Students: Joseph Weinhoffer Fen Fei Yang

Advisor: Norm Badler

DESCRIPTION: Abstract: Computer graphics has recently become steadily more applicable for use in technological devices and enter- tainment. Along with that development, the use of virtual crowd simulations has gained popularity and necessity in creating realistic movies, video games, and training simu- lations, among others. An important piece of designing a realistic population simulator is to incorporate automobiles, traffic, road networks, and the associated rules and regula- tions to allow those vehicles to properly interact with other vehicles and pedestrians. Many traffic simulators currently exist independently of a crowd simulator with human agents. The goal of this project is to design a traffic simulation system that can be incor- porated into an already existing crowd simulation system, full scale 3D model city, and road network. Currently the crowd simulator does not include vehicles, and is therefore not entirely realistic. By including a traffic simulation the environment will gain another level of detail and realism.

Project: Real-time Processing of Multiplexed Data Acquired via Flexible Active Electrode Arrays

Students: , Robert B. Yaffe, Daniel S. Rosenthal

Advisor: Brian Litt

DESCRIPTION:build a real-time online processing system for the new generation of flexible active elec- tronics. The real-time system will utilize parallel processing both of the CPU and of GPUs. Real-time evoked response averages will be calculated and a colormap of real-time fea- tures will be displayed. Frequency domain analysis will be performed. The intent of this project is to provide experi- menters using these electrodes with tools to analyze the re- sults of their experiments in real-time and to adjust experi- mental parameters accordingly.

Project: Polynomial Time Approximation of Nash Equilibrium

Students: , Ryan Menezes

Advisor: Sanjeev Khanna

DESCRIPTION: Develope a poly-time approximation of the nash equilibrium computation problem.

Project: Applying Distributed Constraint Solving to Policy-Based Channel Selection in Wireless Radio Networks

Students: Yash Saini, Aditi Jain

Advisor: Boon Thau Loo

DESCRIPTION: There are two specific problems that we are working to ad- dress. The first is developing a declarative distributed con- straint solving platform. This platform will focus on comput- ing optimized solutions to network-based constraint satisfac- tion problems. The platform will be developed through inte- grating RapidNet (an open source declarative language based development toolkit for implementing network protocols and providing simulation and experimentation capabilities) with Gecode (an open source constraint solving toolkit). There are many use cases for the proposed platform. Some examples include policy-based channel selection, firewall configuration, and cloud configuration. The second problem we are addressing is the implementa- tion of one use case for the platform: policy-based channel selection. This application will be created using our devel- oped constraint solving platform, and its performance will subsequently be assessed using the ns-3 network simulator.

Project: Recognizing Violence in Movies

Students: Lei Kang, Matteus Pan

Advisor: Ben Taskar (and Ben Sapp)

DESCRIPTION:We propose a system for automatic detection of violence in movies, which was inspired by the manual work conducted by the Annenberg-Robert Wood Johnson Coding of Health and Media Project (CHAMP) to map the correlation be- tween portrayals of violence in the media and violence in reality. Analyzing the movie at the shot level, we extract three classes of feature sets – action interest points over the spatio-temporal domain, pose/object detection over the spa- tial domain, and audio event detection – and build a predic- tive model using support vector machines (SVMs) to clas- sify each shot as violent or non-violent. Most action recog- nition research is tested on very structured datasets, with little motion blur, occlusion, or background clutter. Of the few systems dealing with action recognition in realistic set- tings, our system is the first to consider both object and audio information, recognizing that many violence-related actions possess characteristic sounds and objects, such as gunshots and weapons. Finally, we also filter interest points in the spatial and spatio-temporal domain by running human body detection algorithms, only maintaining interest points that fall within the boundaries of the human body. Our current implementation, employing action recognition using spatial- temporal interest points STIP without filtering only, achieves a binary classification accuracy of 56%. Significant improve- ment is very possible, because the system performs well and pooly in very predictable areas, implying that additional in- formation will have large benefits. The final system will be evaluated by its accuracy improvement over the individual components, as well as over other comparable systems.

Project: Pennochio - Mapping Motion Capture onto a Humanoid Robot in Real Time

Students: H. Anthony Arena

Advisor: C.J. Taylor

DESCRIPTION:Use motion capture techniques to allow a human operator to remote control a humanoid robot.

Project: ARMADA: Autonomous Robotic Mail And Delivery Assistant

Students: Lauren Frazier Zachary Meister

Advisor: Jianbo Shi

DESCRIPTION: Offload resource intensive processing tasks to off board computer in order to build an effective robotic mail deliver system using the Pr-2 robot.

Project: SmarterCIS

Students: Parijat Sarkar, Toon Sripatanaskul

Advisor: Zach Ives

DESCRIPTION:Previous work in the ASPEN project has resulted in ex- tending data integration techniques to the distributed stream world, while adding new abstractions for physical phenom- ena. This paper looks to extend the ASPEN project through the development of the SmarterCIS application. This ap- plication would change the scope of the SmartCIS monitor- ing system from a one node monitoring system to a clus- ter level monitoring one. Additionally, SmarterCIS would have a manual control system which allows the user to effec- tively manage the resources in a cluster. This would result in power saving without any comprise on performance. The expected contributions of this work are the integration of Xen virtual machine monitoring system and the Ganglia architecture into the ASPEN runtime system. By leveraging the added functionalities of these systems, SmarterCIS aims to be able to be a more accurate predictive resource manage- ment system by combining data from various sensors (in- cluding physical, virtual and web based ones) and allowing increased control on resource management actions such as virtual image migration. Finally, an experiment will be car- ried out to validate the effectiveness of this application where resource consumption and performance data with and with- out the SmarterCIS system will be compared. In the future, the aim is that this project can be scaled up to larger clusters and eventually to entire data centers..

Project: Biologically Motivated Approaches to Speech Recognition

Students:Mishal Awadah, Robert Hass, John P. Mayer

Advisor: Mark Liberman

DESCRIPTION: While some knowledge about language processing in humans has been incorporated in automated speech recognition, a number of psycholinguistic ideas have yet to be seriously pursued. Hierarchical Temporal Memory networks (HTMs), which are machine learning tools designed to emulate compu- tation in neural circuits, have not (as far as we know) been applied to the task of phoneme identification. The motor theory of speech perception, which suggests that the mean- ingful elements of speech are not auditory but articulatory, likewise, has yet to make its way into a computer speech recognition system. Finally, a variety of representations of sound data are available, some more faithful than others to human audition. Our project is to assess the performance of these ideas on speech recognition.

Project: Haplotyping

Students: Loius Bergelson

Advisor:Lyle Ungar


Project:Temporal Browsing and Visualization of Large News Corpora

Students: Jeremy Liu, Antony Vo, Qian Wang

Advisor:Ani Nenkova

DESCRIPTION: We aim to present and implement a new interface for browsing and visualizing news corpora using temporal in- formation. Our goal is to go beyond current search and vi- sualization techniques which place an emphasis on what ar- ticles are most relevant or most similar to a given search term in a short time frame. Instead, we aim to browse and search over all time frames. We would like to be able to automatically determine the articles of the news that reflect major events of any time period and suggest both historically and current interesting events. In addition, given a topic of interest, we want to be able to similar news articles on dif- ferent topics that co-occur with the original topic. In order to carry out this task, we will use techniques to index and efficiently search the massive New York Times corpus with over 1.8 million articles. In addition, we must combine effi- cient search with algorithms to determine which articles are of temporal interest. In order to determine temporal rele- vance, we will use techniques in computational linguistics, information retrieval, and statistics.

Project: Roicrop: An Efficient System for Tracing and Tracking Rigid and Semiflexible Filaments

Students: Victor Janmey

Advisor:Kostas Daniilidis

DESCRIPTION: Detect macromolecular filaments via image recognition.