CIS senior project First Place award
Gesture-Based Interaction with Home Automation Devices Through Sonar Techniques
Student: Tyler Altenhofen
Advisor: Kostas Daniilidis
Integrating a home with internet connected devices (IoT devices) allows users to control and monitor the state of their house while away. Unfortunately, the interface that makes it possible to interact with the abundance of devices present in the average home makes for a crowded and clunky solution when it comes to controlling IoT devices in your direct vicinity. Often times it’s easier to flip a light switch rather than navigate large menus on a tiny screen. This is why I have designed a gesture-based system for controlling the IoT devices that are right around you. With my protocol, a user simply waves a phone or tablet in the direction of the IoT device they wish to interact with. This will then execute an action (e.g. turning a light on) or present the user with a set of controls (e.g. a television remote) depending on the device. Identifying the IoT device that is being waved towards is accomplished by having the waved phone emit a series of ultrasonic sounds at a constant rate. When the phone is moved towards an IoT device, the sounds are picked up earlier than normal and when the phone moves farther away they are received later. This allows each IoT device to plot out a path of distances between the phone and itself. We then select the IoT device with a path that most resembles a wave gesture. Currently my solution identifies the correct IoT device with approximately 90% accuracy in a normal home environment.
CIS senior project Second Place award
D.A.R.L.I.N.A. - The Distributed Alternately Routed Local INternet Aggregator
Students: Alex Liao, Nathaniel Chan
Advisor: Zachary Ives
Bandwidth limits on a per-device basis are employed by many public wireless hotspots. While beneficial for the hotspot operators, the slowing of internet bandwidth significantly degrades end-user experience. This is the problem that D.A.R.L.I.N.A. solves, by building on two key observations. First, common internet usage patterns are burst-like, meaning that devices use almost no internet bandwidth most of the time. Second, devices often have excess communication capacity that can be used for local communication. D.A.R.L.I.N.A. leverages these two observations using a network routing system that allows for effective pooling of bandwidth across multiple laptop devices. A local network is established dynamically using WiFi-Direct, and network traffic from any one device is routed through all of the devices using custom-software built on the SOCKS5 protocol. This means that network traffic is distributed over the pool of devices, allowing a single device to utilize all of the aggregate internet bandwidth available. In testing, the system demonstrated significant improvements to single-device network throughput with low overhead. File download speeds scaled linearly with the pool size, and utilized over 90% of aggregate internet bandwidth. Improvements in web page loading times were also observed, with up to 35% speed-ups. Testing also revealed that pooled internet bandwidth was distributed equally in heavy load scenarios, where each device in the pool demanded full utilization. In other words, no device saw its minimum on-demand bandwidth decrease significantly through the use of the system.
CIS senior project Third Place award
Students: David Freifelder, Kunal Malhotra
Advisor: Kostas Daniilidis
ShadowDrone aims to utilize gesture control to command a semi-autonomous drone in various simple tasks. The system allows the user to control the drone via a custom gesture capture device and command set, such as taking off, landing, simple directional input, and a return function. The system is intended to be intuitive a minimalistic such that it may be easily learned, as well as portable, using an embedded system and wireless components.
CIS senior project honorable mention
Manta: An Android Peer-to-Peer file sharing protocol for MANETS
Students: Tanner Haldeman, Dhriti Kishore, Pia Kochar
Advisor: Boon Thau Loo
Mobile devices play a vital role in modern communications. Specifically, sharing media files such as photos and videos allows us to share our experiences with others, transmit important visual information, and document unjust acts in a way that was previously impossible. However, the cost of cellular data (particularly during international travel) and locations with inconsistent Wi-Fi coverage limits this mode of communication. To address this problem, we have developed a protocol and system for sharing large files with nearby devices using the Wifi-P2P protocol on Android. Our system allows users to make selected local files available to others and download files owned by nearby devices. The set of Android devices in a specific region form a Mobile Ad-Hoc Network (MANET), consisting of peer-to-peer connections and continuous re-configuration (devices repeatedly forming and breaking connections with one another). This lack of network structure informed many design decisions for our protocol such as using a three-way handshake during a request to ensure files get sent accurately and not storing static information about the state of the network on any node. Our goal was to develop a protocol that scales well to very large networks as well as maintaining fault-tolerance in the presence arbitrary amounts of re-configuration. Therefore, in addition to our Android application, we developed a Python simulation that allows us to alter parameters and collect data on protocol performance.
CIS senior project honorable mention
Introducing Physically-Based Materials to the Core Specification for glTF 2.0
Student: Mohamad Moneimne
Advisors: Stephen Lane and Patrick Cozzi
The Graphics Library Transmission Format (glTF) is a file format that is increasingly becoming the industry standard for encoding complex, three-dimensional scenes in a light-weight and portable manner. Although it fulfills its role in transmitting 3D data efficiently, it doesnäó»t support a powerful asset called Physically-Based Materials. These materials have the ability to represent lifelike substances by simulating the interaction between light and surfaces at a microscopic level. With a clear demand for this capability within industry, the graphics community began working to introduce Physically-Based Materials to the core specification of glTF over the past year. Throughout this process, my goal was to make this information accessible to developers around the world. With this in mind, I created an open source reference implementation for Physically-Based Rendering in WebGL and wrote a tutorial that explains the theory behind the method. This project has been adopted by the graphics open source community, directly influencing the transition from glTF 1.0 to glTF 2.0 and empowering developers to create highly realistic 3D content at interactive rates.
CIS senior project honorable mention
Distributed Computational Number Theory
Students: Joshua Fried, Gabriel Naghi
Advisor: Jonathan M. Smith
Strong cryptography is the cornerstone of secure online communication today. Several of the most commonly used cryptographic schemes rely on the computational difficulty of solving mathematical problems. One such scheme, RSA, relies on the difficulty of factoring composite integers into their prime factors. With the invention of quantum computers on the horizon, cryptographers believe that current forms of RSA will be rendered useless. A possible solution to this problem that is being explored involves the use of significantly larger integers. However, widely-available big number libraries, such as GNU Multiple Precision Library (GMP), are not built to work with numbers of these sizes. One of the most egregiously slow operations in this setting is the simple multiplication of pairs of integers.Distributed Computational Number Theory (DCNT) seeks to provide a solution to this problem by providing a big number software library that is inherently distributed and parallelized, and therefore, scalable. At its core, it has a powerful multiplication routine based off of Schonhage and Strassenäó»s modular multiplication algorithm. This algorithm was chosen from several paradigms that were implemented and evaluated for their efficiency during parallel execution. DCNT uses the Message Passing Interface (MPI) for interprocessor and internode communication. Using a group of networked computers, it is able to perform multiplications on terabyte-size numbers in under thirty minutes. By contrast, running the same operation in GMP requires days of computation and extreme amounts of concentrated resources. DCNTäó»s demonstrated success has led to its active use in cryptographic research.
CIS senior project honorable mention
Finding a Great Burger: Data-Driven Understanding of Adjectives in Yelp Reviews
Student: Veronica Wharton
Advisor: Chris Callison-Burch
Adjectives like "fine", "good", "great", and "outstanding" are semantically similar but differ in intensity (i.e., "fine" < "good" < "great" < "outstanding"). Understanding these intensity differences is a necessary part of reasoning about natural language. We have developed techniques to automatically learn the relative relationship of scalar adjectives. Our approach is based on pairwise adjective intensity relationships that are inferred by analyzing pairs of adjectival paraphrases from the Paraphrase Database (http://paraphrase.org). We showcase this research with GrubGrader, a web application that ranks restaurants along an axis of the useräó»s choosing (e.g., quality of food, friendliness of service). GrubGrader ranks up to 86K restaurants according to the adjectives and nouns used to describe them in 2.2M Yelp reviews. GrubGrader is backed by AdjectiveAnalyzer, an API that we developed that clusters and ranks semantically similar user-given adjectives.
CIS senior project honorable mention
Eye of Horus
Students: Ajay Patel, Jeff Barg
Advisor: Nadia Heninger
The big picture goal of this project is to develop technology to automatically infer what specific page on a website a user is viewing, even when the user is using a modern cryptographically-sound HTTPS connection through the use of fingerprinting techniques and machine learning. HTTPS is believed to be practically secure and believed not leak what particular page on a website a user is viewing, even when being surveilled by a government agency like the NSA. We show that attackers may still be able to determine what page you are viewing on a particular site with our analysis. While the content of the ciphertext is theoretically secure, HTTPS leaks information like the size of the ciphertext and the timing & direction of ciphertext packets flowing over the network. Using novel network isolation techniques, we built a web crawler that was able to index and fingerprint a subset of Wikipedia pages. We then trained a machine learning model over thousands of network flow fingerprints to predict what page was being viewed. The result was a fairly accurate model for predicting which page out of the subset of indexed pages was viewed by a particular user. Our results have social implications for privacy and the state of web traffic encryption, but also have some legitimate use cases like for parents who want to monitor what their child is viewing over their home WiFi network.
CIS senior project honorable mention
Sprout: Mobile Collaboration Made Easy
Students: Reuben Abraham, JT Cho, Kieraj Mumick, Igor Pogorelskiy
Advisor: Swapneel Sheth
Collaboration is hard. Technology has rapidly evolved to improve collaborative experiences - look at Dropbox Paper and Google Docs. However, these tools have not adapted well to mobile and left the rest of developer world behind, rendering them unable to incorporate real-time technologies without the resources of these big-name companies. In order to provide a real-time collaborative experience, there were several algorithms that we considered, namely CRDT, Operational Transform, and Differential Synchronization (DiffSync). After thorough research into the three choices, we selected DiffSync due to its lightweight, low-bandwidth, and mobile-friendly nature. Our solution implements DiffSync on both client and server side. Our solution was robust enough to be successfully incorporated in a real-time mobile note-taking application. Over two weeks of heavy testing, we suffered no data loss, though this is not a guarantee. In terms of the developer user experience, developers can add a Sprout editor to their applications with just 50 lines of code. Due to some frontend bugs, our end-users have found Sprouts to be slightly more difficult to use than default text editors. Given more time, these bugs could be ironed out to make Sprout a seamless experience. Throughout the development process, we encountered the challenges of developing a real-time collaborative framework. Dealing with out of order messages proved to be a significant challenge, and we dealt with this issue using fuzzy patching logic. Given more time, we would like to further explore other known techniques for collaborative text editing, including CRDT.
CIS senior project honorable mention
RootNote: Optimizing Digital Note-Taking for Learning
Students: Lauren Leung, Memoria Matters
Advisor: Stephanie Weirich
Research has shown that using laptops and tablets to take notes leads to student distraction, and thus a decrease in learning. Education experts have identified an absence of digital note-taking software that reduces distraction and is optimized for studentsäó» in-class learning. RootNote addresses this need through an accessible note-taking web application with features chosen to encourage positive note-taking behaviors. These include PDF lecture slide insertion, text-in image uploading, TeX equation insertion, and reminders which utilize positive reinforcement to encourage students to stay on task and take detailed notes. These features were chosen by analyzing pre existing research with the help of Dr. Ryan Baker and soliciting student feedback through surveys and a focus group. During interviews, over 60% of students described text-in images as useful for transcribing diagrams or other content written on a blackboard, and 100% cited an equation editor as something they would use and did not have access to in current note-taking applications. Studentsäó» positive response to these features confirms the need for note taking software targeted towards the classroom experience.
CIS senior project honorable mention
Students: Jordan Hurwitz, Kelly Tan, Koen Van Der Hoeven
Advisor: Norman Badler
In the United States alone, 21.7% of the population aged 18+ report difficulties reaching overhead or using fingers to grasp things, with an additional 3.7% of adults aged 18-64 having independent living disabilities. This staggering number hinders productivity and lifestyles of a large percentage of the population. BrainRoom seeks to provide a hands free, assistive headset to aid individuals with disabilities. BrainRoom is a portable hands free system that automates activities of daily living in a home. By using non invasive, brain computer interfacing, our headset reads electroencephalogram (EEG) data, which, when used in coordination with visual data obtained through the Google glass, outputs targeted binary signals used to carry out day to day activities. BrainRoom uses an Emotiv EPOC along with its related software to capture and parse brain signals, while using Google Glass’ camera to effectively identify and broadcast the IDs of QR codes to a web server. For our demonstration we used an arduino controlled light bulb that accepts bluetooth input that interfaces with our software however BrainRoom’s technology can be expanded onto smart bulbs and other other wireless analogous hardware. Through a learning phase, BrainRoom the software evaluates a threshold at which neural activity will trigger hardware on or off after locking on through the glass. Through evaluation of multiple channels we were able to establish effective channels for the wearer and showed that, as the threshold of channels rise, the amount of false positives significantly drop, aiding individuals with limited mobility in their daily lives.
CIS senior project honorable mention
Beyond Accuracy: Measuring Fairness in Lachine Learning
Students: Ava Dagostino, Rodrigo Ornelas, Pranav Ramabhadran
Advisor: Aaron Roth
We built a web-system that evaluates the fairness of machine learning (ML) results to ensure that future applications explicitly account for discriminatory biases. Fairness in ML is about ensuring that algorithms are not biased with respect to sensitive attributes (e.g. race, gender) when making predictions. Currently, there exists no consensus among researchers about how to evaluate fairness; there are several competing metrics. Achieving perfect fairness by all metrics simultaneously is provably impossible, so quantifying them is necessary to find the right balance. In practice, there exists a tradeoff between fairness and accuracy, and current techniques focus almost exclusively on the latter. This problem is important because ML algorithms are increasingly being used to make decisions such as criminal sentencing, hiring, and credit scoring. Judges in the US use the ML-based COMPAS system to generate risk scores when making parole decisions. While COMPAS is similarly accurate for both black and white defendants, a study by Propublica found that the systematic way in which it makes mistakes discriminates against black defendants. Our system accepts ML results from users and returns intuitive scores and graphs that measure fairness using balance, predictive parity, and calibration. We also implemented an algorithm-agnostic post-processing technique defined by a research paper in the field. This technique equalizes true and false positive rates across groups with a minimal loss of accuracy. The combination of several fairness metrics and the identification of the accuracy-optimal point of fairness will allow future ML applications to both identify and address potential biases.
Students: Vivek Raj; Spiro Metaxas; Chad Nachiappan; JJ Lee
Advisors: Insup Lee; James Weimer
In 2010, the national cost for diabetic non-compliance and its related complications amounted to $105 billion. Furthermore, 2 out of every 3 diabetic patients do not rigorously follow their physician-prescribed regimen. And for a single patient, the cost of non-compliance is $11,000 per year. According to a 2002 NIH study, pre-diabetic patients receiving an intensive personalized behavioral modification experience were 58% less likely to develop diabetes. As a result, we built a mobile application to assist diabetic patients in complying with their prescribed medication, exercise, and diet schedule through behavior modification. Debra tracks users' adherence to a regimen, analyzes their behavior, and provides a community-oriented system to encourage behavior modification. The Debra system uses a scoring engine to rate a patient's adherence to their schedule, identify areas of improvement, and provide tailored recommendations to improve adherence. The system is tested and verified by first, patient usage, and second, surveying endocrinology specialists.
Real-World Analytics with Machine Vision
Students: Luke Carlson, Noah Shpak, Lukas Vacek
Advisor: Kostas Daniilidis
Art galleries and museums generally use intuition, not concrete data, to curate exhibits and design spaces. In conversations with gallery curators in both Philadelphia and NYC, we found that they wanted a simple and non-intrusive way to gauge audience interest and focus. Working with an avant garde gallery in Brooklyn, we collected footage of an exhibition to create a heat-map of activity that allowed the curator to extrapolate on this data and inform better design. The heat-map was generated algorithmically, shading highly visited areas red while leaving less attended sections blue in an image of the gallery. We applied a pedestrian detection algorithm pre-trained by OpenCV in order to detect gallery visitors. To handle rooms with large audiences, we also built a background subtraction algorithm where we track movement on a per pixel level and normalize counts across the full image. This project could be used to evaluate and A/B test various layouts and art pieces on display. We plan to continue working with our first gallery and other interested partners such as the Penn Museum.
Parsley - Syntax Highlighting for Natural Language
Students: Alan Aquino, Matt Howard, Pranav Kunapuli
Advisor: Mitch Marcus
Often is the case that while reading grammatically complex passages, readers get confused by a sentence or sentences, and need to re-read the passage in order to properly understand. Our goal was to create an interface through which our users may read faster and better understand the page content. Our solution consists of both a back-end server and a front-end interface. The back-end server makes use of the Stanford phrase-parser to create syntax trees from paragraphs of text. The front-end interface makes a request to the server with the text from the website and then displays the resulting syntax tree using our custom highlighting rules via a Chrome Extension. Our rules were designed to emphasize the most important grammatical aspects of a sentence, such as the principle noun and verb, while drawing less attention to more complicated sections by delineating boundaries to show phrase relationships. We used Amazon Mechanical Turk to gather human test subject data to evaluate the effectiveness of our approach. While we could not confirm our hypothesis that highlighted text improved reading comprehension, we were able to conclude that there was no statistical difference in understanding between hand-parsed and machine-parsed text. We learned through the experimentation process that we needed to allot more time to gather results, allow for a wider range of data points to establish true statistical significance, and introduce variety in our testing material so that we can establish causality between our independent and dependent variables.
Students: Alex Sands, Ben Hsu, Cathy Chen
Advisor: Stephen Lane
Traveling to a foreign country is difficult because you typically cannot understand what is written around you. Often, important information is communicated in real world text, signs, and menus that are easy for native speakers of the language to read, but difficult for travelers. Augmented reality, however, presents a new opportunity to see foreign language environments through the lens of your native language. Seeing this opportunity, we built Language Vision, an application allows anyone to translate real world text into their own native language in real time, while preserving the scene around them. The application has a simple interface on the mobile frontend, but a complex sequence of API integrations and image processing techniques on the backend. It allows the user to select a language, hold their smartphone up to the scene around them, and translate the foreign language into the user's native language as if it was actually written on the sign. The application has an average word error rate of 11% when recognizing text in signs from a straight-on angle. By combining OCR, translation, phase correlation, and color detection techniques, Language Vision strives to reduce language and communication barriers between people with different backgrounds.
Shark: Online Data Analysis
Students: Alex Frias and Alex Peckman
Advisors: Swapneel Sheth and Arvind Bhusnurmath
Current methods for performing data analysis require many different tools as well as inherent knowledge of those tools. Shark is a website that provides a user-friendly experience for obtaining data and performing statistical analyses on the data with either a useräó»s own data or data scraped by Shark. Many users would like to benefit from various data analyses but have not had the available tools or know-how to do so. To accommodate this user base, Shark provides a built-in customizable web scraper as well as point-and-click tools for performing statistical analysis. By reducing the number of step, tools, and knowledge required to obtain results, Shark allows more people to engage with different types of data in meaningful ways. Shark was designed to be easily expanded upon to help accommodate more data sources and analysis types.
Streamlined Studying: An Intelligent Platform for Study Material
Students: Michael Rudow, Nickhil Nabar, Matthew Chiaravalloti, Kevin Wang
Advisors: Jonathan M. Smith, Mitchell Marcus
Conventional studying practices are constrained by the interfaces through which users engage with their study materials. The standard file system conforms to a tree structure and handwritten notes are inherently chronological; neither representation captures the inherent graphical structure of relationships between notes in a course. We used the results of a user survey to design a solution for the inefficiency of conventional study media. Streamlined Studying is a web application which allows users to organize, navigate, and annotate their course materials. Through representing connections between documents in a graphical structure and leveraging standard natural language processing techniques along with user aided refinement, we improve the interaction between the user and the course material. The results of a controlled experiment indicate that our application slightly improves the speed at which users look up information. Moreover, users offered qualitative feedback indicating that they prefer engaging with course materials via Streamlined Studying over traditional media.
Students: Rachel Chan, Doug Dolitsky, Prashant Joshi
Advisor: Jean Gallier
For our senior design project, we present Green Bin, a smart trash that can sort your recyclables and trash for you. Our idea was born when we were sitting in Engineering Cafe one day and noticed the lack of attention and care when students were throwing out their trash. Rather than putting the decision on people to decipher which bin to throw their trash into, we have transferred the decision making to a machine. For our purposes, we attached a Raspberry Pi camera to a trashcan to simulate the desired behavior of image detection. We utilized Google's Cloud Vision API to perform image recognition analysis on items that people throw out. If the API’s fail to recognize the object with 70% confidence, Green Bin will place the object in the trash. We hope our device provides a smart solution to the current inefficiencies in the way waste is being disposed on Penn’s campus.
PAL: Machine Learning For Mental Health
Students: Rachel Adducci, Lauren Datz, Harrison Huh, Anastasiya Kravchuk-Kirilyuk
Advisor: Eric Eaton
The goal of PAL is to increase mental health awareness by using a machine learning algorithm to predict an individual's psychological well-being. This project is based on the well-known connection between physical and mental health. PAL pulls physiological data from Fitbit, a commercially available activity tracker that accurately measures activity level and sleep quality, and collects self-report measures of mood, stress, and worry. Using the Fitbit data as features, the algorithm predicts the mood, worry, and stress level of the users through supervised learning. PAL shows that easily accessible biophysical features like sleep duration and step count are effective predictors of mental health. For example, the results indicate that the less sleep a user gets, the more worried they are. The success of this project demonstrates the vast potential for future applications of machine learning in mental health.
Checkowl: A Chatbot for Checkout
Students: Brent Shulman, Lee Criso, Alec Olesky, Ryan Greenberg
Advisor: Susan Davidson
CheckOwl is a solution to the difficulty of checking out while shopping on a smartphone. Over 50% of online shopping carts never make it through the checkout process, and CheckOwl will reduce this issue. Using an SMS interface and NLP, users will be able to quickly and seamlessly move through the checkout process while avoiding the hassles of multiple text fields on a small screen or distracting interruptions midway. There is currently more shopping done on mobile phones than on laptop computers, and there is clearly an issue with using a checkout process designed for computers when shopping on a smartphone. This product interacts with the use through text messages, allowing them to converse with a chat bot designed to mimic a concierge. The chat bot will collect the information necessary to complete the process, or if the user is returning, relay previously collected information to the website they are making a purchase from. This end to end process will allow users to quickly checkout while shopping on their phones, increasing the number of completed sales for websites that choose to include CheckOwl as an option.
Know Your Nyms? A Game of Semantic Relationship Discovery
Students: Ross Mechanic, Dean Fulgoni, Hannah Cutler
Advisor: Chris Callison-Burch
To understand natural language, a system must have knowledge of the semantic relationships between words. Familiar relationships include synonyms and antonyms, but others are more complex. Learning these relationships automatically with statistical techniques remains difficult in academic research. Alternatively, our goal is to learn relationships from people directly, using crowdsourcing and gamification. To this end, we developed KnowYourNyms?, an interactive web-based game in which players must accurately name semantic relationship pairs. While providing users with an engaging and educational experience, the application collects large amounts of data that can be used to improve state of the art classifiers for all types of semantic relationships. The data also broadly informs us of how people perceive the relationships between words, which can be useful for research in psychology and linguistics.
Introductory Online IDE
Students: James Park, Jose Ovalle, Holden McGinnis, Neil Wei
Advisor: Benedict Brown
The introductory experience in computer science relies heavily on the integrated development environment (IDE) used. With difficult installation, lack of compatibility, or features detrimental to the educational experience, the IDE may fail to serve its intended purpose. This is especially true in an academic setting, where students depend on instructors to guide them through the basics. Students may spend an unreasonable amount of time setting up their environments, sometimes requiring additional assistance from instructors. Existing solutions have limitations in performance, distribution, and convenience. Our solution to this problem is to expand on an existing online IDE that supports workspace cloning, Cloud9. We created a general plugin that expands Cloud9’s capabilities, such as online Java graphics support, checkstyle and compilation, and an interactive REPL for certain languages. Our evaluation included a user survey and latency measurements of the compilation/checkstyle tool. Both novice and experienced users reported a positive experience using the IDE. While the responsiveness of our features generally seems slower than their non-online counterparts, this did not seem to significantly hinder user experience.
Students: Elizabeth Nammour, Kalan Porter, Kerem Kazan
Advisor: Rajeev Alur
Efficient, low latency processing of high-volume data streams is essential in a growing number of data processing applications. StreamQRE is a stream processing language that extends linguistic constructs of relational query languages and regular expressions to provide the programmer with an expressive, natural syntax, and a high-throughput query engine. We present a multithreaded extension of the core StreamQRE language, allowing it to scale with additional computational resources. We evaluate our implementation with respect to comparable high-performance engines. Our experimental results exhibit 3.8 times the throughput of the single-threaded StreamQRE implementation, and 10 times that of its most efficient competitor. We demonstrate this speedup by benchmarking a series of complex and computationally expensive queries on network traffic data. We also discuss potential efficiency benefits across a wide cross-section of use cases including network monitoring, fraud detection, high frequency finance, and IoT data processing.
Students: Sagar Poudel, Tahmid Shahriar, Ashutosh Agrawal
Advisor: Kostas Daniilidis
More than 90% of total retail sales still occurs at brick and mortar stores and yet there is a significant difference between the online and offline shopping experience. While shopping online for example, one can very easily search for products that have discounts or whether there are promotional coupons available for a particular product. The goal of SATVision was to enhance the brick and mortar store shopping experience and thus bridge the gap between shopping online and at a physical retail store location. We tried to achieve this by building an Android application as internet enabled smartphones are ubiquitous. The application leverages augmented reality to display product information such as available discounts and promotions and allows the user to search for a product and receive navigation instructions, leading to a shopping experience more fit for the 21st century. Our application processes in-store mapping data and product information in the form of .csv and JSON files that was collected by COSY, a startup we partnered with for the project. An efficient shortest path algorithm is used to navigate our user to the particular item they are looking for. Furthermore, we use established computer vision techniques, to improve the positioning and stability of augmented reality virtual assets on the camera feed and thus provide a higher than standard AR experience.
Reanimator - Realistic Face Generation Using Forensic Sketches
Students: Josh Karnofsky, Devin Stein, Reed Rosenbluth, Yagil Burowski
Advisor: Camillo J. Taylor
Despite the abundance of technology in todayäó»s world, not every crime is captured on film. Without video evidence, police rely on forensic sketches created from eyewitness reports. These sketches are publicized on local news channels and matched against police databases. Unfortunately, there are inaccuracies associated with these methods. We sought a way to improve existing methods by leveraging recent advances in machine learning technology. Research shows that humans are better at identifying people from pictures than from sketches. Reanimator was inspired by this research, knowing that if we can generate realistic pictures from forensic sketches, then we can improve overall identification rates. In order to achieve this, we leveraged a state of the art form of neural network known as conditional adversarial networks. We trained the neural networks on datasets with photographs of people and corresponding sketches. After tweaking and training these models, they are capable of generating a realistic picture from a forensic sketch. Once the models were satisfactory, we integrated them into an iPad application that allows forensic sketch artists to either create or import an existing sketch and generate realistic faces from their sketches. To further improve our model's accuracies, the application allows for filtering based on features such as race. We conducted surveys to compare individualsäó» ability to match face sketches to corresponding real face pictures and our generated face pictures to corresponding real face pictures. Our survey results indicated a 26.6% increase in matching ability when individualsäó» identified faces using the faces generated by our model.
Maple: Quantifying Readability for Medical Papers
Students: Spencer Lake, Zack Elliot, Omar Paladines, Zhi Zheng
Advisor: Ani Nenkova
Medical papers contain important knowledge regarding patient care, yet they are often hard to understand for patients and even doctors. Maple quantifies readability of medical papers in hopes of facilitating more effective patient care. It operates as a Chrome extension that enriches the display of search results. Maple provides readability scores and offers definitions for the most difficult words within the text. In order to quantify readability, we accumulated over two hundred thousand abstracts from seven subdomains. Using this pool, we represented text in terms of how typical the words used in the text are. One class of features uses language models to characterize typical lexical usage within domains, such as newspaper, spoken, and medical. Another class of features captures the most common words in a domain. We studied the features individually and combined them in a linear regression model to predict the readability of abstracts. In order to evaluate our model, we designed a web application to gather samples of perceived readability scores from abstracts, thus allowing us to rank abstracts and their domains. Our model has statistically significant correlation to perceived difficulty ratings. Moreover we have added a component to identify the least typical words within an abstract that are then defined to enhance the readability. Maple is significant as it serves as a proof of concept in evaluating readability of technical text in not just medicine but also other highly technical fields.
Rune: Multiplayer Augmented Reality
Students: Brian Tong, Kyuil Lee
Advisor: CJ Taylor
Innovations in dedicated AR and VR hardware like Hololens, Oculus Rift and HTC Vive have enabled developers and content creators to make amazing applications and immersive experiences. However, given the cost of these early adopter systems, most people have not experienced cutting edge AR/VR. The popularity of games like Pokemon GO demonstrate the immense interest in AR content and especially multiplayer content. Part of this popularity comes from the low barrier to entry (smartphone with minimal system requirements). However, this comes at a cost of not being able to leverage newer, more powerful hardware for accurate but computation heavy AR tracking. Rune is a mobile platform for creating multiplayer marker-based AR content and is built around the idea of bringing better AR tracking to a wider audience while preserving the fun and social aspects of multiplayer content. It uses a custom designed fiducial marker for fast detection and player localization. Using OpenCV, we implement a real time detection algorithm that allows us to precisely overlay 3D content over the marker with OpenGL. Leveraging newer mobile hardware for this type of tracking gives us the ability to create more convincing and immersive AR experiences. For the multiplayer aspect, we implement a simple client-server model. Aggregation of user inputs, physics simulations and general state changes are carried out on the server and broadcast to all clients. This allows us to maintain a consistent state across clients while also reducing each client’s computation load. To demonstrate the features of the platform, we implemented a small multiplayer game.
Fixing Natural Occlusions in Facial Detection and Recognition
Student: Alexander Piatski
Advisors: Jianbo Shi and Joao Sedoc
Facial Detection (FD) and Recognition (FR) in modern biometric applications has pervaded various parts of everyday life, especially in Europe and Asia. However, these applications are still prone to non-ideal image samples, particularly those containing natural facial occlusions such as glasses or facial hair. The project investigated FR and FD results in occluded facial profiles, and found a few ways to improve on them. The project took a benchmarked, open-source facial dataset to train a convoluted neural network (CNN) to work as a Generative Adversarial Network (GAN) to generate false facial images. A Haar-Cascade facial detector was then modified to infer various natural occlusions. The GAN would then generate a false facial segment on the masked part of the image. In general, the GAN reconstructions were not accurate enough to lead to improved levels of FR. However, they did lead to improved FD across various commercial, off-the-shelf detectors. Because most real-world FR systems don't take an image sample until a face is detected, FD improvements lead to increased efficiency of the FR system. Future work done in the area should find or create a large database of individuals both with and without occlusions. Currently, most massive open-source facial datasets only have individuals in a single category. This would allow for FR comparisons on an individual-level, something that ended up outside the scope of this project.
Synchronized Video Viewer
Students: David Liao, Yoojin Kim, Zhan Xiong Chin
Advisor: Boon Thau Loo
Number of remote teams have been growing. Online platforms that facilitate remote collaboration have been taking an essential role in the workflow of those teams. However, most of the platforms are based on the text. This project creates a platform that supports collaborative video viewing experience. Synchronized Video Viewer creates a virtual room that allows synchronized video watching and other features, as though viewers are in the same room. Each user can create private rooms and invite other users. Each room support synchronized playback of the selected video, including play, pause, and video time change. Users can select a video to watch or stream previously selected videos. While streaming, users can mark on video using canvas tools, including drawing and erasing. If a user returns to previous time stamps, saved status of video and the drawing would be replayed.
Music from Video: Reconstruction Violin Performances
Student: Nathaniel Chodosh
Advisor: Jianbo Shi
In order to gain some traction in the area of machine understanding of musical performances, I have created a system for analyzing first person videos of violin performances. Specifically my system can take a first person video of a violin performance and generate an audio track corresponding to what is being played. The audio prediction is performed by a deep learning system so the first step in the pipeline is data regularization. In this context regularization means applying image stabilization in order to fix the position of the violin, making the learning step easier. To accomplish this I have implemented the well known Kanade-Lucas-Tomasi algorithm over dense features. The KLT algorithm uses gradient descent to find an optimal alignment between the violin in neighboring video frames. Using the stabilized video I then trained a deep neural network to predict the note being played in each frame of the video. All of the input videos together yield only ~26k data points which isn't enough to train a network. To circumvent this I have implemented a version of a general image classifier, initialize its parameters to ones trained on millions of images and then fine tune the weights for the problem of note prediction. This yields a system that can predict notes with ~95% accuracy, where accuracy is measured as notes classified correctly divided by total samples. The predicted notes are then turned into an audio track using a MIDI synthesizer.
AccessiBill - Accessible Bill Tracking
Students: Jane Guo, Lilach Brownstein
Advisor: Chris Murphy
We believe that modern technologies and interfaces should be accessible and easy to use for anyone and everyone. Unfortunately, today, many groups of people still have trouble incorporating software programs into their everyday lives. In this project, we hope to help some of these groups of people use more technologies in their daily tasks. First, solutions for personal data visualization, interface design, and accessibility was researched for the user group of people who are age 55+ and describe themselves as uncomfortable with using smart-phones. Based on immersion research findings, a key need in this user group is a way to track personal bills and bill payment dates. Using initial findings, a prototype was developed in Photoshop and tested on the user group. The design was then refined and implemented on Android in Java and XML. Finally, structured user tests were conducted to further refine the design to a final product. The final product is an Android app that our users can download and use. The app uses local Android storage to store persistent information about the user's bills and due dates. Some key features in the app include color signaling, heavy use of words over icons, and skeuomorphism (features mimicking real life as much as possible).
Students: Gabriel Duemichen, Long Nguyen, Matt Wojcieszek
Advisors: Rakesh Vohra, (Sanjeev Khanna)
Our project consist of identifying and examining the cycles created by the exchange of favors, or non monetary debts, between people. To get this to work we created a website to allow for user input of favors, and a way to comparatively rank them. Then we used an algorithm based off TTC to identify positive sum cycles where everyone received something they valued more than the favor they did. We have noticed that single task cycles actually do not start forming until the graph of favors becomes more clustered due to variance in user preferences, and will continue doing analyses of graphical structures conducive to cycles, single missing link cycle detection, and task combination in cycles.
VisualizeNN: A Dynamic Artificial Neural Network Visualization Engine
Students: Alexander Yang, Helen Qu
Advisor: Shivani Agarwal
Artificial neural networks have recently come to the forefront in the field of machine learning, and have demonstrated expert proficiency at a variety of complex tasks. These successes have led to an explosion of demand for talent in industry and interest in academia. However, assistive tools for machine learning students and practitioners have not kept up with the pace of innovation, and are sorely lacking. Training neural networks can often be a frustrating experience, as their performance depends upon the practitioneräó»s choice of a large number of parameters, which affect both the structure of the network and how it learns. As a result, practitioners are often left unable to determine both why their model underperforms and the steps necessary for improvement. Our tool improves upon the status quo by supplying practitioners with information regarding model state and performance during the training process in real time. By capturing the dynamics of model training, we illuminate a whole host of issues that were elided by previous tools, which could only interact with static, trained networks. Furthermore, we provide a highly extensible Python API that interfaces directly with Theano, allowing practitioners to easily use our visualization suite with their existing models and to extend it to support their own proprietary models as well. In private testing on a set of Kaggle competitions, use of our tool has shown to markedly increase the speed at which practitioners were able to train models that achieved baseline accuracy levels on image classification tasks.
Incognito: Distributed Anonymous Browsing
Students: Kevin Yim, Mahir Karim
Advisor: Brett Hemenway
As you browse the internet, your activity can be recorded and tracked without your knowledge or consent. This information can be used for purposes ranging from gathering usage statistics to government surveillance. Current tools aim to improve privacy by obscuring a user's unique IP address, allowing them to browse without being tracked. However, these existing solutions incur serious latency penalties or fail to provide a dynamic, changing IP address. We present Incognito, a distributed system that obscures the user's IP by redirecting web browser requests through a pool of shared peers. By improving on existing solutions, we profide a service that aims to be fast, lightweight, and dynamic.
PlantSense: Bringing Hands-on Data Analysis Education to Philadelphia Students
Students: Margaret Li, Alex Hu, Elizabeth Walton
Advisors: Benedict Brown and Jorge Santiago-Aviles
The Philadelphia public school system has three seemingly distinct but secretly related problems that it faces. With average class sizes being 30-40 students/class and rising, teachers don’t have the time nor the funds to create lesson plans that include more hands-on learning, and even if they did, classes are too large for these activities to be practical. Secondly, Philadelphia students never learn how to extrapolate hypotheses and assumptions when given data. Lastly, inner city students have very little access to fresh fruits and vegetables as well as the gardening to produce these foods. PlantSense offers a solution for all three problems. PlantSense is a web application that communicates with a hydroponics system to send and receive requests and data. The data about light intensity, pH, temperature, etc, which the hydroponics system monitors with different sensors is combined with the video feed of plant growth to allow students to create graphs then write and store hypotheses associated with said data. Our application gives students the chance to run various tests by changing the monitored variables and to keep track and add their thoughts about the results. Students also have personalized accounts that can be accessed from school, home, the library, or anywhere else they wish. After testing and further evaluation, we have found that PlantSense is able to make educational indoor gardening worthwhile by not only assisting in teaching how environmental factors impact plant growth but also teaching data analysis in a personalized and easy-to-use manner.
Students: Sacha Best, Scott Freeman, Sebastian Lozano, Nova Fallen
Advisor: Eric Eaton
News articles are meant to be unbiased sources of information. However, it is widely believed that most popular news sources are biased toward one political ideology or another. Deviläó»s Advocate is a web application that helps users understand the bias in the news articles they read. Users can query the system with a topic of interest and retrieve articles from 15+ top news sources. Articles from those sources are split into sentences, and a bias score is assigned to each sentence. Articles have an aggregate bias score based on the sentences they contain, and sources are assigned a bias based on the articles they produce. Bias scores for each sentence are determined by predicting the probability of the sentence in dueling conservative and liberal language models trained on archival data. The language models are created using LSTMs, a variant of recurrent neural networks that works well on sequential data. On a provenance prediction task of determining whether sentences came from conservative or liberal archival articles, the model correctly predicted the source of 26.9% of test sentences, and labeled 57.52% of test sentences as neutral, indicating that the probability of the sentence is similar under both models. These evaluation results show that our system is able to detect features of a sentence that make it more likely to be “conservative” or “liberal.” The web application visually highlights the biased sentences in an interactive manner.
Students: Christopher Besser, Eric Kwong, Greg Dikopf, Nick Wein
Advisors: Swapneel Sheth and Arvind Bhusnurmath
GitStrategy is a tool for instructors and project managers to cleanly track the contributions of contributors to GitHub repositories. Most existing technologies for monitoring git contributions have steep learning curves, while they output in formats not easily human readable, like .json. A few existing technologies, such as diff logs on github.com, may be human readable and user friendly, but are too tedious to dig through to completely analyze a repository. Furthermore, none of these tools are comprehensive collections of all interesting statistics. Project managers may need to be able to use multiple analysis tools to fully analyze a single repository. GitStrategy simplifies analyzing repositories by providing a large assortment of repository statistics in a simple interface, while also rating the quality of the code. A project manager or instructor wishing to analyze a repository need only learn how to launch a node.js application. Node.js is far better documented than existing repository analysis tools, while also a less specialized tool useful for a broader range of purposes. Our system uses the D3 library to convert these repository statistics into graphs that even users without an extensive knowledge of software engineering and Git practices can understand, and renders them onto a web app interface. Users can choose the statistics they are interested in to tailor results to their needs. Because our tool rates code quality while displaying repository statistics like cyclomatic complexity over time, we believe it can be used to study the relationship between certain repository statistics with overall code quality.
XpenseBook: Simplify Expense Reports
Student: Molly Wang
Advisor: Chris Murphy
97% of travel managers and employees feel that preparing an expense report is the biggest barrier to efficient expense report processing. XpenseBook's goal is to help simplify the expense report process by streamlining users' biggest pain point - setting up, entering data, and attaching receipts- through a mobile-app developed and refined by user-centered evaluation. XpenseBook is an Android app built using the Android Studio IDE, using Firebase for the database, authentication, storage, and notification. XpenseBook was created using the combination of the principles that underlie agile development and user-centered design. Both processes are iterative, incorporating testing with users and refinement. XpenseBookäó»s user tests evaluated 1) Increased efficiency and ease in expense report creation process by measuring the time it takes users to create expense report when presented with the app for the first time 2) App layout muscle memory by measuring the time it takes user to create an expense report again 3) User interface intuitiveness by recording what steps users take to create the expense report and any errors or confusion experienced by users. Test results saw increased efficiency in the expense report creation process with each successive UI iteration and 5% less errors made when compared to 3rd-party expense software used at an average company. One main takeaway from this project is how to balance Agile development and UCD, mimicking real-life product development cycles. It was interesting to learn about standout startups who have managed to make both processes work together.