CIS 455 / 555: Internet and Web Systems

Spring 2010

Location: Towne 303 313, Monday/Wednesday 10:30AM - 12:00PM

Instructor

Zachary Ives, zives@cis.upenn.edu, (215) 746-2789
Location:  576 Levine Hall N (a.k.a. GRW building)
Office hours: Mon 1:30-2:30 or by arrangement

Teaching Assistants

Katie Gibson, gibsonk@seas.upenn.edu
Office hours: Wed 1:00-2:00, 612 Levine

Ruchir Jha, ruchirj@seas.upenn.edu
Office hours: Th 3:00-4:00, 612 Levine

Chandni Singh, chandni@seas.upenn.edu

Course Objectives

This course focuses on the issues encountered in building Internet and web systems: scalability, interoperability (of data and code), atomicity and consistency models, replication, and location of resources, services, and data. Note that it is not about building database-backed or PHP/JSP/Servlet-based web sites (for this, see CIS 330/550). Here, we will learn how a Servlet server itself is built!

We will examine how XML standards enable information exchange; how web services support cross-platform interoperability (and what their limitations are); how "cloud computing" services work; how to do replication and Akamai-like content distribution; and how application servers provide transaction support in distributed environments. We will study techniques for locating machines, resources, and data (including directory systems, information retrieval indexing and ranking, web search, and publish/subscribe systems); we'll discuss collaborative filtering and mining the Web for patterns; we'll investigate how different architectures support scalability (and the issues they face). We'll also examine the ideas that have been proposed for tomorrow's Web, including the "Semantic Web," and see some of the challenges, research directions, and potential pitfalls.

An important goal of the course is not simply to discuss issues and solutions, but to provide hands-on experience with a substantial implementation project. This semester's project will be a peer-to-peer implementation of a Googe-style search engine, including distributed, scalable crawling; indexing with ranking; and even PageRank. We will also incorporate the use of topic-specific recognizers and mash-ups in order to support forwarding of certain searches to Google Maps.

As a side effect of the material of this course, you will learn about some aspects of large-scale software development: assimilating large APIs, thinking about modularity, reading other people's code, managing versions, debugging, and so on.

News

Texts and Readings

Prerequisites

This course expects familiarity with threads and concurrency, as well as strong Java skills (those highly proficient in C++ or C# may also be able to easily translate their skills). This course will require a significant amount of programming and will require the ability to work with your classmates in teams.

Format

The format will be two one-and-a-half-hour lectures a week, plus assigned readings from handouts. There will be regular homework assignments and a substantial implementation project with experimental validation and a report, plus a midterm and a final exam.

Grading

Three homework assignments, midterm, and final exam. Breakdown: homework 25%, midterm 15%, final 15%, project 40%, participation/intangibles 5%.

Other Resources

Significant Dates


Last revised: January 7, 2010