CIS 455 / 555: Internet and Web Systems

Spring 2008

Location: Towne 315, Tuesday/Thursday 4:30 - 6:00PM

Instructor

Zachary Ives, zives@cis.upenn.edu, (215) 746-2789
Location:  576 Levine Hall N (a.k.a. GRW building)
Office hours: T 3:30-4:30 (before class) or by arrangement

Teaching Assistant

Mengmeng Liu, mengmeng@seas.upenn.edu
Location:  575 Levine Hall N
Office hours: Wed 3:00-4:00PM or by arrangement

Course Objectives

This course focuses on the issues encountered in building Internet and web systems: scalability, interoperability (of data and code), atomicity and consistency models, replication, and location of resources, services, and data. Note that it is not about building database-backed or PHP/JSP/Servlet-based web sites (for this, see CIS 330/550). Here, we will learn how a Servlet server itself is built!

We will examine how XML standards enable information exchange; how web services support cross-platform interoperability (and what their limitations are); how to do replication and Akamai-like content distribution; and how application servers provide transaction support in distributed environments. We will study techniques for locating machines, resources, and data (including directory systems, information retrieval indexing and ranking, web search, and publish/subscribe systems); we'll discuss collaborative filtering and mining the Web for patterns; we'll investigate how different architectures support scalability (and the issues they face). We'll also examine the ideas that have been proposed for tomorrow's Web, including the "Semantic Web," and see some of the challenges, research directions, and potential pitfalls.

An important goal of the course is not simply to discuss issues and solutions, but to provide hands-on experience with a substantial implementation project. This semester's project will be a peer-to-peer implementation of a Googe-style search engine, including distributed, scalable crawling; indexing with ranking; and even PageRank. We will also incorporate the use of topic-specific recognizers and mash-ups in order to support forwarding of certain searches to Google Maps.

As a side effect of the material of this course, you will learn about some aspects of large-scale software development: assimilating large APIs, thinking about modularity, reading other people's code, managing versions, debugging, and so on.

News

Texts and Readings

Prerequisites

This course expects familiarity with threads and concurrency, and hence a prerequisite is CSE 380 (Operating Systems) or equivalent (or CIS 505 or equivalent); knowledge of Java programming is also required. A suggested prerequisite is CSE 330 (Database Management Systems), CIS 550, or equivalent. This course will require a significant amount of programming and will require the ability to work with your classmates in teams.

Format

The format will be two one-and-a-half-hour lectures a week, plus assigned readings from handouts. There will be regular homework assignments and a substantial implementation project with experimental validation and a report, plus a midterm and a final exam.

Grading

Three homework assignments, midterm, and final exam. Breakdown: homework 21%, midterm 14%, final 20%, project 40%, participation/intangibles 5%.

Other Resources

Significant Dates


Last revised: January 13, 2008