Size Does Matter! (From the Age of Closed-Loop to the Age of Open-Loop) Roch Guerin (guerin@ee.upenn.edu) University of Pennsylvania By many accounts, the Internet is out of control (SPAM, DDOS, DNS attacks, etc.) and in spite of its touted simplicity and flexibility, the sheer weight and complexity of existing protocol suites makes introducing fixes to those problems increasingly difficult. But by the same token it keeps growing and allowing experimentation with and deployment of new services. The Internet is morphing with every "killer app" that gets added, from the web, to Skype, to YouTube and Joost. There are glitches and setbacks, but the growth is there and unabated in spite of repeated predictions of its pending doom, e.g., from a video over-dose, as many have recently warned against. So on one hand, the Internet is faced with a series of increasingly serious and potentially crippling problems, and on the other hand it is getting bigger and more diverse by the day with new services and new types of users, e.g., wireless devices, being continuously added. This clearly makes for a challenging situation where "fixing the Internet" could be compared to trying to change wheels on an accelerating freight train. This is hard not only because solutions need to aim at a moving target, but also because the fact that the target is moving highlights that it is far from having reached a stage where its deficiencies have sufficiently affected it and made it ripe for either major changes or replacement. As a matter of comparison, one could argue that the phone network had been at a stand-still for many years before the Internet came along and ultimately displaced it. The current situation is clearly vastly different. However, while the Internet's sheer size and its continuing growth are the main contributors to the challenges one faces when attempting to improve it, they also create opportunities in leveraging size as a critical component to develop solutions addressing these problems. In particular, the magnitude of the Internet coverage and reach also means that it offers an incredible diversity in terms of both connectivity and the characteristics of that connectivity. This global diversity is further supplemented by the growing availability of local diversity in the form of new access technologies, e.g., multi-homing, multi-band radios, etc. In other words, in today's Internet, there is increasingly a plethora of choices for realizing connectivity between any two points. As a result, as the Internet and its diversity keep growing, the odds of the many connectivity options it offers all going simultaneously bad are fast diminishing, and here lies the key to developing successful solutions. Specifically, while we should still strive to continue improving the stability, correctness, security, performance, etc., of the many protocols and technologies that play a role in delivering the ubiquitous connectivity that has made the Internet so successful, devising solutions that can keep up with rather than hinder its growth must directly take advantage of that growth. This has a number of important implications. First and foremost, it is critical to provide easy access to as well as leverage the different connectivity options that may be available at any given point in the Internet. This affects routing protocols and in particular a protocol such as BGP that is predicated on the selection of a single (best) path, as well as the forwarding behavior of routers. For example, if diversity is viewed as desirable, it should be incorporated in the path selection process to favor choices that maintain or increase the number of forwarding alternatives available, and conversely packets of an individual flow should preferably be forwarded over as many distinct paths as possible rather than be all mapped onto the same next hop as is commonly done in today's routers. A second important implication is that both end-systems and routers should incorporate mechanisms that seek to directly exploit diversity. In particular, packet replication becomes an important functionality, which when coupled with mechanisms that map different packet replicas onto different forwarding decisions can deliver significant improvements in performance and reliability, simply by taking advantages of the diverse characteristics of the underlying paths (they may not all be perfect, but they will most likely not all be bad). In a sense, this is no different from the age-old design principle for building highly reliable systems from unreliable components. Specifically, the Internet is just like any other complex, large-scale system, whose reliability can to some extent be improved by making its components more reliable, but for which redundancy alone can provide a solution that overcomes the intrinsic challenge associated with scale. Last but not least, as is common when scale is a key factor, reactive solutions are to be used only with caution. This is not only because the many possible root-causes behind observed changes can make identifying the appropriate reaction difficult, but also because large-scale reactions, e.g., from many users, can introduce problems of their own such as instabilities and oscillations. In other words, open-loop approaches should and will play a major role in developing solutions that can scale with the Internet. This will be true for both network control functions, e.g., the computation of backup paths, and mechanisms at the level of end-users flows, e.g., by using diversity coding as previously alluded to. In summary, solving the Internet's problems can only be realized by viewing its size as an advantage rather than a factor that renders solutions more complex and difficult. With size comes diversity, and by exploiting diversity it is possible to develop solutions that combine multiple weak links into one strong link. More generally, it is important to acknowledge that no matter what improvements one can make to individual components, the reliability of any large-scale system must involve some level of redundancy. The focus should, therefore, be on exploring how to best add and utilize redundancy across the Internet and its users.