Seminars, Seminar Topics, Notes on Engineering Computers, Electronics, Mechanical, Mba: Adding Intelligence to Internet

Satellites have been used for years to provide communication network links. Historically, the use of satellites in the Internet can be divided into two generations. In the first generation, satellites were simply used to provide commodity links (e.g., T1) between countries. Internet Protocol (IP) routers were attached to the link endpoints to use the links as single-hop alternatives to multiple terrestrial hops. Two characteristics marked these first-generation systems: they had limited bandwidth, and they had large latencies that were due to the propagation delay to the high orbit position of a geosynchronous satellite.In the second generation of systems now appearing, intelligence is added at the satellite link endpoints to overcome these characteristics. This intelligence is used as the basis for a system for providing Internet access engineered using a collection or fleet of satellites, rather than operating single satellite channels in isolation. Examples of intelligent control of a fleet include monitoring which documents are delivered over the system to make decisions adaptively on how to schedule satellite time; dynamically creating multicast groups based on monitored data to conserve satellite bandwidth; caching documents at all satellite channel endpoints; and anticipating user demands to hide latency.

Two scaling problems face the Internet today. First, it will be years before terrestrial networks are able to provide adequate bandwidth uniformly around the world, given the explosive growth in Internet bandwidth demand and the amount of the world that is still unwired. Second, the traffic distribution is not uniform worldwide: Clients in all countries of the world access content that today is chiefly produced in a few regions of the world (e.g., North America). A new generation of Internet access built around geosynchronous satellites can provide immediate relief. The satellite system can improve service to bandwidth-starved regions of the globe where terrestrial networks are insufficient and supplement terrestrial networks elsewhere. This new generation of satellite system manages a set of satellite links using intelligent controls at the link endpoints. The intelligence uses feedback obtained from monitoring end-user behavior to adapt the use of resources. Mechanisms controlled include caching, dynamic construction of push channels, use of multicast, and scheduling of satellite bandwidth. This paper discusses the key issues of using intelligence to control satellite links, and then presents as a case study the architecture of a specific system: the Internet Delivery System, which uses INTELSAT's satellite fleet to create Internet connections that act as wormholes between points on the globe.

In the second generation of systems now appearing, intelligence is added at the satellite link endpoints to overcome these characteristics. This intelligence is used as the basis for a system for providing Internet access engineered using a collection or fleet of satellites, rather than operating single satellite channels in isolation. Examples of intelligent control of a fleet include monitoring which documents are delivered over the system to make decisions adaptively on how to schedule satellite time; dynamically creating multicast groups based on monitored data to conserve satellite bandwidth; caching documents at all satellite channel endpoints; and anticipating user demands to hide latency.

The first question is whether it makes sense today to use geosynchronous satellite links for Internet access. Alternatives include wired terrestrial connections, low earth orbiting (LEO) satellites, and wireless wide area network technologies (such as Local Multipoint Distribution Service or 2.4-GHz radio links in the U.S.).

We see three reasons why geosynchronous satellites will be used for some years to come for international Internet connections.

The first reason is obvious: it will be years before terrestrial networks are able to provide adequate bandwidth uniformly around the world, given the explosive growth in Internet bandwidth demand and the amount of the world that is still unwired. Geosynchronous satellites can provide immediate relief. They can improve service to bandwidth-starved regions of the globe where terrestrial networks are insufficient and can supplement terrestrial networks elsewhere.

Second, geosynchronous satellites allow direct single-hop access to the Internet backbone, bypassing congestion points and providing faster access time and higher net throughputs. In theory, a bit can be sent the distance of an international connection over fiber in a time on the order of tens of microseconds. In practice today, however, international connections via terrestrial links are an order of magnitude larger. For example, in experiments we performed in December 1998, the mean round trip time between the U.S. and Brazil (vt.edu to embr.net.br) over terrestrial links were 562.9 msec (via teleglobe.net) and 220.7 (via gzip.net) [Habib]. In contrast, the mean latency between the two routers at the two endpoints of a satellite link between Bangledesh and Singapore measured in February 1999 was 348.5 msec. Therefore, a geosynchronous satellite has a sufficiently large footprint over the earth that it can be used to create wormholes in the Internet: constant-latency transit paths between distant points on the globe [Chen]. The mean latency of an international connection via satellite is competitive with today's terrestrial-based connections, but the variance in latency can be reduced.

As quality-of-service (QoS) guarantees are introduced by carriers, the mean and variance in latency should go down for international connections, reducing the appeal of geosychronous satellites. However, although QoS may soon be widely available within certain countries, it may be some time until it is available at low cost between most countries of the world.

A third reason for using geosynchronous satellites is that the Internet's traffic distribution is not uniform worldwide: clients in all countries of the world access content (e.g., Web pages, streaming media) that today is chiefly produced in a few regions of the world (e.g., North America). This implies that a worldwide multicast architecture that caches content on both edges of the satellite network (i.e., near the content providers as well as near the clients) could provide improved response time to clients worldwide. We use this traffic pattern in the system described in the case study (Section 3).

One final point of interest is to ask whether LEO satellites that are being deployed today will displace the need for geosynchronous satellites. The low orbital position makes the LEO footprint relatively small. Therefore, international connections through LEOs will require multiple hops in space, much as today's satellite-based wireless phone systems operate. The propagation delay will eliminate any advantage that LEOs have over geosynchronous satellites. On the other hand, LEOs have an advantage: they are not subject to the constraint in orbital positions facing geosynchronous satellite operators. So the total available LEO bandwidth could one day surpass that of geosynchronous satellites.

The overall system must achieve a balance between the throughput of the terrestrial Internet connection going into the warehouse, the throughput of the warehouse itself, the throughput of the satellite link, the throughput of each kiosk, and the throughput of the connection between a kiosk and its end users. In addition, a balance among the number of end users, the number of kiosks, and the number of warehouses is required.

Consider some examples. As the number of end users grows, so will the size of the set of popular Web pages that must be delivered, and the bandwidth required for push, real time, and timely traffic. Let's look at Web traffic in detail. Analysis of end-user traffic to proxy servers at America Online done at Virginia Tech shows that an average user requests one URL about every 50 seconds, which indicates a request rate of 0.02 URLs per second. (This does not mean that a person clicks on a link or types a new URL every 50 seconds; instead, each URL requested typically embeds other URLs, such as images. The average rate of the individual URLs requested either by a person or indirectly as an embedded object is one every 50 seconds.) Thus, a kiosk supporting 1,000 concurrent users must handle a request rate of 200 per second. The median file size from the set of traces cited above (DEC, America Online, etc.) is 2 kilobytes [Abdulla]. Thus, the kiosk Hypertext Transfer Protocol (HTTP)-level throughput to end users must be 400 kilobytes per second. At the other end, the warehouse has a connection to the Internet. The bandwidth of this connection must exceed that of the satellite connection, because the warehouse generates cache consistency traffic. The servers within the warehouse and kiosk have limited throughput, for example, the throughput at which the cache engines can serve Web pages. To do multicast transmission, a collection of content (Web pages, pushed documents) must be bundled up at the application layer at the warehouse into a unit for transmission to a multicast group, then broken down into individual objects at the kiosk. This assembly and disassembly process also limits throughput.

A second issue is how to handle Web page misses as kiosks. If the kiosk has no terrestrial Internet connection, then these situations obviously must be satisfied over the satellite channel. This reduces the number of kiosks that a satellite link can handle. On the other hand, if the kiosk does have a terrestrial connection, an adaptive decision might be to choose the satellite over the terrestrial link if there is unused satellite capacity and if the performance of the territorial link is erratic.

A third issue is how to handle Domain Name System (DNS) lookups. A DNS server is necessary at kiosks to avoid the delay of sending lookups over a satellite. However, how should misses or lookups of invalidated entries in the kiosk's DNS server be handled? One option is for the DNS traffic to go over a terrestrial link at the kiosk, if one is available. An alternative is for the warehouse to multicast DNS entries to the kiosks, based on host names encountered in the logs transmitted from the kiosks to the warehouse.

A fourth issue is fault tolerance. If a kiosk goes down and reboots, or a new kiosk is brought up, there must be a mechanism for that kiosk to obtain information missed during the failure.

The idea for the IDS was conceived at INTELSAT, an international organization that owns a fleet of geostationary satellites and sells space segment bandwidth to its international signatories. Work on the prototype started in February 1998. In February 1999, the prototype system stands poised for international trials involving ten signatories of INTELSAT. A commercial version of IDS will be released in May 1999.

The building blocks of IDS are warehouses and kiosks. A warehouse is a large repository (terabytes of storage) of Web content. The warehouse is connected to the content-provider edge of the Internet by a high-bandwidth link. Given the global distribution of Web content today, an excellent choice for a warehouse could be a large data-center or large-scale bandwidth reseller situated in the U.S. The warehouse will use its high-bandwidth link to the content providers to crawl and gather Web content of interest in its Web cache. The warehouse uses an adaptive refreshing technique to assure the freshness of the content stored in its Web cache. The Web content stored in the warehouse cache is continuously scheduled for transmission via a satellite and multicast to a group of kiosks that subscribe to the warehouse.

The centerpiece of the kiosk architecture is also a Web cache. Kiosks represent the service-provider edge of the Internet and can therefore reside at national service providers or ISPs. The storage size of a kiosk cache can therefore vary from a low number of gigabytes to terabytes. Web content multicast by the warehouse is received, is filtered for subscription, and is subsequently pushed in the kiosk cache. The kiosk Web cache also operates in the traditional pull mode. All user requests for Web content to the service provider are transparently intercepted and redirected to the kiosk Web cache. The cache serves the user request directly if it has the requested content; otherwise, it uses its link to the Internet to retrieve the content from the origin Web site. The cache stores a copy of the requested content while passing it back to the user who requested it.

Seminars, Seminar Topics, Notes on Engineering Computers, Electronics, Mechanical, Mba

Adding Intelligence to Internet

Labels

Archieves

Disclaimer