Computing on the Edge

By Jim Dutton


What does your PC do while you're sleeping? If the answer is compute the trajectories of flying toasters, then you may not be utilizing your computing hardware to its fullest extent.

And yet, every time you search the web or send an email, you're using someone else's server and paying them, in one way or another, for the privilege. That's because your little PC probably isn't powerful enough to index the entire web or store emails from everyone in your organization. Given that there is a relatively small number of server machines at the center of the Internet and a very large number of PCs around the edge, wouldn't it be cool if there were some way to pool the unused resources of all (or some large number of) the edge machines in order to accomplish some of these resource-intensive tasks?

That's exactly what's happening with peer-to-peer computing or P2P, the latest industry fad and this week's essential buzzword of the VC community. For example, each user of Napster, the music sharing program, is essentially donating spare cycles and storage on her PC for others' use, in return for a symmetrical promise that, when she needs to find an MP3 file, other users will pitch in to find it and send it to her. In that sense, the Napster network is a little bit like a cooperative computing service where users barter resources in a very specific marketplace with exactly one commodity: digitized songs. From another perspective, Napster's peer-to-peer network is like a big old mainframe, made up of millions of PCs, to which individual users can submit batch jobs ­ searches and downloads ­ for execution.

Virtual Mainframes

If you read the trade rags, you know that Andreesson invented hypertext, Gosling invented portable programming, and Napster invented peer-to-peer computing. Well, sort of. Napster deserves tremendous credit to be sure (as do the other two, I guess), but as is usually the case with revolutionary technologies, the essential ideas underlying the revolution have evolved for quite some time.

For example, Datapoint Corporation, a large builder of desktop mini-computers and distributed operating systems in the late 1970's, built and deployed something called the Batch Job Facility, or BJF. This was before the WWW, before Ethernet, even before Microsoft and the IBM PC, and Datapoint needed a way to sell local area networks, which they invented (sort of), into enterprises that only understood mainframe computing. Their solution was to allow individual users to utilize the desktop machines and LANs during the workday and then donate resources to the BJF at night and on weekends. The BJF managed the distributed resources and queued up jobs to be executed as the necessary resources became available. It didn't matter whether the required CPU, storage, and peripherals were in the next room or across town (all Datapoint's buildings in San Antonio were connected via line-of-sight IR links on the roofs, which effectively produced a city wide WAN, except on foggy days).

But that's pre-history. More recent examples of systems that utilize your desktop machine to accomplish distributed tasks aren't that hard to find. In fact, the killer app that immediately preceded Napster as the eyeball champion of the universe also embodied some P2P aspects.

Two Peers, Talking

Mirabilis, the company that invented instant messaging (sort of), built ICQ to allow Internet users to send text messages to one another without going through an intervening server. In other words, server-less email. This is a far cry from a virtual mainframe, but it does utilize the computing and communication resources of the edge machines in order to perform simple tasks. This direct connection between users' PCs is more scalable, more robust, and faster than having to send all messages through a central server. But the architecture is not without its difficulties. Firewalls, erected to prevent unblessed message traffic into and out of users' machines, wreak havoc with the pure P2P setup and require an auxiliary server to relay messages, often masquerading as an HTTP server in order to fool the firewall. The term "HTTP tunneling" is a euphemism for this kind of deliberate security breach.

Instant messaging applications also demonstrate a common requirement of many P2P systems, the necessity to know whether the user on the other end is currently online and available to receive your message. Some instant messaging systems, like Ding! from Activerse, provide this presence information in a P2P manner as well, while others revert to a client-server model to publish information about a user's online status. Apart from the technical advantages of publishing presence information from the edge machines rather than a central server, this turns out to be an important distinction from the legal and political viewpoints as well.

Napster's Achilles' Heel

Who owns the Internet? The World Wide Web? Silly questions, right? Now, who owns instant messaging? The answer is: AOL. Between AOL Instant Messenger and ICQ, the uber ISP controls over 90% of all users of IM software. The reason they are able to corner that market and prevent others (e.g., Microsoft) from tapping into their network, is that the presence components of their IM systems are centralized rather than distributed. This gives AOL an opportunity to refuse to supply the necessary information to anybody else's client, which is a good thing for AOL, and for the founders of Mirabilis who received about $300M for inventing this centralized presence system (sort of). But for the Internet community as a whole, including users who prefer to use Yahoo Messenger or MSN Messenger and still speak to their ICQ buddies, it's a giant bummer.

And believe or not, Napster made the same mistake. Although the file storage and file transfers occur on the edge machines in the Napster architecture, the directory of tunes, the Napster namespace, resides in a central server in order to make searching easier. Unfortunately, this architectural anomaly has turned out to be the short hairs by which the RIAA has grabbed Napster in its attempt to destroy the service and crush what the recording industry sees as a major threat to its well being. The central directory, which is the only constant piece of the system apart from the protocol, is viewed as proof that Napster is promoting the distribution of unlicensed digital copies of copyrighted music. Once again, giant bummer for the Internet community.

Hoping to avoid the wrath of recording industry lawyers, some vendors have expanded the MP3 sharing service to include other kinds of files and have repaired the Achilles' heel by storing and searching the namespace on the edge machines as well. Nullsoft, Inc., the inventors of skinnable desktop clients (sort of), released an open-source file sharing app called Gnutella which uses a worm-like algorithm to search shared directories on thousands of connected machines. Ironically, Nullsoft is a wholly-owned subsidiary of America Online, which is about to merge with Time-Warner, which is a founding member of the RIAA. Not surprisingly, the Gnutella release was yanked from Nullsoft's site within hours, but not before the genie was out of the bottle and off to live a life of its own on the West coast. The client is still available for download from multiple sites and, since it doesn't require a central server for anything, neither AOL nor the RIAA can stop it now.

Little Green Peers

File sharing and direct communications are interesting P2P applications, but they are limited-purpose apps that don't come close to the anything-for-a-buck, general purpose nature of my old BJF. However, there is a company that is using edge resources to solve computationally difficult problems of a very general nature, like graphics rendering and genome mapping. The company, United Devices, Inc. of Austin, Texas, is closely related to an academic peer computing project at UC Berkeley called SETI@Home, an ongoing attempt to search voluminous radio telescope data in order to find signs of intelligent life in other star systems. Presumably, they will also have underutilized edge computers.

Rather than rely on a tit-for-tat reward system like Napster's (I'll share my music if you'll share yours), United Devices actually plans to pay subscribers based on the number of MIPs they donate to the system in a given time period. That's because the end product of the distributed computation is not likely to be interesting or even accessible to the people whose edge machines helped produce it. Of course, there are problems, like web searching, that are amenable to this kind of distributed solution and whose results would be very useful to the resource donors themselves.

A P2P Platform

There are many, many possible applications of peer-to-peer architectures. Problem is, distributed applications are notoriously hard to design and code properly. What's needed is a standard framework, a platform for creating P2P tools and toolsets, that provides services for account maintenance and authentication, security and encryption, data synchronization, persistent storage, and peer communication.

As of October 24, 2000, such a platform now exists and is available for download and use by P2P developers. The company and its self-titled first product are called Groove. Founded by Ray Ozzie, of Lotus Notes fame, the company has been operating in stealth mode for several years and has produced what appears to be a well thought-out, reasonably complete platform, with a documented software development kit, and a number of usable example tools. The initial toolset includes presence and instant messaging, chat, shared whiteboard, co-browser, and file sharing, all done without the aid of a central server.

So, in order to start using up all those excess edge cycles, I have downloaded the Groove SDK and plan to use it to create a brand new peer-to-peer application which I have just invented (sort of). Imagine a bunch of little toasters, with wings...


Created by: dutton on: 2000-11-29
Last modified by: dutton on: 2000-11-29
Copyright © 2000 by Jim Dutton
All Rights Reserved