Happy Birthday Gnutella!

March 20, 2010

OK, I’ve been severely neglecting the ol’ blog, I know. I’m breaking the silence to join Janko Roettgers and NewTeeVee in wishing Gnutella a Happy Birthday on the week of its 10 year anniversary. We had some great times at LimeWire and battling it out on the Gnutella Developer’s Forum back in the day.

One of my favorite moments had to be in the summer of 2000 when Vinnie Falco, creator of BearShare, and my fellow LimeWire developer and partner in crime in the old days, Chris Rohrs, would talk about Gnutella architecture decisions. Vinnie’s an Italian American who was living in Miami Beach at the time. Chris is an MIT grad who grew up one town away from me in Deerfield, MA, one of the oldest towns in the country (close to my beloved Greenfield), and about as culturally far away from Miami Beach as one can get in the US. Chris would give Vinnie some piece of advice about this or that, and Vinnie would respond with the perfect Miami Beach Italian American accent, “Hey, you’re talking to the Vin.” And that’s how the Gnutella network was built.

Chris has been tearing it up at Google for years now, and Vinnie probably owns some tiny island in the South Pacific where he buried a chest full of BearShare gold back in 2002.

Check out Janko’s more thorough article, the best and one of the few histories of Gnutella I’ve seen written down.

I’ll be writing more here soon — been hammering out so much code for so many different things (all feeding back in to LittleShoot), it’s been a struggle to find time to blog about them. Technology’s progressing so rapidly on so many fronts, it’s more fun to keep coding! Stay tuned for news about LittleProxy, a sweet little HTPP proxy I’ve been working on using Netty. LittleProxy will be a core component of a new system for circumventing censors around the world. Oh yeah, and lots of other stuff too.

Advertisements

LimeWire Arista RIAA Deposition Recap

February 16, 2008

So I finished my grueling 6 hour deposition an hour or so ago. Present at the deposition were Greg Bildson of LimeWire, Charles Baker (my and LimeWire’s counsel), the RIAA counsel, RIAA special advisor, Kelly Truelove, the counsel for Arista et al from Cravath, the stenographer, and the videographer. I would have liked to have released and distributed the video of the deposition on LittleShoot, Gnutella, and my web servers as a clear demonstration of non-infringing uses, but it looks like it will not be publicly released for the time being.

I fear my testimony damaged LimeWire’s case in large part due to various discussions with Mark Cuban, Jim Griffin, Serguei Osokine and others on the Pho list. Here’s a little excerpt I wrote on 10/27/06:

I believe passionately in p2p and believe it has a bright future, but I do not support the vast majority of p2p companies out there because they’re almost entirely devoted to infringement.

The Cravath lawyer highlighted this and several similar comments as indicating I think LimeWire is completely devoted to distributing infringing content. They successfully pinned me down on this point with precise “yes” and “no” questions, as in “do you have any reason to think you did not write that statement.” I don’t think LimeWire actively sought to make money from infringing content. I think LimeWire was in large part a victim of its historical time, a time when the Internet was still a baby and when users were not savvy about producing and distributing their own works. As a result, the vast majority of digital content available at the time was copyrighted, but only because that’s what the users had. YouTube was not possible then because you didn’t have a threshold of the population who would be comfortable uploading videos to servers and because bandwidth wasn’t cheap enough.

That said, LimeWire is primarily used for distributing infringing material, but it’s clearly the users distributing that material outside of the intents of the LimeWire creators, myself included. When I started working at LimeWire, we were building the Lime Peer Server and planning how Gnutella would be used to search for everything from apartment listings to cars. Despite our best efforts, those plans never came to fruition. My primary critique of LimeWire and of other p2p applications is that they didn’t think as creatively as they could have about other uses of the technology, with the exception of Skype. The conversation on Pho was in the aftermath of the YouTube sale when the potential for distributing non-infringing content was obvious. I think we could have seen that sooner at LimeWire and could have more actively pursued a p2p-enabled YouTube using DMCA protections, but that’s easy to say in retrospect.

My comments on pho were somewhat taken out of context. The Cravath lawyer succeeded in what apparently is the oldest trick in the book: put you to sleep with hours of mind-numbing questioning about the details of query routing hashes and long-forgotten forum posts before slipping in the key potentially incriminating questions just when they think your brain has turned to complete mush. By the time they got to the questions on Pho, I couldn’t remember my name let alone articulately clarify my thoughts on a forum thread from over a year ago. This prevented me from continually pointing out that the Pho forum threads were focused on the details of YouTube’s protections under the DMCA safe harbors and how they could apply to p2p.

Here’s another snippet from Pho they highlighted. I believe I wrote this in response to one of Jim Griffin’s comments:

I agree the underlying technology for LimeWire and Skype are similar. The point is that one makes all of its money off of infringing content while the other does not. You think that’s all great in the spirit of innovation. I think they should be as innovative with their businesses as they are with their technology, like Skype. You say they make money from the same source, I guess the technology. I think that’s ridiculous. There’s so much room to innovate with p2p outside of infringement that it’s mind boggling there hasn’t been more.

The key issue is that, while LimeWire clearly makes money from users’ infringement, they never intended that to be the case. It’s the content that’s infringing, not LimeWire. I simply wished we thought bigger — thought beyond the existing uses of the technology, along the lines of what Skype was able to do. That’s not to say it would have been easy, however, and that’s not to say LimeWire’s liable because they did not more vigorously pursue more creative paths.

As I emphasized continually in the deposition, we were always creating a generalized tool for media distribution. It was a tool for dynamically searching millions of computers for any type of content. We worked with universities around the world, particularly the Stanford Peers Group, on creating the most efficient algorithms for distributed search. Our competitors included Google and Yahoo as much as they did Kazaa, a point the Cravath lawyer failed to fully appreciate or take seriously, even though I could not have been more serious.
If you’re giving a deposition any time soon, my advice is to continually stay on your toes and to watch out for the ol’ put you to sleep with the most boring questions you can possibly imagine trick. It’s a trap.

Hopefully in the long run the First Amendment will matter more than making sure the record industry has plenty of cash to pay the most expensive lawyers in the business to help line their pockets.


O’Reilly, GData, Open Standards

September 4, 2006

Tim O’Reilly’s post about GData and the importance of open standards articulates the argument for expanding the open infrastructure, for standardizing the “small pieces” that together do the heavy lifting of the Internet and make everything work together.

I like David Weinberger’s “small pieces” phrase, and I’ll adopt it here. Open standards and open source work so well, and so well together, because the pieces are small. Each standard solves a very specific problem. This allows each open source implementation of those standards to be limited in scope, lowering the barriers to entry for writing and maintaining them. The Internet today exists because of small pieces, particularly HTTP, HTML, CSS, XML, etc.

Together, these small pieces form the web platform that has fostered the startling array of innovations over the last ten years. O’Reilly’s key phrase is “A Platform Beats an Application Every Time”. If there’s any lesson to take away from the Internet, this is it. A platform beats an application because it fosters an entire ecosystem of applications that can talk to each other using these small pieces. The ability to talk to each makes each application far more powerful than if it were an isolated island. Just like an ecosystem, platforms create new niches and continually evolve as new actors emerge, and they create needs for new protocols.

This is why the current Internet lies in such a precarious state. The ecosystem has evolved, and has created needs for new protocols that do everything from traverse NATs to publish data. As the system becomes more complex, however, we’re forgetting that central tenet that small pieces made the whole thing work in the first place. In most cases, standards for solving the problems exist, but private actors either don’t realize it or decided to use their own versions regardless. This is like companies in 1994 deciding to ignore HTTP and implement their own versions.

Take NATs for example. The IETF’s SIP, TURN, STUN, and ICE provide an excellent, interoperable framework for traversing NATs. Nevertheless, Skype, BitTorrent, and Gnutella all implement their own proprietary versions of the same thing, and they don’t work as well as the IETF versions. As a result, none of them can interoperate, and the resources of all NATted computers remain segmented off from the rest of the Internet as a wasted resource. Skype can only talk to Skype, BitTorrent can only talk to BitTorrent, and Gnutella can only talk to Gnutella in spite of standards that could make all three interoperate. In Skype and BitTorrent’s case, they even ignore HTTP. They decided to completely forgoe interoperability with the rest of the Internet for file transfers.

GData, in contrast, gets high marks for interoperability. It uses the Atom Publishing Protocol (APP), RSS, and HTTP. RSS and HTTP are, of course, widely deployed already. APP is a good standard that leverages HTTP and solves very specific publishing problems on top of that. APP lets you modify any data you submit, one of Tim Bray’s first criteria for “Open” data. Google Base, built on top of GData, also shares AdSense revenue with users, fulfilling Tim Bray’s second criteria of sharing value-added information from submitted data.

The only part of GData I have a problem with is OpenSearch. OpenSearch is sort of half of an Internet standard because it emerged from a single company Amazon, in the face of a better standards out of the IETF, RDF and SPARQL.

SPARQL and RDF together create an abstraction layer for any type of data and allow that data to be queried. They create the data portion of the web platform. As Tim says, “The only defense against [proprietary data] is a vigorous pursuit of open standards in data interchange.” Precisely. RDF and SPARQL are two of the primary protocols we need in this vigorous pursuit on the data front. The Atom Publishing Protocol is another. There are many fronts in this war, however. We also need to push SIP, STUN, TURN, and ICE in terms of making the “dark web” interoperable, just as we need to re-emphasize the importance of HTTP for simple file transfers. These are the protocols that need to form, as Tim says, “a second wave of consolidation, which weaves it all together into a new platform”. If we do things right, this interoperable platform can create a world where free calling on the Internet works as seamlessly as web browsers and web servers, where every browser and every server automatically distribute load using multisource “torrent” downloads, and where all data is shared.

Standards are the key to this open infrastructure.


Daswani on LimeWire Complaint

August 29, 2006

Susheel Daswani has posted another excellent piece in his series about the recent RIAA action against LimeWire.  Susheel worked with me at LimeWire before moving on to study law.

The case will be fascinating to watch.  At LimeWire, we really did all wake up one day to find ourselves in the entertainment business, much to our displeasure.  As I have discussed, the Gnutella developer community was always much more interested in the technology.  Many of the most active members, like Serguei Osokine, Phillipe Verdy, or Raphael Manfredi had absolutely no commercial interest in Gnutella other than as a theoretical and technical exercise.  Susheel and I worked extensively with Serguei to write the current standards for distributed search on Gnutella.  That just always seemed so much more important than the copyright dust up, and it’s sad that it’s come to this.


BitTorrent: Old Technology in a New Box

August 21, 2006

The myth of BitTorrent goes something like this: Bram Cohen, hacker extraordinaire, realized circa 2001 that it would be more efficient to break files up into pieces on different servers and to download those pieces separately. This would distribute the load across multiple servers, providing a more robust architecture for accessing the file. The trouble is, the practice was common well before BitTorrent came on the scene. Cohen simply wrote another implementation of a technology that had already become commonplace in the P2P community. The first implementation I know of was Justin Chapweske´s work on SwarmCast in 2000. As I remember it, Justin´s creativity pointed the way for us all.

Heck, we even released swarm downloading in LimeWire long before BitTorrent ever made a public release, as I first announced here. I wrote almost none of the downloading code, but my old LimeWire buddies Chris Rohrs and Sumeet Thadani have more of a claim to having “invented” swarm downloading than Bram Cohen. LimeWire´s also an open source project, and we were working on the swarming implementation as early as January of 2001, as you can see from the CVS logs. Cohen didn´t even start working on it at all until May of 2001. What´s more, it never occurred to us at LimeWire to think of it as a new idea because, well, it wasn´t.

Why do I care? It´s just that it keeps coming up, most recently in the O´Reilly e-mail from a couple of days ago seeking ETech 2007 participants, where they describe “BitTorrent’s use of sufficiently advanced resource locators and fragmented files” as the type of new innovation they´re looking for. I was a history major in college (in addition to computer science), so these things matter to me. Cohen himself perpetuates the myth, most blatantly on the BitTorrent web site where it says: “While it wasn’t clear it could be done, Bram wanted to enable effective swarming distribution – – transferring massive files from server to client with the efficiency of peer-to-peer — reliably, quickly and efficiently.” The fact is, it was clear it could be done because people like Justin and us over at LimeWire had already done it!

The Wired article on Cohen from January 2005 takes the cake, though. The article says “Cohen realized that chopping up a file and handing out the pieces to several uploaders would really speed things up.” Again, he “realized” it because he saw that others were already doing it. They go on to describe how traditional file sharing networks are “slow because they suffer from supply bottlenecks. Even if many users on the network have the same file, swapping is restricted to one uploader and downloader at a time.” It´s all just blatantly wrong.

Now, don´t get me wrong. I love BitTorrent. I think BitTorrent is amazing and a perfect example of the kind of enabling technology that makes us all more free. It offers the clearest hope for a future of media distribution beyond the inadequate cable and network broadcast model we see today. It´s just that BitTorrent´s innovation was far less sexy. BitTorrent worked because it did less, not because it had any amazing new ideas. BitTorrent took what many p2p applications were already doing and scrapped most of it. BitTorrent scrapped search. It didn´t bother with a fully connected network. It didn´t worry about file management. It just took the downloading component and packaged it up nicely. Cohen realized that the downloading technology alone was all people wanted or needed in many cases, and that the tricky distributed search part was often unnecessary. Hats off — it has really changed the way many of us think about technology.

That said, BitTorrent was old technology in a new package. The innovation was in the packaging.


LimeWire and the Napster Curse

August 11, 2006

Shawn Fanning’s 1999 release of Napster forever associated peer-to-peer technology with music piracy, and we all bear the burden of that curse today.  Why do I call it a curse?  Because p2p can and will be used in far more powerful ways that for distributing copyrighted works, and its inclusion in the incredibly boring copyright squabbles is downright disrespectful to the technology’s potential.

I will continue to hammer home the point that peer-to-peer is about efficiently pooling the resources of all internetworked computers around the world.  Folding@home is a fantastic example of this potential.  So is Skype, and so is Vonage.  Vonage, you say?  That’s a new one.  Why would I include Vonage in a list of p2p applications?  Because Vonage is a service for connecting your computer directly with the person you’re calling, just as Napster connected your computer directly to the person you were downloading from.  The RIAA does not want you to think about Vonage when you think about p2p.  They want you to think about Napster, to “tar [all p2p applications] with the ‘Napster’ brush”, as Joshua Wattles put it in the MGM v. Grokster amicus brief in the district court.  As critical thinkers, we need to break out of this cage.

Now, Robert Cringely might disagree with my use of the term “peer-to-peer” in this context.  The wide variety of applications in the wild smudges the lines delineating peer-to-peer, grid computing, and distributed computing, however.  My purpose is not to say I am correct versus Cringely in my definition of peer-to-peer.  Peer-to-peer on some level means whatever I or Cringely or anyone else says it means.  I am talking about p2p as any system that enables a direct network connection between peers that were exclusively clients in the traditional client/server architecture.  I think the definition has to be this general given the breathtaking diversity of applications in the field.

Nevertheless, Napster shined a spotlight on what to me is one of the most boring aspects of p2p — using it to distribute the overwhelmingly putrid crap coming out of Hollywood and the music studios.  If the first p2p application to explode in the mainstream had been a distributed computing project that found a cure for breast cancer or if SETI@home had actually discovered extra-terrestrial life, the rhetoric surrounding the technology would be far different today.  Instead we started in the gutter with Napster.

The RIAA’s suit against LimeWire marks the latest chapter in this saga.  Their case is built on “inducement”, the idea that LimeWire as a company has historically encouraged users to use the software for copyright infringement.  Now, the MGM v. Grokster decision is more nuanced than that, but I’ll keep it simple for now.  In the RIAA’s version of reality, LimeWire sought to inherit the user base of Napster and to capitalize on the ability of the software to be used for infringement.

This could not be further from the truth.  When my colleagues and I began work on LimeWire in the summer of 2000, we set out to break Napster’s curse.  We saw far more potential in the technology.  Far from building a new tool for music piracy, we instead generalized the Gnutella protocol for commerce.  We imagined users searching for everything from apartment listings to new recipes on Gnutella, much the way many now use SPARQL, REST, or Amazon’s OpenSearch for searching a variety of web services simultaneously.  We added XML schemas to Gnutella requests and responses to enable this shift, and we created the “Lime Peer Server” that businesses could use to serve XML search results over Gnutella.  One real estate company even used the system for a short time.  Despite our best efforts, it never took off.

Why didn’t the peer server work?  What about the other types of searches our XML schemas enabled?  Well, LimeWire was always a generalized tool, agnostic to file types.  Much as you can attach any type of file to an e-mail, users could share any type of file using Gnutella.  P2p technology is particularly effective at sharing large files because you can remove the burden from a single server and instead download files using many different peers.  It just so happens that from the year 2000 to 2006 the vast majority of large files on the Internet were copyrighted media files.  As a result, that’s what users primarily shared on LimeWire.

Now, compare this for a moment to the RIAA’s inducement claims.  Our entire business model for LimeWire was based on selling servers and on getting paid to route traffic to business clients.  Making money off of LimeWire the program was never on our radar.  How, then, can the RIAA claim that we induced users to use the software for infringing purposes?  The fact was that LimeWire users used it to infringe despite the fact that our entire business model had nothing to do with infringement.

Stepping back a bit further, there has always been a disconnect between the p2p developer community and users of file sharing software.  The p2p developer community has always been interested in things like how to search millions of computers in real time or how to ensure trust between peers.  “Oops, I Did it Again” just never seemed quite as interesting.

The p2p community is also far larger than the developers at file sharing companies.  Putting aside the engineers at Vonage for a moment, the p2p community includes the folks at Microsoft Research working on Pastry, the fantastic work coming out of the Stanford Peers Group, as well as researchers at MIT, Rice, Berkeley, and almost every major research university in the world with a computer science department.  Why is this community so vast?  Because of the potential for p2p to harness the world’s collective computing capacity more efficiently to solve the world’s most pressing problems.  Using this technology to make “Pirates of the Caribbean: Dead Man’s Chest” more universally available is not high on the priority list for the researchers at Microsoft just as it wasn’t at the forefront of my mind when I worked on LimeWire.  I’d greatly prefer it if users were more interested in contributing their computing cycles to the understanding of protein folding or to distributing their own creative works than they were in downloading Jenna Jameson’s latest.

The fact is, we are all still living with Napster’s legacy, with the RIAA attempting to draw a straight line from Napster through Aimster, Grokster, Kazaa, and now LimeWire.  The world is more complicated than that, and we are too sophisticated to believe that storyline despite its attractive simplicity.


Hello Blog People

July 20, 2006

So I’ve finally broken down and started a blog. I’m doing this out of desperation. I’ve simply accumulated too many Writely rants and quasi-journal entries addressed to the public that it’s a little embarrassing not to lay it all out there. I also come across daily developments in the tech world that I feel so strongly about that I’m constantly stopping my work to jot down my opinion, only for no one but me to ever see it! In many cases this is the desire to correct others in the interest of furthering general understanding, as I experienced most recently with one of Cringely’s entries where he’s a bit off in his understanding of p2p, but that will have to wait for a future entry.

To give you a little background, I received my undergraduate degree from Brown University in U.S. History and Computer Science. My first job straight out of college was working as a programmer at LimeWire. In fact, I was hired to work for a company called “LimeObjects”. By the time I actually started 2 weeks later, though, Mark Gorton and others had decided to fold LimeObjects and to turn it into LimeWire. I will talk in more detail about my 4 years at LimeWire in future entries. For now, though, I’ll just say that people like Chris Rohrs, Greg Bildson, Susheel Daswani, Robert Soule, Sumeet Thadani, Sam Berlin, and Anurag Singla, and I had a heck of a lot of fun from the early days of Gnutella through turning Gnutella into the mature family of protocols it is today. We also worked with a tremendous and talented group of people from around the world who participated in the Gnutella community through the Gnutella Developer’s Forum (GDF). Bringing LimeWire from nothing to a profitable company, and open-sourcing it along the way, was an invaluable and a fascinating experience I will always cherish.

I decided to leave LimeWire in February of 2004. There were many factors leading to my decision, as I will elucidate in upcoming entries. They all relate to technology and to my interest in interoperability between peer-to-peer and the larger Internet. My ideas were too radical a departure from LimeWire to make them a part of the same project, however, and it made the most sense to strike off on my own and to start a separate company. That company is now Last Bamboo LLC, founded with 4 friends/partners, and now very close to the public release of our first application, “LittleShoot”. I can’t say more about LittleShoot yet, but stay tuned. It’s cool.

For now, this blog will comment on related happenings on the Internet. I’ll also throw out little tidbits from the history of LimeWire and Gnutella here and there as appropriate. I hope everyone enjoys it, and please don’t hesitate to tell me when you think I’m nuts. As Yochai Benkler and many others have pointed out, the Internet’s enabling of collaborative filtering of ideas is essential to maximizing human creative resources. Please engage.