Mark Cuban an Investor, Not a Lawyer

September 29, 2006

Mark Cuban came through New York yesterday and told a group of advertisers the YouTube founders “are just breaking the law.”  Really, Mark?  In fact, that’s far from clear, and Mark knows it.  YouTube has substantial protection under the DMCA section 512 safe harbors, as Fred Von Lohmann and others have made clear.  They appear on particularly firm ground in terms of section 512 (a), the law designed to protect people like Cisco from liability when they route infringing bits.  Just like Cisco, the argument goes, YouTube is at the whim of an automated system where users are choosing to send those bits.  While the DMCA safe harbors have not been tested much at all in this context, Cuban knows very well that YouTube could easily win this one.

So, why all the fuss?  Cuban has an interest in the outcome of the online video wars.  He’s a significant investor in Red Swoosh, the p2p content delivery network.  I have friends at Red Swoosh, and I like what they do.  Their technology makes YouTube look like the kids’ stuff it is.  If Red Swoosh won the video wars instead of YouTube, the Internet would be a better, more efficient place with higher resolution video.  The trouble is, Cuban would also stand to make a heck of a lot of money, and I’m struggling to find another explanation for Cuban’s crusade.  He knows perfectly well he’s overplaying his hand with his predictions of YouTube’s demise.  He knows perfectly well the legal questions hang in the balance, even tilting in YouTube’s favor in my own reading of the DMCA safe harbors.  So why’s he doing it?

I’d love to hear any other explanations out there, as I like Cuban’s general style and thinking.  I’d love someone to tell me I’m wrong.


MySpace Zapr Link Tool, Bandwidth Hell, and NAT Traversal

September 12, 2006

I just read Mark Cuban’s blog for the first time in awhile, and I like his fast and loose style, so don’t be surprised if my posts get a little less formal.

Moment’s after catching up with Mark, Mick from Zapr blogged about the new MySpace Zapr link tool. I quickly gave it a spin. At first, it blew me away. The link you ultimately used to download some of Henning Schulzrinne’s fascinating lecture slides on my machine looked like this:

http://72.3.247.245:81/GETF?(null)&(null)&adamfisk&HORTON-LAPTO&2f615f21d986d501

I looked at that and scratched my head. I even shot off a quick e-mail to my good buddy Jerry Charumilind to figure out what I was missing here. I assumed the 72.3.247.245 was the IP address of the Starbucks I’m sitting in here in Sheriden Square, New York City, and that they had somehow figured out how to publicly address my machine using my locally-running Zapr instance to open up some ports. UPnP? No, that just wouldn’t work in most cases. Too brittle. Were they doing an ICE-style direct transfer, the way I would? Not possible without Zapr installed on both ends.

Then I turned to my trusty nslookup and discovered “72.3.247.245” is, in fact, one of their servers. I suspect they use the raw IP address to make it look like they’re doing something fancy, but, in fact, they’re just relaying traffic. Suddenly the world made sense again!

Don’t get me wrong, it’s still a nifty service and a nice implementation. It’s getting towards the seamless integration of the “dark web” I’d like to see — the integration of edge resources, in this case content. If they were using open protocols, I’d give them a thumbs up. Unfortunately, we can add them to the long list of services ignoring interoperability. If they are using standards, they’re certainly not advertising it. Aside from that, the main problem I see is their bandwidth costs if the service gets popular. Yikes! They’re not caching it, as your file immediately becomes unavailable if you go offline. This means not only will a user’s machine get hammered if something is popular, but so will Zapr’s. The user’s machine would just never hold up (think of the most popular YouTube videos hosted on your machine at home), and the Zapr servers would have a heck of a time too.

How do you get around this? Just like Skype does, just like Gizmo does, and just like LittleShoot does. Require users on both ends to have the software installed and pierce the NATs and firewalls to connect the computers directly. That solves the problem of killing the central server. What about the user’s poor machine? Keep track of all the locations for the file and load balance it across the Internet. How you do that is another question I’ll leave for another day (hint: not with BitTorrent).


O’Reilly, GData, Open Standards

September 4, 2006

Tim O’Reilly’s post about GData and the importance of open standards articulates the argument for expanding the open infrastructure, for standardizing the “small pieces” that together do the heavy lifting of the Internet and make everything work together.

I like David Weinberger’s “small pieces” phrase, and I’ll adopt it here. Open standards and open source work so well, and so well together, because the pieces are small. Each standard solves a very specific problem. This allows each open source implementation of those standards to be limited in scope, lowering the barriers to entry for writing and maintaining them. The Internet today exists because of small pieces, particularly HTTP, HTML, CSS, XML, etc.

Together, these small pieces form the web platform that has fostered the startling array of innovations over the last ten years. O’Reilly’s key phrase is “A Platform Beats an Application Every Time”. If there’s any lesson to take away from the Internet, this is it. A platform beats an application because it fosters an entire ecosystem of applications that can talk to each other using these small pieces. The ability to talk to each makes each application far more powerful than if it were an isolated island. Just like an ecosystem, platforms create new niches and continually evolve as new actors emerge, and they create needs for new protocols.

This is why the current Internet lies in such a precarious state. The ecosystem has evolved, and has created needs for new protocols that do everything from traverse NATs to publish data. As the system becomes more complex, however, we’re forgetting that central tenet that small pieces made the whole thing work in the first place. In most cases, standards for solving the problems exist, but private actors either don’t realize it or decided to use their own versions regardless. This is like companies in 1994 deciding to ignore HTTP and implement their own versions.

Take NATs for example. The IETF’s SIP, TURN, STUN, and ICE provide an excellent, interoperable framework for traversing NATs. Nevertheless, Skype, BitTorrent, and Gnutella all implement their own proprietary versions of the same thing, and they don’t work as well as the IETF versions. As a result, none of them can interoperate, and the resources of all NATted computers remain segmented off from the rest of the Internet as a wasted resource. Skype can only talk to Skype, BitTorrent can only talk to BitTorrent, and Gnutella can only talk to Gnutella in spite of standards that could make all three interoperate. In Skype and BitTorrent’s case, they even ignore HTTP. They decided to completely forgoe interoperability with the rest of the Internet for file transfers.

GData, in contrast, gets high marks for interoperability. It uses the Atom Publishing Protocol (APP), RSS, and HTTP. RSS and HTTP are, of course, widely deployed already. APP is a good standard that leverages HTTP and solves very specific publishing problems on top of that. APP lets you modify any data you submit, one of Tim Bray’s first criteria for “Open” data. Google Base, built on top of GData, also shares AdSense revenue with users, fulfilling Tim Bray’s second criteria of sharing value-added information from submitted data.

The only part of GData I have a problem with is OpenSearch. OpenSearch is sort of half of an Internet standard because it emerged from a single company Amazon, in the face of a better standards out of the IETF, RDF and SPARQL.

SPARQL and RDF together create an abstraction layer for any type of data and allow that data to be queried. They create the data portion of the web platform. As Tim says, “The only defense against [proprietary data] is a vigorous pursuit of open standards in data interchange.” Precisely. RDF and SPARQL are two of the primary protocols we need in this vigorous pursuit on the data front. The Atom Publishing Protocol is another. There are many fronts in this war, however. We also need to push SIP, STUN, TURN, and ICE in terms of making the “dark web” interoperable, just as we need to re-emphasize the importance of HTTP for simple file transfers. These are the protocols that need to form, as Tim says, “a second wave of consolidation, which weaves it all together into a new platform”. If we do things right, this interoperable platform can create a world where free calling on the Internet works as seamlessly as web browsers and web servers, where every browser and every server automatically distribute load using multisource “torrent” downloads, and where all data is shared.

Standards are the key to this open infrastructure.