TechCrunch 50, LittleShoot, and the Aftermath

August 15, 2008

Things have been on hold here for the public release of LittleShoot as we have awaited word on our TechCrunch 50 application.  We didn’t make it! Darn.  It came down to the wire — we were in the last batch of companies to receive notification we weren’t in the conference on Friday at 1:24 AM EST. The 50 companies that made it in should know now.  Here’s the e-mail:

 

Dear TechCrunch50 Candidate:

We are sorry to inform you that your company was not selected as a finalist for the TechCrunch50 conference. As you know, we are only able to select a very, very small percentage of the more than 1,000 outstanding applications we receive.

Your company was among a select set of candidates that we considered, and it was a difficult decision driven purely by the limited number of presentation slots. Since we regarded your business so highly, we want to make sure you still get the opportunity to participate in the conference in our DemoPit.
(http://techcrunch50demopit.eventbrite.com).

As a DemoPit company, you will have the opportunity to be nominated for the People’s Choice award and win the 50th spot on the TechCrunch50 main stage. As the 50th company to present, the People’s Choice award winner will be able to compete for the $50,000 TechCrunch50 award. Act fast, as spaces are very limited and first come, first served.

Additionally, all DemoPit companies will benefit from the exposure generated by media attending the event. We do anticipate having approximately 300 members of the international press in attendance.

If you have questions regarding the TechCrunch50 Demo Pit opportunity, please email Dan Kimerling at dan@techcrunch.com.

Sincerely,

–Jason, Heather & Michael
and the TechCrunch50 Team

 

I’ve got a lot of respect for the TechCrunch folks and the way they give unfunded companies a shot, and I thoroughly enjoyed meeting Jason Calacanis down at the Mahalo Tech Meetup last night in one of my first nights in LA. I appreciate the tremendous work Jason, Michael, Heather, and the other folks at TechCrunch have put it to the process.  

That said, it’s on.  I feel like the guy on draft day who didn’t go on the first round.  It’s the “meritocracy” thing that gets me – TechCrunch 50 is touted as a pure meritocracy.  I’d put LittleShoot’s technology up there with anyone, and it just kills me to think 50 startups beat us out.  We can tell ourselves they had better business models, better marketing plans, yada yada yada, but I’m taking it to mean they had better technology.  If there’s anything that motivates me, that’s it. I have great respect for the other applicants, and we all supported each other on the TechCrunch blog as we agonized through the waiting process.  I wish everyone the best of luck, but the LittleShoot public beta is on its way.

Here’s a link to the LittleShoot demo video we submitted for TechCrunch for people unfamiliar with the Little Fella’:

LittleShoot Demo


P2P in Flash 10 Beta — the Questions Facing a YouTube, Skype, and BitTorrent Killer

May 21, 2008

As I’ve reported, the inclusion of P2P in Flash 10 Beta represents a fundamental disruption of the Internet platform. As with all disruptions, however, this one will progress in fits and starts. Flash 10′s details limit the full power of its P2P features. While features like VoIP will be fully enabled, it will take some ingenuity to turn Flash 10 into a more generalized P2P platform. Here are the issues:

1) Flash Media Server (FMS)

You’ll need Flash Media Server (FMS) to take advantage of Flash P2P. At $995 for the “Streaming Server” and $4,500 for the “Interactive Server”, FMS is beyond the reach of most developers working on their own projects, severely limiting Flash P2P’s disruptive potential. In an ideal world, the new P2P protocols would be openly specified, allowing open source developers to write their own implementations. As it stands now, a single company controls a potentially vital part of the Internet infrastructure, and encryption will likely thwart the initial reverse engineering efforts of open source groups like Red5.

2) No Flash Player in the Background

As David Barrett (formerly of Akamai/Red Swoosh) has emphasized on the Pho List, Flash Player only runs when it’s loaded in your browser. As soon as you navigate to another page, Flash can no longer act as a P2P server. P2P programs like Red Swoosh, BitTorrent, and LittleShoot don’t have this limitation, and it means Flash can’t save web sites as much bandwidth as those full-blown applications can. This limits but does not eliminate Flash’s threat to CDNs. Sure, you could get around this using AIR, but that creates another major barrier to adoption.

3) Usability

While Flash 10 has the ability to save files to your computer and to load them from your computer (essential for P2P), it pops up a dialog box each time that happens. While this is an important security measure, it cripples Flash 10′s ability to mimic BitTorrent because you’d have dialogs popping up all the time to make sure you as a user had authorized any uploads of any part of a file.

4) Limited APIs

While all the required technology is there in the Real Time Media Flow Protocol (RTMFP), ActionScript’s API limits some of the P2P potential of Flash 10. P2P downloading breaks up files into smaller chunks so you can get them from multiple other computers. Flash 10 can only save complete files to your computer — you can’t save in small chunks. As a result, you’d have to use ActionScript very creatively to achieve BitTorrent or LittleShoot-like distribution or to significantly lower bandwidth bills for sites serving videos. It might be possible, but you’d have to work some magic.

So, that’s the deal. There’s still a lot more documentation coming our way from Adobe, so there are undoubtedly useful nuggets yet to be discovered.

Even given all these limitations, however, the key point to remember is the Internet has a new, immensely powerful protocol in its arsenal: Matthew Kaufman and Michael Thornburgh’s Real Time Media Flow Protocol (RTMFP). While Flash might use it primarily for direct streaming between two computers now (think VoIP), it introduces the potential for so much more.

Keep your helmet on.


P2P in Flash 10 Beta – a YouTube, Skype, and BitTorrent Killer

May 16, 2008

The inclusion of p2p in the Flash 10 beta threatens to bring down everyone from YouTube to Skype. Using P2P, Flash sites will be able to serve higher quality video than YouTube at a fraction of the cost. Meanwhile, the combination of the Speex audio codec and the Real Time Media Flow Protocol (RTMFP) will enable sites to seamlessly integrate VoIP without requiring a Skype install. The impact of this change is hard to fathom. We’re talking about a fundamental shift in what is possible on the Internet, with Flash demolishing almost all barriers to integrating P2P on any site.

Hank Williams and Om Malik have discussed the potential for Flash 10 to be used for P2P CDNs, and they’re largely right on. The biggest problem I see with P2P CDNs is oddly latency, however. While P2P theoretically enables you to choose copies of content closer to you on the network, you still have to negotiate with a server somewhere to establish the connection (for traversing NATs), nullifying the P2P advantage unless you’re talking about really big files. As Hank identifies, the sites serving large files are the CDN’s best customers, so we are talking about a significant chunk of the CDN business up for grabs. That said, CDNs could easily start running Flash Media Servers themselves with integrated RTMFP. They’ve already addressed the server locality problem, and taking advantage of Flash deployments would simply be an optimization. Whether the CDNs will realize this shift has taken place before it’s too late is another question.

To me, the really vulnerable players are the video sites themselves and anyone in the client-side VoIP space. Writing a VoIP app is now equivalent to writing your own Flash video player. All the hard stuff is already done. Same with serving videos. You no longer have to worry about setting up an infinitely scalable server cluster — you just offload everything to Flash. No more heavy lifting and no more huge bandwidth bills. In the BitTorrent case, it’s mostly a matter of usability. As with Skype, you no longer need a separate install. Depending on what’s built in to the Flash Media Server, you also no longer need to worry about complicated changes on the server side, and downloads will happen right in the browser.

The stunning engineering behind all of this should be adequately noted. The Real Time Media Flow Protocol (RTMFP) underlies all of these changes. On closer inspection, RTMFP appears to be the latest iteration of Matthew Kaufman and Michael Thornburgh’s Secure Media Flow Protocol (SMP) from Adobe’s 2006 acquisition of Amicima. Adobe appears to have acquired Amicima specifically to integrate SMP into Flash, now in the improved form of RTMFP. This is a very fast media transfer protocol built on UDP with IPSec-like security and congestion control built in. The strength of the protocol was clear to me when Matthew first posted his “preannouncement” on the p2p hackers list. Very shrewd move on Adobe’s part.

Are there any downsides? Well, RTMFP, is for now a closed if breathtakingly cool protocol, and it’s tied to Flash Media Server. That means Adobe holds all the cards, and this isn’t quite the open media platform to end all platforms. If they open up the protocol and open source implementations start emerging, however, the game’s over.

Not that I have much sympathy, but this will also shift a huge amount of traffic to ISPs, as ISPs effectively take the place of CDNs without getting paid for it. While Flash could implement the emerging P4P standards to limit the bleeding at the ISPs and to further improve performance, this will otherwise eventually result in higher bandwidth bills for consumers over the long term. No matter — I’d rather have us all pay a little more in exchange for dramatically increasing the numbers of people who can set up high bandwidth sites on the Internet. The free speech implications are too good to pass up.

Just to clear up some earlier confusion, Flash Beta 10 is not based on SIP or P2P-SIP in any way. Adobe’s SIP work has so far only seen the light of day in Adobe Pacifica, but not in the Flash Player.


Ian Clarke’s Freenet 0.7 Released

May 9, 2008

After 3 years of development, the latest version of Freenet is here. This version protects users from persecution for even using Freenet, let alone for the content they’re distributing. Freenet is a vital tool against censorship, particularly in countries like China where freedom of speech is often severely curtailed. For the unfamiliar, here’s the quick description of Freenet from their site:

Freenet is free software which lets you publish and obtain information on the Internet without fear of censorship. To achieve this freedom, the network is entirely decentralized and publishers and consumers of information are anonymous. Without anonymity there can never be true freedom of speech, and without decentralization the network will be vulnerable to attack.

Congratulations to Ian Clarke, Matthew Toseland, and the other the Freenet developers. The quote on the Freenet site epitomizes the importance of the project:

“I worry about my child and the Internet all the time, even though she’s too young to have logged on yet. Here’s what I worry about. I worry that 10 or 15 years from now, she will come to me and say ‘Daddy, where were you when they took freedom of the press away from the Internet?’”
–Mike Godwin, Electronic Frontier Foundation

Freenet is a vital weapon in that war.

I’m also excited to have Ian as a new addition to the LittleShoot advisory board, one of many things we’ll be making more announcements about soon.  I’ve always had a great respect for Ian’s emphasis on p2p’s importance as a politically disruptive tool for free speech.  We all got caught up in the copyright wars and missed the big picture, but not Ian.


Atlassian JIRA — Automating the Standalone Install on MySQL

February 2, 2008

I’ve started thinking of my coding wanderings as akin to Alice’s rabbit holes — magical new places I play around in for probably a little too long. Automating sysadmin-type work with shell scripts has become my latest rabbit hole. Quickly running new services on Amazon’s EC2 is my inspiration.

So, this is the first little snippet, a simple initial building block that will become a part of larger scripts down the road. For those who don’t know, Atlassian has started giving away free licenses to all of their products to open source projects, so this gives you access to JIRA, Bamboo, Confluence, FishEye, Clover, Crowd, etc. These tools are amazingly useful and are all the best or amongst the best at what they do. Check out the Atlassian web site for more info.

This script automates the two trickiest parts of installing JIRA:

  1. Connecting to your database. In this case we connect to MySQL.
  2. Customizing the port to run JIRA on.

In the first instance, the script automatically downloads the MySQL JDBC driver, creates the JIRA database, and configures the JIRA user name and password for MySQL. The port customization is something you frequently want because so much runs on 8080 by default. These tasks are more annoying than tricky, but this script makes them a breeze.

Prerequisites:

  1. MySQL already running on the default port.
  2. You need to know your MySQL root password. The script will use it to create the JIRA database and to set permissions for the JIRA MySQL user.
  3. A downloaded version of JIRA standalone from the Atlassian web site. This will be a file called atlassian-jira-VERSION-standalone.tar.gz. The script just looks for a file starting with “atlassian-jira” and ending in “tar.gz” in the current directory.
  4. Java installed with JAVA_HOME set.

Future scripts will also include automated installing and configuring of MySQL as well as Java, but for now you need them configured ahead of time. I chose to run JIRA standalone because in my experience getting the separate wars to play nicely with my existing wars was tricky. In particular, some of the Atlassian war files take awhile to start up and don’t shut down as cleanly as they should. Using the standalone versions insures they won’t interfere with your other webapps.

When you have JIRA downloaded, MySQL running, and Java configured, go ahead and download the script from the LittleShoot web site.

Here’s all you need to run:

./jira.bash

The script will guide you through the process of configuring and running JIRA, and it should be really self-explanatory. When the script is done, you’ll still need to run through JIRA’s configuration procedure within the browser, but the script has taken care of the hard part.

If you need to install JIRA from another script, you can also run something like the following, modifying it for your values of course.

./jira.bash jirauser jirapwd yourMySql_root_password adamfisk

The last argument is the user name of the user on the system who should own the jira directory.

Below is the full script.

#!/usr/bin/env bash
#
# This script performs all the JIRA configuration and setup for running
# JIRA on MYSQL.  This includes creating the JIRA database and creating
# a user on the database.
#
# If no arguments are passed to the script, it prompts you for the
# data it needs.  Otherwise, you must pass all the required data on the
# command line.  This makes it easier to incorporate this script into
# other scripts if desired.
#
# If you decide to pass in arguments, they are (in order):
#
# 1) The name of the new jira user in the database.
# 2) The password of the new jira user in the database.
# 3) Your MYSQL root password to create the JIRA database.
# 4) The user account to install JIRA under.  This account should
#    already exist on the system.
#
# To run this script:
#
# YOU MUST HAVE DOWNLOADED JIRA STANDALONE INTO YOUR CURRENT DIRECTORY
#
# That file should be the downloaded copy of JIRA standalone.
#
# If you have any problems, please see the excellent guide at:
# http://confluence.atlassian.com/display/JIRA/Setting+up+JIRA+Standalone+and+MySQL+on+Linux
#

function die
{
echo $1
exit 1
}

ls ./atlassian-jira-*.tar.gz > /dev/null || die "The Atlassian JIRA tar.gz file must be in the current directory.  Have you successfully downloaded JIRA standalone?"

netstat -na | grep 3306 > /dev/null || die "MySQL does not appear to be running on port 3306.  JIRA cannot be installed without MySQL running"

function askUser
{
echo "Please enter your JIRA database user name:"
read JIRA_USER_NAME

echo "Please enter your JIRA database password:"
read JIRA_PWD

echo "Please enter your MySQL root password:"
read MYSQL_ROOT_PWD

echo "What's the name of the user account on this machine you'd like to install JIRA under?"
read USER_ACCOUNT
}

ARGS=4
if [ $# -ne "$ARGS" ]
then
    if [ $# -ne "0" ]
    then
        echo "Usage: jira.bash jira_mysql_user_name jira_mysql_password mysql_root_password user_account"
        echo "You can also just run ./jira.bash to have the script guide you through the setup process."
        die
    else
        askUser
    fi
else
    JIRA_USER_NAME=$1
    JIRA_PWD=$2
    MYSQL_ROOT_PWD=$3
    USER_ACCOUNT=$4
fi

echo "............................................................"
echo "  Hello $USER, let's start setting up JIRA standalone."
echo "............................................................"

function modifyPort
{
  echo "What port would you like to use for JIRA?  The default is 8080."
  read CUSTOM_PORT
  echo "What shutdown port would you like to use for JIRA?  The default is 8005."
  read CUSTOM_SHUTDOWN_PORT
  echo "OK, got it.  Proceeding with install."
}

echo "Would you like to change the port JIRA runs on from the default of 8080? [y/n]"
read CHANGE_PORT
case $CHANGE_PORT in
y)
  modifyPort || die "Could not modify port"
  ;;
Y)
  modifyPort || die "Could not modify port"
  ;;
*)
  echo "OK, using default port of 8080.  Proceeding with install."
  CUSTOM_PORT=8080
  CUSTOM_SHUTDOWN_PORT=8005
  ;;
esac

function installJira
{
echo "Expanding `ls ./atlassian-jira-*.tar.gz`..."
tar xzf `ls ./atlassian-jira-*.tar.gz` || die "Could not open jira tgz file.  Aborting."

# Add a symbolic link to whichever version of JIRA we're running.
ln -s `ls | grep atlassian-jira-` jira

echo "Downloading MYSQL JDBC connector..."

# Somewhat bad to hard code this, but I don't think JIRA users alone will have much of an impact on this server.
curl -o mysqlj.tgz http://mirrors.24-7-solutions.net/pub/mysql/Downloads/Connector-J/mysql-connector-java-5.1.5.tar.gz
tar xzf mysqlj.tgz
mv mysql-connector-java-5.1.5/mysql-connector-java-5.1.5-bin.jar jira/common/lib || die "Could not move myql jdbc jar"

echo "Customizing server.xml..."
cp jira/conf/server.xml jira/server.xml.copy
perl -pi -e s/Server\ port=\"8005\"/Server\ port=\"$CUSTOM_SHUTDOWN_PORT\"/g jira/conf/server.xml || die "Could not set shutdown port"
perl -pi -e s/Connector\ port=\"8080\"/Connector\ port=\"$CUSTOM_PORT\"/g jira/conf/server.xml || die "Could not set JIRA port"
perl -pi -e s/username=\"sa\"/username=\"$JIRA_USER_NAME\"/g jira/conf/server.xml || die "Could not modify jira user name"
perl -pi -e s/password=\"\"/password=\"$JIRA_PWD\"/g jira/conf/server.xml || die "Could not modify jira password"
perl -pi -e s/driverClassName=\"org.hsqldb.jdbcDriver/driverClassName=\"com.mysql.jdbc.Driver/g jira/conf/server.xml
perl -pi -e s/jdbc:hsqldb:\\$\{catalina.home\}\\/database\\/jiradb\"/jdbc:mysql:\\/\\/localhost\\/jiradb?autoReconnect\=true\&\;useUnicode\=true\&\;characterEncoding\=UTF8\"\\/\>/g jira/conf/server.xml || die "Could not set jdbc"
perl -pi -e s/minEvictableIdleTimeMillis\=/\/\"20\"\ \\/\>--\>/g jira/conf/server.xml || die "Could not finish comment"

echo "Customizing entityengine.xml..."
cp jira/atlassian-jira/WEB-INF/classes/entityengine.xml jira/entityengine.xml.copy || die "Could not make entityengine backup"
cp jira/atlassian-jira/WEB-INF/classes/entityengine.xml . || die "Could not copy entityengine to current directory"

perl -pi -e s/name=\"defaultDS\"\ field-type-name=\"hsql\"/name=\"defaultDS\"\ field-type-name=\"mysql\"/g entityengine.xml || die "Could not set entityengine database to MYSQL"
perl -pi -e s/schema-name=\"PUBLIC\"//g entityengine.xml || die "Could not remove public schema from entiry engine"

mv entityengine.xml jira/atlassian-jira/WEB-INF/classes/ || die "Could not move entity engine"

chown -R $USER_ACCOUNT jira || die "Could not set permissions to specified user: $USER_ACCOUNT"

cat < jira.sql
create database if not exists jiradb character set utf8;
GRANT ALL PRIVILEGES ON jiradb.* TO '$JIRA_USER_NAME'@'localhost'
IDENTIFIED BY '$JIRA_PWD' WITH GRANT OPTION;
flush privileges;
EOL
mysql -uroot -p$MYSQL_ROOT_PWD < jira.sql || die "Could not set up database for JIRA.  Is your root password correct?"
echo "Starting JIRA on port $CUSTOM_PORT..."
./jira/bin/startup.sh || die "Could not start JIRA"

echo ""
echo "-----------------------------------------------------------------------------------------------------------------"
echo "  Great, JIRA's starting up.  You should be able to access it momentarily on port $CUSTOM_PORT on this machine."
echo "-----------------------------------------------------------------------------------------------------------------"
}

installJira

exit 0

O’Reilly, GData, Open Standards

September 4, 2006

Tim O’Reilly’s post about GData and the importance of open standards articulates the argument for expanding the open infrastructure, for standardizing the “small pieces” that together do the heavy lifting of the Internet and make everything work together.

I like David Weinberger’s “small pieces” phrase, and I’ll adopt it here. Open standards and open source work so well, and so well together, because the pieces are small. Each standard solves a very specific problem. This allows each open source implementation of those standards to be limited in scope, lowering the barriers to entry for writing and maintaining them. The Internet today exists because of small pieces, particularly HTTP, HTML, CSS, XML, etc.

Together, these small pieces form the web platform that has fostered the startling array of innovations over the last ten years. O’Reilly’s key phrase is “A Platform Beats an Application Every Time”. If there’s any lesson to take away from the Internet, this is it. A platform beats an application because it fosters an entire ecosystem of applications that can talk to each other using these small pieces. The ability to talk to each makes each application far more powerful than if it were an isolated island. Just like an ecosystem, platforms create new niches and continually evolve as new actors emerge, and they create needs for new protocols.

This is why the current Internet lies in such a precarious state. The ecosystem has evolved, and has created needs for new protocols that do everything from traverse NATs to publish data. As the system becomes more complex, however, we’re forgetting that central tenet that small pieces made the whole thing work in the first place. In most cases, standards for solving the problems exist, but private actors either don’t realize it or decided to use their own versions regardless. This is like companies in 1994 deciding to ignore HTTP and implement their own versions.

Take NATs for example. The IETF’s SIP, TURN, STUN, and ICE provide an excellent, interoperable framework for traversing NATs. Nevertheless, Skype, BitTorrent, and Gnutella all implement their own proprietary versions of the same thing, and they don’t work as well as the IETF versions. As a result, none of them can interoperate, and the resources of all NATted computers remain segmented off from the rest of the Internet as a wasted resource. Skype can only talk to Skype, BitTorrent can only talk to BitTorrent, and Gnutella can only talk to Gnutella in spite of standards that could make all three interoperate. In Skype and BitTorrent’s case, they even ignore HTTP. They decided to completely forgoe interoperability with the rest of the Internet for file transfers.

GData, in contrast, gets high marks for interoperability. It uses the Atom Publishing Protocol (APP), RSS, and HTTP. RSS and HTTP are, of course, widely deployed already. APP is a good standard that leverages HTTP and solves very specific publishing problems on top of that. APP lets you modify any data you submit, one of Tim Bray’s first criteria for “Open” data. Google Base, built on top of GData, also shares AdSense revenue with users, fulfilling Tim Bray’s second criteria of sharing value-added information from submitted data.

The only part of GData I have a problem with is OpenSearch. OpenSearch is sort of half of an Internet standard because it emerged from a single company Amazon, in the face of a better standards out of the IETF, RDF and SPARQL.

SPARQL and RDF together create an abstraction layer for any type of data and allow that data to be queried. They create the data portion of the web platform. As Tim says, “The only defense against [proprietary data] is a vigorous pursuit of open standards in data interchange.” Precisely. RDF and SPARQL are two of the primary protocols we need in this vigorous pursuit on the data front. The Atom Publishing Protocol is another. There are many fronts in this war, however. We also need to push SIP, STUN, TURN, and ICE in terms of making the “dark web” interoperable, just as we need to re-emphasize the importance of HTTP for simple file transfers. These are the protocols that need to form, as Tim says, “a second wave of consolidation, which weaves it all together into a new platform”. If we do things right, this interoperable platform can create a world where free calling on the Internet works as seamlessly as web browsers and web servers, where every browser and every server automatically distribute load using multisource “torrent” downloads, and where all data is shared.

Standards are the key to this open infrastructure.


Chris Holmes and Architectures of Participation

August 30, 2006

My good friend Chris Holmes’s recent Tech Talk to Google is now available on Google video. Chris’s work touches on a lot of things, but you can think of it as helping to implement an open standards and open source-based infrastructure for things like Google Maps and Google Earth. You should check out his thoughts.

I get all excited when Chris talks about open standards as a cornerstone of democracy. With the web changing rapidly, we all need to remember this lesson. The web itself was based on the simple open architecture of HTTP and HTML. Analogous standards exist for geographic data. Chris’s work focuses on expanding the web platform to also support geographic data, much as my work focuses on expanding the web platform to support P2P.

I’ll write more about “architectures of participation” in the future. While “Web 2.0″ is a much catchier name, I think “architectures of participation” clears up a lot of the confusion surrounding these issues. I also think it digs deeper. A lot of the Web 2.0 thinking focuses on collaboration on the level of individual web sites. I have no problem with that, and I just love collaborative projects like Wikipedia. There’s a distinct lack of discussion about how architectures of participation at the standards layer enables all of this, though, I think because more people understand web sites than the standards driving them.

Wikipedia would, of course, never exist if we didn’t have HTTP and HTML. HTTP and HTML are really quite simple protocols, but look what they’ve enabled! Imagine what could happen if we really started growing the protocol layer of the web, integrating things like geographic standards and SIP onto standard web projects. What could collaborative projects do atop a more powerful infrastructure? I’m not sure, but it’s a question we should be taking a harder look at.


Daswani on LimeWire Complaint

August 29, 2006

Susheel Daswani has posted another excellent piece in his series about the recent RIAA action against LimeWire.  Susheel worked with me at LimeWire before moving on to study law.

The case will be fascinating to watch.  At LimeWire, we really did all wake up one day to find ourselves in the entertainment business, much to our displeasure.  As I have discussed, the Gnutella developer community was always much more interested in the technology.  Many of the most active members, like Serguei Osokine, Phillipe Verdy, or Raphael Manfredi had absolutely no commercial interest in Gnutella other than as a theoretical and technical exercise.  Susheel and I worked extensively with Serguei to write the current standards for distributed search on Gnutella.  That just always seemed so much more important than the copyright dust up, and it’s sad that it’s come to this.


Cringely, Skype, Open Infrastructure

August 16, 2006

I have seemingly plunged myself into a running debate with Robert X. Cringely about the finer points of p2p telephony and NAT traversal. I would first like to acknowledge the remarkable breadth of Cringely´s technical knowledge. An early Apple employee and a longtime savvy observer of technology, Cringely somehow has a strong grasp of the highly specialized technology underlying VoIP. His range is startling.

That said, Cringely continues to stumble over the finer technical details. I only bring this up because the fundamental problems with Skype lie in those details, as I´ll explain. They are what make Skype a “closed garden” and a detriment to the “open infrastructure” I´ve advocated. Cringely puts forth an explanation of Skype´s NAT traversal, asserting that Skype uses STUN, TURN, and ICE servers to do the heavy lifting. There are several problems with this assertion. First, there´s no such thing as an “ICE server”. ICE is a client-side protocol that makes use of STUN and TURN “candidate” endpoints to establish the best possible connection between two peers. Second, and most importantly, Skype doesn´t implement any single one of these protocols regardless. While Cringely likely understands this, his post makes no reference to this key distinction.

For the uninitiated, STUN, TURN, and ICE allow clients to traverse NATs. They are all IETF drafts that continue to change frequently, and they are typically used alongside SIP to power VoIP. This is true for almost every VoIP provider, except for Skype. Skype unfortunately chose to implement a proprietary version of each one, breaking interoperability with other VoIP providers. This makes Skype much like your cell phone in the U.S., where you typically cannot switch cell phone providers while keeping the same phone. If you switch from Sprint to Verizon, for example, you have the joy of putting your $200 phone in a box in your closet or going through the hassle of selling it on eBay. Skype has given us a similar gift. You could never use Skype software with Vonage or Gizmo, for example. If Skype used SIP, TURN, STUN, and ICE, you theoretically could.

Skype´s proprietary protocol is also an issue on web pages. On eBay, you can now press a hyperlink to call users over Skype. This link starts with “skype:” much as typical links start with “http:”. Links on web pages to initiate a phone call will become increasingly common. You could easily imagine links on MySpace pages for calling other users, for example, and some savvy MySpace users likely already have them. The problem is that every other VoIP provider uses SIP, which long ago standardized its own interoperable URIs that start with “sip:”. So, because Skype chose to implement proprietary versions of everything, you will likely have to choose between two links when making a call, one of the form “skype:afisk” and another of the form “sip:afisk@lastbamboo.org”.

Imagine if web servers made a similar choice circa 1994. In this world, instead of every link starting with “http:”, some would start with “http:” and others would start with “mywretchedprotocol:”. All browsers would have to support both. What a nightmare! We have Skype to thank for starting us along that path with VoIP.

The implications of this issue go further. While SIP certainly has its problems (why did they ever include UDP?), the carefully designed interworking of the SIP family of protocols is a thing of beauty. SIP does not depend on STUN or TURN or ICE, for example, just as STUN does not depend on TURN or SIP or ICE, etc. This allows each protocol to evolve independently and allows different developers to implement different parts of the protocol family. One open source project can simply write a STUN server, for example, while another could write a SIP server, and another a TURN client. In the end, the user gets better software because engineers can break apart the problem and focus on implementing one piece well. And they´re all documented in Internet drafts that anyone can read. Skype´s use of proprietary protocols butchers this system.

Because of its careful engineering, SIP can also carry any type of traffic over any type of transport. You can use SIP to transfer RSS feeds using HTTP over TCP as LittleShoot does, for example. Or you can just make a phone call. SIP is simply the signaling protocol used to establish the connection. Skype doesn´t have anywhere near this flexibility.

Skype´s decision to use a closed protocol has security implications as well. When calls are routed through supernodes, for example, there´s a built-in “man-in-the-middle” that can monitor all traffic. Skype encrypts calls, but do they use both server and client authentication to prevent the man-in-the-middle from launching a replay attack? If they don´t, then it´s theoretically possible for an attacker to become a supernode to listen to all of your calls. As a closed protocol, Skype isn´t open to public scrutiny in the security community that could otherwise identify and fix such vulnerabilities. There could be people implementing this exploit to monitor and decrypt your Skype calls right now. While one independent security audit claims Skype does implement both client and server authentication, this is one person evaluating their architecture as opposed to the throngs of security experts who would be eager to identify holes if the system were open. We just don´t know.

These issues all point to the importance of an open infrastructure and to the power of SIP as a bedrock of the next generation of Internet applications. As people like Vint Cerf have noted, SIP may be to the next ten years what HTTP was to the last ten, unless Skype gets in the way and everything degenerates into a battle of ugly proprietary implementations of the same thing.

I choose to believe that good engineering wins in the end. Protocols like SIP, HTTP, and XMPP will enable a new generation of far more powerful applications capable of seamlessly circumventing NATs and pooling resources to put the maximum possible power in the hands of the user.


Cringely Doesn’t Understand P2P

August 2, 2006

Mark Stephens, a.k.a. Robert X. Cringely, the guy who seems to get everything, just doesn’t seem to understand peer-to-peer. His recent post “The Skype is Falling” misses the mark several times. As Yochai Benkler has noted so clearly, p2p is about computers collaborating to share resources, much as Wikipedia is about humans collaborating to share knowledge. On Wikipedia, different people have different levels of knowledge on different topics. That’s what makes it work so well! Specialists in different areas on Wikipedia don’t have to worry about the things they don’t know about — they just contribute what they can. Together, they combine the best of human knowledge for everyone’s maximum benefit.

Peer-to-peer is at its most powerful when it does precisely the same thing — when each peer contributes as much as it’s capable of contributing. Just like humans, computers come in many shapes and sizes. Some have 200 GB hard drives and a dial-up modem. Others are cell phones with no hard drive and little memory, but with bandwidth to spare. Well-architected p2p networks take this into account, harvesting whatever resources are available at any time for the maximum benefit of the network as a whole.

Computers without firewalls and not behind NATs are one of the most precious resources on p2p networks because they supply services others can’t. Because anyone can connect to them, they serve as the glue holding any p2p network together. Without these nodes, most p2p networks simply would not function. Perhaps most importantly, they facilitate connections between NATted/firewalled nodes, allowing any node on a network to theoretically connect to any other.

This is where Cringely just doesn’t get it. He goes into detail describing how Skype uses “servers” to facilitate NAT traversal between peers. He points to the use of servers as somehow making Skype not p2p. The fact is, Skype uses distributed non-firewalled peers as “servers” to allow other peers to connect. This is ahh, well, precisely like every other p2p network on the planet. This architecture could not be more peer-to-peer. In fact, this is p2p at its best – the network is harvesting all available peer resources dynamically.

Cringely claims that “a lot of Skype connections aren’t p2p at all” because of this server interaction and that these servers need to have a “surplus of bandwidth to handle the conversation relay.” This is where his thesis really starts to unravel. He apparently does not understand that the servers are simply used as a signaling protocol to relay contact information between two peers. The call itself typically happens directly between the two computers using UDP NAT hole punching, a practice that VoIP really brought into mainstream use but that is also used in various games and in most p2p networks. There are certainly cases where the hole punching fails, such as with trying to connect 2 peers both behind symmetric NATs, but the call is typically direct. If it weren’t, call quality would often be horrible.

This is a key difference because the vast majority of the bandwidth requirements for calls are for the call itself. The headers exchanged for establishing the call are negligible. I would hazard a guess that 99% of the packets transferred in VoIP calls around the globe are voice packets, not headers. These packets never touch the server unless UDP hole punching fails. It’s an open question as to what Skype does when UDP hole punching fails, but I’d actually be surprised if they were even using “supernodes” in this case, likely instead routing their calls through their own servers. I just don’t think supernodes would be able to provide enough bandwidth for high enough call quality.

Far from the sort of faux p2p Cringely describes, Skype’s use of non-firewalled nodes to negotiate voice sessions between two or more peers is p2p at its finest. Now, don’t get me wrong, I have plenty of problems with Skype’s use of a proprietary protocol instead of SIP, and I’d prefer them to be open source, but this part of Cringely’s analysis just doesn’t make any sense.


Follow

Get every new post delivered to your Inbox.