Django and More on Google App Engine with App Engine Patch

November 16, 2008

I’ve recently had the chance to play with Waldemar Kornewald’s “Google App Engine Patch” and have come away very impressed. All LittleShoot Google App Engine (GAE) projects now run on it for a couple of simple reasons:

  1. Seamless Django integration (including 1.0.1)
  2. Thorough documentation
  3. Healthy open source development community with an excellent steward in Kornewald and frequent new releases

The Django integration got me first. Almost everything works, such as manage.py, Django authentication, Django testing (the original reason I switched), etc. There are also lots of other goodies in there, like support for boto’s SQS module. If you’re unfamiliar with it, boto SQS allows you to call Amazon’s Simple Queue Service (SQS) from Python. That’s a huge step in getting around GAE’s limitation on longer lived, CPU-intensive tasks. Just queue it up in SQS, and your GAE app will keep humming along fine — cloud integration at its finest.

To get started with App Engine Patch, download the zip file from the home page and work off the sample project. The download includes App Engine Patch itself as well as the sample project.  

With Django App Engine Helper not supporting Django 1.0 and seemingly inactive, App Engine Patch is a godsend and gets a huge thumbs up.

Great work Waldemar, and don’t forget to donate.


Voting Location Via Text Message

October 30, 2008

My buddies Chris Muscarella and Benjamin Stein over at Mobile Commons just released a simple service to find your voting place via text message.  Here’s all you have to do:

text pp then your street address and zip to 69866 (eg: pp 101 market st 94105)

That’ll give you the polling place for your address or the number for the Election Protection Coalition if your address isn’t in their system.  Their original post is here.

We’ll be studying this election for centuries.  Great job fellas.  I’m rooting for your servers holding up!


Hadoop on EC2 with OpenSolaris

October 2, 2008

The OpenSolaris crew just announced you can run Hadoop AMIs on EC2 running on top of OpenSolaris. That’s just cool. I’m still not ready to abandon my Django code running on App Engine (I’ve got a post coming up on the stellar new update to Google App Engine Patch, by the way), but I’d love to play with it. Anyone else given it a go?

You can run it with:

ec2-run-instances –url https://ec2.amazonaws.com ami-2bdd3942 -k <your-keypair-name>

You can get more info on running OpenSolaris on EC2 here.


Amazon Web Services vs. Google App Engine: The Race to the One-Click Cloud

August 27, 2008
One-Click Shopping

Can Amazon Build the One-Click Cloud?

It’s a great time to program for the cloud, no matter what Ted Dziuba’s entertaining but barely coherent rants have to say (will someone get that guy some experience?). Amazon and Google are going toe-to-toe, with Amazon’s addition of sorting in Simple DB bringing it up to par with Google App Engine’s Datastore API. Sorting was the biggest missing piece in Simple DB and the most compelling reason to choose the Datastore API instead. No longer.  

But Google App Engine (GAE) and the Datastore API still win. Here’s why:

  1. The Datastore API is projected to be 10x cheaper. $0.15-$0.18 per GB-month sounds a lot better than Simple DB’s $1.50 per GB-month.
  2. GQL. GAE’s SQL subset is just brain dead simple. As adept as programmers are at learning new frameworks, it’s nice to have something brain dead every once in awhile. Simple DB takes a few more cycles to learn (brain cycles that is — more coffee and such. Modafinil perhaps? Anyone tried it? I’m curious).
  3. GAE has better Object Relational Mapping (ORM). GAE basically uses Django’s sweet ORM system. You’ve got to jump through a lot more hoops to get something as nice with Simple DB. 
  4. GAE automatically scales the web application, not just the database. With Amazon, you have to add load balancing and bring machines up and down yourself, even if you’re using Simple DB. While there are third-party tools to help, they’re not built-in. Again, GAE is brain dead here.  

Sure, App Engine only supports Python. The ultimate question, though, is what functionality can you get in the end? For web apps, App Engine gives you more, particularly for scaling (which is kind of the whole point). Don’t know Python? Learn it. It will save you time in the end. Instead of endlessly fiddling with your load balancer and custom scripts for bringing instances up and down, you’ll spend your time adding the next killer feature your users will love.

In the end, the Amazon/Google “main event” is a huge win for you, me, and our users. The sorting announcement from Amazon comes on the heals of a flurry of other new features from both companies, including Amazon’s impressive persistent storage addition for EC2 called the Elastic Block Store, querying by attributes on Simple DB, GAE’s support for 10 applications per user instead of 3, GAE’s batch writes, etc. Neither one is pulling any punches, and the tools at our disposal as developers are progressing at a breathtaking pace as a result.

Amazon’s is clearly the more complete offering (you can do anything on it, in any language), but it needs to learn from Google’s focus on the dominant deployment scenarios.  Amazon could easily win if it does the following:

  1. Makes Simple DB pricing competitive with Google’s projected prices.
  2. Adds a query language for Simple DB along the lines of GQL.
  3. Adds automatic scaling for web applications, not just the database.
  4. Offers complete deployment solutions for the dominant web applications frameworks, from Tomcat/Spring/Hibernate to Django and Zend, with ORM models already adapted to Simple DB, instances automatically replicated with traffic, etc. Basically the same thing as App Engine for more web app frameworks than App Engine supports and adapted to the Amazon platform. Sure, there are third-party solutions for some of this stuff, but those will never be trusted as much as something offered directly from Amazon.

I’m a big fan of Amazon and Werner Vogels (one of the most innovative people in the industry, and also apparently a pretty nice guy), but Amazon desperately needs to learn from what Google has done. It’s ultimately a question of “usability” for developers. The originators of “one-click shopping” are losing in the game they practically invented. 

Amazon needs to turn on the one-click cloud.


TechCrunch 50, LittleShoot, and the Aftermath

August 15, 2008

Things have been on hold here for the public release of LittleShoot as we have awaited word on our TechCrunch 50 application.  We didn’t make it! Darn.  It came down to the wire — we were in the last batch of companies to receive notification we weren’t in the conference on Friday at 1:24 AM EST. The 50 companies that made it in should know now.  Here’s the e-mail:

 

Dear TechCrunch50 Candidate:

We are sorry to inform you that your company was not selected as a finalist for the TechCrunch50 conference. As you know, we are only able to select a very, very small percentage of the more than 1,000 outstanding applications we receive.

Your company was among a select set of candidates that we considered, and it was a difficult decision driven purely by the limited number of presentation slots. Since we regarded your business so highly, we want to make sure you still get the opportunity to participate in the conference in our DemoPit.
(http://techcrunch50demopit.eventbrite.com).

As a DemoPit company, you will have the opportunity to be nominated for the People’s Choice award and win the 50th spot on the TechCrunch50 main stage. As the 50th company to present, the People’s Choice award winner will be able to compete for the $50,000 TechCrunch50 award. Act fast, as spaces are very limited and first come, first served.

Additionally, all DemoPit companies will benefit from the exposure generated by media attending the event. We do anticipate having approximately 300 members of the international press in attendance.

If you have questions regarding the TechCrunch50 Demo Pit opportunity, please email Dan Kimerling at dan@techcrunch.com.

Sincerely,

–Jason, Heather & Michael
and the TechCrunch50 Team

 

I’ve got a lot of respect for the TechCrunch folks and the way they give unfunded companies a shot, and I thoroughly enjoyed meeting Jason Calacanis down at the Mahalo Tech Meetup last night in one of my first nights in LA. I appreciate the tremendous work Jason, Michael, Heather, and the other folks at TechCrunch have put it to the process.  

That said, it’s on.  I feel like the guy on draft day who didn’t go on the first round.  It’s the “meritocracy” thing that gets me – TechCrunch 50 is touted as a pure meritocracy.  I’d put LittleShoot’s technology up there with anyone, and it just kills me to think 50 startups beat us out.  We can tell ourselves they had better business models, better marketing plans, yada yada yada, but I’m taking it to mean they had better technology.  If there’s anything that motivates me, that’s it. I have great respect for the other applicants, and we all supported each other on the TechCrunch blog as we agonized through the waiting process.  I wish everyone the best of luck, but the LittleShoot public beta is on its way.

Here’s a link to the LittleShoot demo video we submitted for TechCrunch for people unfamiliar with the Little Fella’:

LittleShoot Demo


Where Google App Engine Spanks Amazon’s Web Services: S3, EC2, Simple DB, SQS

May 28, 2008

First off, I loooove Amazon Web Services (AWS), and we make heavy use of S3, EC2, Simple DB, and Elastic IPs for LittleShoot. We run everything on Amazon, but that’s about to change.  I’ve been in the App Engine beta for about a month, and despite all of the astonishing ways AWS makes building web sites easy, Google App Engine makes it far easier. Here’s why:

1) Google App Engine Has Better Scalability  

Google migrates your application across its infrastructure automatically as needed. With EC2, you have to manually detect machine load and bring instances up or down accordingly. You need to set up load balancing and clustering. While Amazon gives you far more control, it’s also far more work.  With Google App Engine, it’s all done for you.

2) Google App Engine Has a Better Database

Google App Engine’s Big Table blows Amazon’s Simple DB out of the water, and that’s coming from a big fan of Simple DB and Amazon CTO Werner Vogels. Simple DB thankfully yanks developers out of the relational database mindset and automatically replicates data across machines for scalability. You do have to learn a completely new query syntax, however, and, as this blog has noted, sorting is not officially supported. Simple DB is also still in beta.  

With App Engine, you’re using the same database, Big Table, Google engineers use to power some of the busiest sites on the Internet. Billions of queries have been hammering out kinks in Big Table for years. You know it will scale.  What’s more, App Engine’s “GQL” gives developers a familiar SQL-like syntax, lowering the learning curve compared to Simple DB.  Big Table also supports sorting.  Perhaps most significantly, Simple DB costs far more. While Google’s final pricing announcement later this year may change, today’s announcement didn’t mention any difference in price for data stored in the database versus anywhere else. On Simple DB, that data costs $1.50 per GB-month. On App Engine, it appears to cost $0.15 – $0.18 per GB-month. Wow.

3) Google App Engine is Cheaper 

Beyond the database, App Engine gives you the first 5 million or so page views per month for free.  That’s a lot of page views. It doesn’t put you up with the Internet’s top dogs, of course, but at 5 million page views you should be making cash. App Engine is free precisely when you’re building your company and keeping costs low is the most important. If you go beyond that 5 million, Google’s I/O event today will reveal newly announced prices that are remarkably similar to Amazon’s current offerings. They both price everything per GB or CPU-hour, and the numbers are barely distinguishable. That first 5 million page views and the apparent huge disparity in database storage pricing are by far the biggest differentiators, both dramatically tipping the scales in favor of Google.

Conclusion

For building scalable web applications quickly, App Engine beats AWS by a surprisingly wide margin.  Note, however, this refers specifically to web applications. For anything custom, you need Amazon. Because App Engine only supports Python, you also need Amazon for running any non-Python code. While this is a significant difference, many good developers are facile with multiple languages and can move rapidly between them.  Amazon’s flexibility makes it win out for many applications, but not for the most common application there is: web sites.  App Engine is more of a “domain-specific cloud” for web applications, but it’s shockingly good at what it does.  

Oh yeah, and it’s cheap.


P2P in Flash 10 Beta — the Questions Facing a YouTube, Skype, and BitTorrent Killer

May 21, 2008

As I’ve reported, the inclusion of P2P in Flash 10 Beta represents a fundamental disruption of the Internet platform. As with all disruptions, however, this one will progress in fits and starts. Flash 10′s details limit the full power of its P2P features. While features like VoIP will be fully enabled, it will take some ingenuity to turn Flash 10 into a more generalized P2P platform. Here are the issues:

1) Flash Media Server (FMS)

You’ll need Flash Media Server (FMS) to take advantage of Flash P2P. At $995 for the “Streaming Server” and $4,500 for the “Interactive Server”, FMS is beyond the reach of most developers working on their own projects, severely limiting Flash P2P’s disruptive potential. In an ideal world, the new P2P protocols would be openly specified, allowing open source developers to write their own implementations. As it stands now, a single company controls a potentially vital part of the Internet infrastructure, and encryption will likely thwart the initial reverse engineering efforts of open source groups like Red5.

2) No Flash Player in the Background

As David Barrett (formerly of Akamai/Red Swoosh) has emphasized on the Pho List, Flash Player only runs when it’s loaded in your browser. As soon as you navigate to another page, Flash can no longer act as a P2P server. P2P programs like Red Swoosh, BitTorrent, and LittleShoot don’t have this limitation, and it means Flash can’t save web sites as much bandwidth as those full-blown applications can. This limits but does not eliminate Flash’s threat to CDNs. Sure, you could get around this using AIR, but that creates another major barrier to adoption.

3) Usability

While Flash 10 has the ability to save files to your computer and to load them from your computer (essential for P2P), it pops up a dialog box each time that happens. While this is an important security measure, it cripples Flash 10′s ability to mimic BitTorrent because you’d have dialogs popping up all the time to make sure you as a user had authorized any uploads of any part of a file.

4) Limited APIs

While all the required technology is there in the Real Time Media Flow Protocol (RTMFP), ActionScript’s API limits some of the P2P potential of Flash 10. P2P downloading breaks up files into smaller chunks so you can get them from multiple other computers. Flash 10 can only save complete files to your computer — you can’t save in small chunks. As a result, you’d have to use ActionScript very creatively to achieve BitTorrent or LittleShoot-like distribution or to significantly lower bandwidth bills for sites serving videos. It might be possible, but you’d have to work some magic.

So, that’s the deal. There’s still a lot more documentation coming our way from Adobe, so there are undoubtedly useful nuggets yet to be discovered.

Even given all these limitations, however, the key point to remember is the Internet has a new, immensely powerful protocol in its arsenal: Matthew Kaufman and Michael Thornburgh’s Real Time Media Flow Protocol (RTMFP). While Flash might use it primarily for direct streaming between two computers now (think VoIP), it introduces the potential for so much more.

Keep your helmet on.


P2P in Flash 10 Beta – a YouTube, Skype, and BitTorrent Killer

May 16, 2008

The inclusion of p2p in the Flash 10 beta threatens to bring down everyone from YouTube to Skype. Using P2P, Flash sites will be able to serve higher quality video than YouTube at a fraction of the cost. Meanwhile, the combination of the Speex audio codec and the Real Time Media Flow Protocol (RTMFP) will enable sites to seamlessly integrate VoIP without requiring a Skype install. The impact of this change is hard to fathom. We’re talking about a fundamental shift in what is possible on the Internet, with Flash demolishing almost all barriers to integrating P2P on any site.

Hank Williams and Om Malik have discussed the potential for Flash 10 to be used for P2P CDNs, and they’re largely right on. The biggest problem I see with P2P CDNs is oddly latency, however. While P2P theoretically enables you to choose copies of content closer to you on the network, you still have to negotiate with a server somewhere to establish the connection (for traversing NATs), nullifying the P2P advantage unless you’re talking about really big files. As Hank identifies, the sites serving large files are the CDN’s best customers, so we are talking about a significant chunk of the CDN business up for grabs. That said, CDNs could easily start running Flash Media Servers themselves with integrated RTMFP. They’ve already addressed the server locality problem, and taking advantage of Flash deployments would simply be an optimization. Whether the CDNs will realize this shift has taken place before it’s too late is another question.

To me, the really vulnerable players are the video sites themselves and anyone in the client-side VoIP space. Writing a VoIP app is now equivalent to writing your own Flash video player. All the hard stuff is already done. Same with serving videos. You no longer have to worry about setting up an infinitely scalable server cluster — you just offload everything to Flash. No more heavy lifting and no more huge bandwidth bills. In the BitTorrent case, it’s mostly a matter of usability. As with Skype, you no longer need a separate install. Depending on what’s built in to the Flash Media Server, you also no longer need to worry about complicated changes on the server side, and downloads will happen right in the browser.

The stunning engineering behind all of this should be adequately noted. The Real Time Media Flow Protocol (RTMFP) underlies all of these changes. On closer inspection, RTMFP appears to be the latest iteration of Matthew Kaufman and Michael Thornburgh’s Secure Media Flow Protocol (SMP) from Adobe’s 2006 acquisition of Amicima. Adobe appears to have acquired Amicima specifically to integrate SMP into Flash, now in the improved form of RTMFP. This is a very fast media transfer protocol built on UDP with IPSec-like security and congestion control built in. The strength of the protocol was clear to me when Matthew first posted his “preannouncement” on the p2p hackers list. Very shrewd move on Adobe’s part.

Are there any downsides? Well, RTMFP, is for now a closed if breathtakingly cool protocol, and it’s tied to Flash Media Server. That means Adobe holds all the cards, and this isn’t quite the open media platform to end all platforms. If they open up the protocol and open source implementations start emerging, however, the game’s over.

Not that I have much sympathy, but this will also shift a huge amount of traffic to ISPs, as ISPs effectively take the place of CDNs without getting paid for it. While Flash could implement the emerging P4P standards to limit the bleeding at the ISPs and to further improve performance, this will otherwise eventually result in higher bandwidth bills for consumers over the long term. No matter — I’d rather have us all pay a little more in exchange for dramatically increasing the numbers of people who can set up high bandwidth sites on the Internet. The free speech implications are too good to pass up.

Just to clear up some earlier confusion, Flash Beta 10 is not based on SIP or P2P-SIP in any way. Adobe’s SIP work has so far only seen the light of day in Adobe Pacifica, but not in the Flash Player.


OpenSocial, Facebook, Google, OpenGadgets

January 28, 2008

I still find all of the attacks on OpenSocial to be naive. Did anyone ever really think each company would open up it’s social graph? Apparently so, but I certainly didn’t. How could Google possibly get everyone to join and dictate they all open their data? One step at a time! Perhaps I’ve never been disappointed because I never conceived of OpenSocial as anything but OpenGadgets. I don’t think the importance of OpenGadgets should be overlooked, however. Google Gadgets is a generally sound approach to standardizing gadget making using html and JavaScript. It’s certainly a big step up from learning some new proprietary markup language invented by Zuckerburg & co.

Perhaps this ties in to my general disdain for Facebook. Sure, it’s a heck of a lot better than MySpace, but is that really saying much? It’s still a social network, which is just inherently cheesy and doesn’t solve any interesting technical problem whatsoever. I find it shocking there’s talk of people leaving Google to go to Facebook. Maybe I’m a tech snob, but old Sergey and Larry actually solved a really challenging technical problem at Google. It makes sense to me they have legions of programmers rallying behind them. Facebook? This guy was a freshman at Harvard who knew a little PHP. Not to mention he stole the idea from his “buddies” and broke away to do it on his own. There’s all this excitement about a company that’s just not interesting technically with sketchy business ethics? I just don’t get it.

I’ll stop soon, but let’s just touch on the “Facebook platform.” Come on. We really need another proprietary platform? There’s a platform I’ve come to know and love that’s simply an astounding place for innovation called the “Internet.” It’s really cool. It’s really open. I’ve come to like other platforms like Ning, but I just have no interest in writing something in Facebook’s markup language that will make them more advertising cash. People like Facebook because there’s money to be made. That’s the only conclusion I can make. It’s not a horrible reason, but we should at least call it like it is. All the ramblings about how innovative the platform is remind me of company valuations in the late 90s. A lot of talk. I have yet to see any truly innovative Facebook app. Seriously. Please don’t super poke me. Ever. Most Facebook apps are not only uninteresting, but I actively wish they didn’t exist. They make my life worse and waste my time.

I’ll take the Internet any day.


O’Reilly, GData, Open Standards

September 4, 2006

Tim O’Reilly’s post about GData and the importance of open standards articulates the argument for expanding the open infrastructure, for standardizing the “small pieces” that together do the heavy lifting of the Internet and make everything work together.

I like David Weinberger’s “small pieces” phrase, and I’ll adopt it here. Open standards and open source work so well, and so well together, because the pieces are small. Each standard solves a very specific problem. This allows each open source implementation of those standards to be limited in scope, lowering the barriers to entry for writing and maintaining them. The Internet today exists because of small pieces, particularly HTTP, HTML, CSS, XML, etc.

Together, these small pieces form the web platform that has fostered the startling array of innovations over the last ten years. O’Reilly’s key phrase is “A Platform Beats an Application Every Time”. If there’s any lesson to take away from the Internet, this is it. A platform beats an application because it fosters an entire ecosystem of applications that can talk to each other using these small pieces. The ability to talk to each makes each application far more powerful than if it were an isolated island. Just like an ecosystem, platforms create new niches and continually evolve as new actors emerge, and they create needs for new protocols.

This is why the current Internet lies in such a precarious state. The ecosystem has evolved, and has created needs for new protocols that do everything from traverse NATs to publish data. As the system becomes more complex, however, we’re forgetting that central tenet that small pieces made the whole thing work in the first place. In most cases, standards for solving the problems exist, but private actors either don’t realize it or decided to use their own versions regardless. This is like companies in 1994 deciding to ignore HTTP and implement their own versions.

Take NATs for example. The IETF’s SIP, TURN, STUN, and ICE provide an excellent, interoperable framework for traversing NATs. Nevertheless, Skype, BitTorrent, and Gnutella all implement their own proprietary versions of the same thing, and they don’t work as well as the IETF versions. As a result, none of them can interoperate, and the resources of all NATted computers remain segmented off from the rest of the Internet as a wasted resource. Skype can only talk to Skype, BitTorrent can only talk to BitTorrent, and Gnutella can only talk to Gnutella in spite of standards that could make all three interoperate. In Skype and BitTorrent’s case, they even ignore HTTP. They decided to completely forgoe interoperability with the rest of the Internet for file transfers.

GData, in contrast, gets high marks for interoperability. It uses the Atom Publishing Protocol (APP), RSS, and HTTP. RSS and HTTP are, of course, widely deployed already. APP is a good standard that leverages HTTP and solves very specific publishing problems on top of that. APP lets you modify any data you submit, one of Tim Bray’s first criteria for “Open” data. Google Base, built on top of GData, also shares AdSense revenue with users, fulfilling Tim Bray’s second criteria of sharing value-added information from submitted data.

The only part of GData I have a problem with is OpenSearch. OpenSearch is sort of half of an Internet standard because it emerged from a single company Amazon, in the face of a better standards out of the IETF, RDF and SPARQL.

SPARQL and RDF together create an abstraction layer for any type of data and allow that data to be queried. They create the data portion of the web platform. As Tim says, “The only defense against [proprietary data] is a vigorous pursuit of open standards in data interchange.” Precisely. RDF and SPARQL are two of the primary protocols we need in this vigorous pursuit on the data front. The Atom Publishing Protocol is another. There are many fronts in this war, however. We also need to push SIP, STUN, TURN, and ICE in terms of making the “dark web” interoperable, just as we need to re-emphasize the importance of HTTP for simple file transfers. These are the protocols that need to form, as Tim says, “a second wave of consolidation, which weaves it all together into a new platform”. If we do things right, this interoperable platform can create a world where free calling on the Internet works as seamlessly as web browsers and web servers, where every browser and every server automatically distribute load using multisource “torrent” downloads, and where all data is shared.

Standards are the key to this open infrastructure.


Follow

Get every new post delivered to your Inbox.