Browse by Tag: platforms (17)

A Look at Open Source Inside Connected

Open Source

The cost of building software products has dramatically fallen compared to a decade ago. Products that used to take millions of dollars are now being built for hundreds of thousands if not tens of thousands of dollars. Two of the most important drivers of falling costs have been open source software and cloud computing.

Yesterday I had the delightful task of rebuilding one of our production cloud images for Connected. What I realized during that process was the full extent to which we rely on open source software to build Connected. Connected wouldn't be what it is today and couldn't have been built nearly as quickly or cheaply without the incredible amount of open source used throughout the stack. I thought I'd take a moment to catalog all the open source software we use to give you a sense of just how much it has truly changed the cost of software development.

Production Operating Systems
Fedora - OS used on our web servers
CentOS - OS used on our queue servers

Data Tier
MySQL - data storage
mysql-proxy - used for automatic db failover
memcached - hot cache

Web Servers
Apache - application web server
mod_wsgi - interface to Python application code
Nginx - static files and load balancing web server

Application Code
Wordpress - hosts our blog
Python - application programming language

Python Libraries
django - Python web framework
setuptools - easy package installation
pip - even easier package installation
virtualenv - isolated package installations
mysql-python - Python MySQL driver
BeautifulSoup - HTML parser
lxml - HTML parser
django_compressor - JS and CSS static file compression
django-indexer - simple key/value store
django-paging - simple paging
django-sentry - detailed web request error logging
greenlet - concurrent programming library
eventlet - concurrent programming library
pyopenssl - SSL support
gdata - Google Data API library
httplib2 - advanced http support
pycrypto - cryptography
python-openid - OpenID
pytz - timezone support
tlslite - SSL support
feedparser - broad feed parsing support
iso8601 - ISO 8601 date conversion
thrift - cross-language development
evernote - Evernote API library
python-dateutil - automatic date conversion
vobject - vCard support
suds - SOAP API library
python-ntlm - NTLM authentication
dnspython - DNS querying
django-storages - common Django storage back-ends
boto - S3 library
python-memcached - memcached library
aweber_api - AWeber API
django-templatetag-sugar - simplified django templates
oauth2 - OAuth library
pyssh - SSH client
django-logging - django debugging
debug_toolbar - dango debugging
daemon - daemonize your background processes

Front-end Code
jQuery - light-weight javascript library
jQuery UI - UI widgets for javascript
jquery-autocomplete - autocomplete text field
jquery-fancybox - pop-up dialog
sencha - mobile javascript framework
micro_templating.js - John Resig's simple javascript templating
underscore.js - powerful data manipulation javascript library
Backbone.js - light-weight javascript MVC framework

Developer Tools
svn - version control
svnX - Mac svn client
Eclipse - developer tools
iTerm - alternative terminal client
PyDev - Python support in Eclipse
Pylint - Python static analysis
pyflakes - Python static analysis
Munin - graphing and monitoring
yui compressor - javascript compressor

Understanding the Players in the Social Data Layer

Qwerly

Social is clearly one of the biggest trends on the web right now, with the majority of new apps and services taking advantage of your friends to provide a more participatory experience. This extends across desktop and mobile applications as well as across most verticals, including media, e-commerce, travel, and more.

But what’s most exciting to me is what is happening a layer below these applications - the rise of the social data layer. The social data layer provides a set of compelling APIs that any application can take advantage of to quickly immerse it’s experience in social. Just as cloud computing significantly reduced the cost of building web applications, these social data platforms are significantly reducing the friction in creating compelling social experiences.

While Facebook is clearly leading the efforts in providing the social data layer, there are a growing set of startups and other providers of social data that new applications can take advantage of. I thought I’d take a moment to describe the current landscape from my perspective.

Social Network APIs
Without a doubt, at the core of the social data layer are the social networks that enable access to both their rich social profile data as well as robust social graph APIs.

Facebook
Facebook, now with over 750 million active members, is not only the largest of the social network providers, but also kicked off the social data revolution by opening up their APIs in 2007. Any app developer building a social application should strongly consider making Facebook their base, with the largest & truest social graph across all the networks.

Twitter
Twitter, with 300 million registered accounts, provides a very unique set of opportunities as their one-way follow mechanic has led to Twitter’s social graph being described as the interest graph as opposed to a pure-play social graph. Since you can follow people that you may not know, but are interested in their area of expertise or just keeping an eye on, it creates a unique set of graph nodes that are compelling for a variety of applications. And of all the social data providers, Twitter has been the closest in keeping up with Facebook in terms of API robustness and has even gone on to create an entire layer of streaming APIs that are very unique to Twitter and their data set.

LinkedIn
At 120 million members, LinkedIn is by far the largest professional graph with the richest searchable resume data for each of its members. It’s a clear choice for any professional application. LinkedIn has had renewed focus of late on their API offering and has expanded beyond basic profile APIs to also allow you to query their company data, group data, and jobs database.

Email Providers
Another rich source of implicit social data that I believe is still significantly under-utilized is the email inbox. Locked inside one’s inbox is almost a truer representation of one’s social graph compared to that which is mapped on explicit social networks. And we only now starting to see applications start to leverage this data in interesting ways.

Gmail
Gmail is leading the pack in opening up their platform to third party developers. For one, they launched an OAuth extension to their IMAP APIs, which now allow you to have delegated access to a user’s inbox without the user having to share their credentials. Given how sensitive the inbox is, this one addition goes a long way to ensure user trust. In addition, they launched the Gmail Contextual Gadget extension point that allows apps to be embedded right within Gmail. Unfortunately this extension is currently limited to Google Apps, but will hopefully be ported to consumer Gmail as well.

Yahoo Mail
Yahoo Mail also provides a set of APis to query their inbox directly. It’s nicer than Gmail’s interface since you can bypass IMAP altogether and use their web standard APIs. Yahoo has also invested in mail as a platform by enabling applications to be installed right within the Yahoo Mail interface. Unlike Gmail, these apps are targeted at both consumers and professionals leveraging Yahoo Mail.

Windows Live Hotmail and AOL Mail are other notable mentions here due to their large user bases. However neither have devoted serious resources to opening up a platform to access their inbox data, though straight POP and IMAP access is available.

Inbox APIs
While you could develop an application that directly speaks to the various email APIs out there, there are a set of inbox API startups looking to simplify the entire effort of accessing inboxes on your user’s behalf.

Context.IO
Context.IO provides a robust inbox API that will automatically index the inbox of your end-user and provides an easy-to-use REST API accessing those messages. If you have ever dealt with IMAP, you’ll appreciate all the work that Context.IO does for you so you don’t have to deal with the complexities associated with it. Their currently enable indexing an IMAP account with speed and scale.

Jexy
Jexy is still in it’s infancy, but one to watch for normalized API access to email, calendar, notes, and more. Their goal is to provide a single interface across any inbox, whether it’s IMAP, POP, Exchange, etc. Looking forward to trying their beta when it becomes available.

Social Data Providers
The social data providers supply connection data to map an email address to social profiles across the web. This becomes very useful when trying to acquire social data from email addresses or trying to fill out a full social profile of a given user.

Rapleaf
Rapleaf was the first compelling offering in this space with one of the largest databases of social data. Unfortunately they have discontinued their social profile lookup API that returned various social profiles for a given e-mail address due to negative press around how they acquired their data. They do still offer a useful personalization API that will give you data on the user behind an e-mail address, including age, gender, household income, and more.

Qwerly
Qwerly allows you to search both by an e-mail address to find associated social profiles or by a single social profile and find other associated social profiles. The data has been leveraged by many email marketing providers, CRM tools, and more to help provide more data about your customers. They have an interesting approach to acquiring their data through a sophisticated social profile crawler.

FullContact
FullContact is a more recent entrant to the social data space, with a compelling contact API that allows you to send it a partial contact record (name, email, phone number, etc) and have them fill it out with more complete data, including social profiles. Again a very useful data source for apps looking to build a more complete profile of users and connections.

Fliptop
Fliptop also enables looking up an e-mail address and returning both social profiles for that person as well as demographic data like name, gender, location, and more. Worth checking them out as well.

Google Social Graph API
Google also provides an answer to this problem via their public crawler. The Social Graph API enables accessing social profile data via their search engine and allows you to search across a variety of attributes. It’s definitely worth looking at for your needs as I hear their data set has gotten much better over the years.

Social Influence APIs
With the rise of the power of the collective community across social networks, key influencers become more and more important. And now there are a set of APIs for you to understand just how influential a person is across their areas of expertise. This data can be used for scenarios ranging from CRM, to customer support, to social marketing campaigns.

Klout
Klout is the most well known social influence provider available. While they got their initial start analyzing Twitter data, they have since expanded to analyze 10 different social media properties, including Facebook, LinkedIn, YouTube, Blogger, and more. For any user, Klout provides an overall influence score as well as details on a given person's areas of expertise.

PeerIndex
PeerIndex is another provider of social influence data. They similarly provide detailed scores on each user to help you better understand their topic expertise as well as overall audience reach.

Personal Data Stores
These projects attempt to bring all your personal data together and then make them available to services via a unified API. They provide value both to the end user in the aggregation but also to developers via their API.

Singly
Singly is the company behind the Locker Project, an open-source effort to create a personal data store of all your personal data from across the web. While a useful end-user service in itself, they also plan on offering a rich set of APIs for developers to take advantage of to get access to this personal data store for their applications. While still in the early phases, certainly a worthwhile effort and one to watch.

Greplin
Greplin is the ultimate search tool across your personal data. They index all your social accounts, inbox, and more and provide a simple Google-style interface to search across the data. They currently have an API in closed beta, but will hopefully open it up shortly to allow other developers to take advantage of their rich index.

API Aggregators
When you are looking to integrate with a variety of APIs, it’s often useful to consider leveraging an API aggregator that normalizes social data for you into a single API interface.

Gnip
Gnip is the most well known API aggregator, providing comprehensive access across a variety of social data providers. They are also the official Twitter partner for getting access to the Twitter firehose of data. So if you have extreme Twitter needs, these are also your guys.

Apigee
Apigee is useful in the development stage, as they provide developer tools for exploring APIs, making it much easier to get started with a variety of APIs. I expect over time these guys may even help normalize these different APIs for developers, though they are currently focused on working mainly with API publishers.

Standards
There have been several attempts as well to develop standards for sharing social profile data that will hopefully continue to get traction amongst publishers.

PortableContacts
PortableContacts was designed as a standard for publishers to share their contact data uniformly across a variety of publishers. Plaxo was one of the first publishers to support it, as Joseph Smarr was a strong advocate for it while he was there. Google also has an implementation of PortableContacts. However, we haven’t yet seen many of the other providers of contact data take up PortableContacts, so it’s usefulness is currently limited.

Webfinger
Webfinger attempts to bring back the old finger protocol that allowed you to get identity information. The new webfinger API enables modern access to identify information. Google again currently implements Webfinger, but continues to have low adoption amongst other data providers.

If you know of other startups or technologies that help to access the social data layer, please leave a comment!

Evernote, The 100 Year Company

Evernote Trunk Conference

Today I had the pleasure of attending the Evernote Trunk Conference, Evernote’s first ever developer-focused event. I was excited to attend not only because Connected is an app in the Evernote Trunk, but also because Evernote has been a break-out success story in the productivity space. I thought I’d share some of my thoughts from the day.

The 100 Year Company
When Evernote raised their recent $50M round of funding, Phil Libin announced his vision for Evernote being a 100 year company and explained how this monster round of funding would help ensure that was possible. He re-iterated at the event that there is no exit strategy for Evernote and he is truly committed to building Evernote for the long-haul. It’s refreshing and bold to see such ambition and I believe their recent moves reflect making it a reality.

Evernote Acquires Skitch
Evernote is already putting that monster round to use and today announced their first acquisition: Skitch. A very complimentary tool, Skitch has become a popular Mac app for editing and annotating photos. It’s exciting to see these two productivity apps together and speaks to Evernote’s larger vision of productivity beyond simple note taking. I expect we’ll see more acquisitions from Evernote in the near future along these lines.

Freemium Works
At the event, Phil shared the following update on user growth, engagement, and monetization stats:

Evernote stats: 12.5M users, 4.5M 30 day active users, 42k daily new users, 568k paying users #evernote_etcless than a minute ago via Twitter for iPhone Favorite Retweet Reply


They continue to re-inforce that freemium can indeed work given the right situation of a large base of users (thanks to their success on mobile app platforms), cheap per user costs, and a product that users absolutely love.

Evernote is Building a Global Brand
It was exciting to see how thoughtfully Evernote has been thinking about establishing their brand. Gabe Campodonico, Evernote’s Creative Director, walked us through how they got to establish their brand identity and what it stands for. As you can see from the images below, you can get a sense for the different iterations Evernote went through to come up with their perfect logo:

Evernote Trunk Conference: Logo Designs 1

Evernote Trunk Conference: Logo Designs 2

But in addition, they are focused on building a truly global brand. Only 1/3 of Evernote’s audience is in the US and the rest is globally distributed, with Japan being the second largest country. Apparently in Japan there are dozens of books written specifically about Evernote, a testament to it’s adoption throughout the country.

Evernote is Committed to the Platform
Today’s event showcased Evernote’s commitment to it’s nascent platform. The excitement in the room from developers was outstanding. And Evernote promised a significant set of enhancements to their API to make it much easier for developers to develop across platforms, including new SDKs for Silverlight, Actionscript, Node.js, and more. They even hinted at new APIs coming out that would enable even deeper integration into Evernote via their new Galleries functionality.

I continue to be impressed with both Evernote’s strong vision and their continued excellent execution. Looking forward to what's next!

The Resurgence of LinkedIn

In the wake of LinkedIn’s announcement of reaching 100 million members, I’ve been impressed with the resurgence they have had in the past year. I thought I would showcase some of the recent product innovation from LinkedIn as well as cultural shifts we’ve seen from within the company that have contributed to this growth.

Product Innovation
For the longest time LinkedIn’s product pace has always been overshadowed by the more nimble Facebook, which has constantly been pushing the envelope both on the speed and shape of innovation. While I won’t try to argue that LinkedIn has caught up, we’ve certainly seen it pick up the pace in the past year as well as start to get comfortable in its own skin, understanding exactly where it provides unique value that Facebook, Twitter, and other social networks cannot.

The latest example of this is LinkedIn Today, a social news product for professionals, that automatically builds a daily digest of the top news you’ll likely be interested in based on how it’s trending amongst your professional network. Even despite a product like Flipboard already existing, you can see how LinkedIn Today provides unique value targeted at professionals and aspires to be more on par with a next generation Wall Street Journal.

LinkedIn also realized that much of it’s value is for sales, business development, and hiring managers seeking out specific contacts that can help them achieve their goals. To that end, we saw the redesign last year of search to supported Faceted Search, a much smarter way of filtering your search results to find exactly who you may be looking for. This search mechanism differs significantly from the keyword-centric Google Search, name-centric Facebook search, and recommendations-centric Twitter results. Instead it’s precisely optimized for what LinkedIn users are doing - narrowing their results down to find candidates for the task at hand.

Beyond that, LinkedIn invested heavily in bringing the LinkedIn experience to you, with revamped iPhone and Blackberry apps as well as a new Android app. These apps provide full profile information on the go for both people in your network as well as new folks you may have just met. They are also complete with Bump-style business-card replacement functionality, enabling you to add a LinkedIn contact on the go.

This is just a taste of some of the new products LinkedIn has developed or enhanced in the past year, with many other little feature improvements throughout.

InDays
Beyond this product innovation, we’ve also seen some important cultural shifts inside LinkedIn. One of the most exciting is an initiative spearheaded by Adam Nash called InDays. Every month employees are encouraged to spend one day working on projects outside of their core responsibilities. Alongside InDay LinkedIn also throws a Hackday contest to showcase the best applications that come out of these internal projects. They started posting these internal projects publicly through a separate LinkedIn Labs site. There is some resemblance to Google’s 20% time, though the time dedicated to these efforts at LinkedIn remains fairly minimal.

The kinds of applications that have come out of InDays have typically shown the robustness, power, and richness of the unique dataset that LinkedIn has. For example, the analytics team at LinkedIn developed InMaps, a stunning way of visualizing your connection graph, complete with clustering of your contacts into similar groups. I found clusters of folks from my previous employer Microsoft, colleagues from my alma mater at The University of Pennsylvania, as well as a cluster of fellow silicon valley entrepreneurs. It’s amazing how they were able to deduce these purely on connection data.

Another great application was The Year in Review, which visualized all the contacts within your network that had changed jobs. LinkedIn ended up sending this out as a email marketing campaign to all of it’s users. It’s interesting to see how many of your colleagues have changed jobs as there are always surprises in there that you didn’t even realize.

I’m sure this culture of encouraging innovative ideas from it’s own employees has boosted employee morale as well as given user’s access to functionality that traditionally wouldn’t be on their roadmap. I’m excited to watch these efforts continue.

App Platform
While LinkedIn has been fairly slow to encourage third party developers to leverage their rich data set in their own apps, we are finally seeing this start to change. LinkedIn has been investing more resources in their API platform as of late, recently releasing OAuth 2.0 endpoints as well as a full-featured Javascript API.

Developers are also starting to leverage the API in more ways in their own applications. Cubeduel created a fun hot-or-not style contest for your co-workers, Bump leveraged it to enable sharing of LinkedIn data on meeting folks, and popular calendaring apps Timebridge and Tungle.me provide further details on meeting attendees. My own relationship managemant application Connected uses the LinkedIn API extensively to provide full work history details on all of your contacts.

This renewed energy behind the API will allow developers to take advantage of their rich dataset in functionality beyond what LinkedIn is likely to provide.

Acquisitions
LinkedIn has traditionally been absent from the acquisition scene unlike many of it’s similarly sized tech brethren. However, that all changed this past year when LinkedIn made not only it’s first acquisition, but a total of three.

The first acquisition was mSpoke in August 2010. mSpoke developed content recommendation technology which was presumably leveraged for a variety of recommendation scenarios throughout LinkedIn’s product. LinkedIn quickly followed this up with the acquisition of ChoiceVendor in September 2010. ChoiceVendor offered ratings and reviews for B2B service providers. And most recently, LinkedIn acquired CardMunch in January 2011. CardMunch makes it easy to convert a business card into a digital contact record simply by photographing a business card via your iPhone and having it automatically transcribed by humans for accuracy.

All these acquisitions have been predominantly talent acquisitions for small dollar amounts, but great ways to inject young talent into the fold.


All the recent changes within LinkedIn are clearly in preparation for their upcoming IPO, which they publicly announced their intention to do so in January. I hope this more aggressive LinkedIn continues along these lines, as I truly believe they are sitting on a goldmine of data which they have only scratched the surface on it’s possibilities.

The imeem Mafia



Last night I read Sarah Lacy's excellent post entitled Inside the DNA of the Facebook Mafia. If you haven't read it yet, you should. It not only catalogues many of the excellent startups that have come out of Facebook, but the emerging patterns amongst the bunch.

It got me thinking about my own experience at my previous startup, imeem. When I think back on imeem, I always felt that we had an incredible group of fascinating, talented, and ambitious people. While we never achieved our ultimate goals at imeem, I was sure many of these same folks would move on to something great afterwards. I told myself to keep a lookout as I was sure many would likely start their own ventures.

And sure enough, about a year since imeem's acquisition by MySpace, more than ten new exciting startups have been founded by the original imeem crew. I thought I'd take a moment to showcase some of them.

The most well known of the bunch is obviously Mixed Media Labs, the creator of the popular iPhone and Android photo sharing app, picplz. It's well known not only for it's impressive traction thus far, but because it was started by imeem's founder Dalton Caldwell and his right-hand man Bryan Berg. In addition, they have brought together many from the original crew including Tim DeGraw, Allan Hsu, and even Ali Aydar as a director. They've raised funding from Andreessen Horowitz and are a strong contender in the now heated mobile photo sharing space.

Mobile has become a popular space for many of the imeem alumni. One of our top mobile developers at imeem, Ty Amell, teamed up with our search guy, Will Palmeri, to start Stackmob, an application platform to ease the development of mobile apps. They even convinced one of the back-end rockstars at imeem, Keith Dreibelbis, to join them. Similarly, imeem's CMO and Head of Biz Dev Steve Jang went on to start Schematic Labs, which is also focusing on the mobile space. He's roped in former imeem designer Alex Katzen into the mix as well.

Some took imeem's sucess in the entertainment space and propelled it into their own incarnation of an entertainment property. Our COO Ali Aydar went on to be CEO of Sporcle, a gaming site with endlessly entertaining quizzes and more. VP of Sales David Wade went on to start Popdust, a music news, reviews, and gossip site.

Still others have gone in completely different directions, following their passions wherever they lead them. For example, Sameer Alsakran, who managed imeem's entire big data infrastructure, including our large Hadoop cluster, is continuing his work in the Hadoop space with his latest venture White Label Labs. Raj Irukulla and Gina Olsen, two folks who were always passionate about great food, went on to start their own startups in the space. Raj founded FoodPair, which helps you find recipes to make with whatever ingredient you choose. Gina started Mothergood, which produces wholesome snacks for expectant mothers.

As many of you know, I got to imeem myself because my own startup, Anywhere.FM, was acquired by imeem. As I fully expected, the three co-founders of Anywhere.FM, have now gone on to start their own new ventures as well. Anson Tsai is already having amazing success with Cardpool, the easiest way to buy and sell gift cards. Lux Chen is following his dream of getting into gaming with an upcoming iPhone\iPad game. I myself have started Connected, a personal relationship manager that brings all your contacts and conversations together in one place.

Though it's too early to tell which will ultimately be successful, I've continued to be impressed with what the imeem crew has gone on to do. Maybe one day we'll even see a post on TechCrunch about the imeem Mafia instead of the already popular Paypal or Facebook Mafias ;)

Why I Abandoned the Rackspace Cloud

Rackspace Cloud

As many of you know, I'm a huge proponent of on-demand computing as I believe it's the best starting point for most early stage web startups. Cloud computing allows a venture to substitute high initial capital expenditures for operating expenses that grow proportional to your traction. Equally important is its ability to flexibility scale and retract with the ebb and flow of your business. While it may make sense at a later stage to move to your own data center as you look to optimize costs, it rarely should be a priority in the tumultuous early days when you are still searching for product/market fit.

At my previous startup Anywhere.FM, we were an early adopter of Amazon Web Services in 2007. I've continued to be an early adopter of next generation cloud platforms as I'm always interested in understanding the bleeding edge innovations. Last year I initially saw a lot of promise in Google App Engine, but ultimately chose to abandon it due to its shortcomings. Just recently I tried the Rackspace Cloud, which is shaping up to be the fiercest competitor against AWS. I thought I'd share my experience with you.

What Attracted Me
Rackspace is not shy about looking to compete aggressively against AWS. They offer on their website a detailed head-to-head comparison with Amazon's EC2. While they suggest a variety of reasons why you might choose Rackspace over EC2, there was one specific advantage that caught my eye and ultimately led me to run a full-scale test on Rackspace. It was the fact that Rackspace offers cheaper low-end server instances that may be all you need for certain tasks. While Amazon's cheapest instance starts at $63.24/mo (1.7GB RAM) for continuous running, Rackspace's cheapest instance starts at $10.95/mo (256MB RAM). For certain tasks that are not CPU or memory-bound but instead network bound, the low-end Rackspace instances may be sufficient and cost effective. Given this fit the characteristics of my workload, I decided to shift some of my resources to Rackspace. I ultimately expanded to a farm of 8 low-end Rackspace instances for a full month.

Why I Abandoned
Given that I've been on Amazon Web Services for years now, my expectations for a cloud platform are high. And unfortunately the Rackspace Cloud ended up falling short for my needs.

Unreliable Boxes
One drawback that completely surprised me was the lack of reliability of the servers. This shocked me given Rackspace's long history of being the underpinnings of some of the internet's largest sites even before they launched their cloud offering. But I regularly would receive e-mails like the following:

This message is to inform you that the host server your Cloud Server 'cloud-512-2' is on became unresponsive at 10:15 PM CST today. As a result we have scheduled a reboot to occur on the physical host server and will investigate the issue. If the problem arises again we will proceed with a hardware swap to maintain the integrity of your data.

I received 13 such required reboot notifications during the 33 days that I had instances live on Rackspace! Talk about a maintenance nightmare!

Limitations on OS Images
Rackspace also has some pretty restrictive limitations on backups and images created on instances. First, a backup of an instance is always tied to a particular instance. So say for example you setup and installed all the appropriate software like I did on one instance. Then you took a backup of that instance so that you could deploy additional servers based on the backup. Today you cannot delete the instance from which the backup was created because if you were to do so, you'll lose the backup! So as long as you want to create future instances based on that image, that instance has to live on. This is due to the strong coupling of servers and backups.

Related to this issue is the fact that you cannot share images with others. This means that any software you want on the box you have to install yourself instead of potentially taking advantage of someone else's work.

This is in contract to Amazon's strong Amazon Machine Image ecosystem. I personally use the great CentOS images from RightScale. This is just not possible on Rackspace.

While Rackspace does plan on addressing both of these issues, they have a ways to go to catch up with Amazon.

No Elastic Storage
The amount of disk that you can have associated with a given Rackspace server instance is based on the server instance package. So if you require more disk space, you'll have to upgrade to a larger instance size and resize your server. While this does mirror most hosting providers, it falls short of the compelling offering of Elastic Block Storage that Amazon provides. Elastic Block Storage is completely resizable persistent storage that can be attached to any Amazon instance, with easy backup, redundancy, and re-attachment to\from different instances. It's great for storing the data files for a database server, for example, and allowing it to grow as necessary. I definitely missed this when I was on the Rackspace cloud.

Bandwidth Costs
Much of my workload involves making requests to third party APIs or scraping web pages. This results in significant network bandwidth both into the box and out of the box.

Currently Amazon is more price competitive on this front. AWS is currently offering completely free data transfer into EC2 instances for the first half of 2010. Compare that to Rackspace's $0.08/GB. In addition, EC2 charges $0.15/GB transferred out, versus Rackspace's higher $0.22/GB transferred out.

For these reasons I have ultimately decided to abandon the Rackspace Cloud. While none of them were deal breakers themselves, there wasn't enough compelling reason to move my infrastructure away from AWS.

What I'll Miss
Despite abandoning the Rackspace Cloud, there are definitely some aspects I will miss.

Fanatical Support
Rackspace doesn't lie when they say they have fanatical support. Because they really do. I signed up for Rakcspace at midnight on the weekend and I still received a friendly call from a Rackspace agent to confirm my identity and ask me if I had any questions about the service. And this wasn't an offshore calling center either. It was a very knowledgeable American representative.

Similarly, they have always available Live Chat that I have taken advantage of on multiple occasions. Why waste time digging through forums to try to find an answer to your question when you can ask extremely helpful and expert chat support staff immediately? Compare this to AWS, which charges significant additional fees for that level of support.

Easy Backups
While the server backup system does have its limitation, it was refreshingly easy to setup. Just tell it in the control panel to backup daily and you are all set! Compare it to the details in this article for getting reliable backups on AWS.

Instance Names
Something so basic that was again very refreshing was the ability to provide easy to remember names for your server instances. AWS can be frustrating at times to look at the Control Panel and simply see instance IDs like i-462e3b1e, leaving you wondering what exactly is running on the instance.

While I have decided to abandon the Rackspace Cloud and stick with AWS, I'm sure I'll be re-evaluating Rackspace as it matures. Nonetheless I'm excited to see strong competition in the marketplace since it forces all cloud providers to continue to innovate. I expect to continue to see significant advancements in the space from all the major vendors as they evolve.

Related Posts

5 Social Platform Predictions for 2010

Leading Social Platforms

As many of you know, I'm a big proponent of open platforms and have spent much of my career designing, building, or leveraging open platforms and APIs. While we have seen explosive growth in social platforms over the past several years, I believe they are still very early in their history.

I wanted to put out my 5 social platform predictions for 2010, as I think we are poised to see another exciting year of innovation.

Facebook & Twitter become the social and real-time protocols of the web.
There is no doubt that Facebook and Twitter have become the de facto social and real-time platforms of the web. I'm not going to argue which service will win because I strongly believe there is room for both as they serve important and separate functions. But I think both will see the next evolution of their dominance. We've already witnessed the transition of these services from valuable end-user apps to compelling platforms serving thousands of scenarios through a strong platform ecosystem. The next leg of this evolution will be transform to ubiquitous protocols that underlie all major social and real-time scenarios across the web. We will start to think of Twitter as a micro-messaging protocol just as we think of existing communication protocols like SMS and IM. Facebook will become the universal address book protocol for connecting and sharing with everyone in our lives. The unique aspect of this transition if successful is that each of these new protocols will be owned by one dominant player. This is a scary reality that we may soon be faced with.

Facebook taxes the value creation happening on top of its platform.
Facebook has seen incredible growth in 2009, reaching over 350 million users worldwide. In addition to strong user growth, they have seen equally exciting platform growth. App growth specifically in social gaming has been coupled with strong monetization for large and small app makers alike. The platform ecosystem's aggregate revenue is rumored to have exceeded Facebook's own revenue. While Facebook has fully embraced this in order to further its priority of platform growth over platform monetization, I think we'll start to see a shift towards Facebook looking to capture more of the value being created on top of its platform. Facebook already makes a significant portion of it's advertising dollars from app developers promoting their games. And Facebook has spent much of 2009 developing it's own virtual currency platform that would provide an easy and efficient way for Facebook to tax it's app developers. Look for this to launch in 2010.

LinkedIn's professional networking platform sees significant app innovation.
LinkedIn made it's first foray into apps in Oct 2008 when it launched it's on-site InApps platform. This platform allowed third party developers to build applications on LinkedIn. However, it was a completely closed platform that was never made generally available to developers and never grew past 13 available applications. Then a year later in Nov 2009 LinkedIn finally opened up it's platform for third parties to leverage it's data on apps on their own sites. The platform is still in the early days and has many limitations, but does pave the way for real innovation leveraging the valuable professional networking data repository LinkedIn has built. With this true opening I expect to finally see significant app innovation, despite it being plagued with delays until now. But better late than never, since LinkedIn stills holds an incredibly valuable data set with tons of untapped scenarios.

API monetization becomes an important question for emerging platforms.
API monetization remains at it's infancy. While there are APIs that are currently being monetized, what the customer is typically paying for is access to a restricted data set more than anything else. This rings true for Twitter and it's monetization of it's real-time data with Google and Microsoft, with the licensing deals LinkedIn struck with various partners which leverage it's professional resume data, or Compete with it's click-stream aggregate site analytics data. We are starting to see the freemium business model apply to APIs as well, with Compete, Urban Mapping, and Whitepages offering free and paid API offerings. In 2010 we'll start to see the discussion of API business models come to the forefront and I expect to see progress across the various models.

OAuth WRAP gains popularity over the original protocol.
The OAuth protocol has been a very important specification that has resulted in true standardization of API authentication across social, media, and other APIs. This has made the task of integrating third party APIs simpler for the developer, thus allowing a developer to interact with more APIs than ever before. Yet one thing that continues to be an extremely confusing aspect of the OAuth protocol is the signature authentication process. While it is standardized in the spec, we've seen a variety of slightly different implementations that have made it difficult for developers to get started with a new API. Just take a look at the LinkedIn API forum, where much of the discussion is around this authentication process. Twitter has kept around non-OAuth based authentication to make it easy for developers to get started without having to get bogged down with the OAuth details. Yet OAuth WRAP solves this very pain point by leveraging SSL for the authentication and removing the need for the developer to manually implement the authentication step. We'll see OAuth WRAP or a variant gain traction to further ease developer pain in this space.

Related Posts

The PayPal Wars and its Lessons for Today's Entrepreneurs

PayPal Mafia

I was perusing Andrew Chen's bookshelf and came across The PayPal Wars by Eric M. Jackson. It turned out to be a riveting tail of the entire journey of PayPal, from its early conception to its monstrous success, retold by one of its earliest hires in marketing. It's a story I thought I knew, but there was so much more to it than the simple success story we all hear about.

I thought I'd take a moment to reflect on the five most important lessons I learned from their journey and my thoughts on their application to today's entrepreneurs.

1. Strong long-term vision coupled with pragmatic short-term attainable goals
One of the most insightful aspects of the story was getting a chance to understand Peter Thiel's management style from the inside. Though he is known as a brilliant financial strategist, it was a simple management tactic he used throughout that really struck me as effective. Peter Thiel is no doubt a man with a big vision. His vision for PayPal was no less than to become "the Microsoft of payments, the financial operating system of the world" and he often talked about solving corrupt government practices of inequitable wealth distribution through his simple online service. This world domination view became a strong rallying cry for all employees. Yet he was always careful to put in place pragmatic short-term attainable goals that kept the team focused and executing against an achievable milestone. This included carefully monitoring, tracking, and rewarding success on increases in percentage of eBay listings using PayPal verses their biggest competitor at the time, Billpoint.

2. It's all about assembling an A-team
Equally exciting about the story was being introduced to the entire management team, which was full of rock stars. Peter was big on hiring incredibly talented, brilliant, and ambitious folks. They didn't necessarily need to be experts at the role they were assuming, but smart and hungry all around. Many of the strategic decisions they made and their success in doing so can be directly attributed to the team. Today all of these top execs have moved on to new and equally ambitious projects, including Peter Thiel (Clarium Capital, Founders Fund, Palantir), Max Levchin (Slide), David Sacks (Geni, Yammer), Reid Hoffman (LinkedIn), and Elon Musk (Tesla). By focusing on bringing the right team together, entrepreneurs can ensure that they can handle even the most difficult challenges.

3. Push decision making down as far as possible
One of the key takeaways for me was that leadership is not about having all the best ideas, but instead about providing a culture for great ideas to win, regardless of where in the company they came from. PayPal promoted this culture throughout and even institutionalized it through the producers role. Producers at PayPal is what we would typically call product managers these days. They essentially were given full authority and responsible for a feature area or discipline and expected to execute and report on their success. Even the naming of the role "producer" showed how focused they were on results and holding people accountable to them. By pushing decision making down, they allowed the best ideas to win from the people who were living and breathing the space all day long, not just the senior management tam.

4. Even the winners have rocky roads
While it's easy to look upon the success of PayPal as a sure fire win, it was clear from this story that throughout they faced many challenges and potentially disastrous situations.

In the early days fraud became a horrendous issue for PayPal. They were losing millions of dollars to the mafia and other organized fraud circles. The reputation, business, and economics were at stake if they could not solve this problem. Max however was able to build strong algorithms for detecting the fraud and eventually reduce the risk. In addition, they had a rather tumultuous merger with X.com which caused a complete shuffle in the star management team, Peter Thiel leaving as CEO, and then his eventual return. And their IPO, which occurred prior to their acquisition by eBay, was almost doomed by pending lawsuits and banking regulatory issues.

It's an important lesson in that all startups have rocky roads, even the successful ones. One needs to know what they are getting into when embarking on a startup, but be equally persistent to overcoming them.

5. The risk of dependence on a platform
Yet the biggest issue that plagued PayPal throughout it's journey is one that is near and dear to many entrepreneurs today: dependence on platforms. PayPal's rise was built on top of eBay as the preferred method of receiving payment for both buyers and sellers. Yet their success was completely dependent on eBay. eBay constantly changed policies which completely disrupted PayPal's service and caused them to endlessly be scrambling to maintain listing share. The situation got monumentally worse when eBay acquired PayPal's competitor Billpoint and made it the default payment method on their service. Many would have assumed PayPal was dead at that point. Yet PayPal was able to survive, through a variety of tactics.

The most important tactic that I think is a key lesson for today's entrepreneurs was building a consumer brand and strong customer affinity towards it. When eBay made changes which threatened PayPal's position, PayPal often appealed to their own users, who flooded the eBay forums with complaints. This strong customer affinity eventually forced eBay to have no choice but to buy PayPal if they were going to keep their customers happy.

While platforms like Facebook and Twitter have created huge opportunities for entrepreneurs, it has caused many of the largest startups leveraging these platforms to feel at risk due to the whims of the platform. My belief is the startups with a strong independent brand and consumer affinity have the highest likelihood of survival.


I'd highly recommend The PayPal Wars to anyone looking to really understand what it's like to work inside a rocket ship startup with an A-team at it's helm.

Google App Engine Task Queues, Push vs. Pull Paradigm, and Web Hooks

Despite my post last week on the Shortcomings of Google App Engine and my decision to move away from it as a viable platform for upcoming projects, I have been impressed with the overall architecture and design of their experimental Task Queue API.

Google throughout its years has been a leader in interface design and that has been reflected not only in the UI of the products they have built, but the countless API interfaces they have published. Google has made available some of the most easy to use yet powerful API interfaces. A clear focus on leveraging open standards where possible has helped them along the way. Google App Engine is probably the strongest testament to this, allowing developers to quickly build web applications that scale to millions of users on an easy to use Python or Java runtime environment. Their latest experimental design for the Task Queue API in Google App Engine is no exception.

Definition
Before I discuss its advantages, I should provide a definition of a task queue:

Task Queue is defined as a mechanism to synchronously distribute a sequence of tasks among parallel threads of execution. The global problem is broken down into tasks and the tasks are enqueued onto the queue. Parallel threads of execution pull tasks from the queue and perform computations on the tasks. The runtime system is responsible for managing thread accesses to the tasks in the queue as well as ensuring proper queue usage (i.e. dequeuing from an empty queue should not be allowed).

Source: Task Queue Implementation Pattern: Ekaterina Gonina (Author), Jike Chong (Shepherd), UC Berkeley ParLab

Task queues have all sorts of uses for offline processing, including periodically pulling data from third party sources, computing aggregate statistics, delivery of emails to users, etc.

Simple Interface
One of the most straightforward advantages of Google's Task Queue API is its very simple interface. While you can define a set of configuration options, they are all optional. Enqueuing a task for execution is as simple as the following:

#python

from google.appengine.api.labs import taskqueue

#Add the task to the default queue.
taskqueue.add(url='/worker', params={'key': key})

A default queue is provided, though you can easily define additional queues with their own execution options. After being enqueued, the task is run as soon as possible (according to the queue's scheduling options). Optional configuration options are specified in a queue.yaml file, including queue names, rates of processing, and bucket sizes.

Push vs. Pull
While a simple interface is nice, the push vs. pull model of the GAE Task Queue is what makes it really shine. To understand this advantage, let's compare it to another popular cloud based queue solution, Amazon Simple Queue Service (SQS). With SQS, you define a queue and it becomes a central repository for unprocessed tasks. Then you create a set of worker processes (on, say, Amazon EC2 servers) that regularly poll the qeueue to see if there are available tasks for processing. If a worker process finds an available task, the task becomes locked, allowing that worker to process it without other workers having access to it. Once the work is complete, it is removed from the SQS queue.

While this approach provides a lot of flexibility, it requires constantly running worker processes that are polling for available work. In addition, if there is a spike in tasks in the queue, you must also manage the scale up and eventual scale down of worker processes.

In contrast to this mechanism, GAE Task Queue provides a push model. Instead of having an arbitrary number of worker processes constantly polling for available tasks, GAE Task Queue instead pushes work to workers when tasks are available. This work is then processed by the existing auto-scaling GAE infrastructure, allowing you to not have to worry about scaling up and down workers. You simply define the maximum rates of processing and GAE takes care of farming out the tasks to workers appropriately.

Web Hooks
What is also compelling about GAE Task Queue is its use of web hooks as the description of a task unit. When you break it down, an individual task consists of the code to execute for that task as well as the individual data input for that specific task.

The web already provides a great mechanism for this through HTTP requests, their GET and POST input, and the resulting status response. Since in GAE you already define code to execute on an HTTP request, you can leverage the same mechanism for defining the execution code for tasks. As far as the data input, the GET querystring params or HTTP POST body provide suitable mechanisms for providing any kind of input. In this way, a task description is simply a URL that handles the request and a set of input parameters to that request.

This allows you to leverage everything you have already learned in building web request handlers in GAE for user-initiated requests. And more importantly, leverages the fact that GAE has already invested heavily in auto-scaling web request handling. It can simply re-use this infrastructure for tasks queues without having to invent a separate scaling architecture.

Shortcomings
While the overall design of GAE Task Queues is compelling, it suffers from the same shortcomings I mentioned in my previous post. Namely, a given task has a 30 second deadline. That means any individual task cannot perform more than 30s of computation, including getting data from the data store, calling third party APIs, computing aggregations, etc. In many cases, this is fine, since you can simply enqueue many small tasks and make tasks granular enough to always complete in 30s. However, this often does introduce needless complexity in task division and some tasks simply cannot be divided into less than 30 seconds of processing.

Overall, I find the design of the GAE Task Queues compelling and think its a great pattern for modeling queue infrastructure, whether its on or off Google App Engine.

If you enjoyed this post, feel free to subscribe by RSS or Email or follow me on Twitter.

Related Posts

Shortcomings of Google App Engine

As many of you know, I have been a huge fan of Google App Engine. I love the vision and truly believe its the first real platform-as-a-service as opposed to the other dominant cloud platform Amazon AWS. While AWS has significantly moved the industry forward with on-demand virtualized instances and cloud storage, it has not developed a fully scalable runtime environment comparable to Google App Engine. Sure Google App Engine only supports a very restricted use case and set of technologies, but constraints can be liberating. If the scenario fits for your web app, the freedom to focus on your app and not on infrastructure and scaling is very compelling.

Thus far I've created a variety of small production apps on app engine, including this blog, TuneChimp, and MonkeySort. I am now in the process of embarking on a large project and have been planning on using Google App Engine for it. However, I have run into a variety of shortcomings in GAE that currently and for the foreseeable future seem insurmountable. It has led me to have to reconsider my platform choice for this project and at this point relying on Amazon AWS (or an alternative cloud platform) seems like the ideal option.

For those also considering building applications on top of Google App Engine, I wanted to discuss these shortcomings so that you can make an informed decision when making your own platform choice.

Urlfetch Requests Can't Take More Than 10 Seconds
Google App Engine in Python requires you to proxy all your third party HTTP requests through their urlfetch library. They have created a lightweight wrapper around it to allow python developers to use their typical urllib and urllib2 interfaces. However, the urlfetch library still has a hard restriction that enforces a deadline on any outgoing HTTP request to a maximum of 10 seconds. While in many scenarios this restriction is fine, when building a mashup application that leverages third party APIs (whether its Twitter, Facebook, YouTube, or others), there are many scenarios where you realistically run into this urlfetch deadline restriction. Some APIs allow you to break up long requests by paging, which allows you to get below the 10 second limit, but this often needlessly complicates your code. In addition, some third party APIs are simply poorly written, don't allow paging, or have high latencies that make it impossible to get meaningful results within the 10s limit.

While Google has already increased this deadline once to the higher 10 second limit, they have provided no roadmap or expectation of increasing this deadline further.

Requests Can't Run for More Than 30 Seconds
In addition to the urlfetch restriction, any web request cannot take longer than 30 seconds. This is the entire time allotted to responding to a request. The scenarios for longer requests typically are not user web requests, but instead offline tasks that periodically call to third party APIs to get the latest data and cache it locally or any other kind of offline computation. While Google App Engine has provided the scheduled tasks and task queue APIs to create a nice facility for this, the fact that any individual task can still only take 30 seconds severely limits the possibilities. While in many cases you can smartly divide the tasks to less than 30 second increments, this again requires significant management of task breakdown which may create needless complexity in your application. Or there many even be scenarios which just simply can't be modeled as tasks that must return within 30 seconds.

Since Google App Engine is designed primarily to respond to user web requests and not designed to be an engine for significant offline processing, there is no roadmap for significantly increasing the amount of time an individual task can take.

Can't Open Sockets To Arbitrary Ports
Given that Google App Engine is a constrained runtime environment, it has an understandable limitation of preventing you from opening sockets on arbitrary ports. This restriction is necessary for security and scalability and Google can only be expected to enable these scenarios by providing their own wrapper libraries for each desired scenario. However, this leads to restrictions on important scenarios. For example, if your application wants to incorporate email and connect to an IMAP server, then you have no ability to do this on GAE.

While Google does plan to eventually add additional services to their capabilities, there is no plan for providing a general capability for opening sockets.

Can't Support HTTPS on Own Domain
Google App Engine allows you to launch applications on either their appspot.com domain at a custom subdomain or to host your application on your own domain through Google Apps. However, if you want to handle HTTPS requests, it has to be done on the appspot.com domain. You cannot support HTTPS on your custom domain. This is a significant restriction since it prevents you from providing a fully secured experience for a user on your own domain.

Given technical limitations, Google has not provided a roadmap for when this issue will be solved, leaving people who require this functionality to simply find another cloud platform.


While I love the promise of Google App Engine, each of these technical limitations in the current platform with no clear roadmap to enable these scenarios has led me to abandon Google App Engine for my next project. I plan to continue to monitor Google App Engine developments and see where it goes as I still believe it is a great platform for certain constrained scenarios. But I plan to spend my time now investigating Amazon AWS and Rackspace Cloud.

If you enjoyed this post, feel free to subscribe by RSS or email or follow me on twitter.

Related Posts

Clara Shih, The Facebook Era, and Business Opportunities on Facebook

Several months ago I had the opportunity to sit in on a guest lecture Clara Shih gave at the Stanford Seminar on People, Computers, and Design. Clara has spent the last several years at Salesforce leading their social networking product strategy as well as developed Faceconnector, the first business app on Facebook that made it easy to integrate Facebook profile data into Salesforce CRM tools. With this insight, Clara recently authored The Facebook Era, a look at how social networks have changed people's behaviors, expectations, and relationships, and the resulting business opportunities it has created.

After attending the seminar, I decided to read the book and wanted to share some of the key trends discussed and the business opportunities that arise from them.

Opportunity #1: Transparent transitive trust opens up social advertising possibilities
Clara speaks at length about transitive trust: the notion that if I have mutual friends with you that I trust, I am by extension more likely to trust you. In addition, if a friend of mine endorses a product or service, I am more likely to respond positively to the brand. Facebook creates complete transparency in both of these scenarios, allowing you to quickly see who are your mutual friends with someone as well as to see a variety of brand endorsements through fan pages, groups, status messages, and a variety of notifications of engagement with various brand applications.

Facebook's enabling of passive word of mouth of brand recommendations creates many opportunities around social campaigns that leverage the higher conversion rate associated with word of mouth referrals. Brands can capitalize on this new channel and user's willingness and desire to associate themselves with the brands they care about to supercharge their previously offline and unscaleable user referral programs.

SocialMedia has been one of the early innovators on social ads, with the Word of Mouth Impression being their latest ad creation. This new ad product attempts to incorporate your friends sentiment around the advertised brand to increase conversion rates of ad campaigns. Appirio has also built a Referral Management for Viral Marketing product to enable brand advocates to easily share their brand preferences with their friends on Facebook.

Opportunity #2: Explicit self expression makes hyper targeting a reality
Compared to the social networks that came before it, Facebook has encouraged authentic online identities and personalities. This has led to users willingly and explicitly expressing themselves on their profiles and streams. With this has come never before available deep data about user's demographics, behaviors, and interests. Of course this creates exciting new opportunities for advertisers to hyper target a set of users based on their interests.

The Facebook Ads platform has provided an opportunity to get at some of this hyper targeted data. Advertisers can now target their campaigns on specific demographics and interest keywords found on user profiles. However, this platform is still in its infancy, only scratching the surface on the targeting vectors that are possible. Hopefully with time Facebook will continue to innovate on their ad platform as well as open up an advertising API to allow third party developers to help with the innovation around targeting. To date third parties have been severely limited by Facebook in their ability to use the data available on user profiles for the purposes of targeting.

Opportunity #3: New forms of casual interactions enable maintaining and growing weak ties
Clara also emphasizes the importance of weak ties on social networks. These are the friends you have on social networks that aren't your closest real life friends, but instead the people you casually or occasionally keep up with. What's important about this class of individuals is that research has shown these are the connections that will be most important to you in terms of business relationships, since over the life of your career these are the people that you are likely to benefit from in one way or another. Facebook enables you to easily maintain and grow connections with these weak ties through casual interactions, including reading their status updates, posting wall posts, sending messages, and more.

Many applications have been developed on Facebook to further encourage casual interactions to grow these weak ties. MyCalendar and Birthday Cards enable you to easily remember your friends birthdays and events and send them online greeting cards. Even many of the social games on Facebook have the affect of allowing you to casually interact with your friends and remind them of you.


Facebook has truly enabled new scenarios and behaviors for people across the world. With these new interactions come many new business opportunities to leverage the social graph to create value. Clara Shih has done a great job of researching and documenting this trend. So check out her book!

Lessons Learned from imeem

Before moving on to a new phase in my career, I always like to reflect on the previous experience and put together key takeaways that I can leverage in the next opportunity.

It's that time again as this past Wednesday was my last day at imeem. As some of you know, imeem acquired Anywhere.FM at the end of 2007. Since then I've helped to migrate Anywhere.FM, develop the imeem Media Platform, and contribute to a variety of monetization projects. But now I'm eager to move on to the next adventure :)

Since I have a blog this time around, I thought I would share my lessons learned from imeem with all of you.

Content Matters. Having interesting or exclusive content is a great source of traffic. imeem's decision to allow user uploaded content has definitely helped it obtain valuable SEO for hard to find tracks. People who really want access to a specific song will go wherever they need to in order to find it. imeem's ability to get exclusive content like the Britney Spears' Circus pre-release album definitely resulted in a nice bump in traffic as well.

Creating a Community. While many people would say developing a web 2.0 UGC site is easy because users contribute significant content, it's actually a lot of work to harvest the desired community. It requires staff to police user-contributed content, answer questions\moderate forums, contribute\manage editorial content, encourage appropriate behavior, and so on. In order to scale, it's important to automate as much as possible and allow users to help manage the community themselves.

Conversion Funnel Optimization. Don't underestimate drop-off rates resulting from adding an extra click in a flow. There are lots of flows that can be optimized simply by removing extraneous pages or reducing non-essential exit paths. It's worth re-looking at all your important conversion funnels to see if you can further optimize (sign up, contributing UGC, sharing, purchase events, premium account sign up, etc).

A/B Testing. Everyone knows that A/B testing is a good idea and they should do it. Yet still so few people do. And why is that? It's because A/B testing is hard and the tools often used to perform it are limited. However, if you take the time to either build or use an existing great A/B testing framework, the cost of A/B testing goes down significantly and becomes easier to do on a regular basis. Investing in A/B testing tools is hugely useful, especially for optimizing monetization for sites where small gains have huge effects due to volume. (There's likely a startup opportunity to provide better A/B testing tools that let you look at the full effect of variations over time).

Widgets. While allowing widgets to be embedded on third party sites significantly extends your reach and can be a huge opportunity for building your brand, the amount of traffic that converts to destination site users is often minimal and the ability to monetize widget traffic is still dismal. When developing widgets, one needs to think very carefully about the actual benefits for the site and exactly how much functionality to expose in the widgets versus reserving for only the destination site.

Using an API Internally. Building an external API is a great discipline for even improving the quality of internal API methods. It forces you to think through good design, re-usability, and creating common usage metaphors. All things you should be thinking about for internal APIs. Speed to market for imeem's own apps has significantly increased with the creation of our external APIs. The audio and video flash players, the imeem Uploader, the VIP player, MySpace\Hi5 apps, and the mobile app were all built on top of these APIs.

Evangelizing a Platform. While getting large well-known companies to use your developer platform provides great case studies that will help you convince other developers to jump on board, you have to trade this off with the fact that large companies take a long time to decide whether to engage as well as a long time to build. For each large integration, you could probably get 3-5 small integrations up and running.

Focus on Monetization. Very few music startups have focused on monetization. There is still a lot of novel business models that should be tried in the music space. Instead startups have focused on building compelling products without much care to the business model. There is room for innovation in the music space if people are willing to tinker with music business models as well.

Competing with Free. It's very difficult to compete with free. Users have come to expect free music streaming from the days of Napster and BitTorrent. And now there are plenty of free music streaming services (either illegal or legally ad-supported) and continue to propel user's expectation to pay nothing for music consumption. It's very difficult to aggressively advertise or charge users without fear of user's flocking to the competition, which just gives it all away. The one nice thing about the recession is that it has forced imeem's competitors to more aggressively monetize in order to stay afloat\get funding and therefore allows imeem to follow suit without fear of losing traffic.

Media and Entertainment Monetization. It's tough to monetize media and entertainment properties through advertising, lead gen, or affiliate revenue. This is because users are there for socializing and consuming content and have very little purchase intent. While they do share a vertical interest in music, associated music commerce opportunities are limited either because of the small margin the publisher gets working with partners for digital downloads and ringtones or because the providers for concert tickets and merchandise are still not aggregated well or lack established affiliate programs. (There's likely another startup opportunity in an aggregated music merchandise storefront and affiliate program).

Direct Sales. Having a direct sales team is expensive. Not only do you need sales reps, but you need sales planning support, post sales production support, and trafficking support. It may make more sense to outsource direct sales to rep firms in the early days of a startup's life.

International Monetization. It's very difficult to monetize traffic outside of the US and a few key markets (UK, Canada, etc). Ad spend in most other countries is still very low since their online ad markets are still nascent. Oftentimes it is probably a better use of time trying to improve US monetization or trying to attract additional US traffic as opposed to trying to optimize international monetization. (I smell a startup opportunity for anyone who can crack international monetization).

Users Willingness to Pay. I was surprised that users are actually willing to pay for online services. Obviously conversion rates are very very low. But it was surprising to me to learn that people were willing to pay at all - anywhere from $3/mo - $100/year for imeem's VIP subscription service. The features were really around convenience. Not even access to content. People will pay for quality products.

Online Audio Ads. Online audio ads are a promising area for innovation and monetization. There is still $21B being spent on offline audio ads and there is clearly an opportunity to move some of those dollars online. No one is aggressively innovating with the right ad unit. Most are simply re-purposing offline audio ads online.

Incentivized CPA Offers. Incentivized CPA offers can be used to monetize a variety of digital goods even outside of the social gaming space. However, the highest eCPMs seen thus far are still in their use in social gaming.

Music Licensing. It's very difficult to get on-demand streaming deals done with all four major labels. And this doesn't even include indie content. Even if you get the deals, you are looking at a large upfront payment to each label, giving up equity, plus a rev share or per-stream fee. The labels have not been looking to give the deals to everyone either, instead focusing on making some large bets.

Value of Data. Every site of any interesting size has a wealth of data. It's important to know exactly what data is tracked and available and to mine it wherever appropriate. All too often this valuable data goes unleveraged. On the other extreme, many believe that the ultimate business model lies in selling data. For those who believe this though, I think it's tougher than one thinks to monetize data itself. But there are many valuable insights that can be gained by mining it for product improvements and getting a better insight into your audience.

Don't Believe Everything You Read. It's interesting being on the inside of a large web property with many eyes watching it. Since I had the inside scoop, I knew that many times imeem was written about, the reporters simply got it wrong. Either because of misinformation, not really understanding the service, or some rumor that someone else started. Because of this, I've become much more critical of what I read online in the tech press and look much more closely at their stated sources of information.

Does Facebook Connect Deliver on its Promise?

The announcement of Facebook Connect in May 2008 brought the next major chapter in the Facebook platform story. After building the first and most successful social networking platform, Facebook decided to expand beyond its own destination to bring the power of the social graph to any third party site.

Facebook Connect - Button

Facebook Connect promised to deliver on the five following tenants: trusted authentication, real identity, friend linking, dynamic privacy, and social distribution. In the half a year since the announcement, how has Facebook Connect done?

Let's take a closer look at how Facebook Connect has fared on each of these tenants.

Trusted Authentication
Facebook Connect has by far the best authentication and single sign-on solution to date. It wins due to its simple and clear user experience. While Windows Live ID, OpenID, Google Friend Connect, and others have in the past provided single sign-on solutions, none saw significant traction. The most important innovation this time around is a rather simple one: Facebook Connect provides a javascript-based light box sign-in screen on the same page without redirecting the user to a third party site for authentication. In addition, the user simply logs into Facebook from the light box if they aren't already (but who isn't always logged into Facebook these days), and then simply authorizes the app with one click.

Facebook Connect - Dialog

It's very satisfying to go to a Connect-enabled site, hit the Connect button, select authorize, and immediately have a presence on a site. No longer is there the friction of deciding whether to go through the hassle of creating an account, setting up a password, and giving away additional personal info just to begin to experience a site's benefits.

Facebook has stated that many sites are already seeing significant success with trusted login:

Some sites that have chosen to include login have already told us that they have seen a two-time or more increase in registrations and 2/3 of users creating accounts via Facebook Connect.

Real Identity
In addition to the sign-on experience, Facebook Connect provides publisher's with rich access to authentic user profile data, including profile picture, real name, birthday, location, relationship status, work history, and much more.

For many small publisher's, users will be much more willing to supply this data to Facebook and not likely to take the time to do so on your own site. Therefore this is a great opportunity for publishers to take advantage of access to this info as well as further simplify their site's registration process.

One important caveat though is you do not get access to a user's email address, which is often one of the most important profile fields during a registration process. Due to privacy and spam concerns, Facebook prevents access to this info. A publisher then has two alternatives. On one hand, a publisher can prompt for an email address outside of the Facebook Connect registration process. While this provides the greatest control, additional profile fields will reduce some of the friction-free benefits of Connect. The alternate approach is to leverage the email methods Facebook provides. For each Facebook Connect authenticated user, you are provided a proxied email address which you can use to email the user. However, this approach does have several constraints. The total number of emails you can send the user is governed by Facebook email limits, thus forcing you to adhere to Facebook messaging constraints. In addition, keep in mind you'll have to prompt the Facebook user with a Facebook permission javascript window to have the user opt-in to email notifications from your application prior to leveraging theproxied email address.

Facebook Connect - Email Permission

Friend Linking
One of the strongest promises of Facebook Connect is to allow a user to take their social graph with them across the web. As users come to your site and connect through Facebook Connect, they are automatically able to see which of their friends are already on the site and see their activities.

Once you have a decent community of Facebook Connect users, this works quite well. Since you have access to all of a user's friends' Facebook uid's, you can now show activity of friend's already using your service.

The problem though comes initially when you launch a Facebook Connect implementation. If you have a large existing user base that has not yet connected via Facebook Connect, then no friends will show up for a new Facebook Connect user, even if their friends exist on the site using the previous authentication mechanism. To help solve this problem, Facebook provides a publisher the ability to submit hashes of email addresses for all of their existing users to Facebook, so that the publisher can prompt a new Facebook Connect user with an invitation dialog to invite existing users of the site to connect via Facebook Connect so they can share friend connections. While this is a useful feature, it requires double opt-in from both the user and their friends. Many who receive such invites may choose to ignore them, making it difficult to jump start the social graph process with an existing user base. Facebook decided to require the double opt-in to ensure privacy for all Facebook users and avoid some of the issues of the previous Beacon product.

Facebook Connect - Invitation

Dynamic Privacy
One tenant Facebook strongly advocates as a real win for Facebook Connect is dynamic privacy. This is an important pillar for Facebook given its previous blunder with Beacon. Users are now in full control, allowing them to choose whether to connect to each site with Facebook Connect, whether to connect existing publisher user accounts with Connect, and the ability to share and un-share profile information with sites and friends.

From a publisher's stand point though, this really just creates implementation limitations. Since all data from Facebook Connect is subject to the Facebook Platform TOS, which limits data caching to 24 hours, a publisher needs to adhere to this restriction and always pull data dynamically from Facebook.

Social Distribution
Facebook Connect also allows users to share their activities back to Facebook in their profile Wall and News Feed. This allows users to share their experiences with your site with their friends and hopefully drive more awareness of your site through Facebook feed channels.

Facebook makes it easy to pop-up a javascript light-box window to allow a user to approve the feed story. It even provides simple options for a full, short, or one-line story, so the user can decide how much they wish to emphasize this activity. While this is an opt-in message (again, correcting their mistakes from Beacon), a user can select to save their preference for this specific activity so future stories can be posted without further approval.

Facebook Connect - Feed Form

While this is definitely a valuable feature for users who are seeking to make all their activities viewable from Facebook, I wouldn't expect this to generate much traffic to your destination site. In the good days of the Facebook Platform, much of the platform growth was due to viral feed stories, notifications, and invitations. As users began to find these channels considerably spammy , Facebook significantly locked down all of them. These days you rarely see many third party app news stories in the News Feed due such limitations. I expect that right now Facebook is allowing Facebook Connect stories to crop up to encourage publishers and user's to use Facebook Connect, but I suspect over time these will also be constrained. In addition, Facebook has little desire to encourage users to leave Facebook and thus is at odds with ramping up distribution to third party sites.

Conclusion
For small publishers, Facebook Connect is a definitive win for its trusted authentication, real identity, and friend linking capabilities. It makes it easy to "socialize" a website that has classically been a straight content site or niche community by significantly reducing friction.

While Facebook Connect does provide value for large publishers as well, there are key issues that publishers must carefully think through before adopting Connect, including access to an email address, merging their existing social graph with Facebook Connect, dealing with Facebook TOS caching restrictions, and more.

To date we have seen some quick and obvious wins from Facebook Connect, including blog comments and socializing content sites. However I'm still waiting for a truly innovative Facebook Connect implementation that goes beyond basic authentication and friend linking to really take advantage of all that Facebook integration has to offer. Is your site ready to take on my challenge?

Resources
For those getting started with Facebook Connect, check out these resources:

The Rise of Media APIs

Last year at the SocialMedia Business School I gave a presentation on Media APIs and how they could be leveraged to enhance existing experiences or build entirely new services around freely available media content. Since I have received many follow up questions, I thought I would take the opportunity to expand on the topic.

Media APIs
2008 definitely saw a rise in media APIs across the major media content types with ProgrammableWeb cataloging 38 music, 42 video, and 37 photo APIs.

Photos
In the photo space, Flickr continued its dominance and saw strong growth in the use of its APIs, reaching a record of 704 API calls per second and ranking as ProgrammableWeb's second most popular API (after only Google Maps).

Videos
YouTube finally launched its own video APIs in a big way in March 2008, quickly becoming the default video API of choice. YouTube did a great job with their offering, providing full programmatic access to their video player as well as a chromeless player to allow third party sites to brand the experience and customize the player controls as they saw fit.

Music
2008 also saw a slew of entrants in the music API space, with imeem launching its APIs in March (disclaimer: I manage the imeem APIs), followed by Last.FM API 2.0 in June, Yahoo Music in August, and iLike and MTV in September. Each came with its own strengths, ranging from unlimited on-demand full length streaming, to limitless music-related metadata, to high quality music videos.

In all 3 categories, API providers have offered very complete solutions to recreate the entire site's user experience as well as leverage a lot of the site's underlying infrastructure. Each provide upload APIs to offload the cost of uploading, storing, and serving media content. They each also provide advanced search APIs to programmatically find exactly the content you are looking for as well as full access to associated metadata for every media item.

Media API Mashups
We are already starting to see categories of resulting mashups emerging, including startups seeking to innovate on the browsing, personalization, and social aspects of experiencing media.

Browsing Experience
We saw a variety of startups in 2008 evolving the user experience of browsing and consuming media content. For example, uvLayer created a beautiful and fluid webtop experience for experiencing YouTube and Flickr content.

Personalization Experience
Given the continued media fragmentation and move away from prime time TV towards online content spread across dozens of popular destinations, it has become much more difficult for a consumer to constantly find interesting content to consume. To fill this void, a variety of media aggregation plays have emerged to bring the best of these content sites together and provide recommendations based on your personal tastes. ffwd is one such service that has aggregated videos from many APIs and organized them in personalized channels based on your expressed tastes.

Social Experience
Others have focused exclusively on the social experience of sharing and discovering media with your friends. Slide has spent a lot of its effort on this, with the integration of premium video into FunSpace and premium music into Top Friends.

API Cost
All the major API providers have decided to provide their APIs for free to developers and hope to monetize through advertising within the media playback experience or indirectly monetize through driving traffic to their destination sites. This is good news for entrepreneurs looking to take advantage of these APIs.

However, I suspect with the flight towards revenue for many technology companies in 2009, entrepreneurs leveraging the media content should expect significant advertising in all syndicated media.

Licensing Limitations
It's not all good news though when it comes to media APIs. Some have very serious licensing limitations that can have significant implications on a startup that is built completely on top of them.

The limitations range from caps on number of API calls, to explicit language preventing building products that compete in any way with the API provider, to the ability to force you into a rev share agreement at a later date. So read the terms of use of each of the APIs carefully before embarking on an integration.

Monetization
The most serious limitations that some API providers have are those around commercial use. This limits your ability to monetize the service that leverages the third party content. Some prevent it outright whereas others have severe limitations on how and when you can monetize the content. Keep in mind as well that the content provider typically reserves the right to advertise in the media itself (instream audio or video pre-rolls\overlays typically). This also limits your own ability to command high CPMs from brand advertisers on media pages as you cannot ensure you can command 100% share of voice for the advertisers.


I hope this provides a more detailed overview for anyone thinking about leveraging media APIs in their next application. Embedded below are the slides from the original presentation.

Intro To Media APIs
View SlideShare presentation or Upload your own. (tags: media api)

Palm Gets it Right With Mojo Developer Platform

The most exciting news out of CES 2009 was the Palm announcement of the Palm Pre, webOS, and Palm Mojo Application Framework.

Palm to my surprise has reinvented itself and gotten back into the smartphone game with the release of the sexy Palm Pre device and new webOS operating system. Early indications suggests it should be in the same consideration set as Apple iPhone, Android G1, and Blackberry Storm. Yet the most innovative news of this announcement was the new Palm Mojo Application Framework.

Only several weeks ago I was having a conversation with one of my colleagues about the double-edged sword of open native mobile platforms. While the opening up and associated app stores have created a lot of opportunities for developers, they have also required developers to learn many disparate development platforms for each and every device. Sure Android should make it simpler to port apps across supported devices, but I suspect it will go the ways of OpenSocial in that it won't bring the promise of write once\run everywhere, but instead the philosophy of learn once and then simply minimize cost of porting to anywhere.

In the desktop world we have just gone through a revolution where we have moved many apps away from native Mac, Windows, and Linux applications to a web world where we build cross-browser and cross-OS experiences. And we are continuing to encourage the revolution with even more powerful browsers like Firefox and now Google Chrome.

I am eager to skip the pain of native proprietary platform mobile client apps and jump right to a world of mobile web browser based applications with cross-device javascript libraries to provide hooks into the native operating systems. Of course the typical criticism of browser based mobile apps today is that they can't take advantage of many of the benefits of the native device, including location based services, address book, local cache and offline data access, and native UI components and gestures. Yet these are all solvable problems by simply having each of the popular platforms exposing javascript APIs for each of these components. Joe Hewitt's early work on iUI shows just how powerful the existing iPhone Safari browser already is in allowing you to recreate full fidelity iPhone native app experiences within the browser.

Palm is taking the first step in realizing this vision and releasing the Mojo Application Framework to allow developers to build application on the new webOS using the web technologies they already know: HTML5, CSS, and Javascript. This thus allows organizations to tap into their existing web assets and already vast experience in building scalable web applications. Developers can leverage the local storage capabilities of HTML5 to have offline access to data. They also have full access to gestures, transitions, and more and access to many of Palm's native device components. It even has full support for background running applications and user notifications, a common criticism of the iPhone platform.

While its too early to tell whether the Palm Pre, webOS and Mojo will take off, it is definitely a step in the right direction for mobile developer platforms. I see way too many examples of native applications on the iPhone that could be much more cheaply developed, more stable and robust, more easily maintained, and available across many more devices by simply making enhanced mobile web apps. I hope to see iPhone and Android opening up even more capabilities in their browsers through javascript APIs and making apps developed with web technologies feel like full fidelity applications on the device, just as Palm is promising to do.

Top Underhyped Open Platforms

2008 was definitely a year of open platforms with the continued growth of the Facebook and OpenSocial communities, the unveiling of the iPhone and Android app stores, and the countless Twitter clients and mashups.

Yet I believe there is still considerable untapped opportunity in several promising platforms that have yet to see significant traction in terms of hype, developers, and ultimately end users.

So here is my list of the top underhyped platforms that I hope to see many entrepreneurs build on in 2009.

Webmail Platforms: Yahoo! Mail, Gmail
There are so many interesting problems to solve with email and we as entrepreneurs are finally going to be able to innovate on them with the opening up of the popular webmail platforms.

Just think of the pain that is email today:
  • Constant overload of email volume and very few ways to sort through the clutter
  • A sophisticated and natural social graph locked in email with no easy way to leverage it
  • Endless files shared through email that are problematic to find, store, and share
While companies like Xobni have developed very innovative solutions to these problems on predominant desktop mail clients like Outlook, we can finally bring these and new innovations to the webmail services we all now live by.

Yahoo! Mail, with over 250M users, has announced its application platform, launched a few white listed applications (Xoopit, Wordpress), and plans on opening up more broadly in 2009. Gmail, with 100M users, already has launched several first party gadgets for Gmail (Google Docs, Google Calendar) and has a developer sandbox available for any developer to place Google gadgets in their Gmail sidebar. I expect in 2009 we will see the complete opening up of these platforms and maybe even Windows Live Hotmail, with its 250M users as well.

Professional Networking Platforms: LinkedIn, Xing
I'm glad much of the craze of social networking platforms has died down as I don't think I could take many more vampire bites, food fights, or fluff friend races. These days it looks like the social networking apps that are still driving acquisition and engagement are those of social gaming and while I occasionally dabble with them, I fail to experience any lasting value from them.

However I'm hopeful on professional networking apps as I think they will likely delve deeper and look to provide more real value than their social networking counterparts.

I'm relieved the LinkedIn Platform did finally launch, but so far I've seen too little too late. I do find the Reading List and Blog Link apps useful to see what my colleagues are reading and writing, but there is so much more I hope to see. LinkedIn was smart in their early thinking of the platform in that they were looking to open up not only first degree contacts but also second and third degree to allow innovation on introductions, new contact-related applications, and more. I want to see applications actually start to leverage what is truly unique about LinkedIn. I think LinkedIn app innovation though will continue to be significantly hampered by their closed platform approach that requires an approval for even getting access to the sandbox. I hope LinkedIn decides to open up the sandbox to allow anyone to build innovative applications but then holds a tight review process to ensure it stays professional and relevant.

Xing, a popular German and European professional networking site, has announced its own OpenSocial platform for 2009, so this should provide interesting opportunities in the international space as well.

Cloud Platforms: Amazon Web Services, Google App Engine, Windows Azure
Many startups have already built consumer and enterprise apps on top of AWS and are starting to dabble with Google's and Microsoft's own cloud platforms. This has already been a disruptive shift in reducing initial CapEx for startups and helping to bring down both the operational cost and effort for basic infrastructure.

But where I think the significant opportunity is in 2009 is building infrastructure applications on top of these cloud platforms to provide higher level services to other startups looking to more easily leverage these cloud solutions. RightScale, for example, is one such infrastructure play that sits on top of AWS but makes it easy for you to manage and auto-scale your EC2 instances. Heroku is another exciting example of a very high level Ruby on Rails development platform that allows developer to simply focus on their app code and Heroku takes care of the rest in terms of spawning EC2 instances, managing load, and more.

Yet there are still lots of much needed services to be built to support cloud platforms that I expect we'll see much more of as more and more startups move to leveraging the cloud.


Got your own thoughts on an underhyped platform? Leave it in the comments!

Is Google App Engine Ready for Prime Time?

I recently took the time to build a web application on Google App Engine and wanted to share my thoughts on the experience and the pros and cons of Google App Engine as a web development platform.

TuneChimp

tunechimp screenshot

The app I developed is TuneChimp, a music mashup that was recently named ProgrammableWeb's Mashup of the Day and a finalist in Mashable's Y! BOSS Challenge. TuneChimp makes it easy for you to discover the very best music, videos, photos, and more for an artist by mashing up content from imeem, YouTube, Flickr, Yahoo, Last.FM, Google News, and more. TuneChimp takes advantage of the dozens of music-related APIs that are now available on the web to auto-generate an artist profile to quickly discover new artists or play music from your favorites. The most useful feature is that it takes the top tracks from an artist based on robust Last.FM audio-scrobbling data and creates a playable playlist using imeem's on-demand music streaming platform.

The core of the application calls 11 different APIs, appropriately caches the datasets, and cross-links the various dataset to put together a meaningful artist profile. The APIs are all accessible through REST endpoints in either JSON or XML formats.

Pros
The single greatest advantage of Google App Engine is speed to market of an application. TuneChimp was designed as a small weekend project to let me perform competitive analysis on the most popular music-related APIs (I manage the imeem Media Platform and wanted to see how we fared against other offerings). The beauty of GAE was that the same day I started coding I had a basic site up that pulled data from several APIs. Since GAE only supports a narrow web app scenario, it makes it extremely simple to setup, develop, and deploy an app. No need to install an OS, configure apache, nor optimize mysql.

While some complain the datastore APIs are limiting because you can't perform classic relational join operations, anyone who has been involved in a large web app built on an open source stack knows that those operations don't scale anyway. GAE forces you to design for scale from the beginning, but its an easy mental model to learn and super-fast to get up and running. The dbmodel objects are similar to any web framework, like RoR and others, so its also very easy to pick up. And the immediate scalability benefit makes it so you don't have to worry how you are going to handle extra load.

Currently GAE is limited to Python and that has put off a lot of people from trying it. I decided to bite the bullet and learn Python and I have been so happy that I have. I see Python as a great compromise between PHP and Ruby on Rails in that python is more explicit like PHP but cleaner in code and still has many of the productivity benefits of RoR with the GAE framework (or Django framework if you choose).

Cons
Unfortunately I encountered some serious bugs in GAE during my development. One bug prevented any web request in production from returning multiple cookies. Unfortunately many APIs use cookies for authentication and it was impossible to read from certain APIs without implementing hacks. I filed the bug, complained to my contacts at Google, and it still took months for this issue to be addressed. GAE is still clearly a work in progress and the bleeding edge developers who are willing to engage with it now will have to continuously invent hacks to get around these kinds of bugs for some time to come.

In addition the restrictive exceeding high CPU quotas and inflexibly short time-outs make it VERY difficult to reliably build on top of third party APIs with varying response times. I ended up having to build in retry logic and significant caching to try to work around these time-outs. At the same time, without the ability to run long run processes and cron jobs, a developer is forced to continue to host a server outside of the GAE environment to perform batch processing and more.

Probably the greatest detractor from Google App Engine though is the propriety stack that it is built on and the resulting lock-in. This creates significant technology risk for a startup to build on top of GAE since its going to be extremely costly to move to a different infrastructure if necessary. Hopefully some of the projects third parties are working on to port the GAE web framework and datastore will mitigate some of the issues associated with this lock-in.

Overall
While Google App Engine has become my web development platform of choice for all my weekend projects, I would not yet take the risk of running a production web business on top of GAE. The platform though is very promising and I hope to see my concerns addressed over time as well as large web app success stories built on top of this cloud platform.