Understanding the Players in the Social Data Layer


Social is clearly one of the biggest trends on the web right now, with the majority of new apps and services taking advantage of your friends to provide a more participatory experience. This extends across desktop and mobile applications as well as across most verticals, including media, e-commerce, travel, and more.

But what’s most exciting to me is what is happening a layer below these applications - the rise of the social data layer. The social data layer provides a set of compelling APIs that any application can take advantage of to quickly immerse it’s experience in social. Just as cloud computing significantly reduced the cost of building web applications, these social data platforms are significantly reducing the friction in creating compelling social experiences.

While Facebook is clearly leading the efforts in providing the social data layer, there are a growing set of startups and other providers of social data that new applications can take advantage of. I thought I’d take a moment to describe the current landscape from my perspective.

Social Network APIs
Without a doubt, at the core of the social data layer are the social networks that enable access to both their rich social profile data as well as robust social graph APIs.

Facebook, now with over 750 million active members, is not only the largest of the social network providers, but also kicked off the social data revolution by opening up their APIs in 2007. Any app developer building a social application should strongly consider making Facebook their base, with the largest & truest social graph across all the networks.

Twitter, with 300 million registered accounts, provides a very unique set of opportunities as their one-way follow mechanic has led to Twitter’s social graph being described as the interest graph as opposed to a pure-play social graph. Since you can follow people that you may not know, but are interested in their area of expertise or just keeping an eye on, it creates a unique set of graph nodes that are compelling for a variety of applications. And of all the social data providers, Twitter has been the closest in keeping up with Facebook in terms of API robustness and has even gone on to create an entire layer of streaming APIs that are very unique to Twitter and their data set.

At 120 million members, LinkedIn is by far the largest professional graph with the richest searchable resume data for each of its members. It’s a clear choice for any professional application. LinkedIn has had renewed focus of late on their API offering and has expanded beyond basic profile APIs to also allow you to query their company data, group data, and jobs database.

Email Providers
Another rich source of implicit social data that I believe is still significantly under-utilized is the email inbox. Locked inside one’s inbox is almost a truer representation of one’s social graph compared to that which is mapped on explicit social networks. And we only now starting to see applications start to leverage this data in interesting ways.

Gmail is leading the pack in opening up their platform to third party developers. For one, they launched an OAuth extension to their IMAP APIs, which now allow you to have delegated access to a user’s inbox without the user having to share their credentials. Given how sensitive the inbox is, this one addition goes a long way to ensure user trust. In addition, they launched the Gmail Contextual Gadget extension point that allows apps to be embedded right within Gmail. Unfortunately this extension is currently limited to Google Apps, but will hopefully be ported to consumer Gmail as well.

Yahoo Mail
Yahoo Mail also provides a set of APis to query their inbox directly. It’s nicer than Gmail’s interface since you can bypass IMAP altogether and use their web standard APIs. Yahoo has also invested in mail as a platform by enabling applications to be installed right within the Yahoo Mail interface. Unlike Gmail, these apps are targeted at both consumers and professionals leveraging Yahoo Mail.

Windows Live Hotmail and AOL Mail are other notable mentions here due to their large user bases. However neither have devoted serious resources to opening up a platform to access their inbox data, though straight POP and IMAP access is available.

Inbox APIs
While you could develop an application that directly speaks to the various email APIs out there, there are a set of inbox API startups looking to simplify the entire effort of accessing inboxes on your user’s behalf.

Context.IO provides a robust inbox API that will automatically index the inbox of your end-user and provides an easy-to-use REST API accessing those messages. If you have ever dealt with IMAP, you’ll appreciate all the work that Context.IO does for you so you don’t have to deal with the complexities associated with it. Their currently enable indexing an IMAP account with speed and scale.

Jexy is still in it’s infancy, but one to watch for normalized API access to email, calendar, notes, and more. Their goal is to provide a single interface across any inbox, whether it’s IMAP, POP, Exchange, etc. Looking forward to trying their beta when it becomes available.

Social Data Providers
The social data providers supply connection data to map an email address to social profiles across the web. This becomes very useful when trying to acquire social data from email addresses or trying to fill out a full social profile of a given user.

Rapleaf was the first compelling offering in this space with one of the largest databases of social data. Unfortunately they have discontinued their social profile lookup API that returned various social profiles for a given e-mail address due to negative press around how they acquired their data. They do still offer a useful personalization API that will give you data on the user behind an e-mail address, including age, gender, household income, and more.

Qwerly allows you to search both by an e-mail address to find associated social profiles or by a single social profile and find other associated social profiles. The data has been leveraged by many email marketing providers, CRM tools, and more to help provide more data about your customers. They have an interesting approach to acquiring their data through a sophisticated social profile crawler.

FullContact is a more recent entrant to the social data space, with a compelling contact API that allows you to send it a partial contact record (name, email, phone number, etc) and have them fill it out with more complete data, including social profiles. Again a very useful data source for apps looking to build a more complete profile of users and connections.

Fliptop also enables looking up an e-mail address and returning both social profiles for that person as well as demographic data like name, gender, location, and more. Worth checking them out as well.

Google Social Graph API
Google also provides an answer to this problem via their public crawler. The Social Graph API enables accessing social profile data via their search engine and allows you to search across a variety of attributes. It’s definitely worth looking at for your needs as I hear their data set has gotten much better over the years.

Social Influence APIs
With the rise of the power of the collective community across social networks, key influencers become more and more important. And now there are a set of APIs for you to understand just how influential a person is across their areas of expertise. This data can be used for scenarios ranging from CRM, to customer support, to social marketing campaigns.

Klout is the most well known social influence provider available. While they got their initial start analyzing Twitter data, they have since expanded to analyze 10 different social media properties, including Facebook, LinkedIn, YouTube, Blogger, and more. For any user, Klout provides an overall influence score as well as details on a given person's areas of expertise.

PeerIndex is another provider of social influence data. They similarly provide detailed scores on each user to help you better understand their topic expertise as well as overall audience reach.

Personal Data Stores
These projects attempt to bring all your personal data together and then make them available to services via a unified API. They provide value both to the end user in the aggregation but also to developers via their API.

Singly is the company behind the Locker Project, an open-source effort to create a personal data store of all your personal data from across the web. While a useful end-user service in itself, they also plan on offering a rich set of APIs for developers to take advantage of to get access to this personal data store for their applications. While still in the early phases, certainly a worthwhile effort and one to watch.

Greplin is the ultimate search tool across your personal data. They index all your social accounts, inbox, and more and provide a simple Google-style interface to search across the data. They currently have an API in closed beta, but will hopefully open it up shortly to allow other developers to take advantage of their rich index.

API Aggregators
When you are looking to integrate with a variety of APIs, it’s often useful to consider leveraging an API aggregator that normalizes social data for you into a single API interface.

Gnip is the most well known API aggregator, providing comprehensive access across a variety of social data providers. They are also the official Twitter partner for getting access to the Twitter firehose of data. So if you have extreme Twitter needs, these are also your guys.

Apigee is useful in the development stage, as they provide developer tools for exploring APIs, making it much easier to get started with a variety of APIs. I expect over time these guys may even help normalize these different APIs for developers, though they are currently focused on working mainly with API publishers.

There have been several attempts as well to develop standards for sharing social profile data that will hopefully continue to get traction amongst publishers.

PortableContacts was designed as a standard for publishers to share their contact data uniformly across a variety of publishers. Plaxo was one of the first publishers to support it, as Joseph Smarr was a strong advocate for it while he was there. Google also has an implementation of PortableContacts. However, we haven’t yet seen many of the other providers of contact data take up PortableContacts, so it’s usefulness is currently limited.

Webfinger attempts to bring back the old finger protocol that allowed you to get identity information. The new webfinger API enables modern access to identify information. Google again currently implements Webfinger, but continues to have low adoption amongst other data providers.

If you know of other startups or technologies that help to access the social data layer, please leave a comment!
Want to accelerate your product career?
I've finally distilled my 15+ years of product experience into a course designed to help PMs master their craft. Join me for the next cohort of Mastering Product Management.
Are you building a new product?
Learn how to leverage the Deliberate Startup methodology, a modern approach to finding product/market fit. Join me for the next cohort of Finding Product/Market Fit.
Enjoyed this essay?
Get my monthly essays on product management & entrepreneurship delivered to your inbox.