Complete WordPress Data from Automattic

Every Post, Comment, and Like You Need

Gnip offers blog-related content from WordPress representing over 70 million blogs and growing at a rate of over 180,000 new Posts per day. These premium data streams deliver unmatched realtime coverage of posts, comments, and "Likes" from the biggest content creation system in the world, representing approximately 15% of the top million websites worldwide and over 22% of all new sites on the web.


Learn More About Complete WordPress Data from Automattic



Extracted from the Automattic Firehose, your WordPress PowerTrack stream will deliver full coverage of the blog post, comment, and "Like" data you want, based on the filtering criteria you provide.


Covering all WordPress data, the WordPress Firehose gives you access to a realtime stream of all public post, comment, and "Like" data from millions of blogs.


Enterprise-Grade Filters

Want to avoid the complexity of having to sort through the entire contents of a full firehose of data? With PowerTrack you can apply a wide range of operators to easily filter for the exact data you want.


Format Normalization

Want to use just one parser to consume all of your data? Activity Streams format from Gnip enables you to consume all of your data in a single standardized format across all of our data sources.


URL Expansion

Gnip expands short URLs (such as links from,, and many others) and provides the extended URLs as added metadata to your streams.


Language Detection

Gnip uses a language detection algorithm to apply language metadata when you access premium data in Activity Streams format. We currently support 24 languages.


Dedicated Support Team

We offer technical onboarding, technical documentation, code snippets and libraries on GitHub, and rapid support response during business hours with 24/7 emergency response to ensure your success.


Rules API

You can update, manage and organize all of your rules through our dedicated Rules API. They'll update on-the-fly and you never have to worry about disconnecting from your realtime steam to update and compile your rules.



Backfill automatically protects you from data loss caused by brief disconnects from your realtime connection. We track the activities we're sending you in realtime and if you reconnect within 5 minutes, your stream picks up right where it left off.


Redundant Stream

A redundant stream can mitigate the effects of a disconnect by providing a second live connection to your stream. The redundant stream can help bridge periods of disconnection and prevent potentially missed data.


Customized Solutions, Predictable Pricing

Gnip offers customized solutions with predictable pricing to meet the needs of your business. There are three easy steps to getting started.

  1. Discuss what you are looking to do with WordPress data
  2. Get you into a trial so you can test our solution and determine data volumes
  3. Put together the best package for your needs


Payload Elements


  • Blog Name, ID, URL
  • Date Time Published
  • Post Title, Content, Summary, URL
  • Actor ID, URL, DisplayName, Hashed Email


  • Post Comment Count
  • Blog ID
  • Post Author ID, URL, DisplayName, Hashed Email
  • Post ID, URL, Summary
  • DateTime Published
  • Comment ID, URL, InReplyTo URL, DisplayName, Body
  • Actor ID, DisplayName, Hashed Email


  • Like ID
  • Blog ID, URL, DisplayName
  • Post ID, URL, DisplayName
  • DateTime Published
  • Actor ID, URL, DisplayName, Hashed Email



Industries Using WordPress

Social Media Monitoring

Content on WordPress is very deep and with over 525,000 posts and comments per day there is plenty of content to monitor. WordPress also passes along "Like" data which can be associated to specific keywords and monitored.

Business Intelligence

When correlated with other internal business data, WordPress data can expose trends and patterns you may not have otherwise found. WordPress is full of influencers whose content can change your business.


Many of the bloggers on WordPress are fashionistas, foodies, and enthusiasts of all kinds. Finding out what they have to say about your brand can inform your product development, supply chain management, marketing, and more.

Sources Similar to WordPress

As blogging platforms Tumblr and WordPress both allow free form text have have rich, deep content for analysis. With both text and image on both platforms getting data from Tumblr makes sense if you are getting data from WordPress.

Facebook data has many similar characteristics to WordPress data however it differs in that it is much more realtime and there is much more content to analyze. Facebook streams are a great addition to your WordPress stream.

If the blogging aspect of WordPress interests you then consider the microblogging platform Twitter as an additional source. While Tweets might be shorter than WordPress posts the volume is so large you can find whatever you're looking for.

Gnip for

Shoppers Tweet about what they bought, but they turn to blogs to share why they purchased.

Gnip for Blogs Package

Gnip for Blogs is a package offering that combines content from four of the most popular long-form social data sources. This package of data from Tumblr, WordPress, IntenseDebate and Disqus gives you seamless access to discussions happening in the blogosphere. Blog content, now better together.

Success Stories

Wordpress is an important source to add to the social media data mix, one that we know our customers will take full advantage of.


Gnip has been a fantastic partner as our product and business have scaled. They have world-class technology and people capable of delivering an enterprise service that meets our extremely high quality requirements.

Simply Measured