Saturday, 11 December 2010

Are Posterous Fudging Visitor Statistics ?

I've recently started using Posterous for the blog of my new developer recruitment startup and while Posterous is a great product I'm somewhat concerned that the readership counts generated by Posterous are significantly higher than what Google Analytics show.

Earlier this week I wrote an article on What Software Developers Watch on TV, here are the stats from it:

Posterous Stats:

Google Analytics:

As you can see Posterous reports three times as many visitors as Google Analytics (3000 vs 1000). Googling reveals many people seeing the same issue.

I also uncovered a response by Posterous co-founder Sachin Agarwal on the topic:

First of all, when people read your posts through the Posterous reader (like I do) that counts as a view for that post on Posterous. But that would not hit google analytics. Same for RSS feeds.

If people hit your site and have javascript disabled, that would still count as a view on Posterous, but would be ignored by Google analytics

If I go to the main page for your site that will count as a post view for *each* of the posts on that page. Google would count that as one view for that page, and no views for each post.

So the fundamental difference here is google analytics is counting when that particular page is loaded with javascript, while we count anytime that *post* is loaded, on any page anywhere.

However I don't believe that this response explains the discrepancy. My Posterous is relatively new so almost no-one is reading it via an RSS reader or via the homepage (well 27 users are according to GA). Javascript is only disabled by between 1-3% of internet users. While these factors might explain a 5% discrepancy in the stats, they come no-where near to explaining a 300% difference.

I've run a fair number of blogs and websites over the years and there's only two things I can think of that would cause such an inflation of numbers:
  1. If Posterous are counting each image download as a page view (which while it would explain a doubling of the numbers, I don't believe this is what is happening)
  2. If Posterous are counting page views generated by bots and crawlers.
I suspect it's almost certainly the second option. I've certainly had websites where 2/3rds of the traffic was caused by automated crawlers.

But if this is the case then the visitor numbers that Posterous are displaying to their users are completely wrong.

Unfortunately it's also in Posterous advantage to show inflated numbers. The biggest threat to a blogging platform isn't it's users defecting to another platform, but rather it's users getting bored and giving up blogging. And lack of audience is the major factor in causing a blogger to get bored and give up. By posting inflated numbers (whether intentionally or otherwise) Posterous is likely getting users to keep blogging when they might otherwise have given up.

I may of course be wrong; but unless Posterous can show that those thousands of extra viewers are genuine users, I'm going to feel slightly morally uncomfortable relying upon their (otherwise awesome) product.

Update: From the discussion taking place on Hacker News where one of the Posterous founders responds evasively to the issue it seems that automated bots/crawlers are actually responsible for the majority of these "phantom visits".