On Privacy of Private RSS Feeds

I have been using Google Reader as my main RSS aggregator for several years now. Unlike some others, I however continue to use a desktop-based RSS client to subscribe to private feeds. This was an intuitive decision, I didn’t spend much time thinking about it.

Earlier this week, I was in for a big surprise. I did a search on a public search engine, and results included a link that I recognized to be from a private feed. In other words, public index included information that should only be available to registered and authorized users of a particular site.

So I started digging. First of all, there are three general forms how a private feed can be implemented. This post summarizes it nicely - unique secret token in URI, cookie or HTTP authentication. Unique secret token seems to be most popular these days, possibly because the other two methods will make it more difficult to get such feed in online readers.

With unique secret token method however, a feed publisher must somehow notify search bots that this content is not to be indexed. Otherwise, we rely on the fact that this URI will never be discovered, which becomes problematic with so many people switching to online readers recently. I found an old story on Techcrunch that brought up the same issue and discussed efforts by Bloglines to set up a standard for this, but could not confirm if those efforts led anywhere. This leaves one well-known method - robots.txt.

Dear publishers of private feeds! Please make sure to disallow access to private feed URIs on your sites in robots.txt. I checked two major publishers of private feeds in the last couple of days that use unique secret token method, and none of them have proper disallow in their robots.txt. Ouch!

Am I missing anything? If you think my theory is wrong and there is a better method, please let me know in comments below.

Categories: blogging |

Comments (2)

Ian // 04 Mar 2009

Dmitriy, thanks for this information. Has anything changed since this post?

Dmitriy // 04 Mar 2009

Ian, I haven't checked. I do know that I occasionally run across content from a private feed on one of the social networks that somehow made its way into a public index of a search engine. I contacted that social network once and my request was supposed to go to the right team, but I am not sure what happened from that point on.

You can test if a given site is affected by looking at URL at which a private RSS feed is offered and checking the site's robots.txt.

This phenomenon is not very widespread at this point, however.