Deactivated the WordPress RSS Cloud plugin

After some research into the architecture of RSS Cloud and how the WordPress plugin implements it, I decided to deactivate the WordPress RSS Cloud plugin.

In 2002 != 2009, a recent post on Scripting News, Dave Winer alluded to third-party concerns of scaling issues with RSS Cloud. A quick Google search uncovered the following scaling concerns:

Workbench: There's a Reason RSSCloud Failed to Catch On

techbrew: Is rssCloud All Wet?

BlogAnon: Why the RSS Cloud probably won't scale to Twitter like levels…

The blog posts exaggerate the problem, but if I understand things correctly, the problem is serious with regards to the WordPress plugin.

Every time you publish a new post to a WordPress site with N RSS Cloud subscribers, the RSS Cloud plugin sends N HTTP messages. Ideally, a site should only need to send one message, to a cloud server which manages the subscribers itself. Unfortunately, the WordPress plugin uses the WordPress site itself as the cloud server, so it bares the entire cost of providing real-time updates to your subscribers. With no cloud server involved, it's a purely client/server, unicast messaging situation with linear growth. That means very popular WordPress sites can easily DOS themselves every time a new post is published.

Each notification receiver (subscriber) fetches the updated feed. When you have a cloud server, it's possible to delegate caching the feed to a scalable location, but the WordPress plugin doesn't support that. So, each subscriber fetches your entire RSS feed, not just the new piece, from your WordPress site. That's a very signficant amount of instantaneous traffic hitting your server.

(If it isn't instantaneous, users wouldn't get real-time updates, which would make the whole system useless, thus, the traffic must arrive as soon as the subscribers can fetch it.)

I know my site isn't so popular it could suffer a DOS attack from RSS Cloud subscribers, even using the WordPress plugin as-is, but I also don't want to advocate the use of this plugin. Not just because it's a dangerous architecture, but because scalability and distributed systems are two of my research areas in my graduate studies, and frankly, I'd look foolish if I advocated using it.

It appears the RSS Cloud design can be quite scalable. One area of improvement would be to provide subscribers with deltas of the RSS instead of the entire feed, as pubsubhubbub does, but that's not an absolute dealbreaker. I'm not convinced the current RSS Cloud server implementation scales to huge numbers of subscribers or sites, but based on my limited research I think it's possible to do a very nice job on that part of the equation.

One concern for site publishers is analytics. Since the real-time feed requests do not have to go to the original server, real-time requests will not appear in their access logs. It would be up to the cloud server operator to provide access to that data (somehow, that is outside of the spec).

So why isn't wordpress.com getting DOS'd since all wordpress.com blogs now support RSS Cloud? The answer is simple, there aren't many RSS Cloud subscribers yet since very few people are using a compatible RSS feed reader today. It's a disaster waiting to happen, so the plugin will have to be improved and wordpress.com will need to implement a scalable RSS Cloud if it ever becomes popular on the reader-side.

Written on September 10, 2009