RSS ugliness

I find it ironical to post a criticism of RSS through a medium where RSS is so dominant itself but here it goes anyways šŸ™‚

RSS is mainly used for communicating news (more often breaking news than editorials). Hence, timeliness of getting a news item through RSS is important. The mechanism by which RSS works is pretty simple itself. An RSS feed is a simple text file served by a web server. News readers keep ‘polling’ the subscribed feeds at regular time intervals to get updates. In general, news readers set their polling rate at highly aggressive levels to ensure timeliness of news delivery.

Many sites complain that RSS feeds on their sites get ‘polled’ at such high rates that the amount of bandwidth they consume is significant compared to the traffic to their main website. This is obviously not a good trend as bandwidth is scarce and expensive. Also, most of the transfers are just for the news readers to check if there was a news update or not, so its quite likely that a significant fraction of RSS hits aren’t even viewed. Some analysis on this can be found at Robert Scoble’s blog and Internetnews. The general workaround against the problem is to limit the size of RSS items or use an RSS distribution service.

It is amusing to note that RSS appeared and became successful at a time when well engineered and time tested alternatives already existed. The most prominent is e-mail. Its well tested, robust and most importantly a notification medium by design. Instead of messy polling, it is based on one time reliable delivery of the message directly to the interested party (I am talking about the SMTP protocol for mail delivery). There is no scarcity of email clients on all kinds of platforms and they have umpteen number of ways to automatically filter and organize one’s mail. Email has also evolved significantly over the years. Problems like duplicate delivery and authentication have been taken care of. Its sad to see the powerful and efficient system of email newsletters being abandoned in favor of RSS feeds.

Please note that the IMAP protocol for accessing is mail is a client protocol and does use polling. However, it sensibly restricts polling to a very small transaction with the client asking the server for a count of unread messages. This is way better than fetching the whole feed again as done by RSS. Having said that, it is known that IMAP can be pretty resource hogging itself. It is very much possible for a thin client interface for mail (webmail, remote login) to actually be more efficient than IMAP itself.

RSS is evolving too. There are services which allow you to cache your website’s feed on their servers. The service takes over the subsequent distribution load. This reminds me a lot about how DNS works. DNS is probably one of the most heavily used, scalable and design-wise underestimated protocols on the Internet. It also works by having (a big number of) distributed caches between authoritative content (in case of DNS, the host-name to IP address mappings; the feed content in RSS’ case). However, there are enough differences between the two to point out the problems with RSS’ general design. DNS data is reused (a lot). Hence caching it at a local server makes sense. Updates are infrequent and often non-critical making an elaborate delivery mechanism (like Email for instance) an overkill. DNS hence resolves to simple expiration of cache entries to refresh DNS data in the caches. In contrast, RSS data is discarded the moment it has been used hence polling and caching at a ‘blog service’ site makes little architectural sense other than offloading bandwidth usage. We already touched upon the timeliness and frequency of update aspects of RSS which is in stark contrast with DNS.

Overall, a step backwards in technology…

Advertisements
This entry was posted in Internet. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s