Much of the brouhaha about bandwidth and scalability of RSS feeds stems from the common misconception that feed data is second class data. The Radio Userland model is just one source of the misconception, in that data is dropped into the queue, shuffles it’s way down the chain, and then disappears off the end, a window onto a moving data flow if you will. (Feel free to extend the chain metaphor to it’s logical conclusion)
If you continue this misconception, then you jump to all sorts of myopic conclusions, such as:
- Data is redundant – if you miss a post, who cares
- GUIDs are nice to haves – if I update an a post, who cares if the feed sends a second copy
- Full text isn’t important – just click and you’ll go to the item in a browser
- You’ll get the item eventually – who cares if it is a day late, so long as I can read it
- Filtering defeats simplicity – I’d rather have a million items to read through manually, than have a useful and usable mechanism for cutting, slicing, dicing and sorting important data
- Take what the site pushes – I like being subservant to big publishers, so I don’t have to think about what might be important to me
Wrong, wrong, wrong. Feed items are first class data, and should be treated as such.
When Scoble talks about pulling down full text for posted items, he’s not a nutter with shares in fat pipe providers, he’s an information consumer with a hunger for knowledge, and only he knows how and where he’ll use that data. Once I have that data in my possession, I can do with what I like, read it, re-publish it, annotate it, combine it, scan it, summarise it, print it (in my preferred format/style), convert it to spoken audio (using Windows and Mac voice tools), cross analyse relationships between items, etc. And once the data starts to move away from free text and into more structural data, the skies are the limit. I’m like Scoble (I can’t believe I just said that), and our numbers are growing.
Regardless of the filtering nirvana that I usually rant about, the future is coming, and it is full text and structural data, regularly updated on demand, navigable, and maintains data and relationship integrity. Sure, our current protocols are completely inadequate for these purposes, but then we’ve already been saying this to the RSS biggots for almost a year now.
I’m sensing momentum… perhaps the fog is starting to clear.