RSS: From Format to Plumbing to Nothing in 13 Years

Is RSS being forced out by technology companies that want more control over all aspects of the user interaction? That’s the contention of an article recently published in the Sydney Morning Herald (see “Apple Joins the War on RSS,” by Adam Turner, 1 August 2012, via the RSS Specifications blog). Turner points out how RSS is no longer a part of either Apple Mail or Apple Safari in the latest version of its operating system, Mountain Lion. He goes on to show how major Internet sites like Facebook and Twitter have been removing the in-built RSS feeds from their pages, making it harder to subscribe to information streams without subscribing to the service itself. Google Plus, Turner notes, never had RSS to start with.

When RSS started, it was a tool for individuals to use to track web sites and people they are interested in. In the early 2000s, sites tried to get you, the reader, to subscribe to their RSS feeds as a way to retain readership. By the late 2000s and first view years of the teens, RSS was less a selling point, but a fundamental part of any web site. It had evolved into an open, universal data exchange standard for web sites. Applications could easily sniff it out (through the headers, read only by applications but not seen by human users who didn’t view a page’s source, of web pages). Perhaps this change from shiny fixture on the kitchen counter to plumbing behind the wall was not a sign of its fundamental importance, as I posited previously.

As Turner points out, the HTML code to indicate the presence of an RSS feed is increasingly rarely even seen in a web page’s header. For example, look at the source of the RSS4Lib Twitter profile page or the RSS4Lib Facebook Page. No relative link in the document header to let an application know that there’s an RSS feed present.

While I regret the change in philosophy that has led popular social networking sites from making it harder for the content on the site to be used in other venues, I suppose I understand it. I imagine the Twitters and Facebooks of the world are thinking something along the lines of this: “If we can prevent that scourge of openness, RSS, from liberating individual user’s content, we can sell more ads or control more interactions.” In a commercial sense, that’s plausible, even if not wholly reflecting reality.

At the same time, if Apple no longer indicates that RSS feeds exist in pages that you visit in Safari (and if other browsers follow suit), that’s will drive a fundamental change in the way individuals discovery and access Internet content. Sure, discovery will happen, but it will happen through your social networks, mediated by major services. And it will happen in short-form: a few characters in a tweet, or a snippet on Facebook. It won’t happen in long-form, in a context that you (the consumer) manage. If information wants to be free, as the saying goes, it needs a path to follow. RSS seemed like it was that path. What’s next?

Drupal in Libraries

Cover of book

One of the reasons I’ve been so absent on RSS4Lib over the past eighteen months or so is a larger project I was working on: a book, Drupal in Libraries, Volume 14 of the Tech Set ® series edited by Ellyssa Kroski.

The book is written as a primer for technically proficient librarians who want to learn more about Drupal and manage a web site using it, but who are not themselves coders. The only time you’ll need to be typing commands directly into a terminal emulator (and even that is optional) is to install and decompress the Drupal software. The rest of the book is focused on what you can do with Drupal from the administrative interface. The book has 10 chapters, as follows:

  1. Introduction
  2. Types of Solutions Available (how and where you can get Drupal, seek development and/or technical support)
  3. Planning (this chapter is available as a free sample)
  4. Social Mechanics (working with your organization to build a successful project)
  5. Implementation (this is the bulk of the text and walks you through the basics of adding and configuring modules, creating content types, and working with various features such as views and panels)
  6. Marketing (how to sell Drupal to your staff and to your IT organization, and how to sell the site to your patrons once it’s launched)
  7. Best Practices (tips and tricks for building a secure and stable Drupal site)
  8. Metrics (measuring the success of your new site)
  9. Developing Trends (up-and-coming tools and modules to be aware of)
  10. Recommended Reading (an annotated bibliography of books, articles, and learning resources

The book also has a companion web site, with additional information and discussion forums. If you have questions about the book, or Drupal in a library setting, stop on by.

You can purchase the book through Amazon.com, from Neal-Schuman, or through your favorite bookseller.

Pinterest

Pinterest (http://www.pinterest.com/)is the latest social media tool to emerge from the fringes to the spotlight. It’s something of a social media bulletin board for interesting images. Once you set up an account (invitation only, but you can request an invitation — mine came within hours), you are given a bookmarklet tied to your account so that can start pinning images you find on the web.

When you’re on a page that has an image you want to “pin,” you click the bookmarklet. Pinterest shows you thumbnails of all the images on that particular page. You select the thumbnail image you want and the board you want to add it to (you can create as many boards as you like).

Uses of Pinterest for Libraries

Pinterest has some interesting uses for libraries:

Copyright Questions

One of the interesting challenges faced by Pinterest is that of copyright. Pinterest works by copying a thumbnail image of whatever it is that you pin. When you pin an image, the original is linked from the thumbnail. While probably not, strictly speaking, allowed by copyright law, I suspect Pinterest is operating under the theory that if Google can cache a thumbnail of an image (or even of an entire web page) for its search tools, then they can do the same.

Complications arise, though, when one Pinterest use copies an image from another. You can "repin" another user’s image to one of your own boards. At that point, you’ve created another copy of the image on your board that links to the "original" — that is, the thumbnail on someone else’s board — and not to the original artist’s. There’s been quite a kerfuffle about this of late.

There’s a very nice summary of the issues around "pinning" things at the University of Minnesota’s Copyright Librarian blog (and a follow-up post) that I encourage you to read. It summarizes the issues far better than I can.

Pinterest via RSS

Pinterest doesn’t document its RSS feeds well, but I stumbled across some instructions for how they can be made.

  1. To get an RSS feed for all of a particular user’s boards, add “feed.rss” to the end of the user’s Pinterest page. So, for example, for RSS feed for the Darien Public Libraries Pinterest account is http://pinterest.com/darienlibrary/feed.rss.
  2. To get an RSS feed for a specific board, remove the end “/” from the board’s URL and then add “.rss”. So the Darien Library’s Best Books for Babies and Toddlers board has the feed http://pinterest.com/darienlibrary/best-books-for-babies-toddlers.rss.

Happy syndicating! (And don’t ask about the potential for copyright issues when we you re-publish an RSS feed of a Pinterest board that itself has copyrighted but unlicensed images on it.)

Curators’ Codes to Standardize ‘Hat Tips’ and ‘Vias’

An interesting proposal was made at SXSW this week to standardize the way we bloggers, and other content aggregators and curators, make reference to those from whom we get interesting tidbits that spark a thought (a ‘hat tip’) or are the source of our post (a ‘via’). The glyphs are called Curator’s Codes. They are Unicode characters meant to be a standard (if not a real one, a standard of practice) for giving where credit is due:

Symbol Purpose HTML Code

[Unicode 1525]
Via <span style=”font-family:sans-serif;text-decoration:none;”>&#x1525;</span>

[Unicode 21ac]
Hat Tip <span style=”font-family:sans-serif;text-decoration:none;”>&#x21ac;</span>

The symbol itself is the link to the source. Curator’s Codes could be rendered in line, much like a brief citation, or used as freestanding blocks. Or, really, in any way that’s sensible to the author. As in, for example, the hat tip for this post:  David Carr, “A Code of Conduct for Content Aggregators”.

What’s the point? To quote the folks at  Curators Code:

While we have systems in place for literary citation, image attribution, and scientific reference, we don’t yet have a system that codifies the attribution of discovery in curation as a currency of the information economy, a system that treats discovery as the creative labor that it is.

As we madly link from thing to thing, and others, in turn, pick up our post and run with it, quoting here, paraphrasing there, it’s all too easy for something one author says to be lost in the expounded thoughts of another. Making a simple, standard, way for authors to cite others is a good thing. And to quickly indicate the kind of citation — are you quoting or paraphrasing, or giving credit to someone else who sparked a thought? Standardization may be a good answer. It could even lead to better machine parsing of interconnections between blog posts, tweets, Facebook, etc. — if adopted.

Update 13 March 2012: There’s an interesting contrarian view at The Brooks Review.

The Paradox of RSS and Web Scale Discovery

Web Scale Discovery systems (products like Summon, EBSCO Discovery Service, Primo Central, and so on) make their customers love them through their comprehensiveness. These systems index hundreds of millions — some approach a billion items — from scholarly and popular sources, library catalogs, institutional repositories, and more. No matter how esoteric the topic you are looking for, you’re almost certain to find something that’s related. Or close to being related.

With their vast reach, these discovery systems open the door to being almost omniscient alert services. Their coverage is vast, so whenever something new is published on a topic, it is likely to find its way into the discovery index. The challenge, it turns out, is in letting people know when something new is available.

Discovery systems are primarily retrieval systems. They cast a wide net, and sort their results in relevance order. When something new is added to the index and the same search is run, the new items appears somewhere in the list. This is the challenge for any kind of current awareness system (whether it is RSS or email alerts).

If the system simply runs the search again and provides an RSS feed of the 100 most relevant results, for most searches, the new material will be nowhere near the top and the feed will contain exactly what you have already seen. For many topics, the new items won’t even make the relevancy cut and will be excluded.

If the system runs the search and provides an RSS feed in reverse chronological order (newest items on top), the newest items may well be so far down the relevancy ranking that they are, in fact, nearly irrelevant. Try a couple experiments. Do a search in your favorite tool and move down to the 5,000th result. Is it the item you’ve been looking for all your life? Almost certainly not. Do the same search, but resort by publication date (newest first). Is the top result relevant to your query? Again, probably not.

So what is needed is some sort of hybrid, database structure. The items from the original search result set that pass some relevancy threshold need to be saved. Whenever new items are added, these new items are compared to the existing list. If they are more relevant than items in the previously seen list, they are added to an alert, and the list of previously seen and previously alerted items grows. Figuring out which are new (to the user) items is not trivial.

Discovery and RSS are almost inherently at odds with one another. Any ideas on how to build a usable RSS feed to stay apprised of a topic?

Google Reader’s A-Changin’

Google recently announced that they are soon to relaunch Google Reader with a new design and are “going to bring Reader and Google+ closer together, so you can share the best of your feeds with just the right circles.” Although I am not a huge fan of Google+ (Aside from the coolness of Hangouts, I haven’t seen a reason to convert from Twitter and Facebook; my social circles don’t see to be active in Google+), one of the things that has griped me about Reader is that there has been no way to share RSS items with my Plus circles. If nothing else, that will soon change.
Something else that will change is that the Google Reader API (an unofficial, undocumented, and formally unsupported API) will at some point be phased out. This doesn’t make a difference to users of the Google Reader web site, but does matter for anyone who has been using Google Reader to track what has been read in applications like FeedDemon and others.
If you want to get your data from Google, they will continue to offer an OPML download of your feeds, but will be augmenting the list of subscribed feeds with your other personal data, including your shared items, friends, likes, and starred items. What you do with them then is your business.

The Link to This Post Has a Half Life Measured in Hours

A recent research report by Hillary Mason of Bit.ly explores the lifespan of a link shared through social media. Her findings are that links shared via Twitter, Facebook, etc., have remarkably short life spans. She measured the half-life of shared links (the amount of time it took for a link to receive half as many clicks as in the previous time period) and learned that, for most links, the half life is two-three hours. (The outlier exception is links shared from YouTube, where the half life of a shared link is a whopping 7.4 hours.)
Graphs and the full report are available on the bitly blog.
Of course, this post is immortal, because as we all know, blog posts never die. Right?

Just How Dead is YOUR RSS Feed?

There has been another incarnation of the “RSS is dead” meme in the past weeks, with posts at TechCrunch and GigaOM debating the point. The conclusion of these posts seems to be that RSS is continuing its gradual evolution from being perceived as an end-user tool to being viewed as plumbing. And this is probably a good thing.
While I still consume most of my “blog-like” news and commentary via an aggregator, I rely more on recommendations through my social networks for learning what’s new. Perhaps that’s because I’ve become lazy about actively following lots of sources, and prefer the crowd to do the filtering for me. Perhaps its because the blogs and news sources I follow are less frequently updated (I know this blog falls in that category). Whatever the reason, I know my consumption patterns have changed. And I’ll wager that most people feel too busy to sift through everything published in every publication they like, and prefer instead to find like-minded individuals who share things of interest. Again, much like I do.
Still, if you’re curious to learn how your feed is consumed (and don’t use Feedburner or the equivalent), take a look at RSS4Lib’s YourStats log file analysis program. If you upload your publication’s log files and tell it what your RSS feed URL is, it will show you where your RSS feed is consumed — providing a good guess at your RSS readership. You may find the numbers surprising (high or low).

Farewell, Bloglines, It’s Been Swell

Bloglines, the venerable RSS reader that I — and tens of thousands of others — have used since 2005 is shutting down on October 1, 2010. Bloglines is making it easy to continue your feedreading habit elsewhere, replacing their front page with the 3 simple steps to export your folders and subscriptions in OPML format:

Exporting Bloglines subscriptions into OPML (click for larger version)

The inevitability of this, in retrospect, seems enormous, and I’m surprised my fondness for Bloglines’ simplicity has made me put up with its quirky behavior. (Quirky, of course, means almost constant brief outages on their perpetual beta version.) Bloglines’ move into selling advertisements on its front page (see Bloglines Succumbs to Advertising from September 2008) was obviously not enough to bring in the revenue needed to keep the service. When your only serious competitor is Google, I suspect almost nothing can save you.
In the blog post announcing the shut down, the trend behind the news is made clear:

The real-time information RSS was so astute at delivering (primarily, blog feeds) is now gained through conversations, and consuming this information has become a social experience. As Steve Gillmor pointed out in TechCrunch last year, being locked in an RSS reader makes less and less sense to people as Twitter and Facebook dominate real-time information flow. Today RSS is the enabling technology – the infrastructure, the delivery system. RSS is a means to an end, not a consumer experience in and of itself. As a result, RSS aggregator usage has slowed significantly, and Bloglines isn’t the only service to feel the impact. The writing is on the wall.

I made a similar point about the phase change in RSS from being a commodity in itself to being a transport mechanism in September 2009. Just as soundbite reporting in television and radio news changed that medium, so has ‘textbite’ exchange of information on the Internet. The overwhelming force of the conversation in Twitter and Facebook — where the granularity of information exchange is much smaller and seems to permeate the Internet with greater fluidity — has changed the game.
I’m not giving up on my RSS feeds (from blogs, news services, and other sources), but I’m switching to the only other game in town: Google Reader.