May 8, 2008

Tagging's Long Tail

Tagging systems offer a fascinating opportunity to study how people tag and what collective wisdom can be generated from the masses. Tim Spalding, in his a recent post at Thingology, The Long Tail of Ann Coulter, observes that tag use in LibraryThing resembles the Long Tail principle. That is, a few tags are used a great deal to describe a given item, while other tags are used just once. These singleton tags reflect the idiosyncratic nature of individual taggers.

I've been thinking about the value of these singleton tags, without conclusive results, in connection with MTagger, the tagging application we built at the University of Michigan library. With the 8.5 million items in the library catalog, or even the 55,000 web pages on our site, will enough tags ever be applied to enough items to make it a useful mode for a newcomer to find an individual item, or are they just an aide-memoire for the person who applied them? In other words, do tags way off in the Long Tail matter?

The more I've pondered this, the more I realize that it's not an either-or question. Tagging, at least in the library environment, is most valuable as a personal collection tool. It offers a way for library users to bring together things that seem similar to them for their own purposes. The real value of tagging is like that of a library: it's the collections, the constructed universe of things that someone (a librarian, a subject expert, a user) brought together. While my tags may prove of no value to anyone else in finding a particular item, the mass of items I've used that idiosyncratic tag on may very well guide a future user in resource discovery.

May 6, 2008

Academic Institutions on Facebook

Melissa Cheater at the Academica blog compiled a survey of institutions of higher education with a presence Facebook and published a post titled "How higher ed is using Facebook Pages."

She found more than 420 IHE-related Facebook Pages. It is interesting to note that Facebook does not provide a standard way to identify authorship -- so she was unable to determine who published the page: the school, a staff or faculty member, or someone who thought there should be one? This poses an interesting question of "authority" -- how reliable are Facebook Pages as sites of valid and trusted information?

The full post is worth a read.

May 2, 2008

High Tolerance for Ambiguity

The 2.0 world -- in libraries in particular, or the web in general -- is helping to address the information management problem of ambiguity. In his inaugural column in the May issue of KMWorld," Now, everything is fragmented," David Snowden notes: "The more you structure material, the more you summarize (either as an editor or using technology), the more you make material specific to a context or time, the less utility that material has as things change..."

Much of what the knowledge management world, and, for that matter, librarians more broadly, seek to accomplish is to get the right bit of information to the person who is looking for it at the right time. However, as we build systems to accomplish that task, we often run counter to both the defining characteristic of our age and what he describes as one of the defining characteristics of our species. Snowden writes:

First, we live in a world subject to constant change, and it’s better to blend fragments at the time of need than attempt to anticipate all needs. We are moving from attempting to anticipate the future to creating an attitude and capability of anticipatory awareness. Second, we are homo sapiens at least in part because we were first homo narrans: the storytelling ape. Dealing with anecdotal material from multiple sources and creating our own stories in turn has been a critical part of our evolutionary development.

Information systems are typically built to remove ambiguity. They are tailored to the specific need at hand. Snowden notes that there is a risk to building systems that remove ambiguity by "chunking" information into discrete elements. This risk is shown through research (in national security, in particular) that indicates raw intelligence is more useful over longer periods of time than the reports based on that raw data. 2.0 environments, in which users of information build on the raw materials, mixing and matching sources in novel ways, are more flexible, allowing for changing needs to reflect themselves over time.

A mentor and twice-supervisor of mine described someone's ability to survive in an organization by saying that the individual either had or lacked a "high tolerance for ambiguity." Having a high tolerance was a good thing: if you could keep your relative sanity as organizational priorities and day-to-day exigencies changed, you were in good shape. As librarians, we need to develop a high tolerance for ambiguity in the information systems we design and provide. By this, I don't mean developing to wishy-washy specifications. I do mean that we need to build systems that enable our users to pursue information-seeking paths we don't, or can't, anticipate. Systems must be built to allow others to get to the raw data, manipulate it, and do what they will. As we today's information needs, we must also allow for flexible interpretation and serendipity of discovery.

April 28, 2008

RSS Awareness Day

RSS Awareness Day

Thursday, 1 May 2008, is RSS Awareness Day. There's a grassroots effort to increase the awareness and use of RSS (and syndication tools in general). On the RSS Awareness Day site, it is claimed that "Feedburner recently reported that they track around 60 million RSS subscribers."

Of course, there are a lot more Internet users today than there were in 2005 (one estimate puts the total at 1.3 billion at the end of December 2007). I would go so far as to triple Feedburner's estimate to 180 million RSS subscribers, to account for all the users that Feedburner does not know about. And there have to be millions of them: people who "use RSS" without being actively aware of it, such as through "live bookmarks" in Firefox, Safari, and IE, or from web sites that themselves are amalgamations of feeds from other publications. People do not need to know what RSS is to use it.

Still.... even if we triple the number of users Feedburner thinks there are to 180 million, it is still only 13.8% of 1.3 billion users out there. That's not a particularly overwhelming market penetration figure for something as gosh-darned handy as RSS.

So -- talk about RSS on May 1, especially if you can do so without preaching to the converted. You and I probably do not need to be sold on the benefits. But our patrons do. But our parents probably don't. Take advantage of the first RSS Awareness Day to spread the word.

April 24, 2008

RSS and Legal Liability

A French court has found that the publisher of a web site is liable for invasion of privacy because it republished rumors, via RSS feeds, that were themselves libelous. See French Websites liable for story in RSS reader (Out-Law.com). The publishers of the 3rd-party sites had to pay fines of between 500 and 1,000 Euro. Out-Law.com, a British legal news site, notes that, "while there has not been a test case in the UK on link liability," there is a legal precedent that could be relevant in English common law: "A Court of Appeal ruling ... found that a man who stood by a roadside placard drawing the attention of passers by to it was liable for its defamatory content, even though he did not create or erect the placard."

This French case may not have any relevance in the U.S., where the legal concepts of freedom of speech and privacy are differently construed. I find it interesting that one publisher could be guilty of libel by reproducing, without any conscious effort, an RSS feed from another source. One of the strengths of RSS is one of the drawbacks -- you subscribe to the feed, come what may.

Do any RSS4Lib readers have opinions on this? Fire away in the comments.