May 13, 2008

RSS Feeds & Copyright

Copyright and fair use are poorly understood in the population at large (just ask high school teachers or college professors how much time they spend vetting submitted papers for flagrant -- let alone subtle -- plagiarism). However, syndication technologies such as RSS and Atom make it so easy to repurpose works that what's proper -- morally or legally -- is often overlooked. After all, feeds are purpose built to make content portable. If the author did not want others to copy the content, the author would not send it out in a format designed for its simple syndication.

The Australian magazine PC World runs an interesting article by Larry Borsato: "Who owns 'public' content? RSS feed ownership brought into question." In the article, Borsato recounts a recent incident in which a commercial entity reproduced, in toto, his blog posts via RSS on its web site. While Borsato has a Creative Commons non-commercial attribution license, he felt the commercial entity had violated it; they were, after all, a commercial entity. While the question was resolved amicably, it highlights, once again, the difference between how copyright is frequently viewed in the syndicated environment from how it is often seen in the print world. Borsato concludes:

In the same way that I can't reprint a Harry Potter book and start selling it for my own gain, we need to realize that we can't do that with RSS feeds or other Web content either. While Fair Use is OK, you can't just start lifting and reusing entire bodies of work without permission.

Like many other facets of life in the Internet age, technological possibility is outstripping common practice -- and often outstripping common sense. Some of this particular misconception, about what can legitimately be done with online content, can be cleared up through experience and training. Some of it will inevitably be resolved through better technological solutions. But when it comes down to it, we as bloggers must take greater responsibility for tracking how our content is used.

May 8, 2008

Tagging's Long Tail

Tagging systems offer a fascinating opportunity to study how people tag and what collective wisdom can be generated from the masses. Tim Spalding, in his a recent post at Thingology, The Long Tail of Ann Coulter, observes that tag use in LibraryThing resembles the Long Tail principle. That is, a few tags are used a great deal to describe a given item, while other tags are used just once. These singleton tags reflect the idiosyncratic nature of individual taggers.

I've been thinking about the value of these singleton tags, without conclusive results, in connection with MTagger, the tagging application we built at the University of Michigan library. With the 8.5 million items in the library catalog, or even the 55,000 web pages on our site, will enough tags ever be applied to enough items to make it a useful mode for a newcomer to find an individual item, or are they just an aide-memoire for the person who applied them? In other words, do tags way off in the Long Tail matter?

The more I've pondered this, the more I realize that it's not an either-or question. Tagging, at least in the library environment, is most valuable as a personal collection tool. It offers a way for library users to bring together things that seem similar to them for their own purposes. The real value of tagging is like that of a library: it's the collections, the constructed universe of things that someone (a librarian, a subject expert, a user) brought together. While my tags may prove of no value to anyone else in finding a particular item, the mass of items I've used that idiosyncratic tag on may very well guide a future user in resource discovery.

May 6, 2008

Academic Institutions on Facebook

Melissa Cheater at the Academica blog compiled a survey of institutions of higher education with a presence Facebook and published a post titled "How higher ed is using Facebook Pages."

She found more than 420 IHE-related Facebook Pages. It is interesting to note that Facebook does not provide a standard way to identify authorship -- so she was unable to determine who published the page: the school, a staff or faculty member, or someone who thought there should be one? This poses an interesting question of "authority" -- how reliable are Facebook Pages as sites of valid and trusted information?

The full post is worth a read.

May 2, 2008

High Tolerance for Ambiguity

The 2.0 world -- in libraries in particular, or the web in general -- is helping to address the information management problem of ambiguity. In his inaugural column in the May issue of KMWorld," Now, everything is fragmented," David Snowden notes: "The more you structure material, the more you summarize (either as an editor or using technology), the more you make material specific to a context or time, the less utility that material has as things change..."

Much of what the knowledge management world, and, for that matter, librarians more broadly, seek to accomplish is to get the right bit of information to the person who is looking for it at the right time. However, as we build systems to accomplish that task, we often run counter to both the defining characteristic of our age and what he describes as one of the defining characteristics of our species. Snowden writes:

First, we live in a world subject to constant change, and it’s better to blend fragments at the time of need than attempt to anticipate all needs. We are moving from attempting to anticipate the future to creating an attitude and capability of anticipatory awareness. Second, we are homo sapiens at least in part because we were first homo narrans: the storytelling ape. Dealing with anecdotal material from multiple sources and creating our own stories in turn has been a critical part of our evolutionary development.

Information systems are typically built to remove ambiguity. They are tailored to the specific need at hand. Snowden notes that there is a risk to building systems that remove ambiguity by "chunking" information into discrete elements. This risk is shown through research (in national security, in particular) that indicates raw intelligence is more useful over longer periods of time than the reports based on that raw data. 2.0 environments, in which users of information build on the raw materials, mixing and matching sources in novel ways, are more flexible, allowing for changing needs to reflect themselves over time.

A mentor and twice-supervisor of mine described someone's ability to survive in an organization by saying that the individual either had or lacked a "high tolerance for ambiguity." Having a high tolerance was a good thing: if you could keep your relative sanity as organizational priorities and day-to-day exigencies changed, you were in good shape. As librarians, we need to develop a high tolerance for ambiguity in the information systems we design and provide. By this, I don't mean developing to wishy-washy specifications. I do mean that we need to build systems that enable our users to pursue information-seeking paths we don't, or can't, anticipate. Systems must be built to allow others to get to the raw data, manipulate it, and do what they will. As we today's information needs, we must also allow for flexible interpretation and serendipity of discovery.

April 28, 2008

RSS Awareness Day

RSS Awareness Day

Thursday, 1 May 2008, is RSS Awareness Day. There's a grassroots effort to increase the awareness and use of RSS (and syndication tools in general). On the RSS Awareness Day site, it is claimed that "Feedburner recently reported that they track around 60 million RSS subscribers."

Of course, there are a lot more Internet users today than there were in 2005 (one estimate puts the total at 1.3 billion at the end of December 2007). I would go so far as to triple Feedburner's estimate to 180 million RSS subscribers, to account for all the users that Feedburner does not know about. And there have to be millions of them: people who "use RSS" without being actively aware of it, such as through "live bookmarks" in Firefox, Safari, and IE, or from web sites that themselves are amalgamations of feeds from other publications. People do not need to know what RSS is to use it.

Still.... even if we triple the number of users Feedburner thinks there are to 180 million, it is still only 13.8% of 1.3 billion users out there. That's not a particularly overwhelming market penetration figure for something as gosh-darned handy as RSS.

So -- talk about RSS on May 1, especially if you can do so without preaching to the converted. You and I probably do not need to be sold on the benefits. But our patrons do. But our parents probably don't. Take advantage of the first RSS Awareness Day to spread the word.