ASIS&T 2007: Next-Generation Catalog: Prototypes and Prospects

Chip Nilges

Nilges is VP of Business Development at OCLC. Currently working on WorldCat Local.
People view libraries favorably as source of great information (from Perceptions report). Report identifies a problem: where do you start your search? 84% say search engine; 2% started at a library site. There is a huge gap there.
How do libraries deliver value (collections, services, and community) to the user, on the network, at the point of need? This is what OCLC is trying to solve.
OCLC strategy to weave libraries into the web. Open WorldCat,, WorldCat local came out of this strategic goal.
Open WorldCat a syndication project. Puts OCLC catalog records into Google, Yahoo, etc. Get data where it’s being searched. Predictable URLs, machine interfaces. Hooked in to Google Scholar, for example. — a way to search the catalog. “Give away” worldcat data. Launched about a year ago; use of WorldCat overall has tripled in 3 years.
Things under development recently:
Personal profiles, citations (in various standard forms);
List creation/management/sharing, expanded metadata coverage to better expose collections of interest to users;
Personalization — features being developed now.
OCLC wants to get into job of citation management — moving in that direction.
OCLC measuring traffic. in 2006/7, and 129.4 million referrals from partner sites to Open WorldCat landing page. 7.6 million clickthroughs from Open WorldCat to library services — this is huge.
WorldCat Local: Not in original plan to release a next-generation catalog. But from library demand, it came about. OCLC “doesn’t do portals” — it’s just a search box. Service is centrally-hosted, customized view and search algorithm. A library gets a search box and a custom URL. Standard search algorithm is ‘tweaked’ to present local items first. Local holdings displayed in record.
OCLC learning it’s a different thing to design for librarians than for customers. Learning a lot about customers.
What’s searched in WCL? WorldCat, metadata of 33 million articles, local repositories as indexed in WorldCat. Object is to bring in good enough data from OCLC sources that libraries can replace their federated search engine. Also indexing local repositories.
WorldCat Local fulfillment requirements: interoperate with local management systems and with local delivery services. Pilot partners: University of Washington, Peninsula Library System, State of Illinois libraries, Ohio State University (12/2007), University of California System Melvyl pilot (spring 2008).
Upcoming features:
Institution search
Identities integration (
Big challenge for OCLC — balancing local needs with global needs; local record vs. master record. User wants continuity, systems don’t provide it.
There may be an OpenURL resolver on the way; some clients are asking for it.
Q: Is inclusion of Open Access journals considered?
A: Yes — open access books, archival materials, ejournals. Lots coming over next two years.

NGC: Next Generation Catalog
Andrew Pace

Our patrons are already “next generation”; it’s our systems that aren’t. Quick demo of Endeca — faceted browsing, shelf browsing, etc. Why do Endeca? Unresponsive vendors; early experiments in NGC; casual conversation with Endeca; formal conversation with Endeca (2/2005-6/2005; fast implementation (7/2005-1/2006).
What’s the big picture? Improve quality of catalog, exploit data already in the catalog. Build a more flexible catalog tool that can be integrated with future tools not yet invented.
Why do Endeca? Facets were a nice byproduct, but relevance ranking was the target. There’s little in the literature about relevance ranking for bibliographic surrogates. Improved response time enhanced natural language searching, and true browsing. Automatic word stemming (for certain words).
Sits on top of library catalog system. Daily data load from catalog. Used to improve the discovery process.

Data and analysis

From July 06 to Jan 07… 67% of users do search. 20% do browse. 8% do pure navigation (through LCSH headings).
26% of navigation is by subject topics — people are refining their searches by subject.
See Lown & Hemminger (2007) for a detailed transaction log display.
The “revolutionary war” problem. A search in catalog gives you LCSH subject headings. U.S. revolution gets 10 pages of subject records. In Endeca, working on this. Do you get the top n subjects in browse?
Expanding scope to 10million records in the Research Triangle libraries.
Emily Lynema and Tito Sierra — a web service on Endeca that allows access to the catalog. Yields RSS new book feeds. Enables mobile device searching. New books wall w/jacket images. Resource lists for embedding in other web pages with web services.


Q: Students when faced with too many options don’t learn the best way to do something.
A: It’s more important that they get what they want at the destination; entry path not so important.
Q: Endeca is “next generation OPAC”; what about next-generation catalog — describing information?
A: NCSU hasn’t done anything yet to change its cataloging practices; what they’ve done is exposed all that work so that it is accessible to users.

eXtensible Catalog
Judy Briden

The eXtensible Catalog (XC) is a project to design and build a system that provides libraries an alternative way to reveal library collection. Integrate library content into other systems. It will be open source and collaborative. Customizable locally.
XC will have a UI with faceted browsing. Locally customizable without significant programming skills. Interface customizable. Multiple metadata schemas (MARC, DC, etc.). Informed by user research.
Two phases to project.
1) One-year grant to write a plan. Completed in summer 2007. Proof of concept prototype, C4, that displays the basic UI that will be bundled with XC. Uses Lucene as search engine. Interesting feature.. from articles search, clicking a link (generated from MetaLib), rather than getting the OpenURL screen, user is directed straight to the full text.
2) Just funded — starting the project.
XC can be used as a new interface to an existing single repository — or integrate multiple repositories (at the interface level).
XC will address the needs of many libraries and be flexible, extensible — anyone can contribute.


Q: What open source license will XC be released under?

Next Generation Catalog: the Minnesota Report
Janet Arth

In March 2006, Ex Libris demoed Primo prototype to UMn and others. They were looking for development partners. UMn became one of those partners. Bibliographic data are extracted from catalog and put into Primo.
Usability was in the contract between UMn and Ex Libris. Minnesota did studies. They have access to an amazing usability lab at Minnesota.
Three usability rounds.

  1. First used proof-of-concept version (completely canned search results).
  2. Second used demo site with live, but anonymized, data.
  3. Third used live test site.

Most users actually use drop-down boxes to narrow their search (item type, with/without keywords, location) — very few typed word and hit search without narrowing it.
In usability debriefing, asked about tags (a part of Primo). Users saw tags as way that future users could see what past users had thought. None thought they would use tags. Few in study actually used tags. Useful as a discovery tool — way to expand search. But not strong support for tagging. Almost universally viewed as something others would use, not selves.


Q: Are you happy with Primo?
A (Arth): Mostly yes; but realistically, we didn’t have money to explore other tools the same way.
Q: Has University of Washington looked at how many people are using WorldCat Local vs. the native catalog?
A (Nilges): Not sure what ‘take rate’ is.
Q: Is there a web service interface to WorldCat local?
A (Nilges): ISBN, yes — but not extensive yet. Coming soon.
Q: Preference for WorldCat local vs. native catalog?
A: In academic libraries, tendency toward WorldCat Local. In publics, the other way. Perhaps this reflects a difference between what’s generally a system of libraries (academic) vs. a single library (public)?
Q: To what extent have we “bridged the gap” with these projects? Are we doing enough to get people to start their search at the library, or is this not even a goal?
A (Briden): Our content needs to be where students are doing their work; we can’t change their behaviors. Library fits in their thinking, it’s just not the first thing. It should be *one* of the first things, though.
A (Nilges): Ditto. Need to build interfaces that allow your services to be everywhere.
A (Pace): We need to avoid self-fulfilling prophecy. We need to make our catalogs useful, entertaining, helpful — so when people do get there, they like the experience and find it of benefit. Make catalog “sticky”.
Q: Does the underlying catalog data need to change to continue making improvements?
A (Arth): We have good data. Challenges lie in merging it.
A (Nilges): Separating inventory management and finding; pulling other data in with the cataloging. Not clear to what extent the data need to be unified; perhaps only connected.
A (Briden): Opportunity to bring tags into collaboration with subject headings; use tags synergistically. Catalogers have opportunity to work with user-generated data. Pull it together in ways that will make more sense.