- In the Library with the Lead Pipe - http://www.inthelibrarywiththeleadpipe.org -
We’re Gonna Geek This Mother Out
Posted By Ross Singer On August 5, 2009 @ 11:36 pm In Uncategorized | Comments Disabled
I am not much of a book reader. I have a home computer. It has a working internet connection. Any interest I have in genealogy or local history could probably be exceeded serendipitously by talking to family or neighbors and by wandering around the city. As a family, we do not watch many movies. I cannot seem to pay attention to audiobooks. Our taxes are complicated enough that I use software to figure them out.
What I am saying is that I am not the target market of public libraries. Despite that, I am completely intrigued by them.
I worked for many years as a technologist in academic libraries. They were all large research institutions with big collections, budgets and staff. I was also not the target market for them either (at least not after I graduated), but I understood the principal demographics of their constituencies and their expectation of the library. I witnessed the shift in the academic library from book depository to IT shop (whether or not all academic librarians agree on this assessment). When the university library stopped being “the place where the books are” it began to lose some of its identity and many began trying to create “social spaces” within the library (presentation rooms, coffee shops, information commons, etc.). The primary purpose of these endeavors, however, seemed to be mainly to help market the library as the information hub of the institution. Since the information is available mainly via technology, and the technology makes the information fungible, it became necessary to reinforce the library’s importance to the community.
If this seems complicated, well, that is because it is. The future of the academic library is in little jeopardy, really, because its role and utility within the larger organization is pretty well defined and not easily replicated by some other group or service.
This is not to say (by any stretch of the imagination) that academic libraries are satisfactorily meeting the needs of their “customers”. Not by a long shot. However, if the academic library is increasingly becoming a technology organization then many of the academic library’s problems are technological problems, theoretically with technological solutions. The majority of these problems are at the intersection of “the way we have always done things” and “where do we go from here”. That is, these are technology problems that are enveloped in a sticky skein of sociopolitical issues. If the interminable committee meetings were ever to wear down that outer skin, it might just be possible to make some real progress. The library technologist’s hope springs eternal.
The public library, on the other hand, appears to be roughly the inverse. It is a primarily social service that has clumsily tacked technology designed for academic libraries to the top. Any argument for the merits of library applications pretty much breaks down when applied to the public library. The audience is different and their needs are different. While, without a doubt, enabling research is within the scope of the public library, in reality the vast majority of transactions there are far more modest. The public that the library serves, largely underwhelmed by our complicated bibliographic search tools, instead uses Amazon.com, a technology company that has quite cleverly tapped into social activities (lists, “people who bought x also bought y,” etc.) to pitch their products. Most importantly, the packaging is slick and effortless.
It is a shame—especially considering how interesting, fun, and rewarding the projects would be —how little public libraries seem to be able to execute their technology. Not that technology is ignored, indeed my local library employs a vast array of applications to try to aid its users. But this comes across as a hodge-podge: many different interfaces, none of them terribly satisfying and not in sync with each other. This is not exclusive to my library, of course, nor is it uncommon in an academic library setting. It would also be fairly easily remedied by some technical expertise, cooperation, and a little bit of vision.
Whereas the academic library is likely not in jeopardy, the public library is subject to far more fickle decision makers. If the primary benefactors of the service, middle class tax payers, see no benefits resulting from the library’s existence, it may find itself subject to political pressure. This is a population that, on the whole, is pretty wowed by style and convenience which tend not to be libraries’ strong suit.
My wife, Selena, is a steadfast supporter of the public library. She had an awakening about five years ago after shelling out tons of money to Amazon for books she read only once.
“Oh my God, they’ve got all these books. For free!”
At this point, we had been married for about four years and I had been working in a library for about ten. It is not like I hadn’t brought up the possibility of the public library before, but I had no defense for her complaints about the catalog’s interface (SirsiDynix’s iBistro). It took her requiring her high school students to get a library card to see the merits of the library and ever since she has been a devoted advocate.
That does not mean that she still doesn’t have her complaints. They are legitimate gripes and, thankfully, almost completely technical. Her issues are:
The third point is not exactly technical, I realize, but it has an effect on the library as a whole. This will not stop me from offering a technical suggestion that might help.
There are several opportunities in the library for serendipitous discovery: the children’s book room, the returned book cart, the new books shelf, maybe a staff picks list. It has not, historically, been the forte of the library catalog. One of Amazon’s many strengths lies in its recommendations and groupings. Simply by being me and doing what I do, Amazon finds and presents me with things I might be interested in based on how other people with profiles like me shop. While the recommendations are generally hit and miss, it made me aware of many things (especially music) that I would have had no way of discovering before.
There are various reasons that the library is reluctant to start creating profile based services for its borrowers: USA PATRIOT act style privacy concerns, population sizes that are too small to produce meaningful recommendations, etc. U.S. libraries should pay close attention to the UK’s MOSAIC project, however, a JISC-funded initiative to harvest and mine circulation data with the intention of providing recommendations based on borrower usage. Assuming concerns surrounding the differences in privacy rights can be met, this could really begin to pave the way forward for such services.
In the meantime, there are tangible ways to provide less targeted, although still meaningful, recommendations: best seller, award, and book club lists. Best sellers’ lists are, of course, a very rough metric of what is currently popular across America and, in the case of some lists, targeted at particular demographics (the New York Times, Essence Magazine, Evangelical Christian Publisher Association, Powell’s Bookstore, Independent Mystery Booksellers Association, etc.). While best sellers lists give no context of the actual content (outside, possibly of fiction or non-fiction) and certainly are no barometer to the quality of the work, they do at least provide a list of books that are currently popular, which might be all the discovery some users need, especially the specialty lists.
When I checked my local public library for access to their collection based on best sellers lists, I was rather surprised to find that they did not have any. This seemed so simple and such an easy win for them, that I thought I would mock something up so they could use it. The New York Times has opened their best sellers lists and book and movie reviews through their API service making it very simple to create a set of interfaces based on their best sellers to the library catalog. Unfortunately, this proved to be harder than I originally had hoped because the library catalog has no machine readable interface. There is no easy way to provide mashups to my library. This is not terribly surprising; the same was true for Atlanta-Fulton County Public Library’s catalog. This is a terrible shame. If the library is unable to provide the resources to create interesting and vibrant technological services, they really should do everything in their power to facilitate these services being created by members of their community. This is exactly how Ann Arbor District Library cultivated its “Super Patron”, Ed Vielmetti. Ironically, the AADL already had a strong technological base and probably needed to depend less on their constituents than other libraries. I suppose this stands to reason, though, what with the rich getting richer and whatnot.
The issue of completely closed systems resonates with me especially hard. For the last two years, I have been working on a project to build a specification to provide access to library data, via the Atom Publishing Protocol, called Jangle. I was my employer’s representative to work on the Digital Library Federation’s Integrated Library System and Discovery Interface API. This year, my work has primarily been split between trying to herd Jangle along and trying to find opportunities to expose library data and services as Linked Data. I also wrote a book chapter on possible ways to make your library data more accessible for mashing up. Sadly, all of this is an exercise in futility if libraries have no machine readable accessible means to provide their data. This lack of openness is a major setback to libraries and the potential services they can offer their users.
While I was trying to figure out a new plan of attack for implementing something like this, I did find that best sellers lists are not uncommon in public libraries; a cursory scan found them at Atlanta-Fulton Public Library, Knox County (TN) Public Library and the Nashville Public Library among several others. The AFPL and Knox County PL both had them integrated directly into their OPACs. Both use SirsiDynix powered OPACs: the AFPL uses iBistro and Knox County uses Rooms. Nashville Public Library uses BookSite.com, a third-party service that compiles lists and tries to emulate the look and feel of the original library website.
They all suck.
The problem with BookSite.com is that it, apparently, has no way to check the host library to see if the selection is even in the collection, much less if it is available or when it will be. This requires the user to click on the link, initiate a catalog session, see if the item exists, check the availability, click the back button, find the next item of interest, click on the link, enter the catalog, etc. While this may not seem terrible, every time they follow a link for an item that does not exist or is not available diminishes their confidence that they will ever find something available. Let us not forget, also, that our OPACs tend to be horribly slow at initiating or reallocating sessions. All of this just adds to a frustrating user experience. This is another example of where a lack of APIs hamstrings third-party developers: despite the intentions of the library to provide a better experience by purchasing subscriptions to products like BookSite, the end result is still awkward.
One would then think that incorporating these lists directly into the OPAC would be an improvement. Unfortunately, this is not really the case. While item availability is shown (assuming the item is even held), the display is just an ugly, OPAC title list view. Understandably, practically any title that appears on a best sellers list is more than likely going to be checked out (and will probably have a wait). From a user’s perspective, though, this offers very little as a “discovery interface.”
Quite a few of the the entries were fairly misleading as well. They provided hope for the user that the title might actually be available, but required going to the full title screen (similar to BookSite.com), only to see that all of the copies are, in fact, unavailable; they just have some status set that the OPAC cannot recognize as “available” or “unavailable.”
What is unacceptable here is that the poor user is presented with a list of 15 dead ends. If the library is unable to provide any of these particular titles, what can it offer the borrower that might be related or relevant? Each of these books represents a possible avenue of interest into the collection. They also define a particular point of interest in the collective national consciousness that can be utilized to present other works held by the library that may not be new, but could be just as much of interest to the user. The “traditional” library avenues of providing similarity tend to be fairly weak substitutes when it comes to this. Dewey Decimal Classification (common to the majority of public libraries), which provides the “shelf browse,” is completely ineffective in the case of fiction works for anything other than finding other titles by the same author or another writer with the same last name. Browsing on subject headings is also a rather blunt tool. “”, “Female friendship Fiction.”, “African American women Fiction.”. None of these, individually, captures the essence of why a particular book is on a particular best sellers list. The MARC 65x field is unable to capture timbre. And this is huge area where the public library is failing the public.
There are products and projects that begin to address this disparity between what the casual user wants and expects and how the library catalog has evolved (or not) for the web. BiblioCommons’ business model is to provide this social context layer over the collection by facilitating and aggregating circulation data, reviews, lists, and other means to allow library users to directly influence the relationships between works. SOPAC could be considered an open source alternative to BiblioCommons; it is a suite of components featuring a public interface built atop the popular FLOSS content management system Drupal. One of the pieces, Insurge, is intended to provide a means to share this social data between the various implementations: reviews, ratings, and recommendations. The design of Insurge theoretically allows it to work independently of SOPAC, the Drupal module, although, in practice, this has yet to happen. Both of these are complete OPAC replacements, relegating the integrated library management system to its rightful place as an inventory control system.
At the other end of the spectrum is LibraryThing for Libraries, which takes the incredibly pragmatic approach of integrating into the existing vendor-supplied OPAC interface. Like the other two, it leverages the much broader LibraryThing community to help enhance the local collection. Of the three currently available options, it, by far, provides the richest and most comprehensive social enrichment because the community already exists. The others have to build this community and the content from scratch. One has to wonder, really, how Syndetics has sold a single subscription since LTfL was released: LibraryThing gives everything a Syndetics subscription could, plus gives the user relevant alternatives from their own library’s collection.
That being said, LTfL also shares the same limitation as Syndetics (or any other “shoehorned in the OPAC” enrichment package): the OPAC is still there. This content, these tags, the ratings: none of these are available to the searcher until she has already found something. Queries do not include this community supplied content, there is no spellcheck, results cannot be sorted by rating. If public libraries are to stay relevant, these interfaces have to be dropped. The future of the ILMS itself is a different matter entirely, though its usefulness as an inventory control system is out of scope here. This is just about the OPAC.
I strongly believe that the future of the public library collection interface has to be tied into some kind of content management system. I am unable to find any hard statistics to back this up, but I do not think it is much of a stretch of the imagination to say that a vast amount of library circulation is casual, popular reading. Just walk into any branch and browse the collection; the overwhelming majority is not research material. While certainly there are lots of archival, local history, reference, and research items at any public library, can any one of them, honestly, say that these types of activity make up the majority of what cardholders want, need, or expect to do there? Why, then, are the interfaces optimized to perform these tasks, arguably, at the expense of the majority? Of course, sophisticated information retrieval still needs to be supported—the line between “hobby” and “research” can be blurry—but perhaps it does not need to be the primary function of the public interface. The social nature of the library as place and collection need to be merged.
The concept of CMS as OPAC is not new or original (or exclusively useful to public libraries): as previously mentioned, SOPAC is a Drupal module, as is the Mellon Foundation-funded eXtensibleCatalog (XC) project. Scriblio is a plugin for the WordPress blogging platform. Several years ago, I was working on a project to build a catalog using the Daisy CMS as a back end. Even SirsiDynix’s Rooms was an attempt to merge the content and collection, albeit with the aesthetic of a traditional web OPAC, the speed of federated search engine and the general user experience of a root canal. At a certain point, a library collection grows to a size that it cannot feasibly be dynamic and fresh using only the catalogers as the sole editors of the content. There is a growing need for “marginalia,” independent of the MARC record, to tie the individual items within the library to each other, to events, to groups, to anything. The separation between the “catalog” and the general information about the library makes no sense.
Besides the integration of general content, collection, and public contribution, the single most important improvement needed for the public interface is search. It is amazing and somewhat appalling how, despite our claims that our systems are designed as being highly advanced information retrieval tools, they fail utterly at retrieving information. My local public library recently deployed the federated search product WebFeat, undoubtedly in a well intentioned attempt to help their users navigate the various silos of information that inconveniently require searching individually: the catalog, the audiobooks, the photograph collection, and the various databases they subscribe to. It is also, by the gentlest assessment possible, a complete train wreck of a user experience. Besides being slower than the stock catalog interface, it does a terrible job at searching. It is understandable that the library would want to highlight and improve access to their database collection (as well as have a unified search interface for their “general collection”), but it does not seem likely that a borrower looking for something by Nora Roberts to take with them to the beach cares much about results from InfoTrac OneFile. Requiring said borrower to enter their library card number before they can search just lessens the experience even more.
Another requirement the library places on the searcher, that they must be an excellent or informed speller, is also unfortunate. As I try out these interfaces, there are two searches I try so I can see how effective they are in aiding the hapless searcher. The searches are “Olive Kitteredge” and “Jody Picoult.” It is depressing how unhelpful our search interfaces are.
For “Olive Kitteredge,” an understandable misspelling of Olive Kitteridge, the Pulitzer Prize winning best selling book, I got:
“Jody Picoult” seems a perfectly reasonable misspelling of the multiple best selling novelist and author of My Sister’s Keeper, which was recently adapted to film. In the same order:
These are not edge cases. These are searches for current best sellers and a Pulitzer Prize winner and both of them are only off by one letter. Of the sixteen searches, eleven of them ended in failure. While not comprehensive, these were eight libraries chosen mostly at random. For all of the current fixation in faceted and graphical search results (and to be fair, Queens Borough Public Library’s AquaBrowser implementation passed the Picoult test and provided “kitteridge” in its similarity graph), none of these bells and whistles matter one whit if the search interface cannot even help the user past the search screen. Amazon not only presented the correct “did you mean” suggestions, it also provided relevant search results with these bad searches.
Of course, correcting a search for “Jennifer Wiener” to Jennifer Weiner is irrelevant if the book the borrower is interested in will not be available for 89 days, as the Knox County Public Library was displaying last week for Best Friends Forever (as of this writing, the New York Times #1 Best Seller for Hardcover Fiction). That is nearly three months. Forget summer reading, you will be lucky to get this book before the winter solstice. While I am normally extremely supportive of large, cooperative borrowing consortiums, such as Georgia’s PINES, the advantages of such a system, regardless of the size and scale, still completely break down when it comes to such enormous spikes of popularity. It does not matter how many copies are in the system if everywhere from metropolises to backwaters has a run on the same title. This is not exclusive to best sellers, of course, consider titles on school curricula or summer reading lists. Backlogs are bad for credibility.
Popularity, however, is fleeting. It is unreasonable for an underfunded library system to exhaust its limited collection development budget purchasing dozens of copies of the new hot thing which tomorrow may not circulate again ever (consider James Frey’s A Million Little Pieces). For cases such as this, rather than borrowing from other libraries that have nothing to give, it makes more sense to borrow from the public. Many of these most popular titles are best sellers, after all, and “best seller” by its very meaning implies that a lot of people own that book. Once read and passed around to your circle of friends, what do you do with this book? For these very popular, highly circulating titles, it makes sense to create a system that allows book owners in the community to donate their copy. Once a particular title passes some predefined threshold (two holds for every copy, as an arbitrary example), provide a link in the best sellers list to encourage people to give the library their copy. Links to this page would need to be present elsewhere, too: after all, the person that owns the book wouldn’t be looking for it on the library website since they already own it. Advertise on the library website. Have an announcement on the local NPR affiliate. Post the list of books the library wants to have donated near drop boxes.
The donor would be given a tax write off based on the value of their book on the open market. When the popularity spike diminishes, the library could either return the book to the original owner or, perhaps, register itself as an Amazon affiliate (as an example, I am not sure of the legalities or practicalities of this, nor is this an endorsement for Amazon.com) and sell the used copies with the proceeds going back to the library like any friends of the library book sale. The tax write off (as well as satisfaction of performing a public good) would probably be more desirable to many potential donors than going through the process of selling the book themselves.
What all of this points to is that public libraries need to place as high of an importance on the technology that they do on the social and physical aspects of their organization. A lot of effort goes into speaker series, story time, game nights, and movie nights. A lot of planning. A lot of investment. If that investment is not given, nobody will come to them. The web presence is no different. If the web tools are an afterthought, a haphazard, sloppy collection of off-the-shelf tools that neither help the user achieve their goals nor captures their interest, the public will write the library off. Just as a speakers series is a combination public service and library marketing tool, the web site must be more so, as it is more public than any event.
At the same time, the library should not have to break the bank investing in the most cutting edge and expensive technology (or worse, break the bank with the run of the mill, dreadful applications currently pitched to them). Many of these issues could quite easily be addressed simply by hiring a competent and creative developer. By pooling these development resources, even more ambitious accomplishments can be achieved. Georgia PINES (despite OCLC’s marketing department’s claims) built the first truly “web scale” ILMS simply because they had a need and were willing to devote the resources towards building it. Joe Lucia, the University Librarian at Villanova made an intriguing and provocative statement on the NGC4LIB mailing list two years ago with this:
“What if, in the U.S., 50 ARL libraries, 20 large public libraries, 20 medium-sized academic libraries, and 20 Oberlin group libraries anted up one full-time technology position for collaborative open source development. That’s 110 developers working on library applications with robust, quickly-implemented current Web technology…. Instead of being technology followers, I venture to say that libraries might once again become leaders….”
He was speaking in this case of academic libraries (he mentions 20 public libraries, but I remain unconvinced that the average public library has all that much in common with its academic counterpart), but it is not too difficult imagine this in the context of public libraries. There are, after all, nearly three times as many public library systems in the United States as there are academic libraries. Surely, collectively, they could figure out how to fund such an endeavor to provide a truly powerful development team committed solely to the technology needs of public libraries.
If added to this was an infrastructure and environment that cultivated an opportunity to harvest the contributions of “super patrons” and “citizen developers,” as well as graphic designers, usability and accessibility experts, entire services could be provided by the constituency just as BiblioCommons, LibraryThing, or SOPAC solicits content. One of the many distractions I had while writing this article came from a desire that I had to not just complain about my public library, but actually build some alternatives that could be contributed back to them. However, as I mentioned previously, there is no machine readable access to their collection for me to build upon. In order to write something interesting and, hopefully, useful, I first had to write a crawler to harvest their catalog. I have yet to gain the nerve to actually run it; there is no robots.txt file, but it still seems rude and underhanded. It is also ridiculous that I have to resort to such tactics just to sketch out some proofs-of-concept.
If all three tiers of this ecosystem were to become a reality (cooperative development team, local developer resources, and a public contribution network), the library would be well-placed to remain relevant for many years in the community’s consciousness. It is difficult to see if the initiative or vision is available to establish such an environment, however. Significant improvement would be rather easy to accomplish. All it would take is a little imagination and some commitment.
Maybe I should just start my crawler and see what happens.
Thanks to: Brett Bonfield for not only convincing me to write this article, but also tirelessly reviewing it and for guiding this along even when I was getting flaky. Also thanks to Dan Chudnov for reviewing it and helping me find a better focus, even if he agreed with only about half of what I wrote. Lastly, I’d like to thank my wife, Selena, without whom I would have had no inspiration, ideas, or “research subjects.”
Article printed from In the Library with the Lead Pipe: http://www.inthelibrarywiththeleadpipe.org
URL to article: http://www.inthelibrarywiththeleadpipe.org/2009/were-gonna-geek-this-mother-out/
This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.