With research scholars finding library catalogues difficult to use, web discovery tools are becoming popular. However, they come with many caveats

Google Scholar and Wikipedia continue to be favourite destinations in online searching even in the academic world. Association of College and Research Libraries of USA in 2012, surveying on top 10 trends in academic and research libraries found that majority of the users found the library catalogues “difficult to use” and often the “last resort” to locate scholarly information. Innumerable information literacy sessions and research studies dismissing the proficiency of Google and Google Scholar have been ineffective in dissuading the academic community to rely on these popular search engines.

When Google was launched, many libraries started redesigning their library website to provide their users with Google-like experience to attract user community. Positive experience from Google raised expectations of users as they preferred single access point that has the ability to search for multiple resources simultaneously.

In early 2000s, this led to the invention of a federated search engine, also known as the Meta search engine, which provided a single platform to search and select multiple databases at the same time resulting in single set of results. Many libraries implemented this technology but soon its technical capabilities pertaining to speed in retrieving results, searching multiple resources and relevancy algorithm could not sustain the interest of the user community.

The launch of Google Scholar in 2004 and its popularity forced many libraries to embed the search box of Google Scholar but with a caveat to evaluate information before using it for academic work. The popularity of Google Scholar and failure of federated search engines, around the same time led to demand for new searching technology that could compete with Google Scholar both in terms of scope and speed.

It is difficult to find from Google Scholar the amount of content indexed and it does not provide any list of resources from which it retrieves the results. Though it claims to index content from both subscribed and open source content, and rank the results the way researchers do for retrieving scholarly information, its relevancy algorithm is unknown to the academic world. It does not even define term ‘scholarly’ and retrieval may even include non-peer review content.

In the year 2008, Serial Solutions from Proquest, an online content provider developed a single search interface with an idea of providing Google Scholar like searching experience with its own proprietary content and launched ‘Summons’, a web discovery platform (also known as next generation catalogues) to facilitate users with single search interface. There are four leading discovery service providers that have reputed libraries as their customers. These are ‘Summons’ from Serial Solutions, ‘Discovery services’ from EBSCO, OCLC WorldCat and ‘Primo Central’ from Exlibris.

Web scale discovery services have a central index which compiles metadata from various publishers and content providers at the back end. Along with a discovery or search interface which retrieves results from this backend index and, a technology that hyperlinks to the full text of the content subscribed by the libraries implementing these solutions.

One of the significant features discrete to web scale discovery content is its ability to index the local collection of libraries such as the online catalogue, the institutional repositories and its own full text subscribed electronic content. These disparate objects are merged into a central index and provide real time information retrieved from multiple sources which include data from newspapers, book reviews, journals, magazines, e-books, databases, indexing and abstracting services and many more.

The search interface provides multiple searching strategies, allowing the user to refine on number of filters such as subjects covered, peer reviewed content, full text and orients it to only library owned content. The relevancy algorithm and ranking of results is weighed on factors such as, keywords matching with subject and title of the content, peer review, currency and surrogates of local print content.

The other technology which provides hyperlinks to citation retrieved is the ‘Link Resolver’, which uses the open URL standard to provide context sensitive linking between a citation and the electronic full text of the resource cited for materials for which the user has authorized access.

Libraries can embed the search box like Google in their websites and other interfaces and brand it. Currently, six libraries in India are subscribing to this resource including the Indian Institute of Management Bangalore (IIMB).

As web scale discovery services are only three years old, there are hardly any major studies to evaluate its efficacy. Challenges such as competing publisher’s participation in compilation of index, normalisation of metadata collected from various databases, and integration of content title by title subscribed by libraries need to be overcome.

The success of web discovery will depend on the extent of participation by publishers and content providers and standardisation of practices in representation of content. Currently, National Information Standards Organization, USA is working towards establishing these standards to overcome its limitations and retain the interest of the scholarly community.

(The author is a librarian at the Indian Institute of Management, Bangalore.)

More In: Internet | Technology