Intermediation and the Digital Library
Paper presented at 1998 joint international conference, Association for Literary and Linguistic Computing Association for Computers and the Humanities 5-10 July 1998, Lajos Kossuth University, Debrecen, Hungary Keywords: librarianship as a profession, online searching, end-user searching, subject access Little Jack Horner Sat in the corner, Eating a Christmas pie; He put in his thumb, And pulled out a plum, And said, What a good boy am I! The Question of Intermediation Until I thought about it, I never realized how weird Little Jack Horner is. So, he eats with his hands — he's little, he's a kid, that's OK. But it's Christmas and he's off by himself; in a corner — which is usually the place of punishment; he has a whole pie to himself, but no turkey or anything else (maybe it's dessert time and he got tired of the adults at the table?); and he's latched onto a plum and tells us he has done well — which is redundant, because we all know what "plum" implies. What is really weird is that Little Jack Horner exemplifies an ideal of information acquisition. The pie is a well-defined territory that he is working over, his thumb is an instrument that provides direct and effective access, the plum is the good stuff that he is extracting, and he is happy in his activity — he feels good about himself and his situation. Perhaps best of all, he can do it alone and he knows what he is doing. The self-sufficiency exhibited here is a stage familiar to anyone who has raised a child. The question to be explored is whether Little Jack Horner can stay in his corner as print resources are supplemented — and replaced — by digital counterparts. Put in other words, and to move into a less analogical mode, does the digital library place its user in a position of increasing dependence on human assistance? Are we entering a stage where even a sophisticated library user will be unable or unlikely to make adequate use of what is available? Reflection on two decades of experience with transition in public and academic libraries leads me to propose that the answer is "yes". The Library Profession Before developing support for this thesis, it seems appropriate to say something about the library profession. It emerged, in the present common understanding, in the latter half of the nineteenth century. Practitioners in this field, as in others like nursing and primary education, have tended to be female, with corresponding status and compensation. There has been a great divide between what are called public services and technical services (epitomized by the activities of reference and cataloguing). Libraries themselves have been classified into four general types: school, public, academic, and special. For a variety of social and technological reasons, these distinctions are becoming less rigid and less definite. Concurrently there is emerging another category which can variously be described as the freelance librarian, contract researcher, or information consultant. The credential of the librarian, a masters degree in library science, is prevalent in the profession. It does not, however, enjoy the degree of exclusionary power exercised by parallel qualifications in fields like education, accounting, law, medicine, and engineering. In part this difference relates to the considerable but informal role played by apprenticeship in becoming qualified to practice. Also significant is the public funding and service nature of much of the work, which generates a perception that what is provided is "free" and consequently of small value. Although incorrect or incomplete information can have serious consequences, accountability and liability tend to be far greater in other professions. Credentials, finances, legal status, health, and safety seem much more immediate and more important than information does. The contract implicit in a client's direct purchase of a service contributes to the importance assigned to results. A corollary is that much of the consultation provided by a librarian is on-demand and real-time, with less allowance for supporting research and a reflective weighing of possibilities and alternatives. From Print to Digital: One Example In the latter days of the print era, some twenty years ago, when even the fruits of data processing usually reached the end user as a printed product, getting at specific information (finding the plum on the dinner table) still was largely intuitive and a much more integrated exercise. That intuition was based on the transmission of habits developed over centuries of dealing with print-based information. The user's ability to navigate was in step with the education system's methods and with publishers' conventions. An illustrative comparison of subject indexes for periodical articles provides some evidence for this assertion. The printed indexes published by the H.W. Wilson Company will be set alongside the databases made available by SilverPlatter. The indexing practices of Wilson provide coordination and easy crossover among the broad range of subjects covered by their separate indexes. As well, there is considerable correspondence to the general thesaurus of the Library of Congress Subject Headings which provide subject access to the monograph collections of most academic and public libraries in North America. Beyond this important general consistency lies much other work done for the user before any search for information is undertaken: Careful selection of material to be indexed Uniform and good typographic design Control of subject terminology Precoordination of subject terms Repetition of entry under appropriate headings Embedding of cross-references Ironically, this paragon of print did not migrate as rapidly or as smoothly as it might have into electronic delivery, in no small measure because the printed product was so good. And because it was print. Wilson developed its own software as an extension of its primary business of providing information. SilverPlatter, on the other hand, has emerged as a data retailer, providing a deceptively uniform and relatively simple interface for a wide variety of producers. Databases whose range and depth would not have been economical in the print era now provide users with citations to grey literature that is not available in most libraries and may be unobtainable. This company may be seen as a cdrom/LAN successor to centralized connect-charge database vendors like Dialog. The software itself is "free", with revenue included in the purchase price of the databases selected. In some respects, the uniformity they offer simplifies training and user support, reducing the need for intermediation. There are significant limitations to this last observation, however. One is that the apparent uniformity of the SilverPlatter interface tends to mask significant differences in the underlying data which are (or were) much more apparent in the producer's printed products. Here are three examples: • Journal of Economic Literature relies heavily on a classified approach to the information; online approach through a general subject heading can be difficult. • Psychological Abstracts (PsycLit) has a strong and well-developed thesaurus which is still most easily consulted in its printed form. The subject terms themselves, and the thesaurus which controls them, can easily remain hidden to an online user. • In more detail, consider the change in subject indexing introduced by MLA International Bibliography in 1981. An electronic searcher can naively combine the two files and search them in what appears to be the same way. However, the only very useful subject term (as opposed to title keyword) available for searching in the older file is name of literary author. This is readily apparent in the printed product, but almost invisible online. Another weakness of the uniformity offered by SilverPlatter is that it is only a part of the information universe, one (albeit major) interface among many others. All of these share the instability inherent in continued development and migration. Further discussion below will be devoted to the multiplicity of interfaces. Aspects of the Transition It may be that we are progressing toward the look and feel of a common interface, some combination of windows, menus, and hypertext that will begin to allow intuitive navigation similar to that of the printed page. But this vision appears in the screen very dimly. Much more in evidence are users who experience the following: Lining up to get at a machine A less stable environment with more down time New interfaces that do not match old habits (even electronic ones) Having to seek training and choose appropriate sessions when they do not know what they do not know Selecting from a proliferating, overlapping, and ever-changing set of appropriate resources (print and electronic) and finding their way to them, whether networked or standalone Trying to navigate unfamiliar systems on the basis of screen helps, tip sheets, and manuals At the least, flying solo through a ready-made result set Variety and complexity associated with data capture/printing through download, session logging, email, etc. (in contrast with the directness and simplicity of photocopying) Beyond these difficulties lie three changes that call for special comment. First is a multiplicity of interfaces. It is increasingly possible to access the same data by different paths, in different media, and through different softwares and hardwares, with significant variation in the results obtainable. Underlying some of this is a transition from telnet and proprietary clients to http, with the robustness and functionality of the latter still falling noticeably short of what the older technology has provided. Also contributing to the multiplicity is vendor competition. Second is an extension from surrogate to full text. Traditional library catalogues, bibliographies, and indexes are ultimately spatial conventions — and in some instances, products of print poverty. While there is often a useful spatial convention in the physical arrangement of printed texts (e.g. the classification of books on a shelf), the order is one-dimensional and normally excludes alternatives (like shelving multiple copies wherever they might be desired). A catalogue or index is unconstrained by physical linearity and provides multiple pointers to the same item, whether by description (author, title, etc.) or subject. It has also been possible for a library to provide reference to many more printed items than it could ever hope to own, house, or directly make available. With recent decreases in the cost of electronic storage and the technical ability to access all content, the bibliographic record moves from necessary means (a record that makes it possible to locate the complete text) to useful adjunct (header information on an electronic text retrieved through direct searching). One is a pointer, the other an identifier. The challenges of coping with full text may be most dramatically illustrated by Lexis-Nexis, with its one and one-third billion documents (growing at a rate approaching ten million per week), representing over eighteen thousand data sources. Third, and perhaps most significant, is a widespread and unreflective move to raw keyword searching for data retrieval. (This is related to, but not a necessary consequence of, the increasing availability of full text.) What the computer can do with keywords is very like what a shotgun can do to a barrel of goldfish in a dark closet. Most users seem happy to hit a few surface swimmers almost at random, while the big smart ones continue to circle at the bottom. The sense of power delights, and there is not much rational evaluation of what pulling the trigger has done. This raises doubts about a good deal of what is called "research". Cumbersome and limiting as they were, the card catalogue and the printed subject index did force users to approach the data through structured subject terminology. In general, keywords were not available. It seems safe to say that most data users remain largely unconscious of things like the principles of source selection in a database, the structure of records and fields, the types of data these contain, and the indexing practices adopted. Even setting this assumption aside, it is rare to find a user possessing the self-conscious linguistic sophistication required to construct a natural-language equivalent for a thesaurus-controlled term. It is sobering to consider that there are only two ways of organizing information for direct human access, with raw keyword searching well toward the weaker end of the spectrum. The two are classification (semantic order, generally with hierarchy) and alphabetization (a conventional and purely arbitrary order), most familiar perhaps in the book as analytic table of contents and index. Of course, the alphabet itself can allow for semantic random access (assuming the mind can generate the point of entry). And semantic order is likely to embed itself within alphabetic organization through controlled indexing terminology, provision of cross references, and distinctions grouped under a concept as subheadings. Ultimately these constitute a thesaurus. Conclusion Implicit in the foregoing examples and analyses is an academic perspective, one that assumes an orientation toward research rather than casual and uncritical browsing of information. Given the reality that communities of discourse employ and rely — probably far too heavily — on personal acquaintance and informal means, the information superstructure may become significant only at the boundaries of those communities. Nevertheless, a scientific approach to knowledge assumes that truth does not require the imprimatur of the familiar, that subject areas cannot be closed and cosy, and that research must be more than conversation and/or extrapolation from randomly acquired information. This leads to the question of who should gather information, how it should be gathered, and whether the gathering itself is integral to the process of dealing with it appropriately. As regards the latter, scholarly habits, particularly in the humanities, have reflected a belief that the gathering cannot be delegated, although some aspects of the work may be performed by a research assistant. At a simpler level, the question becomes one of the role of information gathering/library research in a student's education (understood as gaining basic familiarity with the methods, tools, and data of a particular discipline). Practicalities aside, it is here that we come to a tradeoff that is being exacerbated by the shift from print to electronic delivery of information. Will the quest for data continue to be primarily a do-it-yourself undertaking, with increasing time and energy required to cope with (if not master) the variety, changes, magnitudes, overlaps, and proliferations of the digital library? Or will a need for efficiency and a desire for good results compel most researchers to rely on the intermediation of a specialist in information management? As the costs of human support increase, and public funding declines, a library user can look forward to less direct assistance in an environment that grows more complex and unfamiliar. The effectiveness of group training divorced from particular questions and needs is limited, but it is the only practical way of teaching methods, strategies, and procedures. Opaque technologies have increased the importance of understanding what is being done. However, most users operate on immediate need, and are much more concerned with end than means. Consequently, information seekers should develop a critical self-consciousness regarding what they do not know. When help is provided one-on-one and on demand, it is likely to take the form of a request whose derivation is not explained. This may require greater trust in the information provider, and could lead the user to a willingness to invest sufficient time to develop information-finding skills. In retrospect, the current situation may seem peculiarly transitional, with the durability, stability, accessibility, and universality of the printed page manifested somehow once again in the conventions of an electronic screen. Suggestions for Further Reading: Basch, Reva. Secrets of the super searchers. Wilton, CT : Eight Bit Books, 1993 Bledstein, Burton J. The culture of professionalism : the middle class and the development of higher education in America. New York : Norton, c1976 Clausen, Helge. "The future information professional: old wine in new bottles? Part one" Libri 40:4 (Dec 1990) 265-277 ; " ... Part two" Libri 41:1 (Mar 1991) 22-36 Dalrymple, Prudence W., and Jennifer A. Younger. "From authority control to informed retrieval: framing the expanded domain of subject access" College & research libraries 52:2 (Mar 1991) 139-149 Connell, Tschera Harkness. "Subject searching in online catalogs: metaknowledge used by experienced searchers" Journal of the American Society for Information Science 46:7 (Aug 1995) 506-518 Foskett, A.C. The subject approach to information. 5th ed. London : Library Association, 1996 Hagler, Ronald. The bibliographic record and information technology. 3rd ed. Chicago : American Library Association, 1997 Harter, Stephen P., and Anne Rogers Peters. "Heuristics for online information retrieval: a typology and preliminary listing" Online review 9:5 (Oct 1985) 407-424 Park, Bruce. "Libraries without walls; or, librarians without a profession" American libraries 23:9 (Oct 1992) 746-747 Quint, Barbara. "Disintermediation" Searcher 4 (Jan 1996) 4,6 Studwell, William E. "Will intermediaries be an essential component of online subject access in the future?" Technicalities 13:8 (Aug 1993) 8-9 Tenopir, Carol. "Generations of online searching" Library journal 121:14 (1 Sep 1996) 128, 130 White, Herbert S. "Information intermediation: a fancy name for reference work" Library journal 120:5 (15 Mar 1995) 44-45
For contact information see Joseph Jones.
Presented July 1998
Reformatted for migration July 2013
Hosted by Vancouver Community Network