Prince Leopold Institute of Tropical Medicine

The Library

ITG library home page eBooks Journals Electronic journals Databases WebSPIRS Internet links ITG home page
Last updated on January 20, 2004 | (verkorte) Nederlandse versie | Version (abrégée) française


Database retrieval software - an introduction

Guidelines for using the various database retrieval softwares

Some general remarks on searching information in bibliographic databases.
WebSPIRS software (version 5.0): searching bibliographic databases using a World Wide Web interface.
WinSPIRS software (version 5.0): searching bibliographic databases using a Windows interface.

For information on technical requirements click on connecting to the ITG library database server.

Some general remarks on searching information in bibliographic databases

Selecting an appropriate database

When searching for bibliographic information, selection of an appropriate database is probably the most important requisite for success. Each database has its specific quantitative and content selection criteria. No single database in the world covers all published literature (and certainly not all 'grey' literature, such as internal reports).

When considering a database the following criteria might help you evaluate its appropriateness for your specific research topic:

  • To what extent does the database cover your topic? E.g. it would not make much sense to search for exhaustive information on tuberculosis in the Ebola and Marburg Virus Disease Literature database.

  • How extensive is the database? E.g. the number of records, the number of source items, the number of years covered. When searching for the immunological aspects of malaria Medline or CAB Health are far more appropriate than the highly selective Tropical Endemic Diseases Control database.

  • How up-to-date is the database? E.g. last update, update frequency, usual backlog. Recently published articles may feature in CCOD, but will generally not be included in Medline until a few months later.

  • What types of publications are included? e.g. Medline features journal articles only; Tropical Endemic Diseases Control, Health Care in Developing Countries and other databases produced by the ITG Library also contain references to books, book chapters, abstracts and unpublished documents or reports.

  • How much overlap is there between the database and your library? This criterion could be useful when, instead of an exhaustive list, you are merely looking for a limited number of relevant publications which are immediately available.
Most answers to these questions can be found in the database descriptions:

Searching for journal titles

Retrieving journal titles may pose some problems: Therefore, alphabetical indexes are especially useful for searching journal titles. Many systems have field-specific indexes, so all available journal names can be viewed. If these are not available, a general (free-text) index probably will be.

Searching for authors

Retrieving author names may also pose some problems:

Searching for subjects

Subject information can be found mainly in the "title" and "keywords" fields. Therefore it may be worthwhile to use both "keyword searching" and "free-text" retrieval. Generally the first one will give the better results, but for exhaustive results the combination of both is required.
  1. Using keywords:

    Theoretically, this should be the most precise method, as keywords are part of a controlled vocabulary. They are accorded in a consistent fashion, after careful consideration of the document described. Ideally there is a hierarchical thesaurus with powerful "explode" capabilities. There are, however, several disadvantages:

    • Each database may have its own specific keyword system.
    • Indexers may not always be consistent and occasionally make mistakes.
    • Some systems automatically create or map entries to extra keywords, e.g. based on words in the abstracts or the bibliographies. While these may generally be helpful, in many cases they will also yield superfluous or faulty hits.

  2. Searching free-text

    Free-text retrieval is often understood as searching for just any word that is present anywhere in the database (e.g. in the "author address" field). Other systems limit this option to a few text-based fields like "title" and "abstract" (and sometimes "keywords"). It is obvious that this difference will influence retrieval results, as in the first case you may be overwhelmed by excessive hits while in the second you may not be able to retrieve information from certain fields (e.g. original language title).

    There are two major disadvantages to 'free-text' searching:

    • Low recall or sensitivity: using natural language phrases may yield useful results, but these may be only a fraction of what is available in the database, because indexer and searcher have different points of view:
      • Synonyms may be used.
      • Higher or lower level terms may be used.
      • Singular forms are used instead of plural ones, and vice versa. E.g. In Medline, using the term "children" will yield thousands of records, while "child" is a more appropriate and still more productive search term.
      • Different language terms. E.g. There are dozens of records featuring the French or Dutch term "tuberculose" (but these hits are limited to "original title" and rather surprisingly "author address") while "tuberculosis" features an almost fifty-fold success rate.
      • Also multiple word concepts may be found accidentally in the literal format they were entered, while using the correct keyword entry is far more proficient and relevant, e.g. the free-text search "cancer treatment" may find only a fraction of what the thesaurus combination "explode neoplasms / drug therapy, therapy" yields.

    • Low relevance or specificity: conversely, free-text searching will often generate many extra hits when compared to keyword searching. Yet a large proportion of these may not be sufficiently relevant, as is clear from some examples yielded by the free-term search on the word "malaria". These may show little relevance, e.g. when part of a long (implied) list of diseases as in "... ranging from malaria and tuberculosis to Ebola fever and AIDS..." or when mentioned when another concept is in focus, as in "tuberculosis was the second (after malaria) leading cause ...". Whether actual hit terms are relevant or not also depends on the angle of research. The explicit absence of a disease as in "new cases of malaria were not observed" or "the patient was treated symptomatically after exclusion of malaria" may constitute useful intelligence. Papers investigating the intricacies of the "malaria parasite" or the "malaria vector" may have little clinical relevance but in a broader sense do indeed belong to the domain of malaria. But "the simian malaria parasite" or "two rodent malaria species' or 'a potential malaria vector" are generally not what one expects when searching information on human malaria, and "babesiosis is a malaria-like illness ..." is a genuine miss. And to end with a classic, using "AIDS" when searching for the "acquired immune deficiency syndrome" results in dozens of "hearing aids", "audiovisual aids", "diagnostic aids", "teaching aids", etc., and the verbal form "aids" in the meaning of "helps".

  3. Some concluding remarks on subject searching:

    • Don't focus too strongly on predetermined concepts. One should not expect the specific words one wants to find to always feature literally. Often a different spelling or a synonym are used instead. Therefore, use the alphabetic indexes or structured thesauri when available. Also, take into account that almost all database software and keyword systems are unilingually English.

    • Don't get too specific too soon. The results you find may be disappointing because your search formulations are too specific: a subject may well be covered in a publication, even though this is not explicitely clear from the "title" or the "keyword" fields. E.g. most books on infectious diseases do contain a chapter on malaria, yet this word will probably not feature in the bibliographical record of this book. In the same fashion it is not wise to start with combining all the concepts you have in mind. If just one of them is missing, the search may fail. Therefore it may be more rewarding to start searching for the most specific concept, and add extra concepts only when results are sufficiently numerous.

    Try to bear these (human) limitations in mind and do not lose courage too fast. Use your imagination: even with the ubiquitous World Wide Web the perfect universal retrieval systeem has not been invented yet.

    Related pages: Catalogs produced by ITG library | Databases produced by ITG library | International databases subscribed to by ITG library | Page author: Dirk Schoonbaert
    ITG library home page eBooks Journals Electronic journals Databases WebSPIRS Internet links ITG home page