Download (direct link):
The next point up toward the right designates a thesaurus. A thesaurus concerns the relationships between terms (words or phrases) structured in a (semantically weak) taxonomy. So a thesaurus is a taxonomy plus some term semantic relations.
The ANSI/NISO Monolingual Thesaurus Standard defines a thesaurus as "a controlled vocabulary arranged in a known order and structured so that equivalence, homographic, hierarchical, and associative relationships among terms are displayed clearly and identified by standardized relationship indicators — The primary purposes of a thesaurus are to facilitate retrieval of documents and to achieve consistency in the indexing of written or otherwise recorded documents and other items."2 These relationships can be categorized four ways:
Table 7.4 shows these relationships and synonyms for the relations, and it provides definitions and examples.
A thesaurus is typically used to associate the rough meaning of a term to the rough meaning of another term. In Figure 7.7 (Center for Army Lessons Learned [CALL] Thesaurus, 2002), "radar imagery" is narrower than "aerial imagery," which in turn is narrower than "imagery" and related to "imaging systems."
2 ANSI/NISO Z39.19-1993 (R1998), p. 1.
Table 7.4 Semantic Relations of a Thesaurus
RELATION DEFINITION EXAMPLE
Synonym A term X has nearly the “Report" is a synonym for
Similar to same meaning as a term Y. “document."
Homonym A term X is spelled the The “tank," which is a military
Spelled the same same way as a term Y, vehicle, is a homonym for the
Homographic which has a different “tank," which is a receptacle for
meaning. holding liquids.
Broader Than A term X is broader in “Organization" has a broader
(Hierarchic: meaning than a term Y. meaning than “financial
parent of) institution."
Narrower Than A term X is narrower in “Financial institution" has a
(Hierarchic: meaning than a term Y. narrower meaning than
child of) “organization."
Associated A term X is associated with A “nail" is associated with a
Associative a term Y, i.e., there is some “hammer."
Related unspecified relationship
between the two.
Figure 7.7 Example from Center for Army Lessons Learned Thesaurus.
Thus, a thesaurus is controlled vocabulary designed to support information retrieval by guiding both the person assigning meta data and the searcher to choose the same terms for the same concept. For example, a thesaurus conforming to ISO 2788 supports navigation and term selection by showing relationships between terms that are close in meaning.
A thesaurus ensures that:
■ Concepts are described in a consistent manner.
■ Experienced users are able to easily refine their searches to locate the information they need.
■ Users do not need to be familiar with technical or local terminology.
Although most people are familiar with Roget's Thesaurus3, (it's probably sitting on the shelf above your desk right next to the dictionary), a common resource available to technologists is the WordNet thesaurus (WordNet: http://www.cogsci.princeton.edu/~wn/).4 WordNet was designed according to psycholinguistic theories of human lexical memory. WordNet can be searched interactively online or downloaded and used by software developers who wish to incorporate thesaural knowledge into their applications.
In WordNet, a word is given a definition or definitions (distinct definitions for an individual word are usually called word senses for that word; the notion that a given word such as bank has multiple meanings or word senses is called polysemy ("multiple senses") in linguistics. A word (or common phrase) in WordNet also has the information typically associated with it according to the ISO 2788 standard.
■ Synonyms—Those nodes in the taxonomy that are in the mean the same as relation; in neo-Latinate, "same name."
■ Hypernyms—Those nodes that are in the parent of or broader than relation in the taxonomy; in neo-Latinate "above name"; if the taxonomy is a tree, there is only one parent node.
■ Hyponyms—Those nodes that are in the children of or narrower than relation; in neo-Latinate, "below name."
3 Roget's Thesaurus has multiple versions. The free version (because the copyright has expired, i.e., the 1911 edition) is available at many sites, including http://www.cix.co.uk/~andie/cog-ito/roget.shtml or http://promo.net/cgi-promo/pg/t9.cgi?entry=22&full=yes&ftpsite= ftp://ibiblio.org/pub/docs/books/gutenberg/. Both are the result of the Project Gutenberg project: http://www.gutenberg.net/.
4 Also see Fellbaum (1998), Miller et al. (1993).
In WordNet 1.7.1, for example, the word bank has the following definitional (word sense) and hypernymic (parent of or broader than) information associated with it:
1. depository financial institution, bank, banking concern, banking com-pany—(a financial institution that accepts deposits and channels the money into lending activities; "he cashed a check at the bank"; "that bank holds the mortgage on my home")