Books
in black and white
Main menu
Home About us Share a book
Books
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics
Ads

Communicating with Databases in Natural Language - Wallace M.

Wallace M. Communicating with Databases in Natural Language - Ellis Horwood Limited, 1985. - 170 p.
Download (direct link): comumunicatingwthisdatabase1985.djvu
Previous << 1 .. 5 6 7 8 9 10 < 11 > 12 13 14 15 16 17 .. 59 >> Next

Three basic concepts of semantics are synonymy, incompatibility and hyponymy. Two synonyms are simply words which mean the same. Incompatibles are words which exclude each other (such as “married” and “single”), A hyponym is a word whose meaning is subordinate to the meaning of another word, so “employee” is a hyponym of “person”
In terms of our data models we can illustrate synonymy and incompatibility quite straightforwardly. Two synonyms refer to the same value in the database, and two incompatibles to alternative values. Incompatibles are useful for distinguishing the “conjunctive” use of the English word “and” (“John is tall and thin”), and the “disjunctive” use (“List employees named Lawler and Smith”), The disjunctive use of “and” only occurs when it joins incompatibles NLUs that can deal with these uses of “and” are the system defined by M. King in [27], INTELLECT and those systems which reject null queries, such as REL, Hyponymy is a surprisingly important concept for database query because English includes several words like “who” and “when” which can only be mapped onto databases via hyponyms (“who” — “which employee”, “which client”; “when” — “what date”, “what time”). The really sophisticated approach to hyponymy is the ‘subordinate’ link in a semantic net [32], A technique employed with an ordinary relational model is the use of domain hierarchies [10], In a relational database the database values are divided into a number of separate domains. If “employee” is a domain name, an NLU attached to the database will translate the English word “employee” into a variable, X, that may range over the domain of employees If a domain hierarchy is implemented which includes
PERSON
EMPLOYEE CLIENT
then the English word “person” translates into a variable which may range over employees or clients, (Domain hierarchies are also mentioned in Chapter 5, section ,3,2).
One other semantic concept is antonymy, where two words contrast with each other. Examples are “large” and “small”, “high” and “low”. Such words are not mutually exclusive since, for example, a large mouse is a small animal.
32 NATURAL LANGUAGE ENQUIRY
[Ch
Thus sentences which use antonyms are only true to a degree. To deal with such language a fuzzy logic system such as PRUF [60] would be required, but this is well beyond current data model implementations Notice that NLUs can deal with queries such as “how high . ”, if the database stores heights as numeric
values. In this case, comparisons can also be dealt with such as “higher than” and “lower than”.
5. SPELLING CORRECTION
When INTELLECT fails to recognise an English word it enters into a dialogue with the user such as the example in the last section. Two systems that attempt to guess the misspelt word before resorting to a dialogue with the user are PLANES and LIFER.
5.1 Detection of spelling errors
Firstly, there are bound to be some spelling mistakes that will not even be spotted by the spelling corrector. If dictionary lookup is bottom-up, as in PLANES, then a misspelling that yields another word (e.g. “plan” for “plane”), will not be detected. On the other hand if dictionary lookup is top-down then spelling errors will be indistinguishable from grammatical errors, because the system will only be able to record that no word of the right category could be found. This is the situation in the LIFER system. Faced with a misspelling (or a grammar error), LIFER fails the parse and tries a different production, recording at which point in the sentence the failure occurs. If the sentence cannot be parsed LIFER treats its rightmost failpoint as a spelling error, and tries spelling correction,
5 .2 Correcting spelling errors
Given that a system detects a misspelt word, the word is matched against a list of correct words. If any of these are a good enough match, the system confirms the spelling correction with the user, and the sentence is parsed with the new word, (LIFER does not check back with the user until the parse is complete.)
To ensure the spelling correction is fast and efficient, the list of alternatives must be kept as short as possible, and the matching algorithm must be as fast as possible
List of alternatives Clearly a top-down dictionary-looktip will ensure that the list of alternatives is as short as possible. Furthermore it would be sensible to confine spelling correction to the core dictionary, so as to avoid extensive searching through the data dictionary or database
Matching algorithm. One matching algorithm looks for single letter substitutions, insertion or removal of a single letter, or reversal of two letters. This algorithm
2]
NATURAL LANGUAGE ENQUIRY 33
involves peifoiming the reverse operation on the misspelt word and looking for a match. The PLANES system, however, uses a kind of template matching which gives a score for each match. The word which matches best is suggested as the correct spelling A fuller discussion of possible matching algorithms is in [49].
Previous << 1 .. 5 6 7 8 9 10 < 11 > 12 13 14 15 16 17 .. 59 >> Next