Download (direct link):
To deal with circumlocutions in such a general way, QPROC would have to transform queries like
“I want to know Smith’s salary”
“I want (that I know Smith’s salary)”
Such transformations are really detours on the way to eliciting the intended formal query
Certain data items stored in current databases can be expressed using predications For example, “ICL’s order for pencils” can be described as an event “ICL’s ordering pencils” It seems probable, however, that users do avoid using complex constructions when attempting to express themselves simply and clearly (e.g ).
There is a relationship between actual events (ICL supplies a computer), probabilities (ICL may supply a computer) and possibilities (ICL can supply a computer)
If all three queries are to be dealt with by QPROC, the three constructions ‘supply’, ‘can supply’ and ‘may supply’ must be entered separately, each with its own derived relation. QPROC has no general concept of actual, probable and possible.
There are many other linguistic generalisations, for example the linguistic concept of a verb of change. Such verbs include “open”, “learn”, “empty”, etc. These verbs may have a subject, direct object and with-object — “John burnt the newspaper with his lighter” — or just a subject — “The newspaper burnt”. If QPROC categorised all natural language verbs in this way, the cases for each verb could be deduced from its category. However, QPROC does not utilise linguistic generalisations of this kind firstly because they all seem to have exceptions, and secondly because it would require the ‘knowledge engineer’ to understand them if he were to set up the system on a new application.
Instead QPROC aims to be relatively simple and transparent
A fully portable natural language understanding system would include a complete dictionary in it. Such a dictionary would give the meaning of each word in terms of other words in the language. When such a natural language system was attached to an application, a particular set of words would be redefined to map onto the application. The dictionary would then be used to map all the other words onto the application.
There is some research going on, on a set of ‘semantic primitives’ which underlies natural language ; however, the fully portable language system is not even on the horizon, it is out of sight Language is too complicated to allow each word to be defined by other words. None of the definitions are precise
enough to enable the definition to be substituted for the defined word in any sentence containing that word. Any attempt to reduce language to a fixed set of primitives will lose part of the meaning of the remaining words (e.g. “tenacious”, “stubborn” and “pig-headed” are not quite synonyms!). Finally there are several groups of words which do not have a semantic primitive One such group is “just”, “fair”, “even-handed” whose definition has been an object of study for centuries!
3. QPROC DESIGN
3.1 The data dictionary
To set up a QPROC application, an essential component is a “data dictionary” (INTELLECT’S data dictionary is termed the ‘LEXICON’) This is a file of information about the data. It lists the relations, their attributes, and their domains. It also records some of the data values that may occur in the database, for example the domain ‘party’ has values: conservative labour liberal ‘SDP’
These values are all recorded in the data dictionary even though, at some instant, one of these values may not occur in the data
The data dictionary is combined with the system’s in-built basic vocabulary to yield the full vocabulary required for this application
Finally there is a facility to define synonyms for any word.
Fig 6 1 - Generation of the full vocabulary
.3.2 The full vocabulary
In QPROC the full vocabulary and synonyms are stored as a list of PROLOG facts with the form
‘&woid’(Part_Of_Speech, Continuation, Ending, Meaning)
The functor ‘&word’ is the natural language word preceded by a ‘&’ (to avoid name clashes)
Of particular interest is the ‘Continuation’
An example entry is
‘&social’(noun, “democrat”, singular&_, party: ‘SDP’).
The ‘continuation’ is “democrat” and this tells QPROC to check, whenever it comes across the word “social”, that the next word is “democrat”.
The task of mapping natural language input onto vocabulary entries is further complicated by inflexions, QPROC must also match “social democrats” and “social democratic [party] ” onto the vocabulary entry
Furthermore, some multiple word entries (e g. “take offence”) inflect on the first word (“takes offence”) rather than the last word
Non-standard endings are dealt with by synonyms. So “took” is entered as the perfect of “take”, by giving its meaning as “take”:
‘&took’(verb, nil, perfect, “take”)
The entry will now make “took offence” the perfect of “take offence”.
The continuation can be more complicated than just a word, or list of words. It can, for example, be any word with a certain meaning Thus in the election application the entry for “Mr” enabled any person’s name of the form “Mr X” to be parsed as a noun phrase: