METHOD AND APPARATUS FOR NATURAL LANGUAGE INTERROGATION BASED ON CONTEXTUAL ANALYSIS
It is widely recognized that the next generation of computers will be commanded and controlled through human voice. This technological leap involves speech recognition used in conjunction with language interpretation. The present paper refers only to language interpretation and, in particular, to a method and apparatus intended to perform intelligent human-computer communication through natural language.
Born as the application-oriented child of the theoretical domain of Computational Linguistics and the more data-oriented domain of Language Technology, the field of Applied Natural Language Processing displays nowadays an ever growing number and diversity of applications, ranging from computer-aided translation technologies to human-machine natural language interfaces, from intelligent web searchers to multi-lingual abstraction and indexing mechanisms. Most of the scientific research work in computational linguistics focuses on creation of linguistic models, by integrating the description of both general and particular aspects of a language. These theoretical models are built and verified upon linguistic material that is usually found in large depositories of linguistic data as corpora. Language models involve definition of a grammar, as a set of language-specific rules and universal principles, which are then applied to analyze the syntactic-semantic structure of sentences. Grammar formalisms (as Lexical Functional Grammars, Head-driven Phrase Structure Grammar, etc.) are able to provide almost complete case coverage and in depth language analysis. Still NL applications based on formal grammars are practically prohibitive because an implementation to a specific domain requires a huge amount of work, extensive lexical expertise, a long validation procedure and is usually very computational demanding. As an alternative, engineering of methods and instruments for specific computational processing of the Natural Language are based on the investigation of particular aspects of the language in a specific language register and a well-delimited universe of discourse. Within these limits it is possible to imagine a rough syntactic analysis or one that eludes syntax completely as long as consistent and rather tight semantic constrains can be employed to control gathering of meaning. Compared to instruments based on grammar checking, these tools, although with limited applicability, can be practical and efficient alternatives, working solutions for specific problems. The above discussion will help demonstrate and clarify the differences between the present method and others on Natural Language. While most of the prior art belongs to the first direction, basing its approach on linguistic grammar and syntactic analysis, the present method and apparatus, referred to as the PARLEX system, follows the second direction. When it analyzes a phrase, PARLEX applies custom-defined contextual rules rather than linguistic grammar rules. In doing this, it does not care if a word is a noun or a verb, but it is rather interested in the meaning of that word in a certain context. To be reasonably efficient (e.g. to minimize the response time, provide an adequate coverage), an implementation of this method deals with, but is not limited to, questions within a specific domain, known as The Universe of Discourse, and a specific language, known as English. This universe represents a set of words and meanings related to the specific field of interest for which the PARLEX system is implemented, such as banking, insurance, weather forecast, computer configuration, ticket purchase etc.
2. Summary of the method and apparatus
PARLEX system is a bi-contextual transducer, that is a system aimed at transforming information from the plain NL sentence into its equivalent computational form by applying two-contexts rules. Rules are called contextual because their form allow to take into consideration two sources of knowledge simultaneously and, as such, to interpret a piece of information in the context of one piece or the other. The core of the system is an inference mechanism called PARLEX engine, which processes the fact database to select rules able to generate the new fact database. Initially, the fact database is the plain NL phrase. At the end, the result fact database must contain at least the computational form of the phrase. Input to the system could be a query addressed in natural language to a database, a command addressed to a system or simply a file that needs to be transformed into another form. There is no restriction to the form of the input although the system has special power to interpret natural language. One of the features making PARLEX different than prior art is that, by ignoring certain syntactic constructs, PARLEX manifests a more robust behavior, as syntactically invalid word associations bearing a meaning are correctly interpreted and the system gives the proper answer. This tolerance makes it usable even by people with medium literacy. As will be seen below, sometimes syntactic ambiguities are superfluous from a semantic point of view, and therefore a method that parses NL by ignoring syntax will not even perceive them. PARLEX is a contextual engine for interpreting plain NL phrases. Its main goal is to transform a sentence into a computational form representation (e.g. one having no ambiguities and only one interpretation). The sentence is processed and transformed sequentially, by applying user-defined, customizable, contextual rules. To be efficient, the system deals with domain specific questions / sentences, which belong to a so-called "Universe of Discourse". Both the rules and the final computational form belong to the same universe as the original sentence. For instance, a system developed to answer questions about weather forecast will be novice when dealing with financial matters. What makes PARLEX different than other solutions for NL is the contextual processing mechanism, based on the fact that a sequence of words can have various meanings when combined with various other sequences. For instance, the word "bigger" could mean a simple comparison between two values, as in the phrase "account balances bigger than $1,000".It also can lead to a general comparison, involving a larger number of values, as in the phrase "which one of my accounts has a bigger balance". This is normally understood (and substituted) with"maximum". The grammar free approach of PARLEX is more efficient than the traditional grammar checking used by other NL engines because it lacks syntactic analysis and gives freedom and flexibility in passing from one context to another. For these reasons a PARLEX application is easier to implement, has smaller code and runs faster as its convergence towards a final meaning is better. It is also more configurable and portable from one domain (universe of discourse) to another.
3. An intuitive look on the disambiguation problem in NL processing
Rather frequently, in natural language sentences there exist groups of words whose meanings are different from the literal meaning of their parts taken together. Idiomatic expressions and technical jargon enter in this category. For instance, Oxford Thesaurus 1991 gives as example of idiom "red herring" meaning "false trail" therefore something that is neither red nor a herring. Such expressions must be considered without trying to break their meanings into elementary building bricks. To interpret them means to translate them directly into symbols that, conventionally or not, incorporate their meaning as a whole. Moreover there is often the case that an isolated word or a group of words has/have more than a single meaning taken isolated from the surrounding context. Still, as soon as an environment is considered, the sense is usually restrained naturally to just one interpretation. Plenty of evidence for this can be found in any English thesaurus where almost each word has a plethora of senses when taken in isolation, but as soon as a context is considered around it – just one of them is kept as meaningful in that combination. Consider just these examples (from Oxford Thesaurus 1991) that display the meaning of the noun plant:
(E3.1) The new plant in Crawley is hiring lathe operators. – factory.
(E3.2) Because of the rains the plants are flourishing this summer. – flowers, trees, vegetables;
The noun plant in example (E3.1) and (E3.2) denotes two different concepts. The contexts it is placed in are enough for suggesting to the reader, in each case, the correct sense. Still, not everything in the surrounding text is significant in doing word/idioms sense disambiguation. In example (E3.1), the semantic environment the word plant participates in, that of the actor of a "hiring" event, is enough to get the "factory" sense, while none of the other roles around the main event are significant for this word sense disambiguation (In Crawley the plants are ever green. The plant was grown by the lathe operators of the factory.). In example (E3.2) neither because of the rains nor flourishing (and least of all in the summer) are proper disambiguating contexts for the delimitation of the sense "flowers-trees-vegetables" of plant (Because of the rains the plants had to temporarily cease production. Because of the general resurrection of the economy the plants are flourishing this summer.). Still, when taken together,because of the rains and flourishing succeed in establishing the "factory" sense for plants. PARLEX is a rule-based system that has a special mechanism to deal with context in order to disambiguate between words/idioms senses. In doing this it uses bi-contextual rules, a feature which singles it out from any previous art elaboration on rule-based systems.
4. Components of PARLEX and their interaction
Simply said, PARLEX applys its transformation rules onto the defined worlds in a loop, until no more rules are applicable. As such, it obtains more and more refined computational objects (representations) until no more refinement can be performed. The processing ends up with a computational representation of the original plain NL input. This representation has ultimately no ambiguities and can be understood and interpreted by another software module (hence "computational") unequivocally. As stated above, the type of text processing PARLEX does, in principle, is independent of a syntactic analysis, but the contextual rules mechanism that it implements allows an analyst to also realize a grammar-based type of analysis, if desired. The type of analysis that we exemplify in the present document is a semantic-based one, which retains only the meaningful words and detects their relationships in a well-defined universe of discourse. In the case of a database interrogation application, or the commands-language addressed to a computer-controlled machine, the universe of discourse is very well delimited. PARLEX method is based on the observation that, by retaining only the content significant words for the application at hand, and eliminating all the others that have only a syntax related function, in most of the cases still leaves us with a phrase out of which a human can easily recuperate an unambiguous interpretation. As examples, take the following pairs of original and filtered sequences:
(E4.1) Give me the list of employees in the toy department that earn more than $20,000. à employees toy department earn more than $20,000
(E4.2) I want all the departments in the 2nd floor that sell items supplied by Crescent Co. à departments 2 floor sell items supplied Crescent
Let's note that both sentences (E4.1) and (E4.2) are syntactically ambiguous, as the attributive phrase that earn more than $20,000 could be attached either to employees or to the toy department and analogously that sell items supplied by Crescent Co can complement either departments or the 2nd floor. Still the conceptual representation of the attached universe of discourse filters out the interpretation where a department can earn, or that where a floor can sell items, leaving with just one interpretation in each example. The examples evidence the benefit of not passing through a syntactic analysis phase and jumping directly into a semantic interpretation.
A PARLEX application is structured into three main components: the lexicon, storing the words or symbols and the families of synonyms, the set of rules, containing the contextual rules defined by the analyst, and the engine, which processes the sentence based on the other two components. When one application is changed with another one, most of the lexicon and of the set of rules also changes while the engine is kept.
Independent of the specific syntax employed for rule definitions, PARLEX is essentially a system allowing:
The input in a PARLEX application is a question formulated in plain NL and entered by the user, through an input device (keyboard, speech recognition tools). The output is a computational, language independent, representation of the original input, with no ambiguities, ready to be translated into a computer command, SQL-like database query etc. From a structural perspective, here is a graphical representation of PARLEX modules and their interaction. The data flow is in one direction only: once the sentence received, the engine processes it and passes the computational form to the "Executor". Depending on the customized implementation of PARLEX, the Executor could launch applications, query databases, start motors etc.