| United States Patent | 5,331,556 |
| Black, Jr., et. al. | Jul. 19, 1994 |
Method for natural language data processing using
morphological and part-of-speech information
| Inventors: |
Black, Jr.; James Emmett
(Schenectady,NY); Zernik; Uri (Schenectady, NY). |
| Assignee: | General Electric Company (Schenectady, NY). |
| Appl. No.: | 082,710 |
| Filed: | Jun. 28, 1993 |
An enhancement and retrieval method for natural language data using a computer is disclosed. The method includes executing linguistic analysis upon a text corpus file to derive morphological, part-of-speech information as well as lexical variants corresponding to respective corpus words. The derived linguistic information is then used to construct an enhanced text corpus file. A query text file is linguistically analyzed to construct a plurality of trigger token morphemes which are then used to construct a search mask stream which is correlated with the enhanced text corpus file. A match between the search mask stream and the enhanced corpus file allows a user to retrieve selected portions of the enhanced text corpus.
26 Claims, 3 Drawing Figures