United States Patent 5,331,556
Black, Jr., et. al. Jul. 19, 1994

Method for natural language data processing using morphological and part-of-speech information


Inventors: Black, Jr.; James Emmett (Schenectady,NY);
Zernik; Uri (Schenectady, NY).
Assignee: General Electric Company (Schenectady, NY).
Appl. No.: 082,710
Filed: Jun. 28, 1993


Abstract

An enhancement and retrieval method for natural language data using a computer is disclosed. The method includes executing linguistic analysis upon a text corpus file to derive morphological, part-of-speech information as well as lexical variants corresponding to respective corpus words. The derived linguistic information is then used to construct an enhanced text corpus file. A query text file is linguistically analyzed to construct a plurality of trigger token morphemes which are then used to construct a search mask stream which is correlated with the enhanced text corpus file. A match between the search mask stream and the enhanced corpus file allows a user to retrieve selected portions of the enhanced text corpus.

26 Claims, 3 Drawing Figures


References:   [ Patents Cited | Citing Patents | Other]       [Classifications]


Primary Examiner: McElheny, Jr.; Donald
Attorney, Agent or Firm: Krauss; Geoffrey H.