Principle-Based Parsing for Machine Translation

Unknown author (1987-12-01)

Many syntactic parsing strategies for machine translation systems are based entirely on context-free grammars. These parsers require an overwhelming number of rules; thus, translation systems using rule-based parsers either have limited linguistic coverage, or they have poor performance due to formidable grammar size. This report shows how a principle-based parser with a 'co-routine' design improves parsing for translation. The parser consists of a skeletal structure-building mechanism that operates in conjunction with a linguistically based constraint module, passing control back and forth until a set of underspecified skeletal phrase-structures is converted into a fully instantiated parse tree. The modularity of the parsing design accomodates linguistic generalization, reduces the grammar size, allows extension to other languages, and is compatible with studies of human language processing.