GPSG-Recognition is NP-Hard

Unknown author (1985-03-01)

Proponents of generalized phrase structure grammar (GPSG) cite its weak context-free generative power as proof of the computational tractability of GPSG-Recognition. Since context-free languages (CFLs) can be parsed in time proportional to the cube of the sentence length, and GPSGs only generate CFLs, it seems plausible the GPSGs can also be parsed in cubic time. This longstanding, widely assumed GPSG "efficient parsability" result in misleading: parsing the sentences of an arbitrary GPSG is likely to be intractable, because a reduction from 3SAT proves that the universal recognition problem for the GPSGs of Gazdar (1981) is NP-hard. Crucially, the time to parse a sentence of a CFL can be the product of sentence length cubed and context-free grammar size squared, and the GPSG grammar can result in an exponentially large set of derived context-free rules. A central object in the 1981 GPSG theory, the metarule, inherently results in an intractable parsing problem, even when severely constrained. The implications for linguistics and natural language parsing are discussed.