Program Understanding through Cliché Recognition

Unknown author (1981-12)

Working Paper

We propose research into automatic program understanding via recognition of common data structures and algorithms (clichés). Our goals are two-fold: first, to develop a theory of program structure which makes such recognition tractable; and second, to produce a program (named Inspector) which, given a Lisp program and a library of clichés, will construct a hierarchical decomposition of the program in terms of the clichés it uses. Our approach involves assuming constraints on the possible decompositions of programs according to the teleological relations between their parts. Programs are analyzed by translating them into a language-independent form and then parsing this representation in accordance with a context-free web grammar induced by the library of clichés. Decompositions produced by this analysis will in general be partial, since most programs will not be made up entirely of clichés. This work is motivated by the belief that identification of clichés used in program, together with knowledge of their properties, provides a sufficient basis for understanding large parts of that program's behavior. Inspector will become one component of a system of programs known as a programmer's apprentice, in which Inspector's output will be used to assist a programmer with program synthesis, debugging, and maintenance.