Assigning Hierarchical Descriptions to Visual Assemblies of Blocks with Occlusion
This report describes research done at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Support for the laboratory's artificial intelligence research is provided in part by the Advanced Research Projects Agency of the Department of Defense under Office of Naval Research contract N00014-75-C-0643.
This memo describes a program for parsing simple two-dimensional piles of blocks into plausible nested subassemblies. Each subassembly must be one of a few types known to the program, such as stack, tower, or arch. Each subassembly has the overall shape of a single block, allowing it to behave as part of another subassembly. Occlusion is represented by an area of the image plane whose contents cannot be seen. Heuristic aspects of the program are concerned with 1) ambiguity among competing subassemblies due to sloppiness of the placement of the blocks, 2) ambiguity due to uncertain measurements of blocks which are partially occluded, and 3) total ambiguity as to the contents of the occluded region. Choice among competing subassemblies is accomplished by first making a topological description of the network of conflicts among subassemblies, then considering only the simplest competing subset. If this does not clearly indicate a winner, the system can make an in-depth comparison of the internal structures of the last two competing subassemblies. Uncertainty as to measurements of blocks is handled by creation of a disjunction of more certain blocks, each of which participates in the parsing process. If this disjunction results in a pair of competing subassemblies, only one is used, the other being hidden as an alternate to the first, so that the choice of which will be accepted can be deferred. This is a deferrable choice because the alternate subassemblies are so closely similar that the parsing process does not depend on choosing one of them. Uncertainty due to occlusion is handled by allowing a potential subassembly to use the occluded area as a "wild card", meaning that if the subassembly can be completed by creating a block which intersects the occluded area, it is so completed. Such an imaginary block may later be consolidated with a real one, or it may remain imaginary. The reason for studying this problem is to become acquainted with the program and data structure needed to assign a nested structural description to a complicated visual assembly in which occlusion makes the data incomplete. The extension to 3-dimensional descriptions should be straightforward.