Using Active Learning to Synthesize Models of Applications That Access Databases

Unknown author (2018-08-28)


We present a new technique that uses active learning to infer models of applications that manipulate relational databases. This technique comprises a domain-specific language for modeling applications that access databases (each model is a program in this language) and an associated inference algorithm that infers models of applications whose behavior can be expressed in this language. The inference algorithm generates test inputs and database configurations, runs the application, then observes the resulting database traffic and outputs to progressively refine its current model hypothesis. The end result is a model that completely captures the behavior of the application. Because the technique works only with the externally observable inputs, outputs, and databases, it can infer the behavior of applications written in arbitrary languages using arbitrary coding styles (as long as the behavior of the application is expressible in the domain-specific language). We also present a technique for automatically regenerating an implementation from the inferred model. The regenerator can produce a translated implementation in a different language and systematically include relevant security and error checks.