dc.description.abstract | Density estimation is a fundamental statistical problem. Many methods are either
sensitive to model misspecification (parametric models) or difficult to calibrate, especially
for multivariate data (nonparametric smoothing methods). We propose an alternative
approach using maximum likelihood under a qualitative assumption on the shape of
the density, specifically log-concavity. The class of log-concave densities includes many
common parametric families and has desirable properties. For univariate data, these
estimators are relatively well understood, and are gaining in popularity in theory and
practice. We discuss extensions for multivariate data, which require different techniques.
After establishing existence and uniqueness of the log-concave maximum likelihood
estimator for multivariate data, we see that a reformulation allows us to compute it
using standard convex optimization techniques. Unlike kernel density estimation, or
other nonparametric smoothing methods, this is a fully automatic procedure, and no
additional tuning parameters are required.
Since the assumption of log-concavity is non-trivial, we introduce a method for
assessing the suitability of this shape constraint and apply it to several simulated datasets
and one real dataset. Density estimation is often one stage in a more complicated
statistical procedure. With this in mind, we show how the estimator may be used for
plug-in estimation of statistical functionals. A second important extension is the use of
log-concave components in mixture models. We illustrate how we may use an EM-style
algorithm to fit mixture models where the number of components is known. Applications
to visualization and classification are presented. In the latter case, improvement over a
Gaussian mixture model is demonstrated.
Performance for density estimation is evaluated in two ways. Firstly, we consider
Hellinger convergence (the usual metric of theoretical convergence results for nonparametric
maximum likelihood estimators). We prove consistency with respect to this metric
and heuristically discuss rates of convergence and model misspecification, supported
by empirical investigation. Secondly, we use the mean integrated squared error to
demonstrate favourable performance compared with kernel density estimates using a
variety of bandwidth selectors, including sophisticated adaptive methods.
Throughout, we emphasise the development of stable numerical procedures able to
handle the additional complexity of multivariate data. | |