Spectral methods and computational trade-offs in high-dimensional statistical inference

Wang, Tengyao

Spectral methods and computational trade-offs in high-dimensional statistical inference

dc.creator	Wang, Tengyao
dc.date.accessioned	2018-11-24T23:26:50Z
dc.date.available	2016-10-19T14:07:35Z
dc.date.available	2018-11-24T23:26:50Z
dc.date.issued	2016-10-04
dc.identifier	https://www.repository.cam.ac.uk/handle/1810/260825
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/123456789/3896
dc.description.abstract	Spectral methods have become increasingly popular in designing fast algorithms for modern highdimensional datasets. This thesis looks at several problems in which spectral methods play a central role. In some cases, we also show that such procedures have essentially the best performance among all randomised polynomial time algorithms by exhibiting statistical and computational trade-offs in those problems. In the first chapter, we prove a useful variant of the well-known Davis{Kahan theorem, which is a spectral perturbation result that allows us to bound of the distance between population eigenspaces and their sample versions. We then propose a semi-definite programming algorithm for the sparse principal component analysis (PCA) problem, and analyse its theoretical performance using the perturbation bounds we derived earlier. It turns out that the parameter regime in which our estimator is consistent is strictly smaller than the consistency regime of a minimax optimal (yet computationally intractable) estimator. We show through reduction from a well-known hard problem in computational complexity theory that the difference in consistency regimes is unavoidable for any randomised polynomial time estimator, hence revealing subtle statistical and computational trade-offs in this problem. Such computational trade-offs also exist in the problem of restricted isometry certification. Certifiers for restricted isometry properties can be used to construct design matrices for sparse linear regression problems. Similar to the sparse PCA problem, we show that there is also an intrinsic gap between the class of matrices certifiable using unrestricted algorithms and using polynomial time algorithms. Finally, we consider the problem of high-dimensional changepoint estimation, where we estimate the time of change in the mean of a high-dimensional time series with piecewise constant mean structure. Motivated by real world applications, we assume that changes only occur in a sparse subset of all coordinates. We apply a variant of the semi-definite programming algorithm in sparse PCA to aggregate the signals across different coordinates in a near optimal way so as to estimate the changepoint location as accurately as possible. Our statistical procedure shows superior performance compared to existing methods in this problem.
dc.language	en
dc.publisher	Department of Pure Mathematics and Mathematical Statistics, University of Cambridge
dc.publisher	University of Cambridge
dc.publisher	Department of Pure Mathematics and Mathematical Statistics
dc.publisher	Faculty of Mathematics
dc.publisher	St John's College
dc.subject	Research Subject Categories::MATHEMATICS::Applied mathematics::Mathematical statistics
dc.subject	spectral methods
dc.subject	Davis-Kahan theorem
dc.subject	principal component analysis
dc.subject	PCA
dc.subject	restricted isometry
dc.subject	high-dimensional changepoint estimation
dc.subject	semi-definite programming
dc.title	Spectral methods and computational trade-offs in high-dimensional statistical inference
dc.type	Thesis

Files in this item

Files	Size	Format	View
Wang-2016-PhD.pdf	1.518Mb	application/pdf	View/Open

This item appears in the following Collection(s)

Department of Pure Mathematics and Mathematical Statistics (DPMMS)248

Show simple item record

Spectral methods and computational trade-offs in high-dimensional statistical inference

Files in this item

This item appears in the following Collection(s)

Department of Pure Mathematics and Mathematical Statistics (DPMMS)248