Notes on Regularized Least Squares

Unknown author (2007-05-01)

This is a collection of information about regularized least squares (RLS). The facts here are not new results , but we have not seen them usefully collected together before. A key goal of this work is to demonstrate that with RLS, we get certain things for free : if we can solve a single supervised RLS problem, we can search for a good regularization parameter lambda at essentially no additional cost.The discussion in this paper applies to dense regularized least squares, where we work with matrix factorizations of the data or kernel matrix. It is also possible to work with iterative methods such as conjugate gradient, and this is frequently the method of choice for large data sets in high dimensions with very few nonzero dimensions per point, such as text classifciation tasks. The results discussed here do not apply to iterative methods, which have different design tradeoffs.We present the results in greater detail than strictly necessary, erring on the side of showing our work. We hope that this will be useful to people trying to learn more about linear algebra manipulations in the machine learning context.