A Projected Subgradient Method for Scalable Multi-Task Learning
Recent approaches to multi-task learning have investigated the use of a variety of matrix norm regularization schemes for promoting feature sharing across tasks.In essence, these approaches aim at extending the l1 framework for sparse single task approximation to the multi-task setting. In this paper we focus on the computational complexity of training a jointly regularized model and propose an optimization algorithm whose complexity is linear with the number of training examples and O(n log n) with n being the number of parameters of the joint model. Our algorithm is based on setting jointly regularized loss minimization as a convex constrained optimization problem for which we develop an efficient projected gradient algorithm. The main contribution of this paper is the derivation of a gradient projection method with l1â â constraints that can be performed efficiently and which has convergence rates.