Efficient Algorithms for Learning Sparse Models from Large Amounts of Data

Yoram Singer (Google Inc.)	Yoram Singer ... and the Magic Broom

We will review the design, analysis and implementation of several sparsity promoting learning algorithms. We start with an efficient projected gradient algorithm onto the L1 ball. We then describe a forward-backward splitting (Fobos) method that incorporates L1 and mixed-norms. We next present adaptive gradient versions of the above methods that generalize well-studied sub-gradient methods. We conclude with a description of a recent approach for "sparse counting" which facilitate compact yet accurate language modeling.