It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
Variable splitting is a classic approach to decomposing a difficult optimization problem into an iterative sequence of easier sub-problems. This framework has found great success in solving signal processing problems across a vast variety of disciplines. Recently, several variable splitting techniques have been proposed for training deep neural networks, an extremely challenging optimization task. However, these algorithms are not scalable to real-world problem sizes due to their mainly offline nature. The first work presented in this dissertation is a stochastic, online alternating minimization algorithm for training deep neural networks via variable splitting, designed in collaboration with the IBM Thomas J. Watson Research Center. The proposed method demonstrates strong improvements over offline variable splitting approaches to neural network training, and also shows advantages over popular state-of-the-art backpropagation-based training algorithms.
After exploring how variable splitting can enhance deep learning, we ask a question in the other direction: how can deep learning techniques enhance variable splitting methods? One of the most widely used variable splitting frameworks, called the Alternating Direction Method of Multipliers (ADMM) is an extremely powerful optimizer but can sometimes take many tens of iterations to achieve a good approximate solution. We propose a framework that uses deep learning principles to accelerate ADMM on L1-regularized linear least squares minimization problems. The learned algorithm is generated by the recently proposed technique known as ``unrolling'', which in our case is to implement truncated-ADMM (i.e. ADMM for a fixed number of iterations) using a neural network. Then, deep learning techniques can be applied to strategically modify certain naturally arising linear operators of ADMM with data-driven corrections learned from a training dataset.
We empirically validate the resulting algorithm in a variety of sparse coding settings, i.e. where the goal is to provide sparse representation of a signal with respect to a given dictionary. Of particular interest is the challenging multiple-dictionary setting as given by morphological component analysis (MCA), a generalization of sparse coding in which the goal is to decompose a signal into additive parts such that each part has distinct sparse representation within an appropriately chosen corresponding dictionary. Our algorithm demonstrates vast acceleration over common baselines in both settings. Finally, we present a theoretical framework that shows the resulting algorithm is still an instance of ADMM, but applied to a new, learned cost function. By extending a recently proposed Stochastic Alternating Optimization analysis framework, we show that a gradient descent step along this learned loss landscape is equivalent to a modified gradient descent step along the original loss landscape. In this framework, the achieved acceleration could potentially be explained by the network's ability to learn a correction to the gradient direction of steeper descent.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer