Abstract

Variable splitting is a classic approach to decomposing a difficult optimization problem into an iterative sequence of easier sub-problems. This framework has found great success in solving signal processing problems across a vast variety of disciplines. Recently, several variable splitting techniques have been proposed for training deep neural networks, an extremely challenging optimization task. However, these algorithms are not scalable to real-world problem sizes due to their mainly offline nature. The first work presented in this dissertation is a stochastic, online alternating minimization algorithm for training deep neural networks via variable splitting, designed in collaboration with the IBM Thomas J. Watson Research Center. The proposed method demonstrates strong improvements over offline variable splitting approaches to neural network training, and also shows advantages over popular state-of-the-art backpropagation-based training algorithms.

After exploring how variable splitting can enhance deep learning, we ask a question in the other direction: how can deep learning techniques enhance variable splitting methods? One of the most widely used variable splitting frameworks, called the Alternating Direction Method of Multipliers (ADMM) is an extremely powerful optimizer but can sometimes take many tens of iterations to achieve a good approximate solution. We propose a framework that uses deep learning principles to accelerate ADMM on L1-regularized linear least squares minimization problems. The learned algorithm is generated by the recently proposed technique known as ``unrolling'', which in our case is to implement truncated-ADMM (i.e. ADMM for a fixed number of iterations) using a neural network. Then, deep learning techniques can be applied to strategically modify certain naturally arising linear operators of ADMM with data-driven corrections learned from a training dataset.

We empirically validate the resulting algorithm in a variety of sparse coding settings, i.e. where the goal is to provide sparse representation of a signal with respect to a given dictionary. Of particular interest is the challenging multiple-dictionary setting as given by morphological component analysis (MCA), a generalization of sparse coding in which the goal is to decompose a signal into additive parts such that each part has distinct sparse representation within an appropriately chosen corresponding dictionary. Our algorithm demonstrates vast acceleration over common baselines in both settings. Finally, we present a theoretical framework that shows the resulting algorithm is still an instance of ADMM, but applied to a new, learned cost function. By extending a recently proposed Stochastic Alternating Optimization analysis framework, we show that a gradient descent step along this learned loss landscape is equivalent to a modified gradient descent step along the original loss landscape. In this framework, the achieved acceleration could potentially be explained by the network's ability to learn a correction to the gradient direction of steeper descent.

Details

Title
Optimization Methods Combining Variable Splitting with Deep Learning
Author
Cowen, Benjamin
Publication year
2019
Publisher
ProQuest Dissertations & Theses
ISBN
9781085600828
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
2280613842
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.