Content area
Abstract
Modern data processing are large in dimension and often corrupted by outliers. Extracting meaningful information from such data becomes the backbone of data driven machine learning and signal processing. This dissertation considers a class of nonsmooth nonconvex optimization problems for robust signal processing and machine learning.
We first study the problem of recovering a low-rank matrix from a number of random linear measurements that are corrupted by outliers taking arbitrary values. We consider a nonsmooth nonconvex formulation of the problem, in which we explicitly enforce the low-rank property of the solution by using a factored representation of the matrix variable and employ an l1-loss function to robustify the solution against outliers. We show that even when a constant fraction of the measurements are arbitrarily corrupted, as long as certain measurement operators arising from the measurement model satisfy the so-called l1/l2-restricted isometry property, the ground-truth matrix can be exactly recovered from any global minimum of the resulting optimization problem. Furthermore, we show that the objective function of the optimization problem is sharp and weakly convex. Consequently, a subgradient method with geometrically diminishing step sizes will converge linearly to the ground-truth matrix when suitably initialized. We demonstrate the efficacy of our algorithm for the nonconvex robust low-rank matrix recovery problem with various numerical experiments.
We then consider fast incremental and random shuffling methods. Incremental methods are widely adopted for solving large-scale learning problems like training neural networks. We consider incremental methods for solving nonsmooth nonconvex optimization problems, in which the objective function is weakly convex. We present a family of incremental methods including incremental subgradient, proximal point, and prox-linear methods. We show that the convergence rate of the three incremental algorithms is O(k−1/4) in terms of driving a natural surrogate stationary measure to 0. This extends the convergence theory of incremental methods from convex optimization to nonsmooth nonconvex regime. Moreover, when the weakly convex function satisfies an additional regularity property called sharpness, we show that all the three incremental algorithms with a geometrically diminishing stepsize and an appropriate initialization converge linearly to the optimal solution set. We also conduct experiments on robust matrix sensing and robust phase retrieval to illustrate the superior convergence property of the three incremental methods.
Finally, we extend the former results for unconstrained optimization to nonsmooth nonconvex optimization over the Stiefel manifold, i.e., with orthogonal constraint, in which the objective function is weakly convex in the ambient Euclidean space. We present a family of Riemannian subgradient-type methods— namely Riemannian subgradient, incremental subgradient, and stochastic subgradient methods—to solve these problems and show that they all have an iteration complexity of O(ε−4) for driving a natural stationarity measure below ε. In addition, we establish the local linear convergence of the Riemannian subgradient and incremental subgradient methods when the problem at hand further satisfies a sharpness property and the algorithms are properly initialized and use geometrically diminishing stepsizes. The fundamental ingredient in the proof of the aforementioned convergence results is a new Riemannian subgradient inequality for restrictions of weakly convex functions on the Stiefel manifold, which could be of independent interest. We also show that our convergence results can be extended to handle a class of compact embedded submanifolds of the Euclidean space. In the end, we discuss the sharpness properties of various formulations of the robust subspace recovery and orthogonal dictionary learning problems and demonstrate the convergence performance of the algorithms on both problems via numerical simulations.





