Monday, January 8, 2018 - 4:00pm to 4:50pm
LPSC 125

Speaker Information

Hoi-To Wai
Postdoctoral Fellow
Arizona State University


Large-scale problems, where the problem dimension is high or the amount of data involved is huge, are often encountered in today's machine learning and information processing applications. This talk presents two recent first-order methods for large-scale problems under different settings. 

In the first part, I will introduce a decentralized Frank-Wolfe (DeFW) algorithm that runs on a network of machines in a master-less setting. The algorithm is suitable for a scattered data scenario, where data are stored at agents/computers which cooperate to solve a common problem. The DeFW algorithm is "projection-free" such that it handles high dimensional constraints without the pain of projection, and thus resulting in a lower runtime complexity than the state-of-the-art projection-based methods. In terms of convergence rate, we show that the proposed method converges at least sublinearly for convex and non-convex (to a stationary point) optimization problems and the rate depends on the connectivity of the underlying network. As a sidetrack, I will also discuss about a simple online Frank-Wolfe algorithm for stochastic optimization. 

In the second part, I will present a curvature-aided incremental aggregated gradient (CIAG) method for finite sum optimization. This algorithm is suitable for the "big-data" setting, where the objective function consists of $m$ component functions with $m \gg 1$. Such problem is tackled using an incremental optimization scheme, i.e., considering one component function at a time. Here, the novelty is to improve the gradient tracking performance with the aids of curvature / Hessian information. We show that for strongly convex problems, the CIAG method converges linearly at a rate that is comparable to evaluating one iteration for the full gradient method, and is $m$ times faster than the IAG method. Note that IAG is the deterministic counterpart of the popular SAG method.

The efficacies of both methods are supported by a number of numerical experiments on synthetic and real data. This is a joint work with Jean Lafond (Telecom Paristech), Eric Moulines (Ecole Polytechnique), Wei Shi, Angelia Nedich and Anna Scaglione (ASU).

Speaker Bio

Hoi-To Wai received his PhD degree from Arizona State University (ASU) in Electrical Engineering in Fall 2017, B. Eng. (with First Class Honor) and M. Phil. degrees in Electronic Engineering from The Chinese University of Hong Kong (CUHK) in 2010 and 2012, respectively. He is currently a postdoctoral scholar at ASU. Before ASU, he was a PhD student at UC Davis, a visiting scholar at Telecom ParisTech and LIDS of Massachusetts Institute of Technology. His current research interests are in the broad area of signal processing, machine learning and distributed optimization, with a focus of their applications to networked science.

His dissertation on network science has received the 2017's Dean's Dissertation Award from the Ira A. Fulton Schools of Engineering of ASU. He will be joining the Department of System Engineering & Engineering Management, CUHK as an Assistant Professor in Spring 2019.