Project Details
Description
This project advances the nation's development in science and engineering by providing new theory and algorithms for knowledge discovery from high-dimensional data. High-dimensional estimation, a computational procedure that extracts the most useful information from a large pool of redundant or irrelevant features, has played fundamental roles in various areas such as medical imaging, biology, and climatology. However, the well-established estimation schemes degrade dramatically when the data have complex structures, or when they are contaminated due to hardware failures, programming errors, or cyber-attacks. The goal of this project is to significantly broaden the understanding of the fundamental limits of learning algorithms against different types of structures and data errors, to offer a complete guideline for robust algorithmic design, and to highlight the extent to which an intelligent system behaves reliably and consistently. Outputs, such as theoretical results, algorithm implementation, and reusable empirical data, are designed to support a wide range of researchers in machine learning, high-dimensional statistics, signal processing, biology, and other related fields.
The project will be carried out by investigating the interplay of high-dimensional statistics, optimization, and learning theory. The investigator will develop a unified framework for nonlinear estimation in the high-dimensional regime, which uncovers parameter estimation from quantized measurements and learning with nonlinear activation functions in deep neural networks. In particular, to account for the nonlinear and possibly nonconvex nature, the investigator will develop efficient constrained optimization algorithms by leveraging inherent geometric structures into algorithmic design and theoretical analysis. Based on the unified framework and the established generic results, the investigator will revisit an ensemble of heuristic algorithms and will provide a theoretical justification on when and why they succeed in practice. Lastly, the investigator will design algorithms that are robust to various types of data corruption, such as adversarial noise, outlier, and malicious noise. To obtain a near-optimal dependence on the noise rate and data dimension in the sample complexity, a series of new statistical results will be established by leveraging tools from, and enriching theory in learning theory and robust statistics.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Status | Finished |
---|---|
Effective start/end date | 1/09/20 → 31/08/23 |
Funding
- National Science Foundation
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.