NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | August 24, 2020 |
Latest Amendment Date: | July 10, 2023 |
Award Number: | 2031985 |
Award Instrument: | Continuing Grant |
Program Manager: |
Christopher Stark
cstark@nsf.gov (703)292-4869 DMS Division Of Mathematical Sciences MPS Direct For Mathematical & Physical Scien |
Start Date: | September 1, 2020 |
End Date: | August 31, 2025 (Estimated) |
Total Intended Award Amount: | $1,650,000.00 |
Total Awarded Amount to Date: | $1,320,000.00 |
Funds Obligated to Date: |
FY 2021 = $165,000.00 FY 2022 = $165,000.00 FY 2023 = $330,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
3400 N CHARLES ST BALTIMORE MD US 21218-2608 (443)997-1898 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
3400 N. Charles Street Baltimore MD US 21218-2625 |
Primary Place of Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
OFFICE OF MULTIDISCIPLINARY AC, Special Projects - CCF, MATHEMATICAL SCIENCES RES INST, EPCN-Energy-Power-Ctrl-Netwrks |
Primary Program Source: |
01002122DB NSF RESEARCH & RELATED ACTIVIT 01002324DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.041, 47.049, 47.070 |
ABSTRACT
Recent advances in deep learning have led to many disruptive technologies: from automatic speech recognition systems, to automated supermarkets, to self-driving cars. However, the complex and large-scale nature of deep networks makes them hard to analyze and, therefore, they are mostly used as black-boxes without formal guarantees on their performance. For example, deep networks provide a self-reported confidence score, but they are frequently inaccurate and uncalibrated, or likely to make large mistakes on rare cases. Moreover, the design of deep networks remains an art and is largely driven by empirical performance on a dataset. As deep learning systems are increasingly employed in our daily lives, it becomes critical to understand if their predictions satisfy certain desired properties. The goal of this NSF-Simons Research Collaboration on the Mathematical and Scientific Foundations of Deep Learning is to develop a mathematical, statistical and computational framework that helps explain the success of current network architectures, understand its pitfalls, and guide the design of novel architectures with guaranteed confidence, robustness, interpretability, optimality, and transferability. This project will train a diverse STEM workforce with data science skills that are essential for the global competitiveness of the US economy by creating new undergraduate and graduate programs in the foundations of data science and organizing a series of collaborative research events, including semester research programs and summer schools on the foundations of deep learning. This project will also impact women and underrepresented minorities by involving undergraduates in the foundations of data science.
Deep networks have led to dramatic improvements in the performance of pattern recognition systems. However, the mathematical reasons for this success remain elusive. For instance, it is not clear why deep networks generalize or transfer to new tasks, or why simple optimization strategies can reach a local or global minimum of the associated non-convex optimization problem. Moreover, there is no principled way of designing the architecture of the network so that it satisfies certain desired properties, such as expressivity, transferability, optimality and robustness. This project brings together a multidisciplinary team of mathematicians, statisticians, theoretical computer scientists, and electrical engineers to develop the mathematical and scientific foundations of deep learning. The project is divided in four main thrusts. The analysis thrust will use principles from approximation theory, information theory, statistical inference, and robust control to analyze properties of deep networks such as expressivity, interpretability, confidence, fairness and robustness. The learning thrust will use principles from dynamical systems, non-convex and stochastic optimization, statistical learning theory, adaptive control, and high-dimensional statistics to design and analyze learning algorithms with guaranteed convergence, optimality and generalization properties. The design thrust will use principles from algebra, geometry, topology, graph theory and optimization to design and learn network architectures that capture algebraic, geometric and graph structures in both the data and the task. The transferability thrust will use principles from multiscale analysis and modeling, reinforcement learning, and Markov decision processes to design and study data representations that are suitable for learning from and transferring to multiple tasks.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.