Award Abstract # 2023239
TRIPODS: Institute for Foundations of Data Science
NSF Org: |
DMS
Division Of Mathematical Sciences
|
Recipient: |
UNIVERSITY OF WISCONSIN SYSTEM
|
Initial Amendment Date: |
August 31, 2020 |
Latest Amendment Date: |
August 28, 2023 |
Award Number: |
2023239 |
Award Instrument: |
Continuing Grant |
Program Manager: |
Christopher Stark
cstark@nsf.gov
(703)292-4869
DMS
Division Of Mathematical Sciences
MPS
Direct For Mathematical & Physical Scien
|
Start Date: |
September 1, 2020 |
End Date: |
August 31, 2025 (Estimated) |
Total Intended Award Amount: |
$4,583,262.00 |
Total Awarded Amount to Date: |
$3,603,663.00 |
Funds Obligated to Date: |
FY 2020 = $902,251.00
FY 2021 = $943,895.00
FY 2022 = $805,470.00
FY 2023 = $952,047.00
|
History of Investigator: |
-
Stephen
Wright
(Principal Investigator)
swright@cs.wisc.edu
-
Michael
Newton
(Co-Principal Investigator)
-
Robert
Nowak
(Co-Principal Investigator)
-
Cecile
Ane
(Co-Principal Investigator)
-
Sebastien
Roch
(Co-Principal Investigator)
|
Recipient Sponsored Research Office: |
University of Wisconsin-Madison
21 N PARK ST STE 6301
MADISON
WI
US
53715-1218
(608)262-3822
|
Sponsor Congressional District: |
02
|
Primary Place of Performance: |
University of Wisconsin-Madison
Madison
WI
US
53715-1218
|
Primary Place of Performance Congressional District: |
02
|
Unique Entity Identifier (UEI): |
LCLSJAGTNZQ7
|
Parent UEI: |
|
NSF Program(s): |
TRIPODS Transdisciplinary Rese, Special Projects - CCF, Algorithmic Foundations
|
Primary Program Source: |
01002223DB NSF RESEARCH & RELATED ACTIVIT
01002122DB NSF RESEARCH & RELATED ACTIVIT
01002021DB NSF RESEARCH & RELATED ACTIVIT
01002324DB NSF RESEARCH & RELATED ACTIVIT
|
Program Reference Code(s): |
048Z,
075Z,
079Z
|
Program Element Code(s): |
041Y00,
287800,
779600
|
Award Agency Code: |
4900
|
Fund Agency Code: |
4900
|
Assistance Listing Number(s): |
47.049, 47.070
|
ABSTRACT
Data science is making an enormous impact on science and society, but its success is uncovering pressing new challenges that stand in the way of further progress. Outcomes and decisions arising from many machine learning processes are not robust to errors and corruption in the data; data science algorithms are yielding biased and unfair outcomes, as concerns about data privacy continue to mount; and machine learning systems suited to dynamic, interactive environments are less well developed than corresponding tools for static problems. Only by an appeal to the foundations of data science can we understand and address challenges such as these. Building on the work of three TRIPODS Phase I institutes, the new Institute for Foundations of Data Science (IFDS) brings together researchers from the Universities of Washington, Wisconsin-Madison, California-Santa Cruz, and Chicago, organized around the goal of tackling these critical issues. Members of IFDS have complementary strengths in the TRIPODS disciplines of mathematics, statistics, and theoretical computer science, and a proven record of collaborating to push theoretical boundaries by synthesizing knowledge and experience from diverse areas. Students and postdoctoral members of IFDS will be trained to be fluent in the languages of several disciplines, and able to bridge these communities and perform transdisciplinary research in the foundations of data science. In concert with its research agenda, IFDS will engage the data science community through workshops, summer schools, and hackathons. Its diverse leadership, committed to equity and inclusion, proposes extensive plans for outreach to traditionally underrepresented groups. Governance, management, and evaluation of the institute will build on the successful and efficient models developed during Phase I.
To address critical issues at the cutting edge of data science research, IFDS will organize its research around four core themes. The complexity theme will synthesize various notions of complexity from multiple disciplines to make breakthroughs in the analysis of optimization and sampling methods, develop tools for assessing the complexity of data models, and seek new methods with better complexity properties, to make complexity a more powerful tool for understanding and inventing algorithms in data science. The robustness theme considers data that contains errors or outliers, possibly due to an adversary, and will design methods for data analysis and prediction that are robust in the face of these errors. The theme on closed-loop data science tackles the issues of acquiring data in ways that reveal the information content of the data efficiently, using strategic and sequential policies that leverage information gathered already from past data. The theme on ethics and algorithms addresses issues of fairness and bias in machine learning, data privacy, and causality and interpretability. The four themes intersect in many ways, and most IFDS researchers will work in two or more of them. By making concerted progress on these fundamental fronts, IFDS will lower several of the barriers to better understanding of data science methodology and to its improved effectiveness and wider relevance to application areas. Additionally, IFDS will organize and host activities that engage the data science community at all levels of seniority. Annual workshops will focus on the critical issues identified above and others that are sure to arise over the next five years. Comprehensive plans for outreach and education will draw on previous experience of the Phase I institutes and leverage institutional resources at the four sites. Collaborations with domain science researchers in academia, national laboratories, and industry, so important in illuminating issues in the fundamentals of data science, will continue through the many channels available to IFDS members, including those established in the TRIPODS+X program. Relationships with other institutes at each IFDS site will further extend the impact of IFDS on domain sciences and applications.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
(Showing: 1 - 10 of 66)
(Showing: 1 - 66 of 66)
Dasarathy, Gautam and Mossel, Elchanan and Nowak, Robert and Roch, Sebastien
"A stochastic Farris transform for genetic data under the multispecies coalescent with applications to data requirements"
Journal of Mathematical Biology
, v.84
, 2022
https://doi.org/10.1007/s00285-022-01731-5
Citation Details
Song, Chaobing and Wright, Stephen and Diakonikolas, Jelena
"Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums"
International Conference on Machine Learning
, 2021
Citation Details
Patel, Vivak and Zhang, Shushu and Tian, Bowen
"Global Convergence and Stability of Stochastic Gradient Descent"
Conference on Neural Information Processing Systems
, v.36
, 2022
Citation Details
Roh, Yuji and Lee, Kangwook and Whang, Steven Euijong and Suh, Changho
"FairBatch: Batch Selection for Model Fairness"
9th International Conference on Learning Representations
, 2021
Citation Details
Patel, Vivak and Jahangoshahi, Mohammad and Maldonado, Daniel A.
"An Implicit Representation and Iterative Solution of Randomly Sketched Linear Systems"
SIAM Journal on Matrix Analysis and Applications
, v.42
, 2021
https://doi.org/10.1137/19M1259481
Citation Details
Newton, Michael A. and Polson, Nicholas G. and Xu, Jianeng
"Weighted Bayesian bootstrap for scalable posterior distributions"
Canadian Journal of Statistics
, v.49
, 2021
https://doi.org/10.1002/cjs.11570
Citation Details
Zheng, Zihao and Mergaert, Aisha M and Ong, Irene M and Shelef, Miriam A and Newton, Michael A
"MixTwice: large-scale hypothesis testing for peptide arrays by variance mixing"
Bioinformatics
, 2021
https://doi.org/10.1093/bioinformatics/btab162
Citation Details
Xie, Yue and Wright, Stephen J.
"Complexity of Proximal Augmented Lagrangian for Nonconvex Optimization with Nonlinear Equality Constraints"
Journal of Scientific Computing
, v.86
, 2021
https://doi.org/10.1007/s10915-021-01409-y
Citation Details
Böhm, Axel and Wright, Stephen J.
"Variable Smoothing for Weakly Convex Composite Functions"
Journal of Optimization Theory and Applications
, v.188
, 2021
https://doi.org/10.1007/s10957-020-01800-z
Citation Details
Curtis, Frank E. and Robinson, Daniel P. and Royer, Clément W. and Wright, Stephen J.
"Trust-Region Newton-CG with Strong Second-Order Complexity Guarantees for Nonconvex Optimization"
SIAM Journal on Optimization
, v.31
, 2021
https://doi.org/10.1137/19M130563X
Citation Details
Chen, Ke and Li, Qin and Lu, Jianfeng and Wright, Stephen J.
"A Low-Rank Schwarz Method for Radiative Transfer Equation With Heterogeneous Scattering Coefficient"
Multiscale Modeling & Simulation
, v.19
, 2021
https://doi.org/10.1137/19M1276327
Citation Details
Chen, Ke and Chen, Shi and Li, Qin and Lu, Jianfeng and Wright, Stephen J
"Low-Rank Approximation for Multiscale PDEs"
Notices of the American Mathematical Society
, v.69
, 2022
https://doi.org/10.1090/noti2488
Citation Details
Roch, Sebastien and Wang, Kun-Chieh
"Sufficient condition for root reconstruction by parsimony on binary trees with general weights"
Electronic Communications in Probability
, v.26
, 2021
https://doi.org/10.1214/21-ECP423
Citation Details
Alacaoglu, A and Bohm, A and Malitsky, Y
"Beyond the Golden Ratio for Variational Inequality Algorithms"
Journal of machine learning research
, 2023
Citation Details
Alacaoglu, A and Lyu, H
"Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data"
International Conference on Machine Learning
, 2023
Citation Details
Alacaoglu, Ahmet and Fercoq, Olivier and Cevher, Volkan
"On the Convergence of Stochastic Primal-Dual Hybrid Gradient"
SIAM Journal on Optimization
, v.32
, 2022
https://doi.org/10.1137/19M1296252
Citation Details
Alacaoglu, A and Viano, L and He, N and Cevher, V
"A Natural Actor-Critic Framework for Zero-Sum Markov Games"
Proceedings of Machine Learning Research
, 2022
Citation Details
Alacaoglu, A and Malitsky, Y
"Stochastic Variance Reduction for Variational Inequality Methods"
Proceedings of Machine Learning Research
, 2022
Citation Details
Liu, Bo and Ye, Mao and Wright, Stephen J. and Stone, Peter and Liu, Qiang
"BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach"
Advances in neural information processing systems
, 2022
Citation Details
Chen, Ke and Li, Qin and Lu, Jianfeng and Wright, Stephen J.
"Random Sampling and Efficient Algorithms for Multiscale PDEs"
SIAM Journal on Scientific Computing
, v.42
, 2020
https://doi.org/10.1137/18M1207430
Citation Details
Zhang, Jifan and Katz-Samuels, Julian and Nowak, Robert
"GALAXY: Graph-based Active Learning at the Extreme"
International Conference on Machine Learning
, 2022
Citation Details
Zhu, Yinglun and Katz-Samuels, Julian and Nowak, Robert
"Near Instance Optimal Model Selection for Pure Exploration Linear Bandits"
International Conference on Artificial Intelligence and Statistics
, 2022
Citation Details
Coleman Cody and Chou, Edward and Katz-Samuels, Julian and Culatana, Sean and Bailis, Peter and Berg, Alexander C. and Nowak, Robert and Sumbaly, Roshan and Zaharia, Matei and Yalniz, I. Zeki
"Similarity Search for Efficient Active Learning and Search of Rare Concepts"
Proceedings of the AAAI Conference on Artificial Intelligence
, 2022
Citation Details
Katz-Samuels, Julian and Mason, Blake and Jamieson, Kevin and Nowak, Robert
"Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers"
Advances in neural information processing systems
, 2021
Citation Details
Ni, Zijian and Prasad, Aman and Chen, Shuyang and Halberg, Richard B. and Arkin, Lisa M. and Drolet, Beth A. and Newton, Michael A. and Kendziorski, Christina
"SpotClean adjusts for spot swapping in spatial transcriptomics data"
Nature Communications
, v.13
, 2022
https://doi.org/10.1038/s41467-022-30587-y
Citation Details
Fan, Wai-Tong Louis and Legried, Brandon and Roch, Sebastien
"An impossibility result for phylogeny reconstruction from k-mer counts"
The Annals of Applied Probability
, v.32
, 2022
https://doi.org/10.1214/22-AAP1805
Citation Details
Hill, Max and Legried, Brandon and Roch, Sebastien
"Species tree estimation under joint modeling of coalescence and duplication: Sample complexity of quartet methods"
The Annals of Applied Probability
, v.32
, 2022
https://doi.org/10.1214/22-AAP1799
Citation Details
Ng, Tun Lee and Newton, Michael A.
"Random weighting in LASSO regression"
Electronic journal of statistics
, 2022
https://doi.org/10.1214/22-EJS2020
Citation Details
Hill, Max and Roch, Sebastien
"Inconsistency of Triplet-Based and Quartet-Based Species Tree Estimation under Intralocus Recombination"
Journal of Computational Biology
, v.29
, 2022
https://doi.org/10.1089/cmb.2022.0265
Citation Details
Yu, Peng and Ericksen, Spencer and Gitter, Anthony and Newton, Michael A.
"Bayes Optimal Informer Sets for Early-Stage Drug Discovery"
Biometrics
, v.79
, 2022
https://doi.org/10.1111/biom.13637
Citation Details
Bui-Thanh, Tan and Li, Qin and Zepeda-Nún?ez, Leonardo
"Bridging and Improving Theoretical and Computational Electrical Impedance Tomography via Data Completion"
SIAM Journal on Scientific Computing
, v.44
, 2022
https://doi.org/10.1137/21M141703X
Citation Details
Legried, Brandon and Molloy, Erin K. and Warnow, Tandy and Roch, Sébastien
"Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss"
Journal of Computational Biology
, v.28
, 2021
https://doi.org/10.1089/cmb.2020.0424
Citation Details
Kumar, Pratyush and Rawlings, James B. and Wright, Stephen J.
"Industrial, large-scale model predictive control with structured neural networks"
Computers & Chemical Engineering
, v.150
, 2021
https://doi.org/10.1016/j.compchemeng.2021.107291
Citation Details
Ding, Zhiyan and Einkemmer, Lukas and Li, Qin
"Dynamical Low-Rank Integrator for the Linear Boltzmann Equation: Error Analysis in the Diffusion Limit"
SIAM Journal on Numerical Analysis
, v.59
, 2021
https://doi.org/10.1137/20M1380788
Citation Details
Ding, Zhiyan and Li, Qin
"Langevin Monte Carlo: random coordinate descent and variance reduction"
Journal of machine learning research
, 2021
Citation Details
Garcia Trillos, Nicolas and He, Pengfei and Li, Chenghui
"Large sample spectral analysis of graph-based multi-manifold clustering"
Journal of machine learning research
, 2023
Citation Details
Yinglun Zhu and Robert Nowak
"On Regret with Multiple Best Arms"
NeurIPS 2020
, 2020
Citation Details
Hou, Xiao and Gao, Song and Li, Qin and Kang, Yuhao and Chen, Nan and Chen, Kaiping and Rao, Jinmeng and Ellenberg, Jordan S. and Patz, Jonathan A.
"Intracounty modeling of COVID-19 infection with human mobility: Assessing spatial heterogeneity with business traffic, age, and race"
Proceedings of the National Academy of Sciences
, v.118
, 2021
https://doi.org/10.1073/pnas.2020524118
Citation Details
Xie, Yue and Wright, Stephen J.
"Complexity of a projected Newton-CG method for optimization with bounds"
Mathematical Programming
, 2023
https://doi.org/10.1007/s10107-023-02000-z
Citation Details
Ma, Xiuyu and Korthauer, Keegan and Kendziorski, Christina and Newton, Michael A.
"A compositional model to assess expression changes from single-cell RNA-seq data"
The Annals of Applied Statistics
, v.15
, 2021
https://doi.org/10.1214/20-AOAS1423
Citation Details
Ho-Nguyen, Nam and Wright, Stephen J.
"Adversarial classification via distributional robustness with Wasserstein ambiguity"
Mathematical Programming
, v.198
, 2022
https://doi.org/10.1007/s10107-022-01796-6
Citation Details
Yao, Zhewei and Xu, Peng and Roosta, Fred and Wright, Stephen J and Mahoney, Michael W
"Inexact Newton-CG algorithms with complexity guarantees"
IMA Journal of Numerical Analysis
, v.43
, 2022
https://doi.org/10.1093/imanum/drac043
Citation Details
Fogg, John and Allman, Elizabeth S. and Ané, Cécile and Thomson, ed., Robert C.
"PhyloCoalSimulations: A Simulator for Network Multispecies Coalescent Models, Including a New Extension for the Inheritance of Gene Flow"
Systematic Biology
, v.72
, 2023
https://doi.org/10.1093/sysbio/syad030
Citation Details
Garcia Trillos, Nicolas and Jacobs, Matt and Kim, Jakwang
"The multimarginal optimal transport formulation of adversarial multiclass classification"
Journal of machine learning research
, 2023
Citation Details
García Trillos, Camilo Andrés and García Trillos, Nicolás
"On the regularized risk of distributionally robust learning over deep neural networks"
Research in the Mathematical Sciences
, v.9
, 2022
https://doi.org/10.1007/s40687-022-00349-9
Citation Details
Craig, Katy and Garcia Trillos, Nicolas and Slepcev, Dejan
"Clustering Dynamics on Graphs: From Spectral Clustering to Mean Shift Through Fokker?Planck Interpolation."
Modeling and simulation in science engineering technology
, 2021
Citation Details
Ding, Zhiyan and Li, Qin and Lu, Jianfeng and Wright, Stephen
"Random Coordinate Underdamped Langevin Monte Carlo"
Proceedings of Machine Learning Research
, 2021
Citation Details
Roch, Sebastien
"Expanding the Class of Global Objective Functions for Dissimilarity-Based Hierarchical Clustering"
Journal of Classification
, 2023
https://doi.org/10.1007/s00357-023-09447-x
Citation Details
DIng, Z. and Chen, S. and Li, Q. and Wright, S. J.
"Overparameterization of Deep ResNet: Zero Loss and Mean-field Analysis"
Journal of machine learning research
, 2022
Citation Details
Chen, Nan and Li, Yingda
"BAMCAFE: A Bayesian machine learning advanced forecast ensemble method for complex turbulent systems with partial observations"
Chaos: An Interdisciplinary Journal of Nonlinear Science
, v.31
, 2021
https://doi.org/10.1063/5.0062028
Citation Details
Song, C. and Diakonikolas, J. and Wright, S. J.
"Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums"
International Conference on Machine Learning
, 2021
Citation Details
Ding, Z. and Li, Q. and Lu, J. and Wright, S. J.
"Random Coordinate Langevin Monte Carlo"
Confernece on Learning Theory
, 2021
Citation Details
Tabatabaee, Y. and Roch, S. and Warnow, T.
"Statistically Consistent Rooting of Species Trees Under the Multispecies Coalescent Model"
Research in Computational Molecular Biology. RECOMB 2023. Lecture Notes in Computer Science. Springer.
, v.13976
, 2023
Citation Details
Jo, Changhun and Lee, Kangwook
"Discrete-Valued Latent Preference Matrix Estimation with Graph Side Information"
Proceedings of Machine Learning Research
, v.139
, 2021
Citation Details
Vo, Tien and Mishra, Akshay and Ithapu, Vamsi and Singh, Vikas and Newton, Michael A
"Dimension constraints improve hypothesis testing for large-scale, graph-associated, brain-image data"
Biostatistics
, 2021
https://doi.org/10.1093/biostatistics/kxab001
Citation Details
Luo, Yuetian and Raskutti, Garvesh and Yuan, Ming and Zhang, Anru R.
"A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration"
Journal of machine learning research
, 2021
Citation Details
Cai, Ruoyi and Ané, Cécile
"Assessing the fit of the multi-species network coalescent to multi-locus data"
Bioinformatics
, 2020
https://doi.org/10.1093/bioinformatics/btaa863
Citation Details
Katz-Samuels, Julian and Nakhleh, Julia and Nowak, Robert and Li, Yixuan
"Training OOD Detectors in their Natural Habitats"
International Conference on Machine Learning
, 2022
Citation Details
Ding, Z and Chen, S and Li, Q and Wright, SJ
"Overparameterization of Deep ResNet: Zero Loss and Mean-field Analysis"
Journal of machine learning research
, 2022
Citation Details
O'Neill, Michael and Wright, Stephen J.
"A Line-Search Descent Algorithm for Strict Saddle Functions with Complexity Guarantees"
Journal of machine learning research
, v.24
, 2023
Citation Details
(Showing: 1 - 10 of 66)
(Showing: 1 - 66 of 66)
Please report errors in award information by writing to: awardsearch@nsf.gov.