Award Abstract # 2023239
TRIPODS: Institute for Foundations of Data Science

NSF Org: DMS
Division Of Mathematical Sciences
Recipient: UNIVERSITY OF WISCONSIN SYSTEM
Initial Amendment Date: August 31, 2020
Latest Amendment Date: August 28, 2023
Award Number: 2023239
Award Instrument: Continuing Grant
Program Manager: Christopher Stark
cstark@nsf.gov
 (703)292-4869
DMS
 Division Of Mathematical Sciences
MPS
 Direct For Mathematical & Physical Scien
Start Date: September 1, 2020
End Date: August 31, 2025 (Estimated)
Total Intended Award Amount: $4,583,262.00
Total Awarded Amount to Date: $3,603,663.00
Funds Obligated to Date: FY 2020 = $902,251.00
FY 2021 = $943,895.00

FY 2022 = $805,470.00

FY 2023 = $952,047.00
History of Investigator:
  • Stephen Wright (Principal Investigator)
    swright@cs.wisc.edu
  • Michael Newton (Co-Principal Investigator)
  • Robert Nowak (Co-Principal Investigator)
  • Cecile Ane (Co-Principal Investigator)
  • Sebastien Roch (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Wisconsin-Madison
21 N PARK ST STE 6301
MADISON
WI  US  53715-1218
(608)262-3822
Sponsor Congressional District: 02
Primary Place of Performance: University of Wisconsin-Madison
Madison
WI  US  53715-1218
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): LCLSJAGTNZQ7
Parent UEI:
NSF Program(s): TRIPODS Transdisciplinary Rese,
Special Projects - CCF,
Algorithmic Foundations
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
01002122DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT

01002324DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 048Z, 075Z, 079Z
Program Element Code(s): 041Y00, 287800, 779600
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.049, 47.070

ABSTRACT

Data science is making an enormous impact on science and society, but its success is uncovering pressing new challenges that stand in the way of further progress. Outcomes and decisions arising from many machine learning processes are not robust to errors and corruption in the data; data science algorithms are yielding biased and unfair outcomes, as concerns about data privacy continue to mount; and machine learning systems suited to dynamic, interactive environments are less well developed than corresponding tools for static problems. Only by an appeal to the foundations of data science can we understand and address challenges such as these. Building on the work of three TRIPODS Phase I institutes, the new Institute for Foundations of Data Science (IFDS) brings together researchers from the Universities of Washington, Wisconsin-Madison, California-Santa Cruz, and Chicago, organized around the goal of tackling these critical issues. Members of IFDS have complementary strengths in the TRIPODS disciplines of mathematics, statistics, and theoretical computer science, and a proven record of collaborating to push theoretical boundaries by synthesizing knowledge and experience from diverse areas. Students and postdoctoral members of IFDS will be trained to be fluent in the languages of several disciplines, and able to bridge these communities and perform transdisciplinary research in the foundations of data science. In concert with its research agenda, IFDS will engage the data science community through workshops, summer schools, and hackathons. Its diverse leadership, committed to equity and inclusion, proposes extensive plans for outreach to traditionally underrepresented groups. Governance, management, and evaluation of the institute will build on the successful and efficient models developed during Phase I.

To address critical issues at the cutting edge of data science research, IFDS will organize its research around four core themes. The complexity theme will synthesize various notions of complexity from multiple disciplines to make breakthroughs in the analysis of optimization and sampling methods, develop tools for assessing the complexity of data models, and seek new methods with better complexity properties, to make complexity a more powerful tool for understanding and inventing algorithms in data science. The robustness theme considers data that contains errors or outliers, possibly due to an adversary, and will design methods for data analysis and prediction that are robust in the face of these errors. The theme on closed-loop data science tackles the issues of acquiring data in ways that reveal the information content of the data efficiently, using strategic and sequential policies that leverage information gathered already from past data. The theme on ethics and algorithms addresses issues of fairness and bias in machine learning, data privacy, and causality and interpretability. The four themes intersect in many ways, and most IFDS researchers will work in two or more of them. By making concerted progress on these fundamental fronts, IFDS will lower several of the barriers to better understanding of data science methodology and to its improved effectiveness and wider relevance to application areas. Additionally, IFDS will organize and host activities that engage the data science community at all levels of seniority. Annual workshops will focus on the critical issues identified above and others that are sure to arise over the next five years. Comprehensive plans for outreach and education will draw on previous experience of the Phase I institutes and leverage institutional resources at the four sites. Collaborations with domain science researchers in academia, national laboratories, and industry, so important in illuminating issues in the fundamentals of data science, will continue through the many channels available to IFDS members, including those established in the TRIPODS+X program. Relationships with other institutes at each IFDS site will further extend the impact of IFDS on domain sciences and applications.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 66)
Dai, Minyi and Demirel, Mehmet F. and Liang, Yingyu and Hu, Jia-Mian "Graph neural networks for an accurate and interpretable prediction of the properties of polycrystalline materials" npj Computational Materials , v.7 , 2021 https://doi.org/10.1038/s41524-021-00574-w Citation Details
Lee, Ching-pei and Wright, Stephen J "Random permutations fix a worst case for cyclic coordinate descent" IMA Journal of Numerical Analysis , v.39 , 2018 https://doi.org/10.1093/imanum/dry040 Citation Details
Hill, M. "On the Effect of Intralocus Recombination on Triplet-Based Species Tree Estimation" Research in Computational Molecular Biology. RECOMB 2022. , 2022 https://doi.org/10.1007/978-3-031-04749-7_9 Citation Details
Dasarathy, Gautam and Mossel, Elchanan and Nowak, Robert and Roch, Sebastien "A stochastic Farris transform for genetic data under the multispecies coalescent with applications to data requirements" Journal of Mathematical Biology , v.84 , 2022 https://doi.org/10.1007/s00285-022-01731-5 Citation Details
Song, Chaobing and Wright, Stephen and Diakonikolas, Jelena "Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums" International Conference on Machine Learning , 2021 Citation Details
Xu, Jingcheng and Ané, Cécile "Identifiability of local and global features of phylogenetic networks from average distances" Journal of Mathematical Biology , v.86 , 2023 https://doi.org/10.1007/s00285-022-01847-8 Citation Details
Patel, Vivak and Zhang, Shushu and Tian, Bowen "Global Convergence and Stability of Stochastic Gradient Descent" Conference on Neural Information Processing Systems , v.36 , 2022 Citation Details
Patel, Vivak "Stopping criteria for, and strong convergence of, stochastic gradient descent on Bottou-Curtis-Nocedal functions" Mathematical Programming , v.195 , 2022 https://doi.org/10.1007/s10107-021-01710-6 Citation Details
Roh, Yuji and Lee, Kangwook and Whang, Steven Euijong and Suh, Changho "FairBatch: Batch Selection for Model Fairness" 9th International Conference on Learning Representations , 2021 Citation Details
Johnston, Liam and Patel, Vivak "Second-Order Sensitivity Methods for Robustly Training Recurrent Neural Network Models" IEEE Control Systems Letters , v.5 , 2021 https://doi.org/10.1109/LCSYS.2020.3001498 Citation Details
Patel, Vivak and Jahangoshahi, Mohammad and Maldonado, Daniel A. "An Implicit Representation and Iterative Solution of Randomly Sketched Linear Systems" SIAM Journal on Matrix Analysis and Applications , v.42 , 2021 https://doi.org/10.1137/19M1259481 Citation Details
(Showing: 1 - 10 of 66)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page