Covering Scientific & Technical AI | Wednesday, June 17, 2026

NIH Highlights AI and Advanced Computing in New Data Science Strategic Plan 

July 3, 2025 -- The final 2025-2030 Strategic Plan for Data Science was released June 2025, charting the course for how biomedical data will transform health research over the next five years in the United States. As former NIH Director Dr. Monica Bertagnolli noted in her opening letter, "The NIH Biomedical Data Ecosystem will bring increasingly effective data and tools that enable the broadest research community possible to contribute to our mission to bring better health to all people."

However, discoveries and innovations in health would not be possible without the many researchers and scientists who contribute to NIH’s mission through collaboration and partnership. Their work underpins the progress reflected in this plan.

The following five goals outline priorities that will shape the research data landscape over the next five years:

Goal 1: Improve Data Management and Sharing Capabilities

The NIH Data Management and Sharing Policy that went into effect last year, and NIH is doubling down on supporting and effectively implementing the policy. This goal focuses on three critical objectives:

  1. Supporting the biomedical community in managing, sharing, and sustaining data.
  2. Enhancing FAIR (Findable, Accessible, Interoperable, and Reusable) data principles and harmonization.
  3. Strengthening the NIH data repository ecosystem.

Researchers can expect new tools for preparing and annotating data, improved metadata quality standards, and a data steward program to guide sharing practices. NIH will also work with Tribal communities to develop appropriate data governance frameworks that respect Indigenous data sovereignty through CARE (Collective benefit, Authority to control, Responsibility, and Ethics) principles. For researchers working with sensitive data, streamlined processes for controlled data access are under development.

Goal 2: Enhance Human-Derived Data for Research

Clinical and real-world data offer incredible opportunities but are notoriously tricky to work with. This goal tackles improving access to clinical data sources, adopting health IT standards like Fast Healthcare Interoperability Resources (FHIR) and the Trusted Exchange Framework and Common Agreement (TEFCA), enhancing environmental and lifestyle data integration (the "exposome"), and providing cross-disciplinary training.

The plan includes proposed methods for collecting informed consent when combining data from multiple sources, the development of standards for data generated by home health care devices, and federated frameworks to enable the use of sensitive data in clinical research. It also outlines efforts to develop governance frameworks for data linkages and to conduct real-world pilot projects that integrate environmental factors with clinical common data elements (CDEs), supporting research on determinants of health.

Goal 3: Advance Software, Computational Methods, and Artificial Intelligence

Biomedical research generates large volumes of data, and NIH aims to ensure that researchers have access to advanced tools to analyze these resources effectively. This goal emphasizes balanced investment in software development, computational methods, and artificial intelligence applications. It includes expanded support for community-developed software tools with improved visualization capabilities and established sustainability metrics aligned with FAIR principles. The plan also highlights programs such as NCI’s Information Technology for Cancer Research (ITCR), which provide funding to support tools throughout their entire lifecycle to promote long-term availability beyond individual grant periods.

Beyond traditional analytical approaches, the plan outlines emerging computational areas such as digital twin modeling, privacy-preserving computing, and the integration of theory-based modeling with data-driven methods. The AIM-AHEAD program will continue to expand national networks to broaden access to computational capabilities across institutions. NIH also intends to provide accessible and sustainable tools to support researchers at all levels of expertise as they address the increasing complexity of biomedical data challenges.

Goal 4: Support a Federated Biomedical Research Data Infrastructure

NIH is working toward a federated data ecosystem where researchers can more easily connect disparate datasets across platforms like NHLBI's BioData CatalystNCI's Cancer Research Data Commons (CRDC), the All of Us program, and the NIH database of Genotypes and Phenotypes (dbGaP) through the NIH Cloud Platform Interoperability (NCPI) program. This approach maintains institutional control of data while standardizing access processes and interfaces.

Implementation efforts will focus on establishing a robust, connected data resource ecosystem with improved interoperability, developing new search and discovery capabilities through enhanced metadata standards, and exploring emerging computing paradigms. The Researcher Auth Service (RAS) initiative will expand single sign-on capabilities across NIH data resources, streamlining access while maintaining privacy and security standards.

Goal 5: Strengthen the Data Science Community

Data science skills are increasingly essential in all areas of biomedical research. This goal addresses expanding data science expertise at every level—from pre-college students to established investigators. It includes increasing training opportunities, expanding the data science workforce, enhancing collaboration within NIH's Intramural Research Program, and building capacity for every researcher who works with or for NIH.

The plan includes expanded cross-disciplinary training programs, new mentorship initiatives, and greater integration of data science into existing research training. The DATA Scholars program will continue to build NIH’s internal data science capacity, while partnerships with programs such as the Native American Research Centers for Health (NARCH) will help broaden data science expertise across institutions nationwide.

Conclusion

This strategic plan builds on significant progress made since the first Data Science Strategic Plan, with a renewed focus on partnership, capacity-building, and responsible innovation. As the research landscape evolves with unprecedented speed, NIH is working to ensure these powerful data tools and technologies benefit all Americans through more comprehensive scientific discoveries.

Check out the full plan at the NIH Office of Data Science Strategy website here.


Source: Dr. Susan Gregurick, Associate Director of Data Science, NIH