Skip to Main Content

Creating an NIH Data Management and Sharing Plan

This guide presents information on the NIH Data Management and Sharing Policy, which requires submission of a Data Management and Sharing Plan (DMSP) for all NIH-supported research.

Documenting Where Data will be Preserved

A key part of the NIH's Data Management and Sharing Policy is the requirement to share data in a specific place. Repositories are online storage sites for data that will preserve the digital data in perpetuity and allow data access to other researchers and/or the public, based on the repository's specific privacy settings.

Document in your DMSP the repository where your data will be preserved and shared. Certain NIH ICOs have released helpful guidance documents and tools for selecting a repository, including the NIDDK's Repository Selection Consideration Tool. The related sample workflow below may be used to determine where to deposit data:

  • The NIH ICO (Institute, Center, or Office) releasing the NOFO (Notice of Funding Opportunity) may have requirements on where data from their funded projects should be preserved and shared. If so, use the required repository.
  • If there is not a required repository, but an NIH-approved discipline or data-specific repository exists, choose from that list.
  • If not necessarily NIH-approved, but a domain, discipline, or data-specific repository exists and is vetted and commonly used in your field of study, it may be used.
  • If none of the above apply, and the dataset is small (up to 2 GB in size), it may be included as supplementary material to accompany articles submitted to PubMed Central (see the PubMed Central - Policies - Supplementary Materials guidance: https://www.ncbi.nlm.nih.gov/pmc/about/guidelines/#suppm).
  • If none of the above applies, select a generalist or institutional repository to deposit your data. The NIH provides specific guidance on Selecting a Repository, outlining the key characteristics that any repository should have in order to be appropriate for research data.
    • This blog post by Elliott Smith on the website FAIRSharing.org outlines how the site can be used both to identify domain-specific repositories and similar metadata standards.
    • Generalist repositories included in the NIH's Generalist Repository Ecosystem Initiative support uniform standards for data sharing.
    • The Network of the National Library of Medicine hosts a finder tool for identifying NIH-Supported Data Sharing Resources

The NIH's Desirable Characteristics for Data Repositories, a section in their Selecting a Repository guidance, outlines characteristics that should be adhered to as closely as possible when selecting a repository for data. A brief summary of the desirable characteristics follows:

  • Metadata and PIDs: descriptive metadata enables FAIRness (Findability, Accessibility, Interoperability, and Reusability). A unique identifier is assigned to at least the data record itself, and to other descriptors where possible (Creator, Organization, Subjects, etc.)
  • Easy Access: Free access for records tagged as Open Access, reuse enabled through clear licenses, employs widely used, preferably non-proprietary formats. Guidance on how to use data is clear.
  • Long-term sustainability: Repository has a long-term management plan and retention policy.
  • Curation/Provenance: Repository either provides, or allows access to people who provide, curation or quality control assistance.
  • Security/Integrity/Confidentiality: Repository’s levels of security match sensitivity of the data. User confidentiality assured.

Any repository chosen for storing human data should also meet the Additional Considerations for Repositories Storing Human Data (even if de-identified) as outlined in the Selecting a Repository guidance. These considerations include stricter controls than the above, and should be reviewed carefully.

Describe How to Access Data and Timelines

Describe in your Data Management and Sharing Plan (DMSP) how your datasets, once deposited, will be accessible online. In addition, aim to provide the best possible estimate of a data sharing timeline based on publishing, policy compliance, and patent/ownership concerns.

Data Accessibility:

  • Explain in your DMSP how datasets or subsets of datasets will be accessible once shared. For instance, describe the kind of identifier that will be assigned when the data is stored online, such as a Digital Object Identifier (DOI) or another type of permanent URL.

Timelines for Data Sharing:

  • Per the NIH, the scientific data generated by the project should be shared as soon as possible, no later than the time of an associated publication or the end of the performance period, whichever comes first.
  • Journals may also have policies for sharing datasets associated with publications. If you know which journal you will publish in at the time of the DMSP submission, and if it has data sharing requirements, you may document them in the DMSP along with how this will affect your data sharing timelines.
  • Though data itself may not be patent-able, aspects of your project may need to remain confidential for certain periods of time for license or potential patent-related reasons. Check with your project team and all data owners/collaborators, as well as Northwestern University's Innovations and New Ventures Office to determine patent-related limitations on data sharing timelines
  • Describe, where possible, how long the shared data will be made available. The NIH institute or center funding the award may have specific requirements; in other cases, length of availability may depend on the requirements of publishers and similar partners.