To document in your Data Management and Sharing Plan (DMSP) the types of data that are anticipated to be created in your project, provide your answers to the following questions:
- What is your data's modality?: In the context of the NIH DMSP, modality refers to the high-level type of data you will be collecting, such as imaging, genomic, text sequences, modeling data, mobile, survey, etc.
- In what format will your data be collected?: Format refers to the type of files, generally denoted by file extension, that your project will create, such as CSV, TSV, XML, JSON, fMRI files, SAV, SAS, DTA
- How much data will be collected?: Number of study participants, anticipated number of generated files, etc.
- What is the level of anticipated data aggregation?: Will individual-level data be used for the study, or aggregates of groups of data? Will only aggregated data be shared?
- To what level will the data be processed?: Overall data processing for the project can be described, and also level of processing for data to be shared.
- Which data will be shared?: Based on aggregation and processing considerations above, describe which data can and will be shared at the project's end. For example, this can be only de-identified data, only de-identified subsets supporting publication, etc. Compliance can consist of sharing a subset of the project's data based on legal, ethical, and policy-based data privacy requirements. Rationale for the level of sharing possible must be provided.
- What metadata and other standard documentation will help others understand the data?: Many metadata standards can be used to describe your datasets, from the most general (Dublin Core, DataCite) to NIH-endorsed Common Data Elements, to standards that are highly specified to a field of study, such as the MIAME and MINSEQE standards.
- To find domain-specific metadata standards, try searching these standards catalogs:
- In addition, if other documentation such as data dictionaries or README files is necessary to understand preserved and shared data, describe these files as well.
Domain-specific standards resources from University of Michigan Library's Research Data Management (Health Sciences) Research Guide, https://guides.lib.umich.edu/datamanagement/describe