A file naming convention is any set of rules that your research team establishes to consistently assign names to your research files. A descriptive filename may be up to 256 characters long, excluding the file extension. Best practice is to make your file name simultaneously as descriptive as possible and as short as possible, within the character limit.
The following recommendations for developing a file naming convention are adapted from Stanford Libraries Data Management Services “Data Best Practices and Case Studies - How to Name Files".”
An example file name: 20180817_StrsHlth_Survey1_43584_KGH_verbal_v01.pdf
Document and post your team's agreed-upon file naming conventions in a place where other operational documents are stored
Data stored in spreadsheets serve as fuel for analyses. To move more effectively from source worksheet to data manipulation software, it helps to adopt tidy data practices.
The precepts behind tidy data are simple, but if applied correctly they allow for complicated manipulations. The basic guidelines for tidy data are:
1. Make variables (related qualities measured across units) the columns in a spreadsheet.
2. Make observations (representing units or instances of study) the rows in a spreadsheet.
A final golden rule: don't combine variables.
An example of variable combining can be see in the table below:
Participants | Male<30 | Male>30 | HrtRate |
Participant 1 |
1 | 80 | |
Participant 2 | 1 |
90 |
In the table above, the variables for gender and age range were combined. A tidy version of this table is below:
Participants | Gender | Age | RestingHeartRate | ExerciseHeatRate |
Participant 1 | M | 49 | 80 | 150 |
Participant 2 | M | 21 | 90 | 170 |
Additional best practices when working with data in spreadsheets:
References:
Creating an organization system for your digital folders can be a challenging task, especially if team or project folders will be shared with others. Follow these basic tips to maximize your use of folders:
References:
Working with data in digital files requires careful versioning to ensure the accuracy and authenticity of both original and modified data. File versioning can be achieved through either manual methods or web-based services.
Manual versioning
The simplest way to practice versioning through manual file management is to save a completely new file, which incorporates all the changes made to the original, with a slightly modified filename that allows recording of the version. If you know at the beginning of a research project that files may go through many versions, you may title the original document with "v01" at the end of the filename. Example filenames are:
Appending the version ("v") addition to the filename can be done at the beginning or end, depending on preference, and as long as the preference is documented and followed in the researcher's data management procedures. Starting with "01" allows for the possibility of version numbers reaching the double digits.
Automated versioning
Several online collaboration and filesharing services offer versioning or version control as part of the service. Three popular services are listed below:
Northwestern OneDrive: Northwestern OneDrive is a cloud-based storage and collaboration system available to faculty, staff, and students for storing or sharing files. OneDrive also offers access to all previous versions of stored documents. Changes to files stored in OneDrive are saved contemporaneously, and version history is available for all files. Version history is available for all changes, regardless of who made them (either the document owner or a collaborator).
Google Drive: Like Box, files stored in Google drive offer unlimited collaboration and access to all prior versions, regardless of who updated a file. In addition files in the Drive offer real-time editing.
GitHub: Git at its core is a version management tool, allowing developers working on the same software projects to make changes and update code without overwriting previous versions. Project and data managers can use GitHub for the same function. Changes can be made by cloning a copy of the software or files from Git, then committing the changed files back to the repository in a new version. For more details see "Understanding the GitHub Flow."