Skip to Main Content

Data Management

Documenting Your Data

Metadata is data about data. Descriptive information, associated with your datasets, will help you and others make sense of your data and properly cite your work. If you plan on depositing in a data repository, consult the repository directly about their metadata requirements. Most data repositories have their own metadata standards. 

Questions to Consider: 

  • How will you document your data and project?
  • How will you organize your files into directories, and what naming conventions will you apply? 
  • Which file formats will you use for your data, and why?
  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?

(Sources: UC BerkeleyUMass AmherstUniversity of Michigan, Alix Keener, Creative Commons Attribution 4.0 license.)

Best Practices

Best Practices and Standards 

File Naming Conventions

  • Be consistent and descriptive in naming, formatting, and organizing files.

  • Be specific and obvious about what the files contain.

  • File names should allow you to identify precise research.

You might consider including some of the following information in your file names, but you can include any information that will allow you to distinguish your files from one another:

  • Project or experiment name or acronym

  • Location/spatial coordinates

  • Researcher name/initials

  • Date or date range of experiment

  • Type of data

  • Conditions

  • Version number of file

  • Three-letter file extension for application-specific files

(Adapted from Stanford University Libraries)

Recommended File Formats for Preservation

Certain file formats are more stable and more likely to be accessible in the future. File formats recommended for preservation integrity have the following characteristics: 

  • Non-proprietary

  • Open, documented standard

  • Common usage by research community

  • Standard representation (ASCII, Unicode)

  • Unencrypted

  • Uncompressed

Examples of preferred format choices:

  • PDF/A, not Word

  • ASCII, not Excel

  • MPEG-4, not Quicktime

  • TIFF or JPEG2000, not GIF or JPG

  • XML or RDF, not RDBMS

Source: MIT 

Resources

The metadata standards and data documentation you choose will vary by your discipline and research project. Most funders who require a data management plan as part of a proposal will offer guidance on the preferred methods of documenting. Additionally, the repository you chose to share data may have their own established standards.

Examples of Best Practices and Standards for Documenting Data:

Data Standards

File Formats

Metadata Tools

This work is licensed under a Creative Commons Attribution 4.0 International License.