Skip to Main Content

Data Management

About This Guide

About This Guide

This guide pulls together data management resources from the web and explains available University of Rhode Island services. 

Managing your research data means making plans for how the data will be created, described, shared, stored, and re-used. Well organized, documented, and preserved data will be easier for you to find, use, and analyze, and for your collaborators to use. Most research and federal funders now require a Data Management Plan (DMP to be submitted as part of a grant proposal. Depending on the funding agency, the DMP format and requirements will vary.

Regardless of your funding agency's specific requirements, it is helpful to think of data management in terms of both policies and tools. For each area of data management, you will need to first determine your approach, and then find tools or standards to fit the needs of your data. 

What is Data?

Data is defined as the digital recorded factual material commonly accepted in the scientific community as necessary to validate research findings including data sets used to support scholarly publications but does not include (OMB Circular 110):

Some examples of research data:

  • Documents (text, Word), spreadsheets
  • Laboratory notebooks, field notebooks, diaries
  • Questionnaires, transcripts, codebooks
  • Audiotapes, videotapes
  • Photographs, films
  • Protein or genetic sequences
  • Spectra
  • Test responses
  • Slides, artifacts, specimens, samples
  • Collection of digital objects acquired and generated during the process of research
  • Database contents (video, audio, text, images)
  • Models, algorithms, scripts
  • Contents of an application (input, output, logfiles for analysis software, simulation software, schemas)
  • Methodologies and workflows
  • Standard operating procedures and protocols

Some kinds of data might not be sharable due to the nature of the records themselves, or to ethical and privacy concerns. As defined by the OMB, this refers to:

  • preliminary analyses,
  • drafts of scientific papers,
  • plans for future research,
  • peer reviews, or
  • communications with colleagues

Research data also do not include:

(A) Trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law; and

(B) Personnel and medical information and similar information the disclosure of which would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.

(Adapted from University of Oregon Libraries)

The Data Life-Cycle

Steps in the Data Life-Cycle

  • Plan: description of the data that will be compiled, and how the data will be managed and made accessible throughout its lifetime.
  • Collect: observations are made either by hand or with sensors or other instruments and the data are placed a into digital form.
  • Assure: the quality of the data are assured through checks and inspections.
  • Describe: data are accurately and thoroughly described using the appropriate metadata standards.
  • Preserve: data are submitted to an appropriate long-term archive (i.e. data center).
  • Discover: potentially useful data are located and obtained, along with the relevant information about the data (metadata).
  • Integrate: data from disparate sources are combined to form one homogeneous set of data that can be readily analyzed.
  • Analyze: data are analyzed 

(Source: DataONE)

Sharing & Re-Use

Making your data accessible to other researchers is one of the primary goals of data management. Increasingly, there are many online repositories that can host your data. Before you decide on a repository or method of sharing, consider what data will be shared and whether you need to place any restrictions or conditions on the data. 

Questions to Consider: 

  • How will you make the data available? 
  • Who is responsible for managing and controlling the data? 
  • What data will be shared (raw/derived/published)? 
  • How long must the data be retained?
  • Are there issues with privacy or intellectual property (for example, personal, high-security, or commercially sensitive data)?
  • Who owns the data (intellectual property rights information)? 
  • Under what conditions will data be shared (embargo/CC licenses/upon request)? 

(Sources: UC BerkeleyUMass AmherstUniversity of Michigan, Alix Keener, Creative Commons Attribution 4.0 license.) 

This work is licensed under a Creative Commons Attribution 4.0 International License.