Skip to Main Content

Research Data Services

Information about how to organize, describe, preserve and share your research data

What is Metadata?

Think of what information would be needed to understand and analyze your data, and/or replicate your results, 20 years from now. That’s what needs to be included in your metadata. People are fond of saying that metadata is “data about data.” NISO has a nice guide called.

For a given research project, metadata are generally created at two levels: project- and data-level. Project-level metadata describes the “who, what, where, when, how and why” of the dataset, which provides context for understanding why the data were collected and how they were used.

Examples of project-level metadata are: 

  1. Name of the project
  2. Dataset title
  3. Project description
  4. Dataset abstract
  5. Principal investigator and collaborators
  6. Contact information
  7. Dataset handle (DOI or URL)
  8. Dataset citation
  9. Data publication date
  10. Geographic description
  11. Time period of data collection
  12. Subject/keywords
  13. Project sponsor
  14. Dataset usage rights

Dataset level metadata are more granular. They explain, in much better detail, the data and dataset. (perhaps not surprisingly).

Data-level metadata might include: 

  1. Data origin: experimental, observational, raw or derived, physical collections, models, images, etc.
  2. Data type: integer, Boolean, character, floating point, etc.
  3. Instrument(s) used
  4. Data acquisition details: sensor deployment methods, experimental design, sensor calibration methods, etc.
  5. File type: CSV, mat, xlsx, tiff, HDF, NetCDF, etc.
  6. Data processing methods, software used
  7. Data processing scripts or codes
  8. Dataset parameter list, including
    • Variable names
    • Description of each variable
    • Units

More background on metadata...

  • National Information Standards Organization (NISO): Understanding Metadata
    A PDF document describing metadata, from NISO. NISO, a non-profit association accredited by the American National Standards Institute (ANSI), identifies, develops, maintains, and publishes technical standards to manage information in our changing and ever-more digital environment.
  • Three Categories of Metadata
    View a table summarizing the goals, elements, and sample implementations of the three categories of metadata presented by Cornell University Library.

Selected Metadata Standards & Tools

Discipline Standard Tools
General Research Data    DataCite metadata schema
- open-source tools from DataCite
Social, behavioral, and economic sciences Data Documentation Initiative (DDI)

- see all DDI tools

- Nesstar Publisher: standalone tool to create DDI metadata (Windows only)

- Colectica: Excel add-on that creates DDI metadata (Windows only)

Ecology & Biology Ecological Metadata Language (EML)

- see all EML tools

- Morpho: standalone tool (all platforms), creates metadata, edit data, upload both to the Knowledge Network for Biocomplexity

- DataUp: Excel add-on (Windows only) or web-based tool (all platforms) creates EML metadata, and checks your spreadsheet for best practices compliance

Geospatial ISO 19115 
(in transition from FGDC CSDGM)

- NOAA's NCDDC page on metadata contains very useful links to information about metadata standard ISO 19115 for environmental data, key words and controlled vocabularies, and the MERMAid (Metadata Enterprise Resource Management Aid) tool for creating and publishing metadata.

- Federal Geographic Data Committee (FGDC) page on metadata & metadata tools- USGS metadata page- Metavist: for FGDC metadata creation

Readme template

Sometimes existing metadata standards are not appropriate for a particular dataset, or cannot be used by a particular research group for a variety of reasons. In this case it is important to look for alternatives and decide how the research group is going to document its datasets. The most simple way of recording metadata is to do it in a text file. We often refer to these files as readme files. Structuring the information is in a readme file is a good idea to make sure that the metadata will be thorough and complete, and to make it easier for members of the same team to share the data. A readme template should be tailored to the research group and to the kind of data that it is documenting to be most useful. 

A generic readme template can be found below. This template is published under a CC0 license, feel free to modify it and reuse it as you wish. This template has been designed with the goal of documenting datasets that are made publicly available in a repository, but it can be adapted to working datasets.  

Quick Tools

If you are looking for useful tools to address your data management challenges, start with these:

Other resources

The ETDplus project has published a Metadata guidance brief. It is a short "how to" document written for a student audience, designed to assist students with data management issues related to their theses and dissertations. 

You can access the six Guidance Briefs from ETDplus through the Tools and resources page