- Borrow & Request
- Meet & Study Here
- Tech & Print
After defining what we mean by data, it is helpful to consider what types of data you create and/or work with, and what format those data take. Your data stewardship practices will be dictated by the types of data that you work with, and what format they are in.
Data types generally fall into five categories:
- Captured in situ
- Can’t be recaptured, recreated or replaced
- Examples: Sensor readings, sensory (human) observations, survey results
- Data collected under controlled conditions, in situ or laboratory-based
- Should be reproducible, but can be expensive
- Examples: gene sequences, chromatograms, spectroscopy, microscopy
Derived or compiled
- Reproducible, but can be very expensive
- Examples: text and data mining, derived variables, compiled database, 3D models
- Results from using a model to study the behavior and performance of an actual or theoretical system
- Models and metadata, where the input can be more important than output data
- Examples: climate models, economic models, biogeochemical models
Reference or canonical
- Static or organic collection [peer-reviewed] datasets, most probably published and/or curated.
- Examples: gene sequence databanks, chemical structures, census data, spatial data portals.
Research data comes in many varied formats: text, numeric, multimedia, models, software languages, discipline specific (e.g. crystallographic information file (CIF) in chemistry), and instrument specific.
Formats more likely to be accessible in the future are:
- Open, documented standards
- In common usage by the research community
- Using standard character encodings (ASCII, UTF-8)
- Uncompressed (desirable, space permitting)
A table with appropriate and recommended formats for preserving and sharing research data over the long term can be found in the ScholarsArchive@OSU user guide.
Sources: University of Edinburgh Information Services
University of Oregon Libraries
California Digital Libraries
The ETDplus project has published a File Formats guidance brief. It is a short "how to" document written for a student audience, designed to assist students with data management issues related to their theses and dissertations.
You can access the six Guidance Briefs from ETDplus through the Tools and Resources page.
121 The Valley Library
Corvallis OR 97331–4501