Archiving & PreservationCiting DatasetsData Papers & JournalsData RepositoriesData Storage & BackupData Types & File FormatsDefining DataFile OrganizationIP & Licensing DataLaboratory NotebooksResearch LifecycleResearcher IdentifiersSharing Your Data
Funder MandatesGeneral Data Management PlanNSF directorate-specific infoNIH Data Sharing Plans
Metadata Schemas
This is the "Archiving & Preservation" page of the "Research Data Services" guide.
Alternate Page for Screenreader Users
Skip to Page Navigation
Skip to Page Content

Research Data Services  

Last Updated: Jan 14, 2014 URL: http://guides.library.oregonstate.edu/research-data-services Print Guide RSS UpdatesEmail Alerts

Archiving & Preservation Print Page
  Search: 
 
 

Data Archiving & Preservation

The difference between backing up & archiving

The terms "backup" and "archiving" are often used interchangeably, as they both relate to saving a specific version of a file, but they are actually very different processes. The term “backup” is used specifically when making copies of various files with the knowledge that the files may change. Backups are kept for a certain amount of time, but can be discarded after a specified time has passed. Archiving is used when a file is to be preserved as-is, often at the end of a project and acts as a static (and usually final) record. [source - DataONE education module]

Plan ahead to preserve your data

In addition to planning for local archive storage options (local server, network or OSU’s digital repository), we recommend that you investigate public data repositories within your subject area or discipline. A searchable list of repositories can be found here, and a list of repositories by discipline is here. See Data Repositories for more information on that option.

In many cases, OSU’s digital repository (or “institutional repository”) ScholarsArchive@OSU (SA@OSU) can be a suitable archive and sharing mechanism for your data. All items deposited into SA@OSU receive a persistent identifier (DOI or ARK), are freely available to anyone, and are full-text searchable, making them discoverable through Google, Google Scholar and other large search engines. If you are interested in depositing data into SA@OSU, or have further questions, please contact us (link).

Things to consider when archiving your data

  • File formats for long term access: The file format in which you keep your data is a primary factor in one’s ability to use your data in the future. Plan for both hardware and software obsolescence. See the section Organizing Files and File Formats for details on preferrable long-term storage file formats.
  • Don’t forget the documentation: Document your research and data so others can interpret the data. It is important to begin to document your data at the very beginning of your research project and continue throughout the project.
  • OSU/OUS data retention policy
    University faculty and researchers have a responsibility to maintain research data and make that data available for preservation by the University both as a matter of research integrity, and because of the University’s ownership rights. Research data must be archived for five years after the closeout, final reporting or publication of a project, with original data retained wherever possible. Additional data sharing and/or archiving requirements may be imposed by the sponsoring agency; the PI is responsible for complying with such requirements.
  • Ownership and privacy
    Make sure that you have considered the implications of sharing data, in terms of copyright and IP ownership, and ethical requirements like privacy and confidentiality. Data generated by research projects at or under the auspices of Oregon State University are owned by the University. However, the principle investigator (PI) is responsible for retention, preservation, distribution, and control of the data.

Maintaining the integrity of your data

Digital data are fragile, regardless of which storage medium you choose (DVD, hard disk, tapes, etc.). Digital data are susceptible to bit rot, and are likely to degrade or decay over time. The recommended methods for combatting bit rot are refreshment and replication.

Refreshment: Periodically copy your data onto a new drive or disk (every 2-5 years).
Replication: Maintain your original copy, an external copy, and an external remote copy. Use at least two forms of storage in two different locations.

For long-term archiving of finalized data, personal computers and external storage devices are NOT recommended. Networked file servers managed by the information services group in your college or department, or OSU’s centralized computing group (link) is the best choice. See the OSU Community Network (CN) services and pricing for more details.

Software Obsolescence

Does anyone remember Quattro Pro or Lotus 1-2-3? Exactly. When you archive the final version of your dataset(s), consider using an open, non-proprietary format to ensure that you will be able to fully access it/them in the future. Common file formats for text-based data are plain text (ASCII), HDF and NetCDF. Multimedia formats include JPEG 2000, MNG and PNG. For a list of many other open formats, see here.

If you prefer to keep your data in a proprietary format, there are a couple of ways to ensure continued access to older datasets. When new software versions are released and become established, migrate your older datasets to the newer version or package. In the case of software that becomes obsolete, you may be able to emulate the older software using a virtual machine. The recommended best practice however, is to convert your data to an open format, which facilitates both preservation and sharing.

Adapted from: University of Oregon | Univeristy of Virginia

Description

Loading  Loading...

Tip