- Borrow & Request
- Help
- Meet & Study Here
- Tech & Print
- About
To maximize the ability to share, preserve and re-use digital files, carefully consider the format you use for digital files. Selection of a file format can help you in the future by limiting the chances of your data becoming obsolete when a proprietary format is no longer supported or available.
Formats more likely to be accessible in the future are:
Non-proprietary
Open, documented standards
In common usage by the research community
Use standard character encodings (ASCII, UTF-8)
Uncompressed (desirable, space permitting)
Use the table below to find an appropriate and recommended format for preserving and sharing your digital files over the long term.
Most content deposited to ScholarsArchive@OSU is textual in nature: theses and dissertations, research articles, presentations, technical reports, conference proceedings, posters, etc. The PDF file format is required for this content. PDF/A-1 -- ISO 19005-1 is preferred with fonts embedded (.pdf). PDF without fonts embedded is also acceptable but not recommended. To save a Microsoft word document as a PDF with fonts embedded, follow these simple instructions: https://www.bc.edu/content/dam/files/libraries/pdf/embed-fonts.pdf.
For other content types--such as quantitative and statistical data, spreadsheets, databases, graphics, audio, and video (among others)--use the table below to find an appropriate and recommended format for preserving and sharing your digital files in ScholarsArchive@OSU over the long term.
Format |
Highest Confidence |
Medium Confidence |
Lowest Confidence |
Text |
Plain text -- US-ASCII, UTF-8, UTF-16 with BOM (.txt) SGML with included DTD (.sgm, .sgml) XML with included schema (.xml) PDF/A-1 -- ISO 19005-1 (.pdf) |
Plain text -- ISO 8859-x (.txt) Rich Text Format 1.x (.rtf) Cascading Style Sheets (.css) HTML (.html, .htm) LaTeX with referenced files (.latex, .tex) OpenDocument Text (.odt, .sxw) MS Word 2007+ (OOXML) (.docx) PDF with fonts embedded (.pdf) |
Microsoft Word (.doc) WordPerfect (.wpd) all others |
Digitized Books, Maps, Paper etc. |
JPEG2000 -- lossless (.jp2) TIFF -- uncompressed (.tiff) PDF/A-1 -- ISO 19005-1 (.pdf) |
n/a |
All others |
Raster Graphics |
TIFF -- uncompressed or CCITT 4 compressed (.tiff) JPEG2000 -- lossless compression (.jp2) PNG (.png)--24bit true color |
TIFF -- compressed (.tiff) JPEG (.jpg) JPEG2000 -- lossy compression (.jp2) GIF (.gif) Digital Negative DNG (.dng) BMP (.bmp) PNG (.png)--8 bit indexed |
PhotoShop (.psd) MrSID (.sid) RAW files all others |
Vector Graphics |
SVG -- no JavaScript binding (.svg) PDF/A-1 -- ISO 19005-1 (.pdf) |
Computer Graphics Metafile (.cgm) |
Encapsulated Postscript (.eps) Macromedia Flash (.swf) all others |
Digitized Audio |
BWAV LPCM (.bwav, .wav) 24-bit, 96kHz |
n/a |
all others |
Born Digital Audio |
AIFF -- PCM (aif, aiff) LPCM codec. WAV -- PCM (.wav) LPCM codec |
SUN audio -- uncompressed (.au, .snd) Standard MIDI (.mid) Free Lossless Audio Codec (.flac) Apple Lossless Audio Codec (ALAC) (.m4a) MP3 (.mp3) Advance Audio Coding (.mp4) |
AIFC -- compressed AIFF (.aifc) RealAudio (.rm, .ra) Windows Media Audio (.wma) WAV -- compressed (.wav) Ogg Vorbis (.ogg) (LOSSY) all others |
Digitized Video |
FFV1/Matroska (.mkv) AVI -- uncompressed (.avi) QuickTime -- uncompressed, motion JPEG (.mov) Uncompressed .mxf |
Motion JPEG 2000 (.jp2) |
ProRes (.mov) |
Born Digital Video |
FFV1/Matroska (.mkv) AVI -- uncompressed (.avi) QuickTime -- uncompressed, motion JPEG (.mov) Uncompressed .mxf MPEG-4 (.mp4) H.264 |
MPEG-1, MPEG-2 (.mp1, .mp2) Ogg Theora (.ogv, .ogg) ProRes (.mov) Motion JPEG 2000 (.jp2) |
Windows Media Video (.wmv) RealVideo (.rm, .rv) all others |
Spreadsheet or Database |
Comma- or tab-separated Values (.csv, .tsv, .txt) Delimited text SIARD: Software Independent Archiving of Relational Databases (.siard) |
dBASE (.dbf) OpenDocument Spreadsheet (.ods) MS Excel 2007+ (OOXML) (.xlsx) |
Excel (.xls) all others |
Computer Programs |
Computer program source code |
Compiled / Executable files |
|
Presentation |
PDF/A-1 -- ISO 19005-1 (.pdf) |
OpenDocument Presentation (.odp) MS Powerpoint 2007+ (OOXML) (.pptx) |
PowerPoint (.ppt) all others |
Geospatial |
GeoTIFF (.tif) |
ESRI Shapefile (making sure all component files are present) (.shp, .shx, .dbf) ESRI Geodatabase (.gdb) (prefer Shapefiles) ESRI Export Format (.e00) Geography Markup Language (GML) (.gml) Keyhole Markup Language (KML) (.kml, .kmz) |
Other ESRI files |
Containers |
Zip --no compression .tar |
Zip- compressed |
All others |
Quantitative and Statistical Data (See also: Spreadsheet or Database) |
Comma- or tab-separated Values (.csv, .tsv, .txt) Structured text or markup file containing metadata information: Data Documentation Initiative (.ddi), XML (.xml), JSON (.json) SIARD: Software Independent Archiving of Relational Databases (.siard) HDF5 (.hdf) |
SPSS (.sav, .sps, .spv, .spo) SAS (.sas, .sas7dat) R (.R) HDF4 (.hdf) |
Excel (.xls) Other proprietary formats |
CAD (See also: Vector Graphics) |
Industry Foundation Class (.ifc) Standard for the Exchange of Product Model Data (.step, .stp, .p21) Initial Graphics Exchange Specification (.igs) |
AutoDesk’s Drawing Interchange File Format/Data eXchange Format (.dxf) AutoCAD (.dwg) Extensible 3D (.x3D) Universal 3D (.u3D) Portable Document Format/Engineering or PDF3D (.pdf) |
Other proprietary CAD formats |
|
MBOX EML |
MSG PST |
Table reused courtesy of University of Washington Libraries: Preferred File Formats—UW Libraries. (n.d.). Retrieved January 28, 2021, from https://www.lib.washington.edu/preservation/preservation_services/digitization-and-digital-preservation/preferred-file-formats
121 The Valley Library
Corvallis OR 97331–4501
Phone: 541-737-3331