Today: McKeldin End 24 hours at 8pm. Campus ID required 11pm - 8am

File Formats for Data

File formats play an important role in facilitating replication, new research, and long-term preservation. While you may collect and analyze data in specialized instrument or application formats, you should aim to share and archive data in formats that can be opened and manipulated by a wide variety of software—sometimes called non-proprietary, open, or platform-independent formats.

The chart below contains some file formats that are suitable for sharing and archiving data. For additional information, see the links following the chart. If you submit data to a repository or archive, the data managers will probably request specific formats.

Please suggest additions or revisions – email us

More information on file formats:

Sustainability of Digital Formats (Library of Congress)

Examples of open formats (Wikipedia)

Type of data Recommended formats

Tables, spreadsheets, and databases

Tab-separated tables (.txt — sometimes .tsv or .tab)


Comma-separated tables (.csv or .txt)


Other standard delimiter (e.g. colon, pipe)




OpenDocument Spreadsheet (.ods)


OpenDocument Database (.odb)

Multidimensional arrays

Common Data Format (.cdf)


Network Common Data Format (.nc)


Hierarchical Data Format (usually .hdf or .h5)


See recommendations for tables, spreadsheets, and databases (above)


Data Documentation Initiative (.xml)

Web data

Javascript Object Notation (.json)


Extensible Markup Language (.xml)


Hypertext Markup Language (.html)

Image files

JPEG (.jpg or .jp2)


Portable Network Graphics (.png)


Scalable Vector Graphics (.svg)


TIFF (.tiff or .tif)


Portable Document Format (.pdf)

Sound files

WAVE (.wav)


FLAC (.flac)


MPEG-3 (.mp3 — usually suitable for human voice and moderate-quality audio, but may not be suitable for high-fidelity audio)

Video files

MPEG-4 (.mp4)


JPEG 2000 (.mj2)

Geospatial data

Geo-Referenced TIFF (.tiff)


ESRI Shapefile (.shp, .shx, .dbf)


Keyhole Markup Language (.kml)

Software code

Plain text (usually with an extension that represents the source language)


Plain text (.txt)


Portable Document Format (.pdf)


LaTeX documents (.tex)


Hypertext Markup Language (.html)


Open Document Format (.odt)


Extensible Markup Language (.xml)


EPUB (.epub)