Where and how can I publish my dataset?
There are literally thousands of data repositories out there, and so it’s important that you find the right one for your dataset. Please contact your subject area librarian or a Research Data Service team member if you’d like help selecting a repository or depositing your data.
What to look for in a repository
- Does it follow FAIR guidelines?
- (F)indable: Data are assigned a globally unique and persistent identifier, described with rich metadata, and registered or indexed in a searchable resource
- (A)ccessible: Data are retrievable by their identifier using a free and universally implementable protocol, with metadata available even if data are no longer available
- (I)nteroperable: Metadata use a formal, accessible, shared, and broadly applicable language/vocabulary for knowledge representation.
- (R)eusable: (Meta)data are released with a clear and accessible data usage license, are associated with detailed provenance, and meet domain-relevant community standards
- Does it adhere to your funder's requirements?
- Does it allow you to provide the metadata that will help your data be easily found?
- There are various metadata standards catered to certain research fields, so you'll want to be sure the repository can meet your metadata needs.
- Does it have the functionality that your dataset needs?
- Can it handle the size of your dataset?
- Can it handle the structure that you want for your dataset?
Registries/Lists of Data Repositories
- Re3data.org (index of 2,000 data repositories across all disciplines)
- Nature's list of recommended data repositories
- NIH-supported domain-specific repositories
Repository options hosted by Princeton
Princeton Research Data Repository
Princeton has an institutional repository, called DataSpace, for archiving and publicly disseminating digital research data generated by members of the Princeton community.
Other data publishing options at Princeton
- Office of Population Research’s Data Archive - https://opr.princeton.edu/archive/
General, cross-disciplinary data repositories that might be right for your data
Zenodo - https://zenodo.org
All research outputs from across all fields of research are welcome; Zenodo accepts any file format as well as both positive and negative results, and promotes peer-reviewed openly accessible research
Dryad - https://datadryad.org
Dryad hosts research data underlying scientific and medical publications. Most data in the repository are associated with peer-reviewed journal articles, but data associated with dissertations and books are also accepted.
Open Science Framework - https://osf.io
OSF is a free and open source project management tool that provides support through the entire project lifecycle, including pre-registration, collaboration, and storage and publication of data.
Some examples of funder-specific repositories
NIMH Data Archive - https://nda.nih.gov
The National Institute of Mental Health Data Archive (NDA) makes available human subjects data collected from hundreds of research projects across many scientific domains. In addition to NIMH, other institutes use NDA as well, including NIAAA (NOT-AA-19-020).
NIH general - https://nih.figshare.com
A brand new general data repository for all NIH-funded researchers, developed in partnership with Figshare
NSF National Centers for Environmental Information -http://www.ncei.noaa.gov
NCEI is a consolidation of the former National Oceanographic Data Center (NODC), National Climatic Data Center (NCDC), and National Geophysical Data Center (NGDC).
Some examples of discipline-specific repositories
DataONE - https://www.dataone.org
Data Observation Network for Earth (DataONE) is the foundation of new innovative environmental science through a distributed framework and sustainable cyberinfrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data
VertNet - http://vertnet.org
VertNet is a NSF-funded collaborative project that makes biodiversity data free and available on the web. VertNet is a tool designed to help people discover, capture, and publish biodiversity data.
Open Neuro - https://openneuro.org
A free and open platform for sharing MRI, MEG, EEG, iEEG, and ECoG data
Social & Behavioral Sciences
ICPSR - https://www.icpsr.umich.edu/icpsrweb/ICPSR
ICPSR maintains a data archive of more than 250,000 files of research in the social and behavioral sciences. It hosts 21 specialized collections of data in education, aging, criminal justice, substance abuse, terrorism, and other fields.
Code Ocean - https://codeocean.com
Code Ocean is a research collaboration platform that covers the entire lifecycle from the beginning of a project through publication. With direct access to cloud computing and reproducibility best practices built in, no extra software or hardware is needed.
CORE - https://hcommons.org/core
Commons Open Repository Exchange is a repository that allows users to preserve their research and increase its reach by sharing it across disciplinary, institutional, and geographic boundaries.