About DataSpace


Please note that we are in the process of moving from DataSpace to our new research data repository. 

All new submitters, please use Princeton Data Commons. 


DataSpace is an online repository designed for archiving and publicly disseminating digital objects which are the result of research, academic outputs, or administrative work performed by members of the Princeton University community. It is home to several research collections, organized by department and/or research center, as well as Princeton Theses and Dissertations and some library collections searchable via the Library Catalog.

Browse DataSpace Collections

Why does Princeton University have an online data repository?

Princeton University is committed to supporting research that has the potential to benefit humanity at large (see Princeton's spotlight on research). Very often, research conducted by members of the Princeton community is publicly funded and therefore entails an explicit responsibility to share the products of the research openly with taxpayers. Some academic journals have data policies requiring that the evidence backing submitted research papers be posted online for peer review and public scrutiny. An increasing number of Princeton researchers also embrace the principles of open research in general, committing to share their data and/or code for the sake of transparency, accountability, and the democratization of knowledge, regardless of whether their funders and publishers require it. DataSpace exists as an institutional conduit for long-term archiving and open dissemination of any digital research product stemming from the Princeton community, meeting funder and publisher expectations, as well as open research principles.

Who has access to items in DataSpace?

Anyone can browse, download, and investigate the items stored in DataSpace, free of charge. Many, if not most, of these items have an intended audience of other specialists in a given field of research, however, and DataSpace does not provide guides for interpretation or understanding beyond the technical documentation and references to related publications supplied by contributing researchers. In addition, DataSpace users should be cautioned that individual contributors typically retain copyrights to their research works, so permissions may be required for some forms of use and re-use, depending on the item.

Who contributes to DataSpace?

All faculty, staff, students, and affiliates of Princeton University are welcome to submit their digital research products to DataSpace. If you are a Princeton-affiliated researcher and would like to submit your work to DataSpace, please contact the DataSpace curators to get started.

What types of resources are included in DataSpace?

DataSpace includes digital research data and programming code from all academic domains at Princeton--engineering and applied sciences, humanities, natural sciences, and social sciences--as well as digital copies of Theses and Dissertations dating as far back as 1924. Data files range from tabular datasets, to images and videos, to output files from specialty instruments. Code files include analysis and simulation scripts, along with setup files and source code for custom software. Items with large files or many files organized in a folder hierarchy are typically compressed into consolidated archive files. Most items also include plain text “README” files to explain what is provided and how it can be used. (For storing and browsing written scholarly works produced by members of the Princeton University community, please refer to the Open Access Repository. To search for senior theses, please visit the Library Catalog.)

How should published datasets be cited?

While DataSpace includes items that are licensed to the public domain, the majority of items require attribution for fair use. Moreover, published datasets (and accompanying code) are increasingly recognized as stand-alone products of academic research, warranting citation customs similar to those of articles and monographs to ensure that the contributors are given credit for their painstaking work (see a recent Comment article in Nature). Conventions for data citation vary by discipline and publisher, but general guidelines are provided in the Joint Declaration of Data Citation Principles. Further general considerations can be found in a recent article in Nature Scientific Data, and the Princeton University Library has a guide for citing social scientific data. At a minimum, the DataSpace curators recommend that citations for items in DataSpace include the creators/contributors, year of publication, title, publisher, and persistent identifier (DOI, if available).

Some style guides offer specific standards for dataset citation. For example, the APA Publication Manual, 7th Ed. (2020), Section 10.9 provides for an author-date in-text citation format very similar to that of other research works, with a reference list item format that is supplemented by data repository identifiers, version numbers, and brief type description (see the APA's own example). Here is an example citation for a DataSpace item following the APA style:

LaChance, J. & Cohen, D. (2020). Fluorescence Reconstruction Microscopy Data: Keratinocyte (10x) Phase to Nuclei (DAPI) [Data set]. DataSpace. https://doi.org/10.34770/0pt9-qd20.

As an alternative example, the IEEE Reference Guide (V 11.12.2018), Section II-D provides details for citing datasets with different forms of identifiers. Here is the above example DataSpace item given instead in the basic format for IEEE:

J. LaChance and D. Cohen, “Fluorescence Reconstruction Microscopy Data: Keratinocyte (10x) Phase to Nuclei (DAPI).” (March 2, 2020). Distributed by DataSpace. https://doi.org/10.34770/0pt9-qd20.

And as a third example, Nature journals allow research datasets to be cited in reference lists if they have DOIs. Here is the above example DataSpace item citation again in the Nature format:

LaChance, J. & Cohen, D. Fluorescence Reconstruction Microscopy Data: Keratinocyte (10x) Phase to Nuclei (DAPI). DataSpace https://doi.org/10.34770/0pt9-qd20 (2020).

How is DataSpace implemented?

DataSpace is an implementation of the DSpace platform, a common software solution for universities hosting digital repositories. Staff from the Princeton Office of Information Technology customized DSpace for Princeton’s use case, and staff from the Princeton University Library are currently involved in updating and improving the platform.

Who manages DataSpace?

Research data communities in DataSpace are managed and curated by the Princeton Research Data Service, with technical support from information technology staff at the Princeton University Library. The curators of DataSpace review submissions with an eye toward discoverability, re-usability, and long-term preservation--without partiality to the subject matter or findings of the research. For general questions related to DataSpace, please contact the curators. For help accessing theses and dissertations, please contact Mudd Library.


Go back to the Research Lifecycle Guide


Go to DataSpace