Data Security

While ensuring the security of any research data is important, it is especially important when working with human subjects data. Protecting data collected from human subjects is critical, and the stewardship of such data is guided both by regulatory and ethical principles.

Princeton University’s Office of Research Integrity and Assurance (RIA) is the authority of record for human subject research data at Princeton, and provides guidance about data security on their website: https://ria.princeton.edu/research-data-security

What is data security?

Data security refers to the protection of data from unauthorized access, use, change, disclosure and destruction and includes network security, physical security, and file security.
To protect research data appropriately and effectively, you need to be able to identify the appropriate data classification, which defines the necessary security control requirements for protecting research data.

Data categorization and levels of data security

Research data involving human subjects at Princeton are categorized into four categories. This categorization impacts how the data should be treated during collection, analysis, publication, and storage and so it’s important to correctly categorize your data as early as possible.

Classification	Examples Include:
Restricted	Social security number Bank account number Driver’s license number State identity card number Credit card number Protected health information-- as defined by HIPAA Personally identifiable information (PII) - any information that is uniquely associated with an individual person Research data are considered highly sensitive when there is a heightened risk that disclosure may result in embarrassment or harm to the research subject. Information that could have adverse consequences for subjects or damage their financial standing, employability, insurability, or reputation should be adequately protected from public disclosure, theft, loss or unauthorized use, especially if it includes PII.
Confidential	All non-Restricted information contained in personnel files Donor records
Unrestricted within Princeton	Course descriptions in the Employee Learning Center Web-based resources designed for University use
Public	Public-facing campus directories Course offerings Press releases Departmental websites

Human subjects data

Human subjects data has its own classification. Human subject is defined a living individual about whom an investigator (whether professional or student) conducting research:

(i) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or

(ii) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens.

If your data meets this definition, see RIA’s Research Data Security. You can find a full summary table of research data categories provided by the RIA office here, but here is a quick summary:

Does not contain personal identifying information (PII)

Information that contains neither personal identifiers nor enough specific data to allow inference of subject identities, such as de-identified data from a survey or experiment.

Contains personal identifying information (PII)

Sensitivity Level 1: Benign information about individually identifiable people
- Examples include: Data from a survey about reading habits; Data from an experiment on pattern recognition
Sensitivity Level 2: Sensitive information about individually identifiable people
- Examples include: Data on employment history, personal relationships; Data from an experiment on racial attitudes
Sensitivity Level 3: Very sensitive information about individually identifiable people
- Examples include: Data on sexual behavior, illegal drug use, criminal behavior, or crime victimization; Data from medical and mental health records

Anonymous vs Confidential

The difference between anonymous and confidential as they are used in different phases of a research study.

Anonymous

During recruitment: a participant’s involvement is anonymous if it is impossible for anyone, even the researcher, to know whether or not that individual participated in the study.
- Ex: an online survey that cannot be linked in any way to the individual.
Collected research data: The data are anonymous if no one, even the researcher, can connect the information back to the individual and the data does not contain any PII.

Confidential

During recruitment: if the research team knows that a particular individual has participated in the research and is obligated to protect that information from disclosure to others outside of the research team.
- Ex: a consent form that documents the individual's participation must be treated as a confidential document.
Collected research data: The data are confidential when there is still a link between the data and the identity of the individual.
Ex: a study ID number that is common to both the de-identified data and the corresponding list of names or other types of PII.

For help categorizing your data, contact: IRB@princeton.edu

For help finding active data and storage infrastructure that’s appropriate for your data, contact:

Research Computing: cses@princeton.edu
Or a PRDS team member: prds@princeton.edu

Storing data

The classification of data affects the University storage requirements. See Protect Our Info for more details.

Encryption requirements by classification

Restricted	Information must be encrypted
Confidential	Encryption is not required, but recommended
Unrestricted within Princeton	Encryption is not required, but recommended
Public	No Requirements

Storage tools

Network Drive	Approved for the storage of all classification types
Google Drive	NOT approved for the storage of Restricted information. Permitted for all other classification types.
Microsoft 365	Approved for the storage of documents containing Restricted information, with the exception of protected health information (as defined by HIPAA).
External Drive	Approved for the storage of all classification types. Note the encryption requirements above.
Laptops / Desktops	Approved for the storage of all classification types. Note the encryption requirements above.

For help finding active data and storage infrastructure that’s appropriate for your data, contact:

Research Computing: cses@princeton.edu
Or a PRDS team member: prds@princeton.edu

Researcher responsibilities

Data security and classification is everyone’s responsibility, but there is particular guidance for PIs. For students and postdocs, when in doubt talk with your PI.

Do not divulge, copy, release, sell, alter, or destroy information unless necessary.
Understand the classifications and classify your data appropriately.
Contact the Office of General Counsel prior to disclosure for legal purposes.
Contact the appropriate office prior to disclosure to regulatory agencies, inspectors, examiners, and/or auditors.