Why sharing data?

Sharing data and related materials such as analysis codes that have been produced in the context of research projects is more and more encouraged and has become a requirement for many research funders and academic journals. This is expected to contribute to the transparency, replicability, and reproducibility of empirical social science research. At the same time, researchers are confronted with strict data protection regulations and are sometimes insecure about what this implies for sharing their data and other research materials. Data protection regulations are particularly relevant for social science research which often includes the collection of sensitive data. 

The new Federal Act on Data Protection (nFADP) is now effective in Switzerland.

  • Do you have questions on what data protection laws imply for your research?
  • Are you insecure how to share your data and related research materials while respecting the legal requirements?
  • Do you want to know where to share your data and related research materials safely?

The online symposium on “Data sharing in the light of the new Data Protection Act” held on September 12, 2023 attempted to answer these questions.

The slides are already available here.

To watch the video again, click here.

The answers to the questions raised during the symposium are below.

Your questions & our answers

New legislation and changes

Compared to the old FADP, what specific changes does the new one require concerning the practice of sharing personal data?

The duty to inform is the main change.

Are there any changes or implications to the storage of research data with the new data protection law?

Not really. Security remains the watchword. The idea is to comply with the highest standards in force, and to design for safety right from the start (by default).

In addition, you should avoid working with cloud-based storage solutions that are based in countries with insufficient levels of protection. Besides, outsourcing storage is data communication.

Does the new protection law only apply to research projects and data collected from 1 September 2023? Or is there a "retroactive" dimension to the management of data from earlier projects and data?

There is no retroactive effect. Anything done in the past under the old legal framework remains legal. Current processing, on the other hand, must comply with the new laws (note that universities are subject to cantonal law).

Do European Union citizens fall under federal law, cantonal law or the GDPR?

It depends. If you are affiliated with a federal institution, federal law applies. If you are affiliated with a cantonal institution, cantonal law applies. If you are a private person carrying out research in Switzerland, federal law applies (private persons section). If you carry out research on EU citizens because of their citizenship (targeting), GDPR applies, and if you carry out research on EU territory, Swiss + EU law apply.

What about data from historical datasets? According to federal archive law, we can use documents after 30 years. In which cases does data protection law apply vs. archive law?

Concerning historical data, archive laws and data protection laws work in combination. In general, personal data stored in public archives may be used with the authorization of the authority that archived it. The authority takes the decision on the basis of the data protection law.

Are we in the "safe" countries list to share data?

Yes, we are on the lists of safe countries.

What is the legal situation of a project started in the past (e.g., data from newspapers collected about people where it is debatable whether *all* of them are (still) public figures). Which legal basis applies now? Put differently: does the new law apply to research starting now, or also to research that was initiated several years ago?

Anything done in the past under the old legal framework remains legal. Current processing, on the other hand, must comply with the new laws (note that universities are subject to cantonal law).

As far as “public” people are concerned, researchers need to make an assessment of their importance (relative or absolute) in contemporary history when it comes to using their personal data. Several factors come into play. The important thing is to have sound reasoning.

Concerning sensitive data and the restrictions by data protection laws: is real sharing of sensitive data possible, or can data actually just be deposited, but not reused?

This is possible, provided the right conditions are met: acceptable risk, explicit consent, security, access control, etc.

What about personal data relating to deceased persons? Is it still considered personal data?

Data on deceased persons is not personal data anymore.

Duty to inform

What happens if data is collected from newspapers on politicians, for example? Does the researcher have to inform each of them?

The law allows the use of personal data that has been made available to everyone without formal opposition to its collection. What’s more, “public” people are less covered than ordinary people (according to case law) when it comes to information linked to the reason for which they are known. That said, a case-by-case analysis should always be carried out (with an expert in the field if possible).

When data is shared with software (e.g. transcription software) and the software server is located outside Switzerland (e.g. in Germany), is it necessary to state explicitly that this server is located in Germany? Or is it enough to say that it works with a secure artificial intelligence (AI) as defined by European data protection law?

Participants need to be informed about the categories of recipients. An AI is not a recipient. In other words, mention that you will be working with cloud-based solutions developed by EU companies, and that the information will be stored temporarily on secure servers in line with national data protection law.

If data is collected from an organization (e.g. from police records) and the researcher has obtained authorization from the police, does the researcher have to get the permission of every single person these files are about?

Theoretically, the duty to inform is absolute. That said, when data is transmitted by an authority, we may consider that we are entitled to use them.

If researchers download personal data from third-party data providers, e.g. if researchers download personal data from a repository (third-party data provider), do the people in the data always have to be informed, or isn't this up to the people collecting the data?

The duty to inform concerns both those who share (information on data communication) and those who download (information on data collection). The best scenario is to have data whose collection and sharing have already been adequately informed by the primary team. NB: if it is impossible to contact people again, or would require a disproportionate effort, the duty to inform can be renegotiated.

What if the data is on social media, do researchers still need to inform everyone whose tweets, etc. they collect? E.g. if the researchers use the data to analyse Twitter language?

It depends on the social media policy. Is there an opposition to collection? Is it allowed by general terms and conditions? Otherwise, yes, it is imperative to inform.

Even if the data is publicly accessible, the people whose data is used must be informed, unless the policy of a third-party data provider stipulates otherwise, or the data is from public figures. Is this correct?

Yes, that’s it. This said, it also depends on the difficulty of contacting people. It is on a case-to-case basis.

Personal data also applies to generating contact files for surveys or interviews. I would need to collect the data before I inform them, but the law seems to ask for permission before collecting. How can I ask for permission before collecting a potential sample?

In such cases (especially where data is collected from a third party), it is possible to provide information afterwards.

What about public data by or about non-public persons, e.g., by libraries? Would we have to inform everyone in the dataset when re-using such data?

It depends on what was said to people in the first place (data made public to everybody?) as well as the difficulty to contact them (are the addresses available?)

Consent

What about the use of routine data, where no informed consent is available, but only general consent?

“General consent” is used to enable the re-use of data without having to systematically contact people again. Note, however, that such consent does not really exist under the general data protection regime. If the “secondary” project doesn’t fall under the LRH, care must be taken. In that case, it would be necessary to check what has been promised in the general consent and what has been decided by the ethics commission. If the project falls within the scope of the LRH, this question must be discussed with the relevant cantonal ethics commission.

In which format are we supposed to collect and store the consent?

There is no standard. Depending on the case, it’s best to store them securely, separately from the rest.

Legitimate interests and responsibility

Is there a notion of “Legitimate interests” that can be used to collect or share data?

A violation of data protection rights could be justified by the notion of overriding public interest. However, it must be demonstrated that the social impact of the research outweighs the interest in defending the fundamental rights of the individual (which is very difficult). An impact/risk assessment must therefore be carried out. It all depends on the type of research and the type of positive/negative impact it may have.

Who is responsible for depositing data?

Responsibility lies with the data controller(s), i.e., the people who decide what is or isn’t done with the data (depending on the case, this may be the institution, the researcher, etc.). The archive is a subcontractor.

Anonymisation

If you can anonymise the data, can you share it on a repository? Or is informed consent also needed to be given for sharing anonymised data?

Anonymization is a process. It requires informed consent at the collection stage and information about the anonymization process. If the data is collected anonymously, it’s a different matter.

When inviting people to participate in a primary survey, should it be indicated that the data will be used in anonymous form for other projects? For example, “anonymous data can be made available for other research”.

Yes, it can be mentioned. What happens is that it’s very difficult to achieve full anonymization (in the legal sense of the word). It’s better to assume that you have personal data (which you can de-identify as much as possible). It is possible to say that de-identified data will be shared, and set up conditions (e.g., for access) that will limit the re-identification of participants as much as possible.

If I collect data using a participant panel (e.g. Bilendi, Prolific) there is usually a participant ID attached. Let’s say the dataset otherwise contains no identifiable data, would the ID make it such? Can I share the data freely as long as I remove/hash the participant ID, or are there other things I need to consider?

It all depends on what’s in the data. If they contain information that (directly or indirectly, e.g., by cross-referencing) can be used to identify individuals, then they are personal data. If you remove the IDs and there are otherwise no identifiable data, then you can share the data freely.

Transfer

Avoid data sharing in the USA: does this mean that repositories like SWISSUbase should not give data to US researchers at all?

Transfers to the USA must be carefully thought through and planned (e.g., by obtaining people’s explicit consent). If it turns out that it is not legal to transfer data to the USA, then it is up to the researcher to request control over access to the data (which SWISSUbase can put in place).

What if the data is shared in association with a published article – Journals are more and more requiring to attach to the article the data underlying the study – Does this situation fall within a subcontractor agreement?

Personal data are communicated to a journal in order to make them available to the scientific community. In this kind of case, it’s a good idea to have a contract delimiting this situation (in addition to having informed people of this, or even better, obtained their explicit consent). This contract can be a kind of subcontracting agreement. Be careful, however, if the journal is in the USA.

Question on linking different datasets. We possess linked survey and register data for which we have a data user agreement signed under the old data protection legislation. We do not have the explicit consent of our survey respondents to link our survey data with this register data, necessary under the new data protection legislation. Should we immediately delete this data, or are we allowed to keep this data until our existing data user agreement expires.

It seems that you can use the data until the end of the agreement. That said, it’s hard to answer without more context on the data linkage and data agreement in question.

Data linkage

Question on linking different datasets. We possess linked survey and register data for which we have a data user agreement signed under the old data protection legislation. We do not have the explicit consent of our survey respondents to link our survey data with this register data, necessary under the new data protection legislation. Should we immediately delete this data, or are we allowed to keep this data until our existing data user agreement expires?

It seems that you can use the data until the end of the agreement. That said, it’s hard to answer without more context on the data linkage and data agreement in question.

SWISSUbase

Can a researcher from a University that is not part of the consortium upload data from SWISSUbase? And what are the associated costs?

For depositing data on SWISSUbase: There is no cost for (1) researchers from the social sciences and linguistics from any institute of higher education in Switzerland, and (2) researchers from any faculty at UNIL and any faculty at UNINE. We are working on extending SWISSUbase to cater to the needs of researchers in additional disciplines and institutions across Switzerland.

For data download from the public catalogue: This is free for any researcher in Switzerland, no matter the affiliation.

In case of international collaboration, if the data is archived on SWISSUbase, data protection principles of Swiss legislation are complied with, and if the data set, documentation and meta-documentation are accepted by the SWISSUbase team, does this mean that the international standards (GDPR and others) are complied with?

The person depositing the data is asked by contract to confirm the legality of data collection and distribution (which remains his or her responsibility).

What type of software do you use to run the platform?

The SWISSUbase application code is developed in-house by the team of software engineers at FORS and runs on the Kubernetes container platform. Kubernetes is the open-source system for the management of containerized applications.

Are any commercial solutions (such as for ex. Tresorit) also considered by your researchers for any reason, to share data (let's say active data instead of cold data)?

For the preservation and sharing of research data, we don’t use a commercial cloud-based solution for security reasons, as most solutions are outside of Switzerland and can’t offer the security necessary to comply with the Swiss Data Protection Act. One additional point of clarification – the research data on the SWISSUbase platform is not necessarily “cold data” (e.g., not used) as it is in the public catalogue for the purposes of other researchers to access and reuse the available data.

How are SWISSUbase datasets cited? Is there a DOI?

Each time a version of a dataset is published, a new DOI is assigned to facilitate long-term traceability. Each DOI assigned to a version of a dataset remains active, meaning that the page is accessible and the metadata is visible, but only the data from the latest published version can be downloaded directly via the public catalogue catalog. Additionally, for each dataset deposited on SWISSUbase, a citation (following the APA standard) is automatically generated and recorded in the catalog and the user contract.

FORS Replication Service

What is the difference between this service and Open Science framework (OFS)?

The service offered by FORS is very similar to the one offered by OFS. However, the data deposited with the FORS replication service is stored in Switzerland, which is an advantage from a legal point of view, since there is no transfer of data abroad. In addition, if you have any questions or require a DOI before depositing your data, you can contact one of our experts directly (dataservice@fors.unil.ch), who will be able to help you.

What are the fundamental differences between SWISSUbase and the FORS Replication Service?

SWISSUbase is a platform for sharing and preserving research data in the long-term. Complete datasets are deposited on SWISSUbase, and rich metadata are added to describe the project and the data. The FORS Replication Service is designed to deposit partial datasets linked to a publication. The deposit process is simpler, there is less metadata to complete, and the deposited data is open to everyone (without any login). Each dataset deposited on SWISSUbase and on FORS Replication Service is assigned a DOI.

Symposium program :

13 :30 – 15 :30 | Online


The aim of the symposium was to familiarize participants with the new law and its implications for empirical social science research. The participants learned more about SWISSUbase, the platform for data sharing and dissemination, as well as the new FORS replication service for sharing replication materials. The final part of the symposium was devoted to sharing and depositing research data and related materials as a matter of good practice.

  • Introduction – Marieke Heers (FORS)
  • Sharing data in the light of data protection – Pablo Diaz (FORS – UNIL)
  • Data protection and reproducible research – Marieke Heers (FORS)
  • SWISSUbase: the platform for sharing research data – Jennifer Dean (SWISSUbase Projet Manager – FORS)
  • FORS Replication Service: the place to share replication materials – Emilie Morgan de Paula (FORS)
  • Good research practices for reproducible research – Mauro Cherubini (UNIL)
  • Discussion & questions

If you have any questions, please contact Dr. Marieke Heers or Emilie Morgan de Paula.

Subscribe to our newsletters to be kept updated on our latest news, events, services, publications, data management tips and much more.