-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make plan for supplementary data package #29
Comments
So the steps should involve something like:
|
So far as I can see, only demographics.csv has sensitive information. Yes, anonymized demographics data could be hacked. Apparently this is common. I think it is a standard practice to have a license for sharing such data where the user agrees not to try to identify persons. As I see it, we can (1) keep demographics.csv private except under a confidentiality agreement, (2) anonymize demographics.csv (e.g., remove the names) and share it under a license prohibiting hacking. Regarding how to make the data public. Can we just move all the public data into the public hackathon manuscript repo? |
This will be difficult. A license won't work (licenses grant rights that someone would not otherwise have - these data are facts, so licenses don't apply), and a Data Use Agreement will be difficult to enforce agreement to. I suggest simply not to publish data that we can't openly and publicly deposit. I'm also afraid that any personal data, whether they were previously public or not, will need to go through IRB approval. |
So does this mean that we are inviting acrimony from reviewers because we
|
I don't think so. For example, this is the norm more than the exception in many social science fields - for obvious reasons. We just need to be clear why we are not publishing the data that we aren't, and ideally there'd be a way to get access to them, for example by request and signing a DUA. (Zenodo supports this reasonably.) |
OK, but could we actually make it available anyway after dropping the
|
Any data that can potentially be re-identified needs to be cleared with the IRB unless we're keeping it under wraps. |
Right, but providing a table that lists that at hackathon X there was a Op Sun, 18 Oct 2015 om 16:44 schreef Hilmar Lapp [email protected]
|
Yes, absolutely. If we had had 10,000 participants and 100 of them are Pacific Islanders, the chance of re-identification is low. But not so with small numbers. Nobody voluntarily or otherwise stated their ethnicity on the public pages. |
And, consequently, the chances of getting the IRB's permission to publish Op Sun, 18 Oct 2015 om 16:54 schreef Hilmar Lapp [email protected]
|
Rutger, I think this isn't as problematic as you are implying. We just publish all the data files except demographics.csv. It is OK to publish conclusions based on data that are not released due to privacy concerns. This happens all the time in medicine and social sciences. The people identified in demographics.csv have legal rights in keeping that data private, and we need to protect that right. If we don't believe we can protect that right by anonymizing the data, we should not try to anonymize the data. In fields where data are withheld for privacy reasons, researchers who want to get the data have to make an agreement, to the effect that they will also safeguard the privacy rights of the people identified. This is a 2-way agreement based on trust. The originator can refuse to share the data with someone that he doesn't trust to keep the agreement. |
Ok, good to know. I simply don't know how this works as I've never published anything that involves human subjects. If the approach you're describing is how it's done, then let's do that. |
We have a data repo to go with this manuscript. Currently it is private because of the private demographic info. Make a list of what we need to share, and make a plan for creating a supplementary data package, or a repository for sharing the required files.
Then write that up as a separate ticket.
The text was updated successfully, but these errors were encountered: