Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Make get_UID Pluggable #86

Open
smasuda opened this issue Jul 17, 2024 · 9 comments
Open

Proposal: Make get_UID Pluggable #86

smasuda opened this issue Jul 17, 2024 · 9 comments

Comments

@smasuda
Copy link
Contributor

smasuda commented Jul 17, 2024

Hi there!

I'd like to propose making get_UID pluggable so that users can implement their own version without affecting others.

In certain cases, it is necessary to keep track of UID changes by recording the UIDs before and after modifications over a series of anonymizations done in separate processes. Additionally, some users might want to use a specific prefix to generate their own UIDs. The current implementation doesn't allow this since get_UID is tightly coupled with the code, and its dictionary is volatile.

If this proposal is acceptable, I am willing to raise a PR for review. I would like to hear feedback on whether this change is agreeable.

@pchoisel
Copy link
Collaborator

Hi @smasuda,

Your usecase is definitely valid to me.
However, I'm wondering if your implementation could benefit everyone.

What about this: dicom-anonymizer now accepts a new argument: a path to a json file mapping old UID to new UID. After the end of dicom-anonymizer execution, the json file is eventually (optionally) updated with any new UID that might have been generated.
That way, you could keep the same UID replacement between several datasets.

What do you think about this ? If this does not suit you, I would also agree with making the function replace_element_UID or get_UID injectable.

@smasuda
Copy link
Contributor Author

smasuda commented Jul 29, 2024

Hi @pchoisel - thanks for your feedback!
re: the json file approach, I agree with you that it will be surly suffice for most of the use cases such as an anonymization in a labo, however, it may face a scalability issue when you work with a large dataset with multiple members/teams, which is the original motivation of mine and I'd still want to have a solution.

By making replace_element_UID injectable, one could implement the file approach as well if necessary. What do you think?

@pchoisel
Copy link
Collaborator

@smasuda, no worries.

I'm ok with making replace_element_UID injectable, you can go ahead and open a PR.

Thanks !

@smasuda
Copy link
Contributor Author

smasuda commented Aug 1, 2024

@pchoisel actually I came to have a second thought - once this merged PR ( #83 ) is released, we could achieve the pluggable replace_element_UID via defining your own rule generator.. Not knowing the release cycle of this project, yet would you be able to consider making a small release?

@pchoisel
Copy link
Collaborator

pchoisel commented Aug 5, 2024

Yes, I think we are having enough small features to create a minor release. I can wait after this is merged to create it.

@smasuda
Copy link
Contributor Author

smasuda commented Aug 13, 2024

@pchoisel

I can wait after this is merged to create it.

Just to be on the same page, I'm waiting for the release; after the release I'll try to see replace_element_UID is injectable. Sorry if I'm not clear enough about the steps to take.

@pchoisel
Copy link
Collaborator

All right, I understood it wrong then
I'll try to make a release this week

@sjswerdloff
Copy link

@pchoisel

I can wait after this is merged to create it.

Just to be on the same page, I'm waiting for the release; after the release I'll try to see replace_element_UID is injectable. Sorry if I'm not clear enough about the steps to take.

I'd be very interested in this. I wrote a pseudonymiser that is in PyMedPhys (experimental) and a key aspect of that was to have the UIDs stay consistent amongst objects (referenced UIDs, Study and Series UIDs). To avoid rainbow table attacks against hashed values, I added pepper (which could be shared within a facility to allow for consistent forward conversion of data, but would need to be kept sufficiently secure).

@pchoisel
Copy link
Collaborator

pchoisel commented Sep 3, 2024

Release done !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants