Adding a hash transformation #60
Unanswered
Mohammad-nassar10
asked this question in
Ideas
Replies: 1 comment 3 replies
-
The mover has a digest transformation function that pretty much does this. The supported hashes are:
|
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Today, Arrow-flight module has three actions to transform columns. The first transformation is redact which replaces the values of a column with a given fixed "redactValue". The other transformations are delete and filter columns.
We want to add another possible way of transforming columns in a way where different column values will be replaced with different values. One possible way is to replace each value by its hash value. This transformation can hide the origin values but give the ability to detect equal values (that have the same hash).
A possible hash function is "hashlib.sha256()" which takes objects of type "bytes", return hashes with 256 bits long, and it is consistent (returns the same hash in all different machines). It is required that the values of the columns are of type "string" in order to convert them to "bytes" using "encode()" function.
The transformation changes each value in the requested columns by the hexadecimal representation of the hash value that is returned from "hashlib.sha256()".
Beta Was this translation helpful? Give feedback.
All reactions