-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: Initial Support for FixedSizeBinary
#240
base: main
Are you sure you want to change the base?
feat!: Initial Support for FixedSizeBinary
#240
Conversation
Thanks for the PR! The CI is not passing (which I think you expected). |
cda4220
to
ea18cd8
Compare
Thanks, @JayWhite2357 Could we rerun the CI?
Could you start by taking a look at DataType, Column, OwnedColumn, and the others? I made some decisions where I wasn’t sure of the best approach, so your feedback would be really helpful. I also added some ‘FIXME’ comments. By the way, I’d love to connect on Discord. I joined the server but not sure how to reach you there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! This is heading in the right direction.
I don't have an amazing solution to the concerns here, but hopefully my feedback can help with direction.
@@ -266,6 +303,9 @@ pub enum ColumnType { | |||
/// Mapped to [`Curve25519Scalar`](crate::base::scalar::Curve25519Scalar) | |||
#[serde(alias = "SCALAR", alias = "scalar")] | |||
Scalar, | |||
/// Mapped to fixed size binary | |||
#[serde(alias = "FIXEDSIZEBINARY", alias = "fixedsizebinary")] | |||
FixedSizeBinary(i32), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that would be nice would be having some way of ensuring that the number of bytes must be in the range [0, 32]
. (Or at least positive.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first, I wanted to use FixedSizeBinary(usize)
since it made more sense to me, but Arrow actually uses i32
for that, so I decided to stick with the convention.
having some way of ensuring that the number of bytes must be in the range
[0, 32]
where do you think such a check should go?
// FIXME: Is this interpretation correct? or should we chunk the bytes | ||
// into `byte_width`-sized chunks and convert each chunk into a scalar? | ||
compute_dory_commitment_impl(column, offset, setup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking it into chunks and converting each chunk into a scalar would work.
However, we should avoid that because it is an expensive conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe something like this may work:
// FIXME: Is this interpretation correct? or should we chunk the bytes | |
// into `byte_width`-sized chunks and convert each chunk into a scalar? | |
compute_dory_commitment_impl(column, offset, setup) | |
// FIXME: Is this interpretation correct? or should we chunk the bytes | |
// into `byte_width`-sized chunks and convert each chunk into a scalar? | |
compute_dory_commitment_impl(&column.iter().chunks_exact(byte_width).collect_vec(), offset, setup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be sure to add some tests here ensuring that this is behaving correctly. Even though CI is currently passing, this code is currently incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I’ve updated that based on your suggestion and also added a test we_can_compute_a_dory_commitment_with_fixed_size_binary_values
that is currently failing with the following error:
panicked at crates/proof-of-sql/src/proof_primitive/dory/pack_scalars.rs:179:14:
source slice length (1) does not match destination slice length (4)
tracing it back, I found that the pack_bit
function is the one panicking. It seems to be related to the implementation of OffsetToBytes
.
column.iter().enumerate().for_each(|(i, value)| {
let index = i + offset;
let row_offset = (index % num_columns) * bit_table_sum_in_bytes;
let col_offset = current_byte_size * (index / num_columns);
let offset_index = row_offset + col_offset + byte_offset;
packed_scalars[offset_index..offset_index + current_byte_size]
.copy_from_slice(&value.offset_to_bytes()[..]);
});
The issue is that it’s currently iterating over values and calling offset_to_bytes
, treating each element in the column as its own value.
This is not ideal for FixedSizeBinary
because multiple consecutive values form a single value (byte_size
). I’m not quite sure how to handle this properly. Could you point me in the right direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any updates on this? I’ve rebased on top of main @JayWhite2357
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still marked as draft. I just approved the workflows. Once it's ready, mark it as ready for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marked it as ready for review. The test we_can_compute_a_dory_commitment_with_fixed_size_binary_values
is still failing the CI, though. Let me know what you think
Signed-off-by: Tarek <[email protected]>
Signed-off-by: Tarek <[email protected]>
2d1a834
to
efd18a0
Compare
Rationale for this change
This PR adds support for
FixedSizeBinary
What changes are included in this PR?
FixedSizeBinary
in various parts of the codebase. The proposed representation isFixedSizeBinary(i32, Vec<u8>)
, with conversion implementationsfrom (i32, Vec<u8>)
and(i32, &[u8])
.docs/SQLSyntaxSpecification.md
)impl<T: MontConfig<4>> From<&[u8]> for MontScalar<T> {
:Are these changes tested?
Yes, new tests were added.
However, there are some "FIXME" comments in places where I wasn't sure if the implementation is correct. These are open for discussion and review.
Additional Notes