feat!: Initial Support for `FixedSizeBinary` #240

tareknaser · 2024-10-08T18:27:20Z

Rationale for this change

This PR adds support for FixedSizeBinary

This PR is not meant to close #229 but serves as initial support. We should use this as a stepping stone and, after discussions arising from the code, decide how to build on this and address the issue in a follow-up PR.

What changes are included in this PR?

Added support for FixedSizeBinary in various parts of the codebase. The proposed representation is FixedSizeBinary(i32, Vec<u8>), with conversion implementations from (i32, Vec<u8>) and (i32, &[u8]).
Updated documentation to reflect the new changes (docs/SQLSyntaxSpecification.md)
Modified the implementation of impl<T: MontConfig<4>> From<&[u8]> for MontScalar<T> {:
- Previously, this was only used for strings and hashed the value regardless.
- Now, if the byte slice is empty, the result is the zero scalar. If the byte slice has length 31 or less, the bytes are directly converted to a scalar. If the byte slice has length 32, the bytes are hashed using blake3 and the result is converted to a scalar.
- String handling was updated to keep it hashed all the time.

Are these changes tested?

Yes, new tests were added.
However, there are some "FIXME" comments in places where I wasn't sure if the implementation is correct. These are open for discussion and review.

Additional Notes

I have not added literal support yet.
I have attempted to add support in every required place in the code, but please let me know if I missed any areas!

JayWhite2357 · 2024-10-08T19:01:40Z

This PR is not meant to close #229 but serves as initial support. We should use this as a stepping stone and, after discussions arising from the code, decide how to build on this and address the issue in a follow-up PR.

Thanks for the PR! The CI is not passing (which I think you expected).
If there is anything you want our team to give you feedback on, it would be good if you can highlight that with comments since we won't have time for thorough feedback until a PR is ready to be merged/passing CI.

tareknaser · 2024-10-08T21:56:28Z

Thanks, @JayWhite2357

Could we rerun the CI?
cargo test --all-features now passes locally

If there is anything you want our team to give you feedback on,

Could you start by taking a look at DataType, Column, OwnedColumn, and the others? I made some decisions where I wasn’t sure of the best approach, so your feedback would be really helpful. I also added some ‘FIXME’ comments.

By the way, I’d love to connect on Discord. I joined the server but not sure how to reach you there.

JayWhite2357

Thanks for the PR! This is heading in the right direction.
I don't have an amazing solution to the concerns here, but hopefully my feedback can help with direction.

JayWhite2357 · 2024-10-09T12:36:33Z

crates/proof-of-sql/src/base/database/column.rs

@@ -266,6 +303,9 @@ pub enum ColumnType {
    /// Mapped to [`Curve25519Scalar`](crate::base::scalar::Curve25519Scalar)
    #[serde(alias = "SCALAR", alias = "scalar")]
    Scalar,
+    /// Mapped to fixed size binary
+    #[serde(alias = "FIXEDSIZEBINARY", alias = "fixedsizebinary")]
+    FixedSizeBinary(i32),


One thing that would be nice would be having some way of ensuring that the number of bytes must be in the range [0, 32]. (Or at least positive.)

At first, I wanted to use FixedSizeBinary(usize) since it made more sense to me, but Arrow actually uses i32 for that, so I decided to stick with the convention.

having some way of ensuring that the number of bytes must be in the range [0, 32]

where do you think such a check should go?

crates/proof-of-sql/src/base/commitment/committable_column.rs

crates/proof-of-sql/src/base/database/owned_column.rs

JayWhite2357 · 2024-10-09T13:16:53Z

crates/proof-of-sql/src/proof_primitive/dory/dynamic_dory_commitment_helper_cpu.rs

+            // FIXME: Is this interpretation correct? or should we chunk the bytes
+            // into `byte_width`-sized chunks and convert each chunk into a scalar?
+            compute_dory_commitment_impl(column, offset, setup)


Breaking it into chunks and converting each chunk into a scalar would work.
However, we should avoid that because it is an expensive conversion.

I believe something like this may work:

Suggested change

// FIXME: Is this interpretation correct? or should we chunk the bytes

// into `byte_width`-sized chunks and convert each chunk into a scalar?

compute_dory_commitment_impl(column, offset, setup)

// FIXME: Is this interpretation correct? or should we chunk the bytes

// into `byte_width`-sized chunks and convert each chunk into a scalar?

compute_dory_commitment_impl(&column.iter().chunks_exact(byte_width).collect_vec(), offset, setup)

Be sure to add some tests here ensuring that this is behaving correctly. Even though CI is currently passing, this code is currently incorrect.

Alright, I’ve updated that based on your suggestion and also added a test we_can_compute_a_dory_commitment_with_fixed_size_binary_values that is currently failing with the following error:

panicked at crates/proof-of-sql/src/proof_primitive/dory/pack_scalars.rs:179:14: source slice length (1) does not match destination slice length (4)

tracing it back, I found that the pack_bit function is the one panicking. It seems to be related to the implementation of OffsetToBytes.

column.iter().enumerate().for_each(|(i, value)| { let index = i + offset; let row_offset = (index % num_columns) * bit_table_sum_in_bytes; let col_offset = current_byte_size * (index / num_columns); let offset_index = row_offset + col_offset + byte_offset; packed_scalars[offset_index..offset_index + current_byte_size] .copy_from_slice(&value.offset_to_bytes()[..]); });

The issue is that it’s currently iterating over values and calling offset_to_bytes, treating each element in the column as its own value.

This is not ideal for FixedSizeBinary because multiple consecutive values form a single value (byte_size). I’m not quite sure how to handle this properly. Could you point me in the right direction?

Any updates on this? I’ve rebased on top of main @JayWhite2357

This is still marked as draft. I just approved the workflows. Once it's ready, mark it as ready for review.

Marked it as ready for review. The test we_can_compute_a_dory_commitment_with_fixed_size_binary_values is still failing the CI, though. Let me know what you think

Signed-off-by: Tarek <[email protected]>

tareknaser force-pushed the FixedSizeBinary_support branch from cda4220 to ea18cd8 Compare October 8, 2024 20:28

JayWhite2357 reviewed Oct 9, 2024

View reviewed changes

tareknaser mentioned this pull request Oct 12, 2024

feat: Add a Method to Support Dynamic Element Sizes in Sequence spaceandtimelabs/blitzar-rs#44

Merged

tareknaser added 2 commits November 4, 2024 19:05

feat!: initial support for FixedSizeBinary

43934ab

Signed-off-by: Tarek <[email protected]>

fix: use Sequence::from_raw_parts_with_size for FixedSizeBinary

efd18a0

Signed-off-by: Tarek <[email protected]>

tareknaser force-pushed the FixedSizeBinary_support branch from 2d1a834 to efd18a0 Compare November 4, 2024 17:16

JayWhite2357 self-requested a review November 6, 2024 18:05

tareknaser marked this pull request as ready for review November 8, 2024 20:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!: Initial Support for `FixedSizeBinary` #240

feat!: Initial Support for `FixedSizeBinary` #240

tareknaser commented Oct 8, 2024 •

edited

Loading

JayWhite2357 commented Oct 8, 2024

tareknaser commented Oct 8, 2024

JayWhite2357 left a comment

JayWhite2357 Oct 9, 2024

tareknaser Oct 12, 2024

JayWhite2357 Oct 9, 2024

JayWhite2357 Oct 9, 2024

JayWhite2357 Oct 9, 2024

tareknaser Oct 13, 2024

tareknaser Nov 4, 2024

JayWhite2357 Nov 6, 2024 •

edited

Loading

tareknaser Nov 8, 2024

feat!: Initial Support for FixedSizeBinary #240

Are you sure you want to change the base?

feat!: Initial Support for FixedSizeBinary #240

Conversation

tareknaser commented Oct 8, 2024 • edited Loading

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Additional Notes

JayWhite2357 commented Oct 8, 2024

tareknaser commented Oct 8, 2024

JayWhite2357 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JayWhite2357 Nov 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

feat!: Initial Support for `FixedSizeBinary` #240

feat!: Initial Support for `FixedSizeBinary` #240

tareknaser commented Oct 8, 2024 •

edited

Loading

JayWhite2357 Nov 6, 2024 •

edited

Loading