Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Immutable array based on RRB Vector #1286

Open
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

jialunzhang-psu
Copy link
Collaborator

This PR modifies the tree structure based on RRB Vector: A Practical General Purpose Immutable Sequence, ICFP'15. It will support operations including removal and concat.

@jialunzhang-psu jialunzhang-psu marked this pull request as draft December 3, 2024 21:31
@coveralls
Copy link
Collaborator

coveralls commented Dec 3, 2024

Pull Request Test Coverage Report for Build 4819

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 200 of 229 (87.34%) changed or added relevant lines in 4 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.02%) to 83.15%

Changes Missing Coverage Covered Lines Changed/Added Lines %
immut/array/utils.mbt 16 17 94.12%
immut/array/array.mbt 27 30 90.0%
immut/array/tree_utils.mbt 19 30 63.33%
immut/array/tree.mbt 138 152 90.79%
Files with Coverage Reduction New Missed Lines %
ref/ref.mbt 1 71.43%
Totals Coverage Status
Change from base Build 4797: 0.02%
Covered Lines: 4999
Relevant Lines: 6012

💛 - Coveralls

Copy link

peter-jerry-ye-code-review bot commented Dec 26, 2024

‼️ This code review is generated by a bot. Please verify the content before trusting it.

After analyzing the git diff, here are three potential issues I've identified:

  1. Memory Efficiency Issue in iter() method:
pub fn iter[A](self : T[A]) -> Iter[A] {
  Iter::new(fn(yield_) {
    let arr = self.to_array() // TODO: it first converts to an array, which is not efficient
    // ...
  })
}

There's a comment indicating that the current implementation first converts the entire structure to an array before iteration, which is inefficient in terms of memory usage. This could be optimized by implementing a direct iterator over the tree structure.

  1. Potential Integer Overflow in redis_plan:
let total_nodes = node_counts.fold(init=0, fn { acc, x => acc + x })
// round up to the nearest integer of S/branching_factor
let opt_len = (total_nodes + branching_factor - 1) / branching_factor

There's no check for integer overflow when calculating total_nodes or opt_len. For very large arrays, this could lead to undefined behavior.

  1. Potential Safety Issue in redis:
guard j < old_len  // This shouldn't be triggered if the plan was correctly generated

There's a comment suggesting that this guard should never fail if the plan is correct, but there's no validation of the plan before execution. If there's a bug in plan generation, this could lead to runtime errors. Consider adding validation of the redistribution plan before executing it.

These issues primarily affect performance, memory efficiency, and robustness of the implementation. Would you like me to elaborate on any of these points?

@jialunzhang-psu
Copy link
Collaborator Author

jialunzhang-psu commented Jan 7, 2025

Originally I wanted to add concat and split and all other derived operations for immut/array, but now it turns out to be a large enough PR for just adding concat. So I pause here. There is more work to be done on this component, as listed in README.

In a nutshell, the main change of this PR is the introduction of the concat function in tree.mbt. All other things added are either helper functions or tests.

@jialunzhang-psu jialunzhang-psu requested a review from yj-qin January 7, 2025 09:08
@jialunzhang-psu jialunzhang-psu marked this pull request as ready for review January 7, 2025 09:08
@yj-qin yj-qin requested review from Yoorkin and removed request for yj-qin January 7, 2025 09:19
@bobzhang
Copy link
Contributor

@jialunzhang-psu can you do a rebase? @Lampese are you interested in doing a review?

@Lampese
Copy link
Collaborator

Lampese commented Jan 15, 2025

@jialunzhang-psu can you do a rebase? @Lampese are you interested in doing a review?

OK, I will do it later.

immut/array/array.mbt Show resolved Hide resolved
/// Physically copy the array.
/// Since it is an immutable data structure,
/// it is rarely the case that you would need this function.
pub fn copy[A](self : T[A]) -> T[A] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the necessity of this API.
cc @peter-jerry-ye

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think it's necessary. Even it were, maybe it can just return itself?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API was already there when I touched it. I have no problem in removing it or making it private.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we need to remove it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it.

immut/array/array.mbt Outdated Show resolved Hide resolved
immut/array/array.mbt Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants