-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement attribute index for wasm build #12
Conversation
Caution Review failedThe pull request is closed. WalkthroughThis pull request introduces a series of updates across both Rust and TypeScript modules. In Rust, the bst package now has a new Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant UI as UI (index.html)
participant Q as WasmAttrQuery
participant R as HttpFcbReader
participant I as AsyncFeatureIter
U->>UI: Enter JSON query and click "Run Query Test"
UI->>Q: Construct WasmAttrQuery from JSON
UI->>R: Call select_attr_query(query)
R->>R: Log query data & compute index sizes
R->>I: Return AsyncFeatureIter for feature iteration
I->>UI: Provide features for display
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (12)
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Here's the code health analysis summary for commits Analysis Summary
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
src/ts/index.html (1)
103-156
: 🛠️ Refactor suggestionEnhance error handling and query validation.
The implementation needs more robust error handling:
- Add input validation for the query structure
- Implement proper cleanup in case of errors
- Add retry logic for network failures
- Add loading state handling
async function runQueryTest() { + const queryTestBtn = document.getElementById('query-test-btn'); + queryTestBtn.disabled = true; + try { - console.log('Running query test...'); + console.log('Running query test...'); - const urlInput = document.getElementById('url-input'); - const queryInput = document.getElementById('query-input'); - const url = urlInput.value.trim(); - const queryStr = queryInput.value.trim(); + const urlInput = document.getElementById('url-input'); + const queryInput = document.getElementById('query-input'); + const url = urlInput.value.trim(); + const queryStr = queryInput.value.trim(); - if (!url) { - alert('Please enter a valid URL.'); - return; - } + if (!url) { + throw new Error('Please enter a valid URL.'); + } + if (!url.startsWith('https://')) { + throw new Error('URL must use HTTPS.'); + } - let queryArray; - try { - queryArray = JSON.parse(queryStr); - } catch (e) { - alert('Invalid JSON for query.'); - return; - } + let queryArray; + try { + queryArray = JSON.parse(queryStr); + } catch (e) { + throw new Error('Invalid JSON for query.'); + } + + if (!Array.isArray(queryArray) || !queryArray.every(q => + Array.isArray(q) && q.length === 3 && + typeof q[0] === 'string' && + typeof q[1] === 'string')) { + throw new Error('Invalid query format. Expected array of [string, string, any][]'); + } // ... rest of the implementation ... + } finally { + queryTestBtn.disabled = false; + } }
🧹 Nitpick comments (14)
src/rust/fcb_core/src/http_reader/mod.rs (2)
239-301
: New attribute-based feature selection is a solid addition.
The implementation correctly fetches the entire attribute index and processes entries into aMultiIndex
, then queries it.Consider a more incremental or chunked approach for enormous attribute indexes to avoid large memory consumption in a single range fetch.
518-538
:SelectAttr
implementation is straightforward but duplicates some logic withSelectBbox
.Consider abstracting out the shared logic for reading feature buffers into a small helper to reduce code drift.
src/rust/wasm/src/lib.rs (1)
280-348
: Addedselect_attr_query
mirroring the core logic for WASM.
Implementation properly retrieves and processes the attribute index withMultiIndex
.If the attribute index can be huge, consider partial reading or streaming similarly to bounding box queries.
src/ts/fcb_wasm.js (1)
418-470
: NewWasmAttrQuery
class
The approach is solid. However, note that returningthis
explicitly from the constructor (lines 457-458) is a no-op in JavaScript.- return this;
Removing it avoids confusion, since constructors ignore a raw
this
return by default in JS.🧰 Tools
🪛 Biome (1.9.4)
[error] 457-458: The constructor should not return a value.
The constructor is here:
Returning a value from a constructor is ignored.
(lint/correctness/noConstructorReturn)
src/rust/fcb_core/tests/http.rs (2)
41-52
: Consider removing commented out code and using constants.The query vector contains commented out conditions that should either be removed or uncommented. Also, consider extracting the magic number
10.0
into a named constant for better maintainability.+const HEIGHT_THRESHOLD: f64 = 10.0; let query: Vec<(String, Operator, ByteSerializableValue)> = vec![ ( "b3_h_dak_50p".to_string(), Operator::Gt, - ByteSerializableValue::F64(OrderedFloat(10.0)), + ByteSerializableValue::F64(OrderedFloat(HEIGHT_THRESHOLD)), ), - // ( - // "identificatie".to_string(), - // Operator::Eq, - // ByteSerializableValue::String("NL.IMBAG.Pand.0503100000012869".to_string()), - // ), ];
93-94
: Enhance test assertions.The current assertions only verify basic conditions. Consider adding more comprehensive checks:
- Verify specific attribute values in the results
- Test edge cases (empty query, invalid operators)
- Test error conditions
src/ts/fcb_wasm.d.ts (1)
34-57
: LGTM! Well-documented TypeScript definitions.The
WasmAttrQuery
class is well-documented with clear examples. Consider adding:
- Documentation for supported operators (Gt, Eq, etc.)
- Type information for supported value types
src/rust/bst/src/query.rs (2)
130-130
: Address the TODO comment about stream processing.The comment indicates a need for improvement in stream processing and request handling. Consider implementing streaming to handle large result sets more efficiently.
Would you like me to help implement a streaming solution that processes results incrementally?
132-194
: Consider refactoring to reduce code duplication.The
stream_query
function shares significant logic with thequery
method. Consider extracting the common query logic into a shared private function to improve maintainability.Here's a suggested refactor:
fn execute_query(indices: &HashMap<String, Box<dyn AnyIndex>>, conditions: &[QueryCondition]) -> Vec<ValueOffset> { let mut candidate_sets: Vec<HashSet<ValueOffset>> = Vec::new(); for condition in conditions { if let Some(idx) = indices.get(&condition.field) { let offsets: Vec<ValueOffset> = match condition.operator { // ... existing operator match logic ... }; candidate_sets.push(offsets.into_iter().collect()); } } if candidate_sets.is_empty() { return vec![]; } // Intersect candidate sets let mut intersection: HashSet<ValueOffset> = candidate_sets.first().unwrap().clone(); for set in candidate_sets.iter().skip(1) { intersection = intersection.intersection(set).cloned().collect(); } let mut result: Vec<ValueOffset> = intersection.into_iter().collect(); result.sort(); result } impl MultiIndex { pub fn query(&self, query: Query) -> Vec<ValueOffset> { execute_query(&self.indices, &query.conditions) } } #[cfg(feature = "http")] pub async fn stream_query( m_indices: &MultiIndex, query: Query, feature_begin: usize, ) -> Result<Vec<HttpSearchResultItem>> { let offsets = execute_query(&m_indices.indices, &query.conditions); let http_ranges: Vec<HttpSearchResultItem> = offsets .into_iter() .map(|offset| HttpSearchResultItem { range: HttpRange::RangeFrom(offset as usize + feature_begin..), }) .collect(); Ok(http_ranges) }src/ts/package.json (1)
1-15
: Restore essential package metadata and build configuration.The removal of essential package metadata (author, license) and build configuration (scripts, dependencies) could affect:
- Package distribution and compliance
- Build and test processes
- Code minification (terser dependency)
Consider restoring these configurations or documenting why they were removed.
src/rust/bst/Cargo.toml (1)
6-9
: LGTM! Consider adding feature documentation.The feature configuration is well-structured:
- Making
packed_rtree
optional and gating it behind thehttp
feature improves modularity- Setting
http
as a default feature ensures backward compatibilityConsider adding documentation comments to explain the purpose and implications of each feature.
src/rust/wasm/Cargo.toml (1)
35-36
:❓ Verification inconclusive
Verify bst feature configuration and document dependency usage.
The new dependencies look good, but:
- Verify that the
bst
dependency inherits the defaulthttp
feature- Document why
chrono
was addedRun this script to verify the feature inheritance:
🏁 Script executed:
#!/bin/bash # Description: Check if bst dependency enables the http feature by default # Check if bst dependency in fcb_wasm explicitly disables default features rg "bst.*default-features.*=.*false" src/rust/wasm/Cargo.toml # Check if http feature is explicitly disabled rg "bst.*features.*=.*\[(?!.*http.*\])" src/rust/wasm/Cargo.tomlLength of output: 388
Verify bst default features and document chrono usage
The added dependencies align with the PR’s objective, but please take these actions:
- Verify in
src/rust/wasm/Cargo.toml
that thebst
dependency does not disable its default features (ensuring that thehttp
feature remains enabled based onbst/Cargo.toml
).- Confirm in
src/rust/bst/Cargo.toml
that the default feature set indeed includeshttp
.- Add a brief note explaining why
chrono
was added (e.g., for timestamp handling or related purposes).For further verification, please run the following commands manually using PCRE2 (which supports look-around) to double-check the configuration:
#!/bin/bash # Check that bst dependency does not disable default features in wasm Cargo.toml rg --pcre2 'bst\s*=\s*{[^}]*default-features\s*=\s*false' src/rust/wasm/Cargo.toml # Verify that bst/Cargo.toml includes "http" in the default feature list rg --pcre2 'default\s*=\s*\[.*http.*\]' src/rust/bst/Cargo.tomlPlease verify these configurations and update the documentation accordingly.
src/ts/index.html (2)
44-45
: Review hardcoded URL to Google Storage.The default URL points to a specific Google Storage bucket. Consider:
- Moving this to a configuration file
- Documenting the expected data format
- Adding error handling for CORS issues
76-94
: Enhance feature iteration implementation.Good implementation, but consider these improvements:
- Make the feature limit configurable
- Add memory management by cleaning up the iterator
- Consider streaming the features instead of collecting them all
- async function iterateFeatures(iter) { - let count = 0; - while (true) { + async function iterateFeatures(iter, limit = 10) { + let count = 0; + try { + while (true) { - log(`Fetching feature ${count + 1}...`); - const feature = await iter.next(); - if (feature === undefined || feature === null) { - log('No more features.'); - break; - } - log(`Feature ${count + 1}: ${JSON.stringify(feature, null, 2)}`); - count++; - // Limit iteration to avoid excessive logs - if (count >= 10) { - log('Reached feature iteration limit (10).'); - break; + log(`Fetching feature ${count + 1}...`); + const feature = await iter.next(); + if (feature === undefined || feature === null) { + log('No more features.'); + break; + } + log(`Feature ${count + 1}: ${JSON.stringify(feature, null, 2)}`); + count++; + if (count >= limit) { + log(`Reached feature iteration limit (${limit}).`); + break; + } + } + } finally { + // Clean up the iterator + if (iter.free) { + iter.free(); + } + } + log(`Total features fetched: ${count}`); - } - log(`Total features fetched: ${count}`); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/ts/fcb_wasm_bg.wasm
is excluded by!**/*.wasm
📒 Files selected for processing (18)
src/rust/bst/Cargo.toml
(1 hunks)src/rust/bst/src/byte_serializable.rs
(1 hunks)src/rust/bst/src/query.rs
(2 hunks)src/rust/fcb_core/src/http_reader/mod.rs
(7 hunks)src/rust/fcb_core/src/http_reader/wasm_client.rs
(0 hunks)src/rust/fcb_core/src/lib.rs
(1 hunks)src/rust/fcb_core/src/reader/attr_query.rs
(2 hunks)src/rust/fcb_core/tests/http.rs
(2 hunks)src/rust/makefile
(1 hunks)src/rust/packed_rtree/src/lib.rs
(2 hunks)src/rust/wasm/Cargo.toml
(1 hunks)src/rust/wasm/src/lib.rs
(8 hunks)src/ts/.gitignore
(1 hunks)src/ts/fcb_wasm.d.ts
(2 hunks)src/ts/fcb_wasm.js
(10 hunks)src/ts/fcb_wasm_bg.wasm.d.ts
(1 hunks)src/ts/index.html
(4 hunks)src/ts/package.json
(1 hunks)
💤 Files with no reviewable changes (1)
- src/rust/fcb_core/src/http_reader/wasm_client.rs
✅ Files skipped from review due to trivial changes (2)
- src/ts/.gitignore
- src/rust/makefile
🧰 Additional context used
🪛 Biome (1.9.4)
src/ts/fcb_wasm.js
[error] 457-458: The constructor should not return a value.
The constructor is here:
Returning a value from a constructor is ignored.
(lint/correctness/noConstructorReturn)
🔇 Additional comments (24)
src/rust/fcb_core/src/http_reader/mod.rs (8)
2-2
: Import statements look appropriate.
No concerns: the newly imported items are used below without apparent issues.
10-10
: Import frombst
is valid.
Nothing to fix: it's consistent with usage in the newselect_attr_query
logic.
135-144
: R-tree index size calculation appears correct.
It properly checks for non-empty index and feature count, then delegates toPackedRTree::index_size
.
146-164
: Attribute index size calculation is well-structured.
The overflow check is a good safety measure. Returningu64
is consistent with the R-tree calculation logic.
207-207
: Potentially large attribute index on 32-bit systems.
Castingu64
tousize
can overflow on 32-bit targets if datasets are enormous. Verify whether you support 32-bit systems or want to safeguard this.
211-211
: Passing the attribute index size to the R-tree search is consistent with the new logic.
No issues found here.
342-342
: Enum variant forSelectAttr
is aligned with the new selection mode.
No concerns.
353-353
: Properly handlingSelectAttr
innext_feature_buffer
.
Integration is smooth. No issues found.src/rust/wasm/src/lib.rs (9)
6-7
: Logging and JS bridging imports look correct.
No issues found.
10-10
: Serialization imports are valid.
They match the usage ofserde
for bridging data to/from JS.
14-14
: Extendedbst
imports.
No concerns: they align with the new attribute-query approach.
18-18
:chrono
import for date/time handling is appropriate.
Matches usage inWasmAttrQuery
and date conversions.
21-23
: Combined import fromfcb_core
for process utilities.
This grouping is logically consistent.
62-62
:#[wasm_bindgen(start)]
attribute recognized.
This sets up the module initialization on the WASM side. No concerns.
67-68
: Runtime initialization with a custom logger.
Good approach for debugging in WASM. No issues found.
170-179
: R-tree index size computation mirrors the HTTP reader logic.
No concerns: consistent approach for WASM.
181-199
: Attribute index size calculation in WASM environment.
Includes the overflow check andu64
return. Matches the rest of the design.src/ts/fcb_wasm.js (2)
234-239
: Helper_assertClass
Appropriate to validate the WASM object instance type.
405-415
:select_attr_query(query)
bridging to WASM.
Implementation matches the WASM function. No issues found.src/rust/fcb_core/src/lib.rs (1)
4-4
: LGTM! Making the module public is consistent with its usage.The change to make the
fb
module public aligns with its existing public re-export and supports the implementation of attribute index functionality.src/ts/fcb_wasm_bg.wasm.d.ts (1)
10-10
: LGTM! WASM bindings are properly defined.The new functions for attribute query support are correctly exported and match the TypeScript definitions.
Also applies to: 15-17
src/rust/bst/src/byte_serializable.rs (1)
3-3
: LGTM! Good visibility change.Making
OrderedFloat
public is the right choice since it's used in public types and implementations. This change follows Rust's visibility best practices.src/rust/fcb_core/src/reader/attr_query.rs (1)
17-17
: LGTM! Appropriate visibility changes.Making
process_attr_index_entry
andbuild_query
public is correct as they're needed for the new HTTP feature implementation. The functions maintain proper encapsulation and error handling.Also applies to: 88-88
src/rust/packed_rtree/src/lib.rs (1)
570-571
: LGTM! Correct offset calculation adjustment.The addition of
attr_index_size
parameter and its inclusion in thefeature_begin
calculation is correct. This ensures proper offset calculation when attribute indices are present.Also applies to: 586-587
src/rust/fcb_core/tests/http.rs
Outdated
let res = | ||
read_http_file_bbox("https://storage.googleapis.com/flatcitybuf/delft_attr.fcb").await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider using local test data.
The test depends on an external URL which could make tests flaky if the resource becomes unavailable. Consider downloading the test data and storing it in the repository or using a mock HTTP server for testing.
🏁 Script executed:
#!/bin/bash
# Check if the test data is accessible
curl -I "https://storage.googleapis.com/flatcitybuf/delft_attr.fcb"
Length of output: 1263
Action Required: Remove external dependency from HTTP test.
While the remote URL currently responds successfully, relying on an external resource—as seen in src/rust/fcb_core/tests/http.rs
at lines 105-106—can lead to flaky tests if the resource becomes unavailable or slow to respond. To enhance test stability, please consider one of the following:
- Storing the test data locally in the repository
- Using a mock HTTP server to simulate the external call
bb90aa6
to
afadb32
Compare
This change is
Summary by CodeRabbit
New Features
Improvements