-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimizer: minimal rule of range scan #645
Comments
@shmiwy will handle this 🤣 |
This looks like a reasonable exercise for an optimizer newbie... If no one is working on it right now, I can take a look? |
I understand what a range scan is: specifying the start and end point of a scan, and (hopefully) uses an index to avoid a full table scan. However, I am not sure what's a filter scan. Reading |
Filters can be pushed down to the storage layer. We can put the predicate inside TableScanExecutor, so that the storage layer can filter more blocks instead of fetching them to the compute layer. FilterScanRule merges filter + table scan to a table scan with filter. |
Here's my understanding of the problem we are trying to solve, and a few questions:
My questions:
impl HeuristicOptimizer {
pub fn optimize(&self, mut root: PlanRef) -> PlanRef {
for rule in &self.rules {
if let Ok(applied) = rule.apply(root.clone()) {
root = applied;
break;
}
}
...
|
Yes. Note that non-pk can also be pushed down to the storage layer. Our column store supports filter scan (on any column) and range filter scan on pk.
Yes, you will need to modify the rule. It would be okay to create a new rule, or modify the existing FilterScanRule. Both will work.
Yes. |
Thanks for replying!
Suppose we create a new rule, call it impl Optimizer {
pub fn optimize(&mut self, mut plan: PlanRef) -> PlanRef {
...
let mut rules: Vec<Box<(dyn rules::Rule + 'static)>> = vec![
Box::new(FilterAggRule {}),
Box::new(FilterJoinRule {}),
Box::new(LimitOrderRule {}),
];
if self.enable_filter_scan {
rules.push(Box::new(FilterScanRule {}));
}
let hep_optimizer = HeuristicOptimizer { rules };
plan = hep_optimizer.optimize(plan);
...
}
impl HeuristicOptimizer {
pub fn optimize(&self, mut root: PlanRef) -> PlanRef {
for rule in &self.rules {
if let Ok(applied) = rule.apply(root.clone()) {
root = applied;
break;
}
} |
No. The optimizer |
Seems like a bug. Need fix 🤣 |
The optimizer should iterate through all rules, and try all of them sequentially, right? If that's the case, it should be an easy fix (and if we have enough tests for the optimizer, it should be relatively easy to verify that this fix won't break anything)? Just verified that if we remove the |
Yes. Welcome to submit a PR. Also cc @st1page 🤣 |
+1
BTW, if the order of the rules can affect the optimizing result, we can split them into multi optimize phases. In other words, construct multiple |
I am finally picking up this again (after a long time)... Currently, fn scan<'a>(
&'a self,
begin_sort_key: &'a [DataValue],
end_sort_key: &'a [DataValue],
col_idx: &'a [StorageColumnRef],
is_sorted: bool,
reversed: bool,
expr: Option<BoundExpr>,
) -> Self::ScanResultFuture<'a> { Since The implementation of range scan currently supports only one key, and always uses the first column: https://github.com/risinglightdb/risinglight/pull/644/files#r867324032 (which seems wrong). |
After #644 gets implemented, our storage layer would have range scan capability!
To make SQL leverage this property, we can have a range filter scan rule, which does the following:
cond1 and cond2 and cond3, ...
, and there is one condition in the form ofpk <op> constant
, then we change the TableScan to scan with pk sorted, and apply the begin / end sort key to the scan interface.We can eventually move forward to #601. But to keep everything simple, we can have this minimal rule.
The text was updated successfully, but these errors were encountered: