Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: 执行外连接的查询过于耗时, tianmu需要执行十四分钟, innodb执行外连接仅需要执行一分钟 #1777

Open
haitaoguan opened this issue May 15, 2023 · 126 comments
Assignees
Labels
A-bug Something isn't working B-performance prio: high High priority

Comments

@haitaoguan
Copy link
Collaborator

haitaoguan commented May 15, 2023

###sql_text
select a.*
  FROM c_acct a
  LEFT JOIN (SELECT b.account_id, b.fiscal_date, b.balance
               FROM (SELECT account_id, max(fiscal_date) fiscal_date
                       FROM c_day
                      WHERE deleted_flag = '0'
                      GROUP BY account_id) a
               JOIN c_day b
                 ON a.account_id = b.account_id
                AND a.fiscal_date = b.fiscal_date
              WHERE b.deleted_flag = '0') c
    ON a.row_id = c.account_id

mysql> select count(*) from c1md_bank_acct;
+----------+
| count(*) |
+----------+
|     1016 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from c1am_acct_day;
+----------+
| count(*) |
+----------+
|  3251200 |
+----------+
1 row in set (0.00 sec)

###InnoDB explain plan
+----+-------------+---------------+------------+-------+-------------------------------------------------+-----------------+---------+----------------------------+---------+----------+-------------+
| id | select_type | table         | partitions | type  | possible_keys                                   | key             | key_len | ref                        | rows    | filtered | Extra       |
+----+-------------+---------------+------------+-------+-------------------------------------------------+-----------------+---------+----------------------------+---------+----------+-------------+
|  1 | PRIMARY     | a             | NULL       | ALL   | NULL                                            | NULL            | NULL    | NULL                       |    1022 |   100.00 | NULL        |
|  1 | PRIMARY     | <derived3>    | NULL       | ref   | <auto_key1>                                     | <auto_key1>     | 8       | mbs.a.ROW_ID               |     409 |   100.00 | NULL        |
|  1 | PRIMARY     | b             | NULL       | ref   | IDX_AMACCTDAY02,IDX_AMACCTDAY03,IDX_AMACCTDAY04 | IDX_AMACCTDAY02 | 12      | mbs.a.ROW_ID,a.fiscal_date |       1 |   100.00 | Using where |
|  3 | DERIVED     | c_day         | NULL       | index | IDX_AMACCTDAY02,IDX_AMACCTDAY03                 | IDX_AMACCTDAY03 | 8       | NULL                       | 3136482 |    10.00 | Using where |
+----+-------------+---------------+------------+-------+-------------------------------------------------+-----------------+---------+----------------------------+---------+----------+-------------+
4 rows in set, 1 warning (0.00 sec)

The optimizer of InnoDB prioritizes selecting small tables for association, with an execution time of 6 seconds.

###Tianmu explain plan
+----+-------------+---------------+------------+------+-------------------------------------------------+-------------+---------+------------------------------------+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table         | partitions | type | possible_keys                                   | key         | key_len | ref                                | rows    | filtered | Extra                                                                                                                                                                     |
+----+-------------+---------------+------------+------+-------------------------------------------------+-------------+---------+------------------------------------+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  1 | PRIMARY     | a             | NULL       | ALL  | NULL                                            | NULL        | NULL    | NULL                               |    1016 |   100.00 | NULL                                                                                                                                                                      |
|  1 | PRIMARY     | b             | NULL       | ALL  | IDX_AMACCTDAY02,IDX_AMACCTDAY03,IDX_AMACCTDAY04 | NULL        | NULL    | NULL                               | 3251200 |     0.14 | Using where                                                                                                                                                               |
|  1 | PRIMARY     | <derived3>    | NULL       | ref  | <auto_key0>                                     | <auto_key0> | 12      | mbs.b.ACCOUNT_ID,mbs.b.FISCAL_DATE |     320 |   100.00 | Using where; Using index                                                                                                                                                  |
|  3 | DERIVED     | c_day         | NULL       | ALL  | NULL                                            | NULL        | NULL    | NULL                               | 3251200 |    10.00 | Using where with pushed condition (`mbs`.`c_day`.`DELETED_FLAG` = '0')(t0) Pckrows: 50, susp. 50 (0 empty 0 full). Conditions: 1; Using temporary; Using filesort |
+----+-------------+---------------+------------+------+-------------------------------------------------+-------------+---------+------------------------------------+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4 rows in set, 1 warning (0.03 sec)

The optimizer of Tianmu correlates according to the order of table connections, with an execution time of 10 minutes.
@haitaoguan haitaoguan added the A-feature feature with good idea label May 15, 2023
@haitaoguan haitaoguan changed the title feature: feature: Tianmu's optimizer did not prioritize selecting small tables for association. May 15, 2023
@adofsauron adofsauron self-assigned this May 15, 2023
@adofsauron
Copy link
Collaborator

ACK

@adofsauron
Copy link
Collaborator


T:-2.LEFT_JOIN_ON({T:-1},{T:-4,T:-5},C:1)

@adofsauron
Copy link
Collaborator

adofsauron commented May 16, 2023

The problem is that tianmu executes LEFT_JOIN_ON. Changing the table order of the execution plan has little impact on the query time

+----+-------------+---------------+------------+------+---------------+------+---------+------+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table         | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra                                                                                                                                                                     |
+----+-------------+---------------+------------+------+---------------+------+---------+------+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  1 | PRIMARY     | a             | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1016 |   100.00 | NULL                                                                                                                                                                      |
|  1 | PRIMARY     | <derived3>    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |  325120 |   100.00 | Using where; Using join buffer (Block Nested Loop)                                                                                                                        |
|  1 | PRIMARY     | xx            | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 3251200 |   100.00 | Using where; Using join buffer (Block Nested Loop)                                                                                                                        |
|  3 | DERIVED     | c1am_acct_day | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 3251200 |    10.00 | Using where with pushed condition (`mbs`.`c1am_acct_day`.`DELETED_FLAG` = '0')(t0) Pckrows: 50, susp. 50 (0 empty 0 full). Conditions: 1; Using temporary; Using filesort |
+----+-------------+---------------+------------+------+---------------+------+---------+------+---------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


@adofsauron
Copy link
Collaborator

Add the WHERE c.account_id IS NOT NULL condition so that null does not exist in the result set, thus converting the outer join to the inner join. In this way, the join condition can be pushed up and down according to the situation, so that the query takes less than 1 second

1019883B-2A75-4a1c-A674-2E4BF3F2F3F5

@adofsauron
Copy link
Collaborator

But for the slow left join, we need to find a calm point. If the problem is nested loop, it needs to be redesigned

@adofsauron adofsauron changed the title feature: Tianmu's optimizer did not prioritize selecting small tables for association. bug: Tianmu's optimizer did not prioritize selecting small tables for association. May 18, 2023
@adofsauron adofsauron added A-bug Something isn't working prio: high High priority and removed A-feature feature with good idea labels May 18, 2023
@adofsauron adofsauron changed the title bug: Tianmu's optimizer did not prioritize selecting small tables for association. bug: 执行外连接的查询过于耗时, tianmu需要执行十四分钟, 但是inodb仅需要执行一分钟 May 22, 2023
@adofsauron
Copy link
Collaborator

同样是客户POC的问题, 影响客户是否使用天幕的决策,必须要尽快解决,本问题单继续对该问题做分析, 尽快处理该问题以满足客户的需求

@adofsauron adofsauron changed the title bug: 执行外连接的查询过于耗时, tianmu需要执行十四分钟, 但是inodb仅需要执行一分钟 bug: 执行外连接的查询过于耗时, tianmu需要执行十四分钟, 但是innodb仅需要执行一分钟 May 22, 2023
@adofsauron
Copy link
Collaborator

对比该SQL的外连接,进行内连接则具有显著的查询性能提升,此因素需要被操作符的物理计划上的逐个分析

@adofsauron
Copy link
Collaborator

adofsauron commented May 22, 2023

可以对比下内连接的执行计划,几乎所有的条件都被上提,而外连接却是在执行nested loop, 由于物化表没有聚集处理,导致时间复杂度暴增

@adofsauron
Copy link
Collaborator

我想想,外连接和内连接差别如此巨大,有点意思,代码写的有点屌

@adofsauron
Copy link
Collaborator

由于为客户POC拿下合同的紧迫性,在客户的现场,建议客户在自己的业务SQL中添加空值拒绝的语义以转为内连接,以在客户现场演示环节向客户展现连接查询的性能

@adofsauron
Copy link
Collaborator

adofsauron commented May 23, 2023

在进行POC以拿下客户的同时,对外连接进行彻底的分析

@adofsauron
Copy link
Collaborator

构建的查询计划倒是符合预期,重点在于物理计划的执行

void CompiledQuery::LeftJoinOn(const TabID &temp_table, std::vector<TabID> &left_tables,
                               std::vector<TabID> &right_tables, const CondID &cond_id) {
  CompiledQuery::CQStep s;
  s.type = StepType::LEFT_JOIN_ON;
  s.t1 = temp_table;
  s.c1 = cond_id;
  s.tables1 = left_tables;
  s.tables2 = right_tables;
  steps.push_back(s);
}

@adofsauron
Copy link
Collaborator

此处尤其要对比内连接的表结构见的关系, 与外连接不同,内连接对于表做了不同的处理

void CompiledQuery::InnerJoinOn(const TabID &temp_table, std::vector<TabID> &left_tables,
                                std::vector<TabID> &right_tables, const CondID &cond_id) {
  CompiledQuery::CQStep s;
  s.type = StepType::INNER_JOIN_ON;
  s.t1 = temp_table;
  s.c1 = cond_id;
  s.tables1 = left_tables;
  s.tables1.insert(s.tables1.end(), right_tables.begin(), right_tables.end());
  steps.push_back(s);
}

@adofsauron
Copy link
Collaborator

adofsauron commented May 23, 2023

这段代码写的有点水平,有点意思,虽说关系型数据库的关系不是标准的集合论,是采用的包模型,不过这块代码确实出彩,干开源的弟兄们看代码时候,可以多看看这块,有两下子

@adofsauron
Copy link
Collaborator

此连接操作在物理执行的时候在执行多重循环,更为麻烦的地方在于内连接不是一个连接操作,而是多个列组成的虚拟表的扫描过滤

@adofsauron
Copy link
Collaborator

void TempTable::AddInnerConds(Condition *cond, std::vector<TabID> &dim_aliases) {
  for (uint i = 0; i < cond->Size(); i++)
    for (uint j = 0; j < dim_aliases.size(); j++) (*cond)[i].left_dims[GetDimension(dim_aliases[j])] = true;
  filter.AddConditions(cond, CondType::ON_INNER_FILTER);
}

@adofsauron
Copy link
Collaborator

void TempTable::AddLeftConds(Condition *cond, std::vector<TabID> &left_aliases, std::vector<TabID> &right_aliases) {
  for (uint i = 0; i < cond->Size(); i++) {
    for (int d = 0; d < (*cond)[i].left_dims.Size(); d++) (*cond)[i].left_dims[d] = (*cond)[i].right_dims[d] = false;
    for (uint j = 0; j < left_aliases.size(); j++) (*cond)[i].left_dims[GetDimension(left_aliases[j])] = true;
    for (uint j = 0; j < right_aliases.size(); j++) (*cond)[i].right_dims[GetDimension(right_aliases[j])] = true;
  }
  filter.AddConditions(cond, CondType::ON_LEFT_FILTER);
}

@adofsauron
Copy link
Collaborator

  if (false_desc_found && outer_dims.IsEmpty()) {
    all_dims.Plus(cond[0].left_dims);  // for FALSE join condition
                                       // DimensionUsed() does not mark anything
    for (int i = 0; i < mind->NumOfDimensions(); i++)
      if (all_dims[i])
        mind->Empty(i);

    return;  // all done
  }

@adofsauron
Copy link
Collaborator

可以看出当将T4表和T5表都与T2表做了内连接后,随后的条件过滤都只针对T2表进行

@adofsauron
Copy link
Collaborator

之所以能这么做,涉及到集合论中的语义,可以参考韦恩图中的示意

@adofsauron
Copy link
Collaborator

通过将join操作符转换为限制的做法,成功的避免了对join的耗时处理

@adofsauron
Copy link
Collaborator

traversed_dims_size: 3251200 matched_dims_size: 1016 joined_tuples: 1032256 outer_tuples_: 0

@adofsauron
Copy link
Collaborator

mysql> select count(1) from c1md_bank_acct;
+----------+
| count(1) |
+----------+
|     1016 |
+----------+
1 row in set (0.00 sec)

mysql> select count(1) from c1am_acct_day;
+----------+
| count(1) |
+----------+
|  3251200 |
+----------+
1 row in set (0.00 sec)

@adofsauron
Copy link
Collaborator

  left_dims = {
    v = std::vector<bool> of length 3, capacity 64 = {1, 0, 0}
  }, 

--Type for more, q to quit, c to continue without paging--
right_dims = {
v = std::vector of length 3, capacity 64 = {0, 1, 1}
},

@adofsauron
Copy link
Collaborator

散列构建过程中已经出错

if (traversed_dims_.Intersects(matched_dims_) || !compatible) {
// both materialized - we should rather use a simple loop
// could not prepare common encoding
why_failed = JoinFailure::FAIL_HASH;
return false;
}

1 similar comment
@adofsauron
Copy link
Collaborator

散列构建过程中已经出错

if (traversed_dims_.Intersects(matched_dims_) || !compatible) {
// both materialized - we should rather use a simple loop
// could not prepare common encoding
why_failed = JoinFailure::FAIL_HASH;
return false;
}

@adofsauron
Copy link
Collaborator

(gdb) p traversed_dims_
$1 = {v = std::vector of length 3, capacity 64 = {1, 1, 1}}
(gdb) p matched_dims_
$2 = {v = std::vector of length 3, capacity 64 = {1, 1, 1}}

@adofsauron
Copy link
Collaborator

[2023-05-30 03:57:03.372153] [471180] [ERROR] [parallel_hash_join.cpp:376] MSG: PrepareBeforeJoin fail, traversed_dims_[3].Intersects(matched_dims_[3]) || !compatible[true]

@adofsauron
Copy link
Collaborator

对于这个问题有一些不可思议的地方, 其中一个需要引起重视的地方就是对于物化的极为苛刻的推迟,这里做下记录

@adofsauron
Copy link
Collaborator

递归处理连接的表,将所有的表都附加到一起,中间一点物化都不做

1 similar comment
@adofsauron
Copy link
Collaborator

递归处理连接的表,将所有的表都附加到一起,中间一点物化都不做

@adofsauron
Copy link
Collaborator

对浓密树做物化,再与左节点做连接

(gdb) p aliases
$6 = std::vector of length 2, capacity 2 = {-4, -4}
(gdb) p dim_aliases
$7 = std::vector of length 2, capacity 2 = {{n = -3}, {n = -4}}


@adofsauron
Copy link
Collaborator

设计的目标有这么几个:

  1. 去除将join条件的上推,以避免将连接结果集合的扩大
  2. 将连接尽可能早的进行物化, 而非延迟进行物化

@adofsauron
Copy link
Collaborator

(gdb) p v
$3 = std::vector of length 2, capacity 64 = {0, 0}
(gdb) p sec
$4 = (Tianmu::core::DimensionVector &) @0x7fde148ff750: {v = std::vector of length 3, capacity 64 = {0, 0, 1}}

@adofsauron
Copy link
Collaborator

有以下几个因素被涉及:

  1. 左表
  2. 右表
  3. 连接的条件与坐标和右边位图的关系

@adofsauron
Copy link
Collaborator

对于条件的分析:

#4  0x00000000030097f0 in Tianmu::core::CompiledQuery::FindSourceOfParameter (this=0x7fc7fe3aa700, tab_id=..., tmp_table=..., is_group_by=@0x7fc7fe3a7273: false)
    at /root/work/stonedb-dev-202305026/storage/tianmu/optimizer/compile/compiled_query.cpp:846
#5  0x0000000002d23e5c in Tianmu::core::Query::WrapMysqlExpression (this=0x7fc7fe3aa7d0, item=0x7fc538013da0, tmp_table=..., expr=@0x7fc7fe3a7460: 0x7fc7fe3a74a0, in_where=true, 
    aggr_used=false) at /root/work/stonedb-dev-202305026/storage/tianmu/core/query_compile.cpp:808
#6  0x0000000002d02737 in Tianmu::core::Query::Item2CQTerm (this=0x7fc7fe3aa7d0, an_arg=0x7fc538013da0, term=..., tmp_table=..., filter_type=Tianmu::core::CondType::ON_INNER_FILTER, 
    negative=false, left_expr_for_subselect=0x0, oper_for_subselect=0x7fc7fe3a7a2c, base_table=...) at /root/work/stonedb-dev-202305026/storage/tianmu/core/query.cpp:1125
#7  0x0000000002d03969 in Tianmu::core::Query::ConditionNumberFromComparison (this=0x7fc7fe3aa7d0, conds=0x7fc538902590, tmp_table=..., filter_type=Tianmu::core::CondType::ON_INNER_FILTER, 
    and_me_filter=0x0, is_or_subtree=false, negative=false, can_cond_push=false) at /root/work/stonedb-dev-202305026/storage/tianmu/core/query.cpp:1336
#8  0x0000000002d05f31 in Tianmu::core::Query::ConditionNumber (this=0x7fc7fe3aa7d0, conds=0x7fc538902590, tmp_table=..., filter_type=Tianmu::core::CondType::ON_INNER_FILTER, 
    and_me_filter=0x0, is_or_subtree=false, can_cond_push=false) at /root/work/stonedb-dev-202305026/storage/tianmu/core/query.cpp:1587
#9  0x0000000002d04a2f in Tianmu::core::Query::ConditionNumber (this=0x7fc7fe3aa7d0, conds=0x7fc538006960, tmp_table=..., filter_type=Tianmu::core::CondType::ON_INNER_FILTER, 
    and_me_filter=0x0, is_or_subtree=false, can_cond_push=false) at /root/work/stonedb-dev-202305026/storage/tianmu/core/query.cpp:1440
#10 0x0000000002d0698c in Tianmu::core::Query::BuildConditions (this=0x7fc7fe3aa7d0, conds=0x7fc538006960, cond_id=..., cq=0x7fc7fe3aa700, tmp_table=..., 
    filter_type=Tianmu::core::CondType::ON_INNER_FILTER, is_zero_result=false, join_type=Tianmu::core::JoinType::JO_INNER, can_cond_push=false)
    at /root/work/stonedb-dev-202305026/storage/tianmu/core/query.cpp:1635
#11 0x0000000002d07ce7 in Tianmu::core::Query::BuildCondsIfPossible (this=0x7fc7fe3aa7d0, conds=0x7fc538006960, cond_id=..., tmp_table=..., join_type=Tianmu::core::JoinType::JO_INNER)
    at /root/work/stonedb-dev-202305026/storage/tianmu/core/query.cpp:1928
#12 0x0000000002d21ac9 in Tianmu::core::Query::AddJoins (this=0x7fc7fe3aa7d0, join=..., tmp_table=..., left_tables=std::vector of length 1, capacity 1 = {...}, 
    right_tables=std::vector of length 0, capacity 0, in_subquery=false, first_table=@0x7fc7fe3aa4af: false, for_subq_in_where=false, use_tmp_when_no_join=false)
    at /root/work/stonedb-dev-202305026/storage/tianmu/core/query_compile.cpp:423
#13 0x0000000002d20f9b in Tianmu::core::Query::AddJoins (this=0x7fc7fe3aa7d0, join=..., tmp_table=..., left_tables=std::vector of length 1, capacity 1 = {...}, 
    right_tables=std::vector of length 0, capacity 0, in_subquery=false, first_table=@0x7fc7fe3aa4af: false, for_subq_in_where=false, use_tmp_when_no_join=false)
    at /root/work/stonedb-dev-202305026/storage/tianmu/core/query_compile.cpp:341
#14 0x0000000002d26227 in Tianmu::core::Query::Compile (this=0x7fc7fe3aa7d0, compiled_query=0x7fc7fe3aa700, selects_list=0x7fc538005570, last_distinct=0x0, res_tab=0x0, ignore_limit=false, 
    left_expr_for_subselect=0x0, oper_for_subselect=0x0, ignore_minmax=false, for_subq_in_where=false) at /root/work/stonedb-dev-202305026/storage/tianmu/core/query_compile.cpp:1245
#15 0x0000000002ceb722 in Tianmu::core::Engine::Execute (this=0x61a3eb0, thd=0x7fc538000e10, lex=0x7fc538003138, result_output=0x7fc538016608, unit_for_union=0x0)
    at /root/work/stonedb-dev-202305026/storage/tianmu/core/engine_execute.cpp:472
#16 0x0000000002cea8ab in Tianmu::core::Engine::HandleSelect (this=0x61a3eb0, thd=0x7fc538000e10, lex=0x7fc538003138, result=@0x7fc7fe3aadc8: 0x7fc538016608, setup_tables_done_option=0, 
    res=@0x7fc7fe3aadc4: 0, is_optimize_after_tianmu=@0x7fc7fe3aadbc: 1, tianmu_free_join=@0x7fc7fe3aadc0: 1, with_insert=0)
    at /root/work/stonedb-dev-202305026/storage/tianmu/core/engine_execute.cpp:243
#17 0x0000000003084b08 in Tianmu::DBHandler::ha_my_tianmu_query (thd=0x7fc538000e10, lex=0x7fc538003138, result_output=@0x7fc7fe3aadc8: 0x7fc538016608, setup_tables_done_option=0, 
    res=@0x7fc7fe3aadc4: 0, is_optimize_after_tianmu=@0x7fc7fe3aadbc: 1, tianmu_free_join=@0x7fc7fe3aadc0: 1, with_insert=0)
    at /root/work/stonedb-dev-202305026/storage/tianmu/sql/ha_my_tianmu.cpp:95

@adofsauron
Copy link
Collaborator

可以理解为在执行操作序列前对表和条件之间进行重新构建

@adofsauron
Copy link
Collaborator

对比下固化的执行序列

|    36 | X_1=0:int := sql.mvc();                                                                                                                                                               |
|    27 | C_2=[3]:bat[:oid] := sql.tid(X_1=0:int, "sys":str, "b":str);                                                                                                                          |
|    22 | C_4=[4]:bat[:oid] := sql.tid(X_1=0:int, "sys":str, "c":str);                                                                                                                          |
|  5803 | X_5=[2]:bat[:int] := bat.pack(32:int, 32:int);                                                                                                                                        |
|    21 | X_6=[2]:bat[:str] := bat.pack(".b":str, ".x":str);                                                                                                                                    |
|  3902 | X_3=[2]:bat[:int] := bat.pack(0:int, 0:int);                                                                                                                                          |
|   428 | X_7=[0]:bat[:int] := bat.new(nil:int);                                                                                                                                                |
|   255 | X_8=[0]:bat[:int] := bat.new(nil:int);                                                                                                                                                |
|  4474 | X_12=[3]:bat[:int] := sql.bind(X_1=0:int, "sys":str, "b":str, "b1":str, 0:int);                                                                                                       |
|  7059 | X_11=[2]:bat[:str] := bat.pack("b1":str, "c1":str);                                                                                                                                   |
|  5066 | X_9=[2]:bat[:int] := sql.bind(X_1=0:int, "sys":str, "a":str, "a1":str, 0:int);                                                                                                        |
|  2933 | C_10=[2]:bat[:oid] := sql.tid(X_1=0:int, "sys":str, "a":str);                                                                                                                         |
|  8466 | X_13=[2]:bat[:str] := bat.pack("int":str, "int":str);                                                                                                                                 |
|    32 | X_14=[3]:bat[:int] := algebra.projection(C_2=[3]:bat[:oid], X_12=[3]:bat[:int]);                                                                                                      |
|   324 | C_15=[3]:bat[:oid] := bat.mirror(X_14=[3]:bat[:int]);                                                                                                                                 |
|  1010 | X_16=[4]:bat[:int] := sql.bind(X_1=0:int, "sys":str, "c":str, "c1":str, 0:int);                                                                                                       |
|  1554 | C_17=[2]:bat[:oid] := algebra.thetaselect(X_9=[2]:bat[:int], C_10=[2]:bat[:oid], 1:int, ">=":str); # select: sorted                                                                   |
|    28 | X_18=[2]:bat[:int] := algebra.projection(C_17=[2]:bat[:oid], X_9=[2]:bat[:int]);                                                                                                      |
|     6 | X_19=0@0:void := language.pass(X_9=[2]:bat[:int]);                                                                                                                                    |
|  1259 | C_20=[3]:bat[:oid] := algebra.thetaselect(X_16=[4]:bat[:int], C_4=[4]:bat[:oid], 3:int, "<=":str); # select: densescan v >= vl && v <= vh (canditer_next_dense)                       |
|    23 | X_21=[3]:bat[:int] := algebra.projection(C_20=[3]:bat[:oid], X_16=[4]:bat[:int]);                                                                                                     |
|     5 | X_22=0@0:void := language.pass(X_16=[4]:bat[:int]);                                                                                                                                   |
|  1912 | X_23=[1]:bat[:oid] := algebra.join(X_21=[3]:bat[:int], X_18=[2]:bat[:int], nil:BAT, nil:BAT, false:bit, nil:lng); # mergejoin_int                                                     |
|    21 | X_24=[1]:bat[:int] := algebra.projection(X_23=[1]:bat[:oid], X_21=[3]:bat[:int]);                                                                                                     |
|     8 | X_25=0@0:void := language.pass(X_21=[3]:bat[:int]);                                                                                                                                   |
|  2852 | (X_26=[1]:bat[:oid], X_27=[1]:bat[:oid]) := algebra.join(X_14=[3]:bat[:int], X_24=[1]:bat[:int], nil:BAT, nil:BAT, false:bit, nil:lng); # selectjoin; select: sorted                  |
|  2631 | C_28=[2]:bat[:oid] := algebra.difference(C_15=[3]:bat[:oid], X_26=[1]:bat[:oid], nil:BAT, nil:BAT, false:bit, false:bit, nil:lng); # leftjoin; mergejoin_void; select: dense          |
|    29 | X_29=[2]:bat[:int] := algebra.projection(C_28=[2]:bat[:oid], X_14=[3]:bat[:int]);                                                                                                     |
|   170 | X_30=[1]:bat[:int] := algebra.projection(X_26=[1]:bat[:oid], X_14=[3]:bat[:int]);                                                                                                     |
|    11 | X_31=0@0:void := language.pass(X_26=[1]:bat[:oid]);                                                                                                                                   |
|   984 | X_32=[2]:bat[:int] := algebra.project(C_28=[2]:bat[:oid], nil:int);                                                                                                                   |
|     6 | X_33=0@0:void := language.pass(C_28=[2]:bat[:oid]);                                                                                                                                   |
|    14 | X_35=0@0:void := language.pass(X_14=[3]:bat[:int]);                                                                                                                                   |
|   587 | X_34=[1]:bat[:int] := algebra.projection(X_27=[1]:bat[:oid], X_24=[1]:bat[:int]);                                                                                                     |
|    14 | X_36=0@0:void := language.pass(X_24=[1]:bat[:int]);                                                                                                                                   |
|   299 | X_37=[1]:bat[:int] := bat.append(X_8=[1]:bat[:int], X_34=[1]:bat[:int], true:bit);                                                                                                    |
|    18 | X_38=[3]:bat[:int] := bat.append(X_37=[3]:bat[:int], X_32=[2]:bat[:int], true:bit);                                                                                                   |
|  2123 | X_39=[1]:bat[:int] := bat.append(X_7=[1]:bat[:int], X_30=[1]:bat[:int], true:bit);                                                                                                    |
|    24 | X_40=[3]:bat[:int] := bat.append(X_39=[3]:bat[:int], X_29=[2]:bat[:int], true:bit);                                                                                                   |
| 28377 | barrier X_41=false:bit := language.dataflow();                                                                                                                                        |
|    28 | X_42=1:int := sql.resultSet(X_6=[2]:bat[:str], X_11=[2]:bat[:str], X_13=[2]:bat[:str], X_5=[2]:bat[:int], X_3=[2]:bat[:int], X_40=[3]:bat[:int], X_38=[3]:bat[:int]);     

@adofsauron
Copy link
Collaborator

对于嵌套内连接直接物化后进行连接

 X_23=[1]:bat[:oid] := algebra.join(X_21=[3]:bat[:int], X_18=[2]:bat[:int], nil:BAT, nil:BAT, false:bit, nil:lng); # mergejoin_int 

@adofsauron
Copy link
Collaborator

这里对条件的处理存在不同,stonedb完全是以条件为中心进行关联,而固化的操作则是以列的向量处理为依据

|  1554 | C_17=[2]:bat[:oid] := algebra.thetaselect(X_9=[2]:bat[:int], C_10=[2]:bat[:oid], 1:int, ">=":str); # select: sorted                                                                   |


@adofsauron
Copy link
Collaborator

[2023-06-01 04:59:39.838678] [657043] [ERROR] [parallel_hash_join.cpp:376] MSG: PrepareBeforeJoin fail, traversed_dims_[3].Intersects(matched_dims_[3]) || !compatible[true]

@adofsauron
Copy link
Collaborator

这里需要注意,当使用内连接物化的操作时,构建散列探测的过程将会出错

@adofsauron
Copy link
Collaborator

可以看出是先将条件从连接上推,然后是尽可能的下推到内连接的表中

@adofsauron
Copy link
Collaborator

修复完客户上一个紧急的查询错误,继续回到正题处理这个事情

@adofsauron
Copy link
Collaborator

47CCB2FE-E794-4980-96B7-24C6282DF24E

@adofsauron adofsauron assigned adofsauron and unassigned adofsauron Jun 14, 2023
@adofsauron
Copy link
Collaborator

在列存储的数据库中对于此处场景的处理与mysql/sql的处理逻辑有诸多不同, 这里仅关注比较经典的列存储数据库的设计策略,比较明显的地方有:

  1. 处理单元的组织
  2. 在处理join时在列存储数据库中更多的基于列的处理
  3. 外连接与内连接在与存储层面的交互的处理

@adofsauron adofsauron changed the title bug: 执行外连接的查询过于耗时, tianmu需要执行十四分钟, innodb执行外连接仅需要执行一分钟,但是tianmu执行内连接仅需要0.4秒 bug: 执行外连接的查询过于耗时, tianmu需要执行十四分钟, innodb执行外连接仅需要执行一分钟 Jun 27, 2023
@adofsauron
Copy link
Collaborator

对于表连接的处理mysql/sql层使用嵌套循环并将条件做处理, 对于tianmu来说这种优化起到了反效果, 对于列的处理来说,执行逻辑类似于以下伪代码, 而不能采用嵌套循环的措施进行优化

left outer join (                        
| table("sys"."b") [ "b"."b1" ] ,  
| join (                                 
| | select (                             
| | | table("sys"."a") [ "a"."a1" ] 
| | ) [ "a"."a1" >= int "1" ],           
| | select (                             
| | | table("sys"."c") [ "c"."c1" ] 
| | ) [ "c"."c1" <= int "3" ]            
| ) [ "a"."a1" = "c"."c1" ]              
) [ "b"."b1" = "c"."c1" ]                

@adofsauron
Copy link
Collaborator

+---------------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table         | Non_unique | Key_name        | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| c1am_acct_day |          0 | PRIMARY         |            1 | ROW_ID      | A         |     3251257 |     NULL | NULL   |      | BTREE      |         |               |
| c1am_acct_day |          0 | IDX_AMACCTDAY02 |            1 | ACCOUNT_ID  | A         |         813 |     NULL | NULL   |      | BTREE      |         |               |
| c1am_acct_day |          0 | IDX_AMACCTDAY02 |            2 | FISCAL_DATE | A         |     2755421 |     NULL | NULL   | YES  | BTREE      |         |               |
| c1am_acct_day |          1 | IDX_AMACCTDAY01 |            1 | TENANT_ID   | A         |           9 |     NULL | NULL   |      | BTREE      |         |               |
| c1am_acct_day |          1 | IDX_AMACCTDAY03 |            1 | ACCOUNT_ID  | A         |        1524 |     NULL | NULL   |      | BTREE      |         |               |
| c1am_acct_day |          1 | IDX_AMACCTDAY04 |            1 | FISCAL_DATE | A         |        3513 |     NULL | NULL   | YES  | BTREE      |         |               |
+---------------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-bug Something isn't working B-performance prio: high High priority
Projects
Status: In Progress
Development

No branches or pull requests

3 participants