Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strange activity about join query with the join key condition. #63

Open
sophie-jeong opened this issue Jan 14, 2021 · 3 comments
Open

Comments

@sophie-jeong
Copy link

Hello , Here is the my sql query

explain verbose
SELECT
a.shop_member_id
, a.mall_id
, a.shop_no
, a.member_id
, a.cmember_name
, b.email
, b.cell
FROM
shop_member as a
INNER JOIN shop_member_property b ON ( a.shop_member_id = b.shop_member_id and a.shop_member_id in ( '6d34478f-626d-d162-6758-e9fc98dcd6a3') )
WHERE 1=1
QUERY PLAN




Foreign Scan (cost=1.00..-1.00 rows=1 width=1)
Output: a.shop_member_id, a.mall_id, a.shop_no, a.member_id, a.member_name, b.email, b.cell
Relations: (shop_member a) INNER JOIN (shop_member_property b)
Remote SQL: SELECT r1.shop_member_id, r1.mall_id, r1.shop_no, r1.member_id, r1.c_member_name, r2.c_email, r2.c_cell FROM bidb.shop_member r1 ALL INNE
R JOIN bidb.shop_member_property r2 ON (TRUE) WHERE ((r2.shop_member_id = '6d34478f-626d-d162-6758-e9fc98dcd6a3')) AND ((r1.shop_member_id = '6d34478f-62
6d-d162-6758-e9fc98dcd6a3'))

(4 rows)

Received exception from server (version 20.6.4):
Code: 403. DB::Exception: Received from localhost:9000. DB::Exception: Cannot get JOIN keys from JOIN ON section: TRUE.

@HagasSaan
Copy link

Same, but works if in join clause for IN condition will have 2 items - works as expected
For previous comment if might be as

INNER JOIN shop_member_property b ON ( a.shop_member_id = b.shop_member_id and a.shop_member_id in ( '6d34478f-626d-d162-6758-e9fc98dcd6a3', 'some_unknown_shop_member_id') )

@HagasSaan
Copy link

HagasSaan commented Jul 5, 2021

How to reproduce:

On Clickhouse:

create table default.deals_1_pipedrive
(
    date                        Date,
    __account_id                String,
    __row_hash                  UInt64,
    __row_id                    UInt16,
    __key_hash                  UInt64,
    __sign                      Int8,
    __insert_date               DateTime
)
    engine = CollapsingMergeTree(__sign)
        PARTITION BY toYYYYMM(date)
        ORDER BY (date, __account_id, __row_hash, __row_id)
        SETTINGS index_granularity = 8192;


CREATE VIEW default.deals_1_pipedrive_fdw
            (
             `date` Date,
             `__account_id` String,
             `__insert_date` DateTime
                )
AS
SELECT deals_1_pipedrive.date,
       deals_1_pipedrive.__account_id,
       deals_1_pipedrive.__insert_date
FROM default.deals_1_pipedrive
         FINAL;

INSERT INTO default.deals_1_pipedrive (
    date, __account_id, __row_hash, __row_id, __key_hash, __sign, __insert_date
) VALUES (
'2019-12-13', '1234', 1751228920605290501, 0,
8193464384205566314, 1, '2021-06-07 06:33:21'
);


-- auto-generated definition
create table report_accounts
(
    id           UInt64,
    account_id   String
)
    engine = File(TabSeparated);

INSERT INTO default.report_accounts (id, account_id) VALUES (218760, '1234');

then on CH FDW

create foreign table deals_1_pipedrive_fdw
    (
        date date,
        __account_id text
        )
    server clickhouse_svr;

create foreign table report_accounts
    (
        id integer,
        account_id text
        )
    server clickhouse_svr;

Then run

select * from deals_1_pipedrive_fdw m
join report_accounts s on m.__account_id = s.account_id
where m.__account_id in ('1234');

and it will fail with Cannot get JOIN keys from JOIN ON section: TRUE

Explain verbose will give
Remote SQL: SELECT r1.date, r1.__account_id, r2.id, r2.account_id FROM "default".deals_1_pipedrive_fdw r1 ALL INNER JOIN "default".report_accounts r2 ON (TRUE) WHERE ((r2.account_id = '1234')) AND ((r1.__account_id = '1234'))

@ildus
Copy link
Owner

ildus commented Jul 5, 2021

Hi, thank you for the script to reproduce.
Unfortunately, there is nothing I can do on FDW side, since the conditions changed by postgres optimizer, and FDW gets the conditions as you can see and doesn't change them.
Just to make sure I made a special build of postgres where I disabled process_equivalence function, and it made the query working.
There could be some trick to avoid this optimization.
Another option could be a discussion with the clickhouse developers to enable ON (True) construction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants