-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer equality when comparing values #34164
Comments
On Sqlite create table test (a, b); inequality has a much shorter plan:
but it performs a full table scan of the whole table for each row. Equal-to-not has a more complex plan:
which automatically creates a temporary index for column |
IIUC similar results also hold true in postgres: create table test (a boolean, b boolean);
|
SqlServer 2022: create table test (a bit, b bit);
SET SHOWPLAN_ALL ON;
|
An example session showing the performance difference on Sqlite:
TL;DR: about 3 times faster for a 50/50 distribution of true/false; much faster for 5/5/90 false/true/null |
A similar experiment on SqlServer 2022: set statistics time ON
create table test (a BIT, b BIT);
insert into test (a,b) select value & 1, CAST(value & 2 AS BIT) from generate_series(1, 10000000);
select COUNT_BIG (*) from test as t1, test as t2 where t1.a != t2.b;
--SQL Server parse and compile time:
-- CPU time = 0 ms, elapsed time = 0 ms.
--
-- SQL Server Execution Times:
-- CPU time = 4027 ms, elapsed time = 2281 ms.
select COUNT_BIG (*) from test as t1, test as t2 where t1.a = ~(t2.b);
--SQL Server parse and compile time:
-- CPU time = 0 ms, elapsed time = 0 ms.
--
-- SQL Server Execution Times:
-- CPU time = 3578 ms, elapsed time = 198 ms.
delete from test;
insert into test (a,b) select
CASE value % 20 WHEN 0 THEN 0 WHEN 1 THEN 1 END,
CASE (value / 20) % 20 WHEN 0 THEN 0 WHEN 1 THEN 1 END
from generate_series(1, 10000000);
select COUNT_BIG (*) from test as t1, test as t2 where t1.a != t2.b;
--SQL Server parse and compile time:
-- CPU time = 0 ms, elapsed time = 0 ms.
--
-- SQL Server Execution Times:
-- CPU time = 5176 ms, elapsed time = 3350 ms.
select COUNT_BIG (*) from test as t1, test as t2 where t1.a = ~(t2.b);
--SQL Server parse and compile time:
-- CPU time = 0 ms, elapsed time = 0 ms.
--
-- SQL Server Execution Times:
-- CPU time = 3682 ms, elapsed time = 199 ms. TL;DR: SqlServer saves about 15% of the CPU time when filtering with |
I see the same perf pattern on my informal test based on the script above! |
Ah, I mentioned it in the comment #34166 (comment) but maybe it should also be repeated here: these benchmarks are completely synthetic and do not represent a real-world workload. |
Most databases can take advantage of indexes (and in some cases even auto-generate them) when queries perform equality comparisons (instead of inequality).
In some places EFCore already tries to lean towards equality
efcore/src/EFCore.Relational/Query/SqlExpressionFactory.cs
Lines 657 to 666 in 20edb63
In other places it disregards this and in fact can even replace equality comparisons with inequalities
efcore/src/EFCore.Relational/Query/SqlNullabilityProcessor.cs
Lines 1748 to 1757 in 20edb63
efcore/src/EFCore.Relational/Query/SqlNullabilityProcessor.cs
Lines 1803 to 1804 in 20edb63
It would be better to avoid converting equalities into inequalities and, when possible, convert inequalities into equalities.
The text was updated successfully, but these errors were encountered: