New IR -- WIP #24466

kasiafi · 2024-12-13T08:10:11Z

See core/trino-main/src/main/java/io/trino/sql/newir/README.md for details.

kasiafi · 2024-12-13T17:47:01Z

example assembly printout

IR version = 1
%0 = query() : () -> "boolean" ({
    ^query
        %1 = table_scan() : () -> "multiset(row(""f_1"" varchar(25),""f_2"" bigint))" ()
            {table_handle = "{""catalogHandle"":""tpch:normal:default"",""connectorHandle"":{""@type"":""../../plugin/trino-tpch/pom.xml:io.trino.plugin.tpch.TpchTableHandle"",""schemaName"":""tiny"",""tableName"":""nation"",""scaleFactor"":0.01,""constraint"":{""columnDomains"":[]}},""transaction"":[""../../plugin/trino-tpch/pom.xml:io.trino.plugin.tpch.TpchTransactionHandle"",""INSTANCE""]}", column_handles = "[{""@type"":""../../plugin/trino-tpch/pom.xml:io.trino.plugin.tpch.TpchColumnHandle"",""columnName"":""name"",""type"":""varchar(25)""},{""@type"":""../../plugin/trino-tpch/pom.xml:io.trino.plugin.tpch.TpchColumnHandle"",""columnName"":""regionkey"",""type"":""bigint""}]", constraint = "{""columnDomains"":[]}", update_target = "false", use_connector_node_partitioning = "false"}
        %2 = filter(%1) : ("multiset(row(""f_1"" varchar(25),""f_2"" bigint))") -> "multiset(row(""f_1"" varchar(25),""f_2"" bigint))" ({
            ^predicate (%3 : "row(""f_1"" varchar(25),""f_2"" bigint)")
                %4 = field_selection(%3) : ("row(""f_1"" varchar(25),""f_2"" bigint)") -> "bigint" ()
                    {field_name = "f_2"}
                %5 = constant() : () -> "bigint" ()
                    {constant_result = "{""type"":""bigint"",""value"":2}"}
                %6 = comparison(%4, %5) : ("bigint", "bigint") -> "boolean" ()
                    {comparison_operator = "GREATER_THAN"}
                %7 = return(%6) : ("boolean") -> "boolean" ()
                    {ir.terminal = "true"}
            })
        %8 = project(%2) : ("multiset(row(""f_1"" varchar(25),""f_2"" bigint))") -> "multiset(row(""f_1"" varchar(25)))" ({
            ^assignments (%9 : "row(""f_1"" varchar(25),""f_2"" bigint)")
                %10 = field_selection(%9) : ("row(""f_1"" varchar(25),""f_2"" bigint)") -> "varchar(25)" ()
                    {field_name = "f_1"}
                %11 = call(%10) : ("varchar(25)") -> "varchar(25)" ()
                    {resolved_function = "{""signature"":{""name"":{""catalogName"":""system"",""schemaName"":""builtin"",""functionName"":""lower""},""returnType"":""varchar(25)"",""argumentTypes"":[""varchar(25)""]},""catalogHandle"":""system:normal:system"",""functionId"":""lower(varchar(x)):varchar(x)"",""functionKind"":""SCALAR"",""deterministic"":true,""functionNullability"":{""returnNullable"":false,""argumentNullable"":[false]},""typeDependencies"":{},""functionDependencies"":[]}"}
                %12 = row(%11) : ("varchar(25)") -> "row(varchar(25))" ()
                %13 = return(%12) : ("row(varchar(25))") -> "row(varchar(25))" ()
                    {ir.terminal = "true"}
            })
        %14 = output(%8) : ("multiset(row(""f_1"" varchar(25)))") -> "boolean" ({
            ^outputFieldSelector (%15 : "row(""f_1"" varchar(25))")
                %16 = field_selection(%15) : ("row(""f_1"" varchar(25))") -> "varchar(25)" ()
                    {field_name = "f_1"}
                %17 = row(%16) : ("varchar(25)") -> "row(varchar(25))" ()
                %18 = return(%17) : ("row(varchar(25))") -> "row(varchar(25))" ()
                    {ir.terminal = "true"}
            })
            {output_names = "[""_col0""]", ir.terminal = "true"}
    })
    {ir.terminal = "true"}

kasiafi · 2025-02-07T12:14:55Z

core/trino-main/src/main/java/io/trino/sql/dialect/trino/TrinoAttributeRegistry.java

I believe that the attributes should be on the top level of the dialect, as opposed to being part of an operation.
IR allows mixing dialects. In particular, an operation can have attributes from another dialect. For example, the trino.query operation has the attribute ir.terminal. The dialect is supposed to understand an attribute outside the context of an operation.

kasiafi · 2025-02-07T12:24:58Z

core/trino-main/src/main/java/io/trino/sql/dialect/trino/TypeConstraint.java

+{
+    // Intermediate result row type.
+    // Row without fields is supported and represented as EmptyRowType.
+    // If row fields are present, they must have valid unique names.


Currently, the intermediate relation row type has field names, and the fields are referenced by name with the FieldSelection operation. It will be refactored so that the row type is anonymous, and the fields will be referenced by index with the FieldReference operation.

Explanation:
The query program must work correctly with the Memo and Equivalence Classes. For that purpose, the intermediate relation type must be generic. Each operation in an Equivalence Class must derive exactly the same output type, because they all must be compatible with the downstream program.
Due to this limitation, field names in the intermediate row type aren't very useful. We must use generic sequential names, for example f_1, f_2, f_3... Using indexes would be more concise.

kasiafi · 2025-02-07T12:26:39Z

core/trino-main/src/main/java/io/trino/sql/dialect/trino/operation/Join.java

+        if (leftCriteriaSelector.parameters().size() != 1 ||
+                !trinoType(leftCriteriaSelector.parameters().getFirst().type()).equals(relationRowType(trinoType(left.type()))) ||
+                !(trinoType(leftCriteriaSelector.getReturnedType()) instanceof RowType || trinoType(leftCriteriaSelector.getReturnedType()).equals(EMPTY_ROW))) {
+            throw new TrinoException(IR_ERROR, "invalid left criteria selector for Join operation");
+        }


There is a lot of repetition of this code across different operation classes. It will be extracted and reused.

cla-bot bot added the cla-signed label Dec 13, 2024

kasiafi force-pushed the 526Abstractions branch 2 times, most recently from ed4c70f to 6ea030e Compare December 13, 2024 17:44

kasiafi force-pushed the 526Abstractions branch 14 times, most recently from 30ed4b3 to 852f33b Compare December 20, 2024 15:17

kasiafi force-pushed the 526Abstractions branch 12 times, most recently from d71a5b9 to 8f19d4a Compare December 29, 2024 10:53

kasiafi force-pushed the 526Abstractions branch 11 times, most recently from 687d453 to fdaffa9 Compare January 10, 2025 12:31

kasiafi force-pushed the 526Abstractions branch from fdaffa9 to 9050e1c Compare January 21, 2025 13:21

kasiafi force-pushed the 526Abstractions branch 5 times, most recently from 5383dcf to 0a5463b Compare February 4, 2025 14:49

kasiafi marked this pull request as ready for review February 4, 2025 14:50

kasiafi force-pushed the 526Abstractions branch 3 times, most recently from bf82af6 to 1b12725 Compare February 5, 2025 19:35

kasiafi added 2 commits February 5, 2025 21:04

New IR -- abstractions and dialects

f77b9e5

Add parser for new IR

61a2bee

kasiafi force-pushed the 526Abstractions branch from 1b12725 to 61a2bee Compare February 5, 2025 20:05

kasiafi commented Feb 7, 2025

View reviewed changes

kasiafi force-pushed the 526Abstractions branch from efa6f93 to 27649af Compare February 7, 2025 17:01

Test IR print with QueryRunner

40aeed5

kasiafi force-pushed the 526Abstractions branch from 27649af to 40aeed5 Compare February 7, 2025 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New IR -- WIP #24466

New IR -- WIP #24466

kasiafi commented Dec 13, 2024 •

edited

Loading

kasiafi commented Dec 13, 2024 •

edited

Loading

kasiafi Feb 7, 2025

kasiafi Feb 7, 2025

kasiafi Feb 7, 2025

New IR -- WIP #24466

Are you sure you want to change the base?

New IR -- WIP #24466

Conversation

kasiafi commented Dec 13, 2024 • edited Loading

kasiafi commented Dec 13, 2024 • edited Loading

kasiafi Feb 7, 2025

Choose a reason for hiding this comment

kasiafi Feb 7, 2025

Choose a reason for hiding this comment

kasiafi Feb 7, 2025

Choose a reason for hiding this comment

kasiafi commented Dec 13, 2024 •

edited

Loading

kasiafi commented Dec 13, 2024 •

edited

Loading