Skip to content

Declarative Syntax Specification

Vaivaswatha N edited this page Jan 12, 2025 · 4 revisions

Declarative Syntax Specification

Introduction

To print and parse the IR, pliron provides the Printable and Parsable traits, which can be implemented on Op, Type and Attribute objects, and even on general Rust types.

However manually implementing these traits could be boring or cumbersome. MLIR solves this problem via ODS, allowing specification in a table-driven manner, using LLVM's TableGen.

Rather than implementing a separate tool, pliron chooses to take advantage of Rust's powerful procedural macros. This article describes the current capabilities for deriving the format of different IR entities via the #[format] macro.

Rust structs and tuples.

With the assumption that every field of a struct (or a tuple) already has trait Format: Printable + Parsable implemented, the #[format] macro can be used on a Rust struct to automatically derive Format for it.

#[format]
struct U64Wrapper {
    a: u64
}

For an instance of this with a : 42, the derived (generated) printer prints {a=42}.

While one can look at the generated printer and parser using cargo-expand, this one time, I provide the expansions below:

Generated `Printable` and `Parsable` for `U64Wrapper`
  impl ::pliron::printable::Printable for U64Wrapper {
    fn fmt(
        &self,
        ctx: &::pliron::context::Context,
        state: &::pliron::printable::State,
        fmt: &mut ::std::fmt::Formatter<'_>,
    ) -> ::std::fmt::Result {
        ::pliron::printable::Printable::fmt(&"{", ctx, state, fmt)?;
        ::pliron::printable::Printable::fmt(&"a", ctx, state, fmt)?;
        ::pliron::printable::Printable::fmt(&"=", ctx, state, fmt)?;
        ::pliron::printable::Printable::fmt(&self.a, ctx, state, fmt)?;
        ::pliron::printable::Printable::fmt(&"}", ctx, state, fmt)?;
        Ok(())
    }
}
impl ::pliron::parsable::Parsable for U64Wrapper {
    type Arg = ();
    type Parsed = Self;
    fn parse<'a>(
        state_stream: &mut ::pliron::parsable::StateStream<'a>,
        arg: Self::Arg,
    ) -> ::pliron::parsable::ParseResult<'a, Self::Parsed> {
        use ::pliron::parsable::IntoParseResult;
        use ::combine::Parser;
        use ::pliron::input_err;
        use ::pliron::location::Located;
        let cur_loc = state_stream.loc();
        ::pliron::irfmt::parsers::spaced(::combine::parser::char::string("{"))
            .parse_stream(state_stream)
            .into_result()?;
        ::pliron::irfmt::parsers::spaced(::combine::parser::char::string("a"))
            .parse_stream(state_stream)
            .into_result()?;
        ::pliron::irfmt::parsers::spaced(::combine::parser::char::string("="))
            .parse_stream(state_stream)
            .into_result()?;
        let a = <u64>::parser(()).parse_stream(state_stream).into_result()?.0;
        ::pliron::irfmt::parsers::spaced(::combine::parser::char::string("}"))
            .parse_stream(state_stream)
            .into_result()?;
        let final_ret_value = U64Wrapper { a };
        Ok(final_ret_value).into_parse_result()
    }
}

Specifying a custom syntax

The #[format] macro takes a string argument to specify a custom syntax.

  1. A named variable $name specifies a named struct field.
  2. An unnamed variable $i specifies the i'th field of a tuple struct.
  3. Literals are enclosed with backticks (`).

Example:

#[format("$upper `/` $lower")]
struct IntDiv {
    upper: u64,
    lower: u64,
}

An instance of this with upper = 42 and lower = 7 prints 42/7.

Rust enums.

The #[format] macro can derive Format for enums as well, as long as all sub-elements have both Printable and Parsable derived.

For enums, the #[format] macro does not take a custom syntax specification argument, although another #[format], with a custom format string, can be specified for its individual variants.

#[format]
enum Enum {
    A(TypePtr<IntegerType>),
    B { one: TypePtr<IntegerType>, two: IntWrapper },

    C,
    #[format("`<` $upper `/` $lower `>`")]
    Op {
        upper: u64,
        lower: u64,
    },
}

the printed values for each variant looks as below:

  • A(builtin.int <si64>)
  • B{one=builtin.int <si64>,two={inner=builtin.int <si64>}}
  • C
  • Op<42/7>

Attributes and Types

Since pliron's Attributes and Types are Rust types (structs and enums) implementing their respective traits, specifying a format for these is same as that for general Rust types described above, except that the format_attribute and format_type macros are used instead.

Examples: an attribute and a type, with the format specified.

#[def_attribute("test.my_attr")]
#[format_attribute("`<` $ty `>`")]
#[derive(PartialEq, Clone, Debug)]
struct MyAttr {
    ty: Ptr<TypeObj>,
}
impl_verify_succ!(MyAttr);

and

#[def_type("llvm.array")]
#[derive(Hash, PartialEq, Eq, Debug)]
#[format_type("`[` $size `x` $elem `]`")]
pub struct ArrayType {
    elem: Ptr<TypeObj>,
    size: u64,
}

Ops

Ops are Rust structs with just one field, a Ptr to the underlying Operation. Instead, semantics of Ops are based on the underlying Operation's result types, operands, regions and attributes. So the custom syntax rules for Ops are different.

Only those syntax in which results appear before the opid are supported:

res1, ... = opid ...

The format string specifies what comes after the opid.

  1. A named variable $name specifies a named attribute of the operation.
  2. An unnamed variable $i specifies operands[i], except when inside some directives.
  3. The "type" directive specifies that a type must be parsed. It takes one argument, which is an unnamed variable $i with i specifying result[i].
  4. The "region" directive specifies that a region must be parsed. It takes one argument, which is an unnamed variable $i with i specifying region[i].

Examples:

#[format_op("`:` type($0)")]
#[def_op("test.one_result_zero_operands")]
#[derive_op_interface_impl(ZeroOpdInterface, OneResultInterface)]
struct OneResultZeroOperandsOp {}
impl_verify_succ!(OneResultZeroOperandsOp);

This looks like res0 = test.one_result_zero_operands : builtin.int <si64>;

#[format_op("$0 `:` type($0)")]
#[def_op("test.one_result_one_operand")]
#[derive_op_interface_impl(OneOpdInterface, OneResultInterface)]
struct OneResultOneOperandOp {}
impl_verify_succ!(OneResultOneOperandOp);

This looks like res1 = test.one_result_one_operand res0 : builtin.int <si64>;