Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add formatter for the parse AST #820

Open
wants to merge 71 commits into
base: master
Choose a base branch
from

Conversation

LaurenzV
Copy link

No description provided.

Copy link
Contributor

@JonasAlaif JonasAlaif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we really don't need all of these reformat override defns. Please find some way to avoid most of them (see my suggestion for how).

@@ -318,7 +335,9 @@ case class PVersionedIdnUseExp(name: String, version: String, separator: String
trait PAnyFormalArgDecl extends PNode with PUnnamedTypedDeclaration with PPrettySubnodes

/** The declaration of an argument to a domain function. Not a `PDeclaration` as it will never clash. */
case class PDomainFunctionArg(name: Option[PIdnDef], c: Option[PSym.Colon], typ: PType)(val pos: (Position, Position)) extends PAnyFormalArgDecl
case class PDomainFunctionArg(name: Option[PIdnDef], c: Option[PSym.Colon], typ: PType)(val pos: (Position, Position)) extends PAnyFormalArgDecl {
override def reformat(implicit ctx: ReformatterContext): List[RNode] = showOption(name) <> showOption(c) <+> show(typ)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really have a feeling we could avoid a lot of these functions by simply using the PPrettySubnodes trait (and e.g. adding a reformat function there or even just merge pretty and reformat).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I've gotten rid of quite a few implementations in ecbb283, let me know if it is better now. There are still quite a few implementations, but they are kind of unavoidable because a lot of nodes need custom handling.

I can try to move the Reformattable trait into PPrettySubNodes. I'm not a 100% sure about merging pretty and reformat, since I'm not sure if someone maybe relies on the API and I want to avoid breaking/interfering with existing code as much as possible. Not to mention that it would result in an even bigger git diff, so I think if at all it would make sense to do it in a separate PR? As you want, though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow response, I was on holiday. pretty is only used in the IDE e.g. to display the function signature of a function call when hovered: I imagine that it should match quite closely to what reformat does. For example, pretty is already able to handle printing this PDomainFunctionArg - the PSym.Colon itself adds the space. I imagine reformat could be made to work the same way, with little custom handling here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just to be clear, you still want me to replace pretty completely in this PR? Or would it be fine to open an issue for that and for now, I just focus on trying to merge this into PPrettySubNodes and seeing whether I can remove some more impls?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this PR to do things the way that pretty does. That means that the vast majority of classes in ParseAst.scala don't have a pretty or reformat implementation (looking over it now, I even see some overrides of pretty that are unnecessary), except for the unparseable types and 2 or 3 special pnodes maybe. If you're not sure how to achieve this, I'm happy to help out a bit.

If pretty is deleted once that is done is not super important. I imagine it would simply be implemented as .reformat.toString or whatever the calls are.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I’ve already managed to remove a couple more overrides, I will keep trying a few more things and see how far I can get.

@@ -19,6 +22,8 @@ trait LeftSpace extends PReservedString { override def leftPad = " " }
trait RightSpace extends PReservedString { override def rightPad = " " }
case class PReserved[+T <: PReservedString](rs: T)(val pos: (Position, Position)) extends PNode with PLeaf {
override def display = rs.display
// Need to override implementation because pretty-printing will add space padding
override def reformat(implicit ctx: ReformatterContext): List[RNode] = rt(rs.token)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change leftPad and rightPad to be Boolean (I think currently they only are "" or " ") and then add a space to the left/right based on that. I think you can do the majority of spaces this way. You could also add a addNewline field which is set for e.g. all stmt keywords and function/method/requires etc. which adds a newline before the keyword. Lastly, I would define a some things as "grouping" in the sense that they add two spaces after a newline for all contained RNodes, which would probably handle most cases of indentation (I think you already do this below, but I'm not 100% sure).

@LaurenzV
Copy link
Author

LaurenzV commented Mar 1, 2025

So, overall, I think I've managed to get rid of a bunch of overrides, thanks for the hints, they helped a lot! :D However, there still are quite a few remaining but I think it will be rather hard to remove those... :/ I'll try to add explanations for some of them below.

Some other notes:
Unfortunately, I couldn't reuse leftPad/rightPad in all instances, because there are a few examples where we need to deviate from the pretty printer (for example, the pretty printer by default wraps any binary operator with spaces, but for the reformatter we handle operators like + and ==> differently, so always padding with spaces messes things up). So I took the route of using the padding of the pretty printer by default, but override it for a few select cases.

Regarding some of the overridden reformats:
PGrouped, PDelimited: As you can see in the code there's quite a few different cases that need to be distinguished, so I don't think this can easily be removed.

PFunctionType: We can't reuse the pretty impl here because it would call pretty on all children as well.

PExp/PBinExp: For those we also have different code paths depending on what exactly we are processing, so it's again hard to find a generalization for these.

PCondExp,PUnfolding,PApplying,PAsserting, etc.: For those (i.e. the ones with PExp), the difficulty is that the nesting we apply does not really have a clear general pattern, so I'm not exactly sure how you would generically apply the grouping. For example, for unfolding you have a single expression which is nested, grouped and then prepended with a line (which will evaluate to either a space or a new line depending on the situation), while for PCondExp, we have two expressions, out of which the else part shouldn't get an additional layer of nesting. Also, always prepending a line to an expression is also not desirable, for example, if you have a field access somewhere in your code, this shouldn't automatically have a prepended whitespace.

PIf, PWhile, etc.: Those also have somewhat conditional formatting. For example, if a while loop has invariants, then the opening bracket should always be on a newline, while if there aren't any invariants it looks nicer if it's on the same line.

PDefine: I don't think I can remove the manual space here since adding spaces to either PDelimited or even PNode will break a lot of other things.

PDomainFunction, PExp: Same here, adding a space by default to either PIdnDef or PDelimited or PBracedExp would break lots of other things.

Similar problems apply to the remaining ones.

Regarding your comments:

I would change leftPad and rightPad to be Boolean (I think currently they only are "" or " ") and then add a space to the left/right based on that.

I changed that!

You could also add a addNewline field which is set for e.g. all stmt keywords and function/method/requires etc. which adds a newline before the keyword.

I tried this, unfortunately this doesn't seem to work very well in my case... The problem is that I had to implement a somewhat sophisticated logic to preserve newlines in the existing program while also respecting the intended formatting, and it doesn't work well if I just always prepend a newline. Just to give you an idea, if you have for example:

field x: Int
field y: Int

After reformatting, it should preferably stay the same, while if you have something like:

field y: Int



predicate StructA(this: Ref) {
  acc(this.x) && acc(this.y)
}

The reformatted result will now look like:

field y: Int

predicate StructA(this: Ref) {
  acc(this.x) && acc(this.y)
}

instead of:

field y: Int
predicate StructA(this: Ref) {
  acc(this.x) && acc(this.y)
}

Things get even more complicated if there are comments in between the members where I need to look at the whitespaces before and after the outer comments, so this is why the logic for reformatting is pretty delicate, and just prepending a newline before each members unfortunately trips it up.

Lastly, I would define a some things as "grouping" in the sense that they add two spaces after a newline for all contained RNodes, which would probably handle most cases of indentation (I think you already do this below, but I'm not 100% sure).

Yes I have the rne() function which will indent everything that appears in-between, but as mentioned above there are kind of a lot of edge cases with different constructs having different preferred nesting rules, so generalizing those doesn't seem easy to me... The reason the pretty printer has much less of those is that it simplifies a lot, and therefore the output isn't great in many cases and often even produces syntactically broken results (I tried it). But for reformatting, it's obviously important that the output looks nice and the program stays syntactically valid.

Let me know what you think, I hope it's at least an improvement to before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants