Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Heya 👋
Well, it turns out, a lot of the core Commonmarker C library behavior is a bit strange when it comes to parsing/walking the AST. As you've noticed, Commonmarker will just blatantly rewrite text as it sees fit. Wrote a list using
*
s? Too bad, Commonmarker prefers-
after it's done parsing. I found an ancient PR to fix this specific behavior, but you already found similar defaults in other things like blockquotes and code blocks. So according to the C code, there's literally no way to just "pass through" text as it exists, which really sucks.Given that, if the text diff needs to remain identical, it might well be that the only plausible solution is to do a
gsub
as you originally thought.In this PR I'm suggesting an alternative approach, though it's by no means fool-proof. Here, rather than passing all unmatched text over with
node.to_commonmarker
(which sometimes rewrites the content), we'll instead keep an array pair of the original text and its replacement, for node types we care about. So, if headers are the only thing that need to change, aheader
method can operate on the node, and then an iteration over thechanges
can make those substitutions. It passes all the original test cases, so if header manipulation is all that's changing, it should at least be a viable solution for this!