Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional Feature: backtracking is sub-calls #375

Open
User4martin opened this issue Jan 8, 2024 · 5 comments
Open

Optional Feature: backtracking is sub-calls #375

User4martin opened this issue Jan 8, 2024 · 5 comments

Comments

@User4martin
Copy link

  IsMatching('Recursive', '^(?''A''(?:b[^b]+(?&A)?x+))', 'baaabaaaaaxx', [1,12,  1,12]);

https://regex101.com/r/g3pDSW/1

^(?'A'
  (?:
    b[^b]+
    (?&A)?  ## This sub-call will match ALL the "x", but when the outer match fails, it backtracks, and matches only ONE "x"
     x+
  )
)

TRegExpr fails to match, because it does not backtrack the sub-call.

@User4martin
Copy link
Author

It appears that recursion (rather than sub-call) (?R) or (?0)` does not backtrack.

https://regex101.com/r/T2Il6J/1
(?:b[^bx]+((?R))?x+) on `baaabaaaxx'

If it did backtrack, then IMHO it should match the whole pattern, as it does if the + after the x+ is removed: https://regex101.com/r/uEG8kF/1

There is a mention of "recursion being atomic" here https://www.rexegg.com/regex-recursion.html

@User4martin
Copy link
Author

Actually, changing the regex a bit ^(?'A'(?:b[^bx]+(?&A)?x+))
https://regex101.com/r/wwuP8H/1

  • Then in PCRE >= 7.3 the sub-call does back-tracking.
  • But in PCRE < 7.3 it does not. (As can be seen by removing the + of the x+

@User4martin User4martin changed the title Wrong/Missing backtracking is sub-calls Optional Feature: backtracking is sub-calls Jan 8, 2024
@User4martin
Copy link
Author

I would suggest to keep that as a reminder for an optional feature.

@Alexey-T
Copy link
Collaborator

Alexey-T commented Jan 8, 2024

Martin, I am out of code knowledge here. so do the change as you wish. please test it.

User4martin pushed a commit to User4martin/TRegExpr that referenced this issue Jan 8, 2024
@User4martin
Copy link
Author

User4martin commented Jan 8, 2024

If it is ok, for now just keep the issue open. Needs more research how different engines handle it, and if there is documentation. Then maybe a property controlled behaviour: current or as pcre7.3.

Solving this, would need changes that would also remove the 20 levels of recursion.

I don't plan any immediate work on it... But might later go for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants