-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement semantic diagnostics #379
Conversation
0df23fa
to
9f4a821
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, unless there's a break between compiler/library interfaces. We really need to keep those stable and migrate incrementally.
case invalidReference(Int) | ||
case duplicateNamedCapture(String) | ||
case invalidCharacterClassRangeOperand | ||
case invalidQuantifierRange(Int, Int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same enum or separate enum? (I haven't thought about it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially added them as a separate enum, but it seemed cleaner to do it this way as they share all the same logic as other parser errors for e.g printing and catch
block handling. This for example means they can use the same testing logic as other parsing tests. We could split them out in the future, but for now at least I think this is the simplest way to go.
This allows specifying whether or not to perform semantic checks on the AST. Some clients, e.g syntax coloring, only care about the syntactic structure. But other clients want errors to be emitted for e.g unsupported constructs.
Start emitting errors for unsupported constructs, and other semantic errors such as duplicate group names. Once we start emitting bytecode for regex at compile time, these errors could potentially be subsumed into the bytecode generator. But for now, implement them as a separate pass.
Begin storing source location on capture lists, and start erroring on duplicate named captures.
@swift-ci please test |
} | ||
|
||
func validateQuantification(_ quant: AST.Quantification) throws { | ||
try validateNode(quant.child) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to validate that the child isn't a zero-width assertion here? e.g. we want to reject \b+
. Tracked in #312.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented logic to check for escape sequences that aren't quantifiable, how does it look?
.extendedPictographic, .graphemeLink, .hyphen, .otherAlphabetic, | ||
.otherDefaultIgnorableCodePoint, .otherGraphemeExtended, | ||
.otherIDContinue, .otherIDStart, .otherLowercase, .otherMath, | ||
.otherUppercase, .prependedConcatenationMark: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these .other*
properties, we don't implement b/c they're included in e.g. \p{isAlphabetic}
. Do you think we should redirect people to the corresponding properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, and I can take care of that in a follow-up
case .resetStartOfMatch, .singleDataUnit, .horizontalWhitespace, | ||
.notHorizontalWhitespace, .verticalTab, .notVerticalTab, | ||
// '\N' needs to be emitted using 'emitAny'. | ||
.notNewline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still to implement, but I'll fix up the validation then.
- Make `\h` and `\H` supported for now - Check character class ranges - Diagnose unquantifiable escape sequences
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Start emitting errors for unsupported constructs, and other semantic errors such as duplicate group names.
Once we start emitting bytecode for regex at compile time, these errors could potentially be subsumed into the bytecode generator. But for now, implement them as a separate pass.
Resolves #357
Resolves #264
Resolves #116
Resolves #312