A NFA based regular expression engine implemented in Motoko.
You can install the regex engine using MOPS:
mops add regex
import Regex "mo:regex";
For full documentation, visit Motoko Regex Engine Docs.
The following diagram illustrates the core components and data flow of the regex engine:
graph TD
subgraph Input
input[TEXT INPUT]
cursor[EXTERNAL CURSOR]
end
subgraph Engine Core
lexer[LEXER]
parser[PARSER]
compiler[COMPILER]
matcher[MATCHER]
api[REGEX API]
end
subgraph Outputs
tokens[Tokens]
ast[AST]
nfa[NFA]
result[MATCH RESULT]
end
input --> cursor
cursor --> lexer
lexer --> tokens
tokens --> parser
parser --> ast
ast --> compiler
compiler --> nfa
nfa --> matcher
matcher --> result
api --> lexer
api --> parser
api --> compiler
api --> matcher
-
Input Processing:
- Text Input: Raw regular expression string
- External Cursor: Character-by-character stream processor
-
Core Components:
- Lexer: Tokenizes the input stream into meaningful regex components
- Parser: Builds an Abstract Syntax Tree (AST) from tokens
- Compiler: Transforms the AST into a Non-deterministic Finite Automaton (NFA)
- Matcher: Executes pattern matching using the compiled NFA
-
Intermediate Outputs:
- Tokens: Lexical units of the regex pattern
- AST: Tree representation of the pattern structure
- NFA: State machine for pattern matching
- Match Result: Final output indicating match success/failure and captures
This project was developed with the support of a developer grant from the DFINITY Foundation.
Your feedback is invaluable in improving this and future projects. Feel free to share your thoughts and suggestions through issues or discussions.
If you find this project valuable and would like to support my work on this and other open-source initiatives, you can send ICP donations to:
8c4ebbad19bf519e1906578f820ca4f6732ceecc1d5396e5a5713046dca251c1