Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite regex to work with arbitrary subgroup extraction #28

Open
Divide-By-0 opened this issue Apr 27, 2023 · 1 comment
Open

Rewrite regex to work with arbitrary subgroup extraction #28

Divide-By-0 opened this issue Apr 27, 2023 · 1 comment
Labels
enhancement New feature or request low

Comments

@Divide-By-0
Copy link
Member

Jern has a good suggestion for improvement:
So, the demo I and Anka built is based on the assumption that we try to match the subgroup from the minimized DFA (which is unique to each regex) However, since we label tag after it’s minimized, the tag is not 1-1 and can cause funny behavior in some cases.

Hence, to be more general, we need to start appending tag since NFA states. Now, if we literally use NFA with tags and put it in circom, it is fine tracking states, we don’t need registers. But it will take expo time due to NFA nature when extracting the tag. To collapse NFA to DFA and tracking the tag, most papers use register because it can collapse to a very succinct state machine since now you don’t need to care what index that the tag gets assigned to, you just keep the register number, and assign the index to it as the Algo reads the input.

That being said, the tagged DFA is definitely overkilled since it is literally tagged between any alphabet in regex, not the subgroup. So, I’m weakening the assumption and care only the subgroup extracting with () stuffs, and make the version using register.

At first, I thought register in circom shouldn’t be bad (and if we do that, it can easily be applied to almost all versions of tagged stuffs), but it doesn’t seem so, so I will try modify it to be ok without it, so we can stick to the transition version that u and Sora already built (I think it should work using table stuffs) but either way, the naive minimized DFA will never work when we want subgroup extraction.

Edit for https://github.com/zk-email-verify/zk-regex as well.

@Divide-By-0 Divide-By-0 transferred this issue from zkemail/zk-email-verify Oct 30, 2023
@Divide-By-0
Copy link
Member Author

This is done with a frontend at https://github.com/JernKunpittaya/full_zk_regex

Unfortunately it doesn't work all the time, but hopefully can be improved!

@Divide-By-0 Divide-By-0 added enhancement New feature or request low labels Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request low
Projects
Status: In Progress
Development

No branches or pull requests

1 participant