Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex-based POC #217

Merged
merged 1 commit into from
Oct 8, 2024
Merged

regex-based POC #217

merged 1 commit into from
Oct 8, 2024

Conversation

masklinn
Copy link
Contributor

Uses the pyo3-based ua-parser/uap-rust#3

Fixes #166

@masklinn masklinn mentioned this pull request Oct 8, 2024
@masklinn masklinn added this to the 1.0 milestone Oct 8, 2024
@masklinn masklinn merged commit 46778e7 into ua-parser:master Oct 8, 2024
23 checks passed
@masklinn masklinn deleted the regex branch October 8, 2024 18:06
@masklinn
Copy link
Contributor Author

masklinn commented Oct 8, 2024

Forgot to put it in the commit message, but the regex-based parser is generally a bit faster than re2 (it loses by a hair on pgts, but is 80% faster on ua, and 30% on devices):

0.89s call     tests/test_core.py::test_devices[test_device-re2]
0.68s call     tests/test_core.py::test_devices[test_device-regex]
0.56s call     tests/test_core.py::test_ua[pgts_browser_list-regex]
0.54s call     tests/test_core.py::test_ua[pgts_browser_list-re2]
0.11s call     tests/test_core.py::test_ua[test_ua-re2]
0.06s call     tests/test_core.py::test_ua[test_ua-regex]
0.02s call     tests/test_core.py::test_os[test_os-re2]
0.02s call     tests/test_core.py::test_os[test_os-regex]
0.01s call     tests/test_core.py::test_ua[firefox_user_agent_strings-regex]
0.01s call     tests/test_core.py::test_ua[firefox_user_agent_strings-re2]
0.00s call     tests/test_core.py::test_os[additional_os_tests-re2]
0.00s call     tests/test_core.py::test_os[additional_os_tests-regex]

The main draw is that it is compatible with pypy and graal (even though it's a cpyext) and they have excruciating pure python performances on uap tests, here's pypy:

18.55s call     tests/test_core.py::test_devices[test_device-basic]
10.94s call     tests/test_core.py::test_ua[pgts_browser_list-basic]
1.25s call     tests/test_core.py::test_ua[test_ua-basic]
0.96s call     tests/test_core.py::test_devices[test_device-regex]
0.54s call     tests/test_core.py::test_ua[pgts_browser_list-regex]
0.26s call     tests/test_core.py::test_ua[firefox_user_agent_strings-basic]
0.18s call     tests/test_core.py::test_os[test_os-basic]
0.12s call     tests/test_core.py::test_ua[test_ua-regex]
0.03s call     tests/test_core.py::test_os[test_os-regex]
0.01s call     tests/test_core.py::test_os[additional_os_tests-basic]
0.01s call     tests/test_core.py::test_ua[firefox_user_agent_strings-regex]

So not only does it give these "native" performances, it's an even bigger gain than the re2 parser for cpython.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

regex-based parser
1 participant