-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fuzzing reveals a number of parse errors #568
Comments
Another such error: clusterfuzz-testcase-minimized-bs4_fuzzer-6401239223762944 Markup: Same |
This one is different from the rest: Markup: Raises an
|
I'm the lead developer of Beautiful Soup, which has html5lib as an optional dependency. Over the past couple of years I've gotten a number of notifications from Google's oss-fuzz project about unhandled exceptions that actually turned out to be problems in html5lib. There wasn't much I could do with these errors, but now that it looks like html5lib maintenance is picking up, I can pass them on to you. (Sorry. 😿)
I've incorporated the fuzz reports into the Beautiful Soup test suite, and the test cases themselves are here, but here's a general picture of what problems I see. In each case, I believe just parsing the bad markup is enough to trigger the error.
clusterfuzz-testcase-minimized-bs4_fuzzer-4999465949331456
Markup:
b')<a><math><TR><a><mI><a><p><a>'
Error:
clusterfuzz-testcase-minimized-bs4_fuzzer-5843991618256896
Markup:
b'-<math><sElect><mi><sElect><sElect>'
Error:
clusterfuzz-testcase-minimized-bs4_fuzzer-6241471367348224
Markup:
b'ñ<table><svg><html>'
Error:
clusterfuzz-testcase-minimized-bs4_fuzzer-6600557255327744
Markup:
b'\t<TABLE><<!>;<!><<!>.<lec><th>i><a><mat\x00\x01<mi\x00a><math>><th><mI>chardeta\xff\xff\xff\xff<><th><mI><||||||||A<select><>qu?\xbemath><th><mie>qu'
Error:
Also reported to me recently was the issue that was reported to you as issue #557.
The text was updated successfully, but these errors were encountered: