Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEGFAULTs and assertion failures #14

Closed
theHamsta opened this issue Oct 20, 2020 · 8 comments · Fixed by #17
Closed

SEGFAULTs and assertion failures #14

theHamsta opened this issue Oct 20, 2020 · 8 comments · Fixed by #17
Assignees
Labels
bug Something isn't working

Comments

@theHamsta
Copy link

theHamsta commented Oct 20, 2020

We at nvim-treesitter are using this parser for editor support in Neovim. We've received and experienced reports about segfaults of this parser nvim-treesitter/nvim-treesitter#602 neovim/neovim#13122. We would love to use this project, however we can't when it's killing the whole editor.

https://files.gitter.im/5506b96e15522ed4b3dd5317/KR5c/after.gif (referencing scanner 253)

Maybe a way to debug this would be to use tree-sitters libfuzzer feature.

# working dir is tree-sitter repo (https://github.com/tree-sitter/tree-sitter)
apt install libfuzzer-12-dev # or other version
ln -s ~/projects/tree-sitter-markdown ./test/fixtures/grammars/markdown
export LIB_FUZZER_PATH=/usr/lib/llvm-12/lib/libFuzzer.a
mkdir out
# next step probably requires CC to be clang, GCC won't work
./script/build-fuzzers
./out/markdown_fuzzer

python used in script must be python2 (or edit ./script/build-fuzzers by replacing python with python2)

Below you can see a example output achieved by this fuzzer.

#226964 NEW    cov: 9367 ft: 43554 corp: 2889/35Kb lim: 53 exec/s: 879 rss: 519Mb L: 52/53 MS: 1 InsertRepeatedBytes-
markdown_fuzzer: test/fixtures/grammars/markdown/src/./tree_sitter_markdown/inline_scan.cc:103: tree_sitter_markdown::Symbol tree_sitter_markdown::scn_inl(tree_sitter_markdown::Lexer &, tree_sitter_markdown::InlineDelimiterList &, tree_sitter_markdown::InlineContextStack &, tree_sitter_markdown::BlockDelimiterList &, tree_sitter_markdown::BlockContextStack &, InlineDelimiterList::Iterator &, const InlineDelimiterList::Iterator &, tree_sitter_markdown::LexedIndex &, const bool): Assertion `blk_dlms.back().sym() == SYM_LIT_LBK' failed.
==189124== ERROR: libFuzzer: deadly signal
    #0 0x52ae41 in __sanitizer_print_stack_trace (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x52ae41)
    #1 0x473758 in fuzzer::PrintStackTrace() (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x473758)
    #2 0x458f53 in fuzzer::Fuzzer::CrashCallback() (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x458f53)
    #3 0x7f2c56085baf  (/lib/x86_64-linux-gnu/libpthread.so.0+0x14baf)
    #4 0x7f2c55e9a8ca in __libc_signal_restore_set signal/../sysdeps/unix/sysv/linux/internal-signals.h:104:3
    #5 0x7f2c55e9a8ca in raise signal/../sysdeps/unix/sysv/linux/raise.c:47:3
    #6 0x7f2c55e7f863 in abort stdlib/abort.c:79:7
    #7 0x7f2c55e7f748 in __assert_fail_base assert/assert.c:92:3
    #8 0x7f2c55e91a95 in __assert_fail assert/assert.c:101:3
    #9 0x5827bc in tree_sitter_markdown::scn_inl(tree_sitter_markdown::Lexer&, tree_sitter_markdown::InlineDelimiterList&, tree_sitter_markdown::InlineContextStack&, tree_sitter_markdown::BlockDelimiterList&, tree_sitter_markdown::BlockContextStack&, std::_List_iterator<tree_sitter_markdown::InlineDelimiter>&, std::_List_iterator<tree_sitter_markdown::InlineDelimiter> const&, unsigned short&, bool) /home/stephan/projects/tree-sitter/test/fixtures/grammars/markdown/src/./tree_sitter_markdown/inline_scan.cc:103:13
    #10 0x5829fa in tree_sitter_markdown::scn_inl(tree_sitter_markdown::Lexer&, tree_sitter_markdown::InlineDelimiterList&, tree_sitter_markdown::InlineContextStack&, tree_sitter_markdown::BlockDelimiterList&, tree_sitter_markdown::BlockContextStack&) /home/stephan/projects/tree-sitter/test/fixtures/grammars/markdown/src/./tree_sitter_markdown/inline_scan.cc:38:10
    #11 0x6097d4 in (anonymous namespace)::Scanner::scan(TSLexer*, bool const*) /home/stephan/projects/tree-sitter/test/fixtures/grammars/markdown/src/scanner.cc:261:18
    #12 0x6084a2 in tree_sitter_markdown_external_scanner_scan /home/stephan/projects/tree-sitter/test/fixtures/grammars/markdown/src/scanner.cc:305:19
    #13 0x64ec30 in ts_parser__lex (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x64ec30)
    #14 0x63c4b6 in ts_parser__advance (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x63c4b6)
    #15 0x6367d1 in ts_parser_parse (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x6367d1)
    #16 0x646fe9 in ts_parser_parse_string_encoding (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x646fe9)
    #17 0x646c13 in ts_parser_parse_string (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x646c13)
    #18 0x555293 in LLVMFuzzerTestOneInput (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x555293)
    #19 0x45a6c1 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x45a6c1)
    #20 0x459c0a in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x459c0a)
    #21 0x45b8b7 in fuzzer::Fuzzer::MutateAndTestOne() (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x45b8b7)
    #22 0x45c455 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x45c455)
    #23 0x44a5fb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x44a5fb)
    #24 0x473f32 in main (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x473f32)
    #25 0x7f2c55e81cb1 in __libc_start_main csu/../csu/libc-start.c:314:16
    #26 0x41f56d in _start (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x41f56d)

NOTE: libFuzzer has rudimentary signal handlers.
      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
MS: 2 CrossOver-CopyPart-; base unit: 8c7fd13d377b76f756fea15bdc24d58fa6c383a9
0x2a,0x1,0x3,0x24,0x24,0xa,0x2d,0x3a,0xa,0x3c,0x3c,0x3c,0x2a,0x1,0x3,0x24,0x24,0xa,0x2d,0x3a,0xa,0x3c,0x3c,0x3c,0x2b,0x2d,0x2b,0x2d,
*\x01\x03$$\x0a-:\x0a<<<*\x01\x03$$\x0a-:\x0a<<<+-+-
artifact_prefix='./'; Test unit written to ./crash-d4d36d06d5d8987d8eefc4fa7ea7868479ed6ea7
Base64: KgEDJCQKLToKPDw8KgEDJCQKLToKPDw8Ky0rLQ==

the test input (I can also send you the file). Probably it's easier to read from above report (*\x01\x03$$\x0a-:\x0a<<<*\x01\x03$$\x0a-:\x0a<<<+-+-)

*��$$
-:
<<<*��$$
-:
<<<+-+-
@theHamsta
Copy link
Author

theHamsta commented Oct 20, 2020

Another run:

markdown_fuzzer: test/fixtures/grammars/markdown/src/scanner.cc:246: bool (anonymous namespace)::Scanner::scan(TSLexer *, const bool *): Assertion `blk_dlms_.empty()' failed.
markdown_fuzzer: test/fixtures/grammars/markdown/src/scanner.cc:246: bool (anonymous namespace)::Scanner::scan(TSLexer *, const bool *): Assertion `blk_dlms_.empty()' failed.
==193883== ERROR: libFuzzer: deadly signal
    #0 0x52ae41 in __sanitizer_print_stack_trace (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x52ae41)
    #1 0x473758 in fuzzer::PrintStackTrace() (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x473758)
    #2 0x458f53 in fuzzer::Fuzzer::CrashCallback() (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x458f53)
    #3 0x7ff0c422ebaf  (/lib/x86_64-linux-gnu/libpthread.so.0+0x14baf)
    #4 0x7ff0c40438ca in __libc_signal_restore_set signal/../sysdeps/unix/sysv/linux/internal-signals.h:104:3
    #5 0x7ff0c40438ca in raise signal/../sysdeps/unix/sysv/linux/raise.c:47:3
    #6 0x7ff0c4028863 in abort stdlib/abort.c:79:7
    #7 0x7ff0c4028748 in __assert_fail_base assert/assert.c:92:3
    #8 0x7ff0c403aa95 in __assert_fail assert/assert.c:101:3
    #9 0x60b885 in (anonymous namespace)::Scanner::scan(TSLexer*, bool const*) /home/stephan/projects/tree-sitter/test/fixtures/grammars/markdown/src/scanner.cc:246:7
    #10 0x6084a2 in tree_sitter_markdown_external_scanner_scan /home/stephan/projects/tree-sitter/test/fixtures/grammars/markdown/src/scanner.cc:305:19
    #11 0x64ec30 in ts_parser__lex (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x64ec30)
    #12 0x63c4b6 in ts_parser__advance (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x63c4b6)
    #13 0x6367d1 in ts_parser_parse (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x6367d1)
    #14 0x646fe9 in ts_parser_parse_string_encoding (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x646fe9)
    #15 0x646c13 in ts_parser_parse_string (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x646c13)
    #16 0x555293 in LLVMFuzzerTestOneInput (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x555293)
    #17 0x45a6c1 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x45a6c1)
    #18 0x459c0a in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x459c0a)
    #19 0x45b8b7 in fuzzer::Fuzzer::MutateAndTestOne() (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x45b8b7)
    #20 0x45c455 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x45c455)
    #21 0x44a5fb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x44a5fb)
    #22 0x473f32 in main (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x473f32)
    #23 0x7ff0c402acb1 in __libc_start_main csu/../csu/libc-start.c:314:16
    #24 0x41f56d in _start (/home/stephan/projects/tree-sitter/out/markdown_fuzzer+0x41f56d)

NOTE: libFuzzer has rudimentary signal handlers.
      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
MS: 1 CopyPart-; base unit: 14a98c5140ffd723ccb37a584434183c286457f8
0x2a,0x2e,0xa,0x2a,0x2e,0xa,0x4a,0x4a,
*.\x0a*.\x0aJJ
artifact_prefix='./'; Test unit written to ./crash-e7db3f2d23a9531e16457395f6ee00419d7743ad
Base64: Ki4KKi4KSko=

dlukes added a commit to dlukes/dotfiles that referenced this issue Oct 20, 2020
Indentation support has landed! The definitions for Python don't exist
yet, but I'm still getting rid of the separate Python indentation plugin
to remind myself that I should check. Also, since I'm using Black for
auto-formatting, the ugly default indentation is less of an issue now.

The Treesitter playground is a really nice way to visualize the parsed
tree for a given buffer. It will teach me about parsing and help if I
ever need to create my own Treesitter queries.

Basic Markdown support is finally available, but I should probably still
wait for languagetree to settle down. Also, the Markdown parser
currently seems to be responsible for a crash:

- nvim-treesitter/nvim-treesitter#602.
- ikatyang/tree-sitter-markdown#14

So not getting rid of the Markdown syntax plugin just yet, and disabling
the markdown parser for now.

And last but not least, using the new ensure_installed option to install
the parsers.
@ikatyang ikatyang added the bug Something isn't working label Oct 21, 2020
@ikatyang ikatyang self-assigned this Oct 21, 2020
@vigoux
Copy link

vigoux commented Nov 17, 2020

Adding my two cents to this, using AFL I ended up with the following SEGFAULT, reproducible using tree-sitter parse min.md with :

[](0 ()

@vigoux
Copy link

vigoux commented Nov 17, 2020

Found an even smaller repro file for the assertion error @theHamsta :

0
-:
*0
0

@ikatyang
Copy link
Owner

ikatyang commented Jan 1, 2021

Hi, sorry for the late response, I've managed to fix all the failed cases provided in this issue, the PR (#17) is still WIP since there are still some errors that could be found by the libfuzzer and I'll merge the PR once I cannot find any further error.

@vigoux
Copy link

vigoux commented Jan 25, 2021

Still not fixed, here are two more inputs that cause a crash : faults.tar.gz

@ikatyang
Copy link
Owner

Some potential mitigations: nvim-treesitter/nvim-treesitter#872 (comment)

@razzeee
Copy link

razzeee commented Mar 6, 2021

@vigoux could you share how you setup afl? what would I need to run?

@ikatyang
Copy link
Owner

The mitigation (#29) has been released. And the crashed cases mentioned by @vigoux has been moved to #30, since the previously-crashed cases are now crash-free with the cost of the parsed tree being not accurate, but the parsed tree should be accurate again once the typing finished if the assumption of the crash caused by unfinished typing is correct. Let me know if there's any issue, thanks and sorry for the inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants