Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Improve and promote cpp/overflow-buffer #18837

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Feb 21, 2025

Improve and promote "Call to memory access function may overflow buffer" (cpp/overflow-buffer).

  • fix an issue where references to array expressions inside offsetof expressions were being misinterpreted as accesses.
  • promote the query onto security-extended (by increasing the severity to warning and setting the precision to medium).
  • added some test cases along the way, inspired by real world results / FPs found using MRVA.
    • I actually found three classes of false positives in the real world results, but only decided to fix one of them. For the other two, I believe it will be more challenging to determine which are TP vs FP cases. If we want to promote the query to high precision (code scanning suite) at some point, we will probably need to do something with these.

@geoffw0 geoffw0 added the C++ label Feb 21, 2025
@Copilot Copilot bot review requested due to automatic review settings February 21, 2025 19:10
@geoffw0 geoffw0 requested a review from a team as a code owner February 21, 2025 19:10

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR improves and promotes the "Call to memory access function may overflow buffer" query by fixing its handling of array expressions within offsetof and adjusting its severity and precision for the security-extended suite.

  • Updated change note in cpp/ql/src documenting the security-extended promotion of cpp/overflow-buffer.
  • Added a change note in cpp/ql/lib for fixing the getBufferSize predicate issue.

Reviewed Changes

File Description
cpp/ql/src/change-notes/2025-02-20-overflow-buffer.md Added change note to promote cpp/overflow-buffer to security-extended.
cpp/ql/lib/change-notes/2025-02-20-getbuffersize.md Documented the fix for getBufferSize misinterpreting array expressions.

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Tip: Copilot only keeps its highest confidence comments to reduce noise and keep you focused. Learn more

@jketema
Copy link
Contributor

jketema commented Feb 24, 2025

Did you look at the results flagged up by DCA and their quality?

@geoffw0
Copy link
Contributor Author

geoffw0 commented Feb 24, 2025

DCA

  • failed due to WebKit-JavaScriptCore; it shouldn't have been run, but probably my DCA was still the build from https://github.com/github/codeql-dca/pull/2418 so it did run. Irrelevant to this PR.
  • 1164 results on dsp-testing__samate-c-cpp
    • sample were all in the bad test cases, i.e. these results look great!
  • a light scatter of results on other projects
    • most of which are of the "worth checking what's going on here" rather than "definitely wrong" type; accessing offset pointers using negative indices (update: looking again, at least some of these are TP) and clearing consecutive elements of a struct (update: on closer inspection the latter results look to be doing something unsafe after all) being common patterns. This is probably OK as we're only targetting medium precision and there isn't a huge amount of noise - but I'm going to add a couple more test cases and see if I can narrow down results a little more.

@geoffw0
Copy link
Contributor Author

geoffw0 commented Feb 24, 2025

I think this is ready to merge, but I'd really appreciate a second opinion on at least some of the real world results.

@jketema
Copy link
Contributor

jketema commented Feb 24, 2025

This is what I'm seeing:

  • neovim: likely a FP, but hard to tell because there's quite a lot of pointer manipulation going on
  • PHP: These look like FPs. One of the patterns here is memset(p->free_map, 0, sizeof(p->free_map) + sizeof(p->map)) Are we handling sizeof correctly (see also below)?
  • OTP: The alert in erl_bif_info.c looks like a TP. The hashing ones are difficult to follow so not sure.
  • TinyUSB: These all look like FPs of the form memset(arr, 0, sizeof(arr)). Are we handling sizeof correctly?

@geoffw0
Copy link
Contributor Author

geoffw0 commented Feb 25, 2025

Thanks for taking a look.

PHP: These look like FPs. One of the patterns here is memset(p->free_map, 0, sizeof(p->free_map) + sizeof(p->map)) Are we handling sizeof correctly (see also below)?

I've just pushed a change that fixes this FP result in PHP (amongst others), and the cost of losing some TPs as well. I think this is a good tradeoff given the query is a bit noisy, and one we could improve in future by refining the existing getRootType predicate in Buffer.qll to cope with a wider range of things.

OTP: The alert in erl_bif_info.c looks like a TP. The hashing ones are difficult to follow so not sure.

The hashing ones are also eliminated by the commit I just pushed (the clear TP result remains).

@geoffw0
Copy link
Contributor Author

geoffw0 commented Feb 25, 2025

TinyUSB: These all look like FPs of the form memset(arr, 0, sizeof(arr)). Are we handling sizeof correctly?

This was caused by a variable with multiple types in the database. Probably shouldn't happen, but I've hardened the query to this.

I should do another DCA run now with all the changes... [started]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants