-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust: Add generated models for standard libraries including core #18787
base: main
Are you sure you want to change the base?
Conversation
7bce170
to
bfb716b
Compare
bfb716b
to
0c3e8a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR adds generated models for standard Rust libraries (core, std, alloc, and proc_macro) and updates the associated tests. The key changes are:
- Updating a test annotation in dataflow/strings/main.rs from "hasTaintFlow" to "hasValueFlow".
- Removing several taint and value model entries in lang-core.model.yml to better scale with the new models.
Changes
File | Description |
---|---|
rust/ql/test/library-tests/dataflow/strings/main.rs | Updated test comment annotation to reflect new model expectations. |
rust/ql/lib/codeql/rust/frameworks/stdlib/lang-core.model.yml | Removed several model entries to adjust for increased model volume. |
Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
rust/ql/test/library-tests/dataflow/strings/main.rs:53
- Please ensure that changing the annotation from 'hasTaintFlow' to 'hasValueFlow' aligns with the updated test expectations and model semantics.
sink(s2); // $ hasValueFlow=36
rust/ql/lib/codeql/rust/frameworks/stdlib/lang-core.model.yml:7
- Review the removal of the model entry for 'crate::hint::must_use' to ensure that tests or taint propagation flows are still adequately covered.
- - ["lang:core", "crate::hint::must_use", "Argument[0]", "ReturnValue", "value", "manual"]
Tip: If you use Visual Studio Code, you can request a review from Copilot before you push from the "Source Control" tab. Learn more
input = "Argument[self]" and | ||
output = "ReturnValue" and | ||
preservesValue = true and | ||
model = "generated" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this had a model of ""
and it seemed to be disabled/overwritten by the generated models. The generated models include a model for clone
on i64
, which caused the test for this method to fail. Changing the model to generated
or manual
fixed the problem. I just went with generated
without worrying too much as this is temporary anyway.
@@ -11,6 +11,8 @@ private module Input implements InputSig<Location, RustDataFlow> { | |||
not exists(n.asExpr().getLocation()) | |||
} | |||
|
|||
predicate postWithInFlowExclude(RustDataFlow::Node n) { n instanceof Node::FlowSummaryNode } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes some data flow inconsistencies otherwise introduced by the new models. Ruby and C# have the same, so I think this is appropriate.
@@ -14,7 +14,7 @@ | |||
| Macro calls - resolved | 2 | | |||
| Macro calls - total | 2 | | |||
| Macro calls - unresolved | 0 | | |||
| Taint edges - number of edges | 4 | | |||
| Taint edges - number of edges | 1465 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A 366x increase in taint edges 📈 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great, though I wonder what they all are. I'm assuming hello-project
is pretty basic.
DCA shows taint reach going down by 1 on the iced project. That's unexpected, but in the tests things look good, so I don't thing there's much to worry about. |
Those tests have been starting to irritate me even before we started adding generated models. Thanks for cleaning them up. 👍
This is very minor, but surprising - surprising enough it might be worth investigating. If you download the database from DCA you could try and narrow down taint edges we have before the changes here but not afterwards??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good, a few points to discuss, and I should really review a few more of the models (at random)...
@@ -14,7 +14,7 @@ | |||
| Macro calls - resolved | 2 | | |||
| Macro calls - total | 2 | | |||
| Macro calls - unresolved | 0 | | |||
| Taint edges - number of edges | 4 | | |||
| Taint edges - number of edges | 1465 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great, though I wonder what they all are. I'm assuming hello-project
is pretty basic.
- ["lang:core", "<crate::result::Result>::unwrap_or", "Argument[self].Field[crate::result::Result::Ok(0)]", "ReturnValue", "value", "dfc-generated"] | ||
- ["lang:core", "<crate::result::Result>::unwrap_or_default", "Argument[self].Field[crate::result::Result::Ok(0)]", "ReturnValue", "value", "dfc-generated"] | ||
- ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[0].ReturnValue", "ReturnValue", "value", "dfc-generated"] | ||
- ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[self].Field[crate::result::Result::Err(0)].Reference", "ReturnValue", "value", "dfc-generated"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this is true (the described method is here). Though (assuming I'm right) I doubt the model will do much harm anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely spotted! That model is indeed odd. Both because the error value is not directly returned and because there are no references involved. The latter might be due to some mistakenly inserted reference read step.
In any case, the implementation is very simple, so I would expect the model to be accurate. I've created an internal issue for me to fix this.
- ["lang:core", "<crate::result::Result>::unwrap_or_default", "Argument[self].Field[crate::result::Result::Ok(0)]", "ReturnValue", "value", "dfc-generated"] | ||
- ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[0].ReturnValue", "ReturnValue", "value", "dfc-generated"] | ||
- ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[self].Field[crate::result::Result::Err(0)].Reference", "ReturnValue", "value", "dfc-generated"] | ||
- ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[self].Field[crate::result::Result::Err(0)]", "Argument[0].Parameter[0]", "value", "dfc-generated"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand this model is perfect and I missed it in the manual models. ✨
Spotting mistakes like the one in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I agree we should merge this ASAP, but continue discussions about possible follow-up improvements.
I think I just created some merge conflicts by merging #18701 ; let me know if you need any help untangling what happened there (I expect mostly it's those .expected
files that change too often).
pack: codeql/rust-all | ||
extensible: summaryModel | ||
data: | ||
- ["lang:std", "<&[u8] as crate::io::BufRead>::consume", "Argument[self].Element", "Argument[self].Reference.Reference", "value", "dfc-generated"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another weird edge involving references. Based on the description I don't think consume
should have any taint flows.
This adds generated models for some of the standard Rust libraries,
core
,std
,alloc
, andproc_macro
.We had some test that created
.expected
output growing with the number of models or taint steps caused by models. That didn't scale well to the new amount of models, so I've tweaked those tests.