Rust: Add generated models for standard libraries including core #18787

paldepind · 2025-02-14T13:47:34Z

This adds generated models for some of the standard Rust libraries, core, std, alloc, and proc_macro.

We had some test that created .expected output growing with the number of models or taint steps caused by models. That didn't scale well to the new amount of models, so I've tweaked those tests.

PR Overview

This PR adds generated models for standard Rust libraries (core, std, alloc, and proc_macro) and updates the associated tests. The key changes are:

Updating a test annotation in dataflow/strings/main.rs from "hasTaintFlow" to "hasValueFlow".
Removing several taint and value model entries in lang-core.model.yml to better scale with the new models.

Changes

File	Description
rust/ql/test/library-tests/dataflow/strings/main.rs	Updated test comment annotation to reflect new model expectations.
rust/ql/lib/codeql/rust/frameworks/stdlib/lang-core.model.yml	Removed several model entries to adjust for increased model volume.

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

rust/ql/test/library-tests/dataflow/strings/main.rs:53

Please ensure that changing the annotation from 'hasTaintFlow' to 'hasValueFlow' aligns with the updated test expectations and model semantics.

sink(s2); // $ hasValueFlow=36

rust/ql/lib/codeql/rust/frameworks/stdlib/lang-core.model.yml:7

Review the removal of the model entry for 'crate::hint::must_use' to ensure that tests or taint propagation flows are still adequately covered.

-      - ["lang:core", "crate::hint::must_use", "Argument[0]", "ReturnValue", "value", "manual"]

Tip: If you use Visual Studio Code, you can request a review from Copilot before you push from the "Source Control" tab. Learn more

paldepind · 2025-02-17T10:32:52Z

rust/ql/lib/codeql/rust/frameworks/stdlib/Clone.qll

+    input = "Argument[self]" and
+    output = "ReturnValue" and
+    preservesValue = true and
+    model = "generated"


Previously this had a model of "" and it seemed to be disabled/overwritten by the generated models. The generated models include a model for clone on i64, which caused the test for this method to fail. Changing the model to generated or manual fixed the problem. I just went with generated without worrying too much as this is temporary anyway.

paldepind · 2025-02-17T10:33:28Z

rust/ql/lib/codeql/rust/dataflow/internal/DataFlowConsistency.qll

@@ -11,6 +11,8 @@ private module Input implements InputSig<Location, RustDataFlow> {
    not exists(n.asExpr().getLocation())
  }

+  predicate postWithInFlowExclude(RustDataFlow::Node n) { n instanceof Node::FlowSummaryNode }


This fixes some data flow inconsistencies otherwise introduced by the new models. Ruby and C# have the same, so I think this is appropriate.

paldepind · 2025-02-17T10:34:45Z

rust/ql/integration-tests/hello-project/summary.expected

@@ -14,7 +14,7 @@
 | Macro calls - resolved | 2 |
 | Macro calls - total | 2 |
 | Macro calls - unresolved | 0 |
-| Taint edges - number of edges | 4 |
+| Taint edges - number of edges | 1465 |


A 366x increase in taint edges 📈 😃

Sounds great, though I wonder what they all are. I'm assuming hello-project is pretty basic.

paldepind · 2025-02-17T12:02:48Z

DCA shows taint reach going down by 1 on the iced project. That's unexpected, but in the tests things look good, so I don't thing there's much to worry about.

geoffw0 · 2025-02-24T19:14:16Z

We had some test that created .expected output growing with the number of models or taint steps caused by models. That didn't scale well to the new amount of models, so I've tweaked those tests.

Those tests have been starting to irritate me even before we started adding generated models. Thanks for cleaning them up. 👍

DCA shows taint reach going down by 1 on the iced project. That's unexpected, but in the tests things look good, so I don't thing there's much to worry about.

This is very minor, but surprising - surprising enough it might be worth investigating. If you download the database from DCA you could try and narrow down taint edges we have before the changes here but not afterwards???

geoffw0

Looks really good, a few points to discuss, and I should really review a few more of the models (at random)...

geoffw0 · 2025-02-24T18:50:33Z

rust/ql/integration-tests/hello-project/summary.expected

@@ -14,7 +14,7 @@
 | Macro calls - resolved | 2 |
 | Macro calls - total | 2 |
 | Macro calls - unresolved | 0 |
-| Taint edges - number of edges | 4 |
+| Taint edges - number of edges | 1465 |


Sounds great, though I wonder what they all are. I'm assuming hello-project is pretty basic.

geoffw0 · 2025-02-24T19:08:09Z

rust/ql/lib/ext/generated/rust/lang-core.model.yml

+      - ["lang:core", "<crate::result::Result>::unwrap_or", "Argument[self].Field[crate::result::Result::Ok(0)]", "ReturnValue", "value", "dfc-generated"]
+      - ["lang:core", "<crate::result::Result>::unwrap_or_default", "Argument[self].Field[crate::result::Result::Ok(0)]", "ReturnValue", "value", "dfc-generated"]
+      - ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[0].ReturnValue", "ReturnValue", "value", "dfc-generated"]
+      - ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[self].Field[crate::result::Result::Err(0)].Reference", "ReturnValue", "value", "dfc-generated"]


I don't see why this is true (the described method is here). Though (assuming I'm right) I doubt the model will do much harm anyway.

Nicely spotted! That model is indeed odd. Both because the error value is not directly returned and because there are no references involved. The latter might be due to some mistakenly inserted reference read step.

In any case, the implementation is very simple, so I would expect the model to be accurate. I've created an internal issue for me to fix this.

geoffw0 · 2025-02-24T19:08:36Z

rust/ql/lib/ext/generated/rust/lang-core.model.yml

+      - ["lang:core", "<crate::result::Result>::unwrap_or_default", "Argument[self].Field[crate::result::Result::Ok(0)]", "ReturnValue", "value", "dfc-generated"]
+      - ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[0].ReturnValue", "ReturnValue", "value", "dfc-generated"]
+      - ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[self].Field[crate::result::Result::Err(0)].Reference", "ReturnValue", "value", "dfc-generated"]
+      - ["lang:core", "<crate::result::Result>::unwrap_or_else", "Argument[self].Field[crate::result::Result::Err(0)]", "Argument[0].Parameter[0]", "value", "dfc-generated"]


On the other hand this model is perfect and I missed it in the manual models. ✨

paldepind · 2025-02-25T08:16:20Z

Looks really good, a few points to discuss, and I should really review a few more of the models (at random)...

Spotting mistakes like the one in unwrap_or_else is valuable, but I would suggest we go ahead with the models in the PR as is. They already add a lot of value and a many of the flaws are from known limitations. Instead I suggest we continuously regenerate the models when the data flow library improves and fix things when we run into problems.

geoffw0

Yep, I agree we should merge this ASAP, but continue discussions about possible follow-up improvements.

I think I just created some merge conflicts by merging #18701 ; let me know if you need any help untangling what happened there (I expect mostly it's those .expected files that change too often).

geoffw0 · 2025-02-25T10:48:30Z

rust/ql/lib/ext/generated/rust/lang-std.model.yml

+      pack: codeql/rust-all
+      extensible: summaryModel
+    data:
+      - ["lang:std", "<&[u8] as crate::io::BufRead>::consume", "Argument[self].Element", "Argument[self].Reference.Reference", "value", "dfc-generated"]


This is another weird edge involving references. Based on the description I don't think consume should have any taint flows.

Rust: Add generated models for standard libraries including core

925d6ac

github-actions bot added the Rust Pull requests that update Rust code label Feb 14, 2025

paldepind force-pushed the rust-core-std-models branch 2 times, most recently from 7bce170 to bfb716b Compare February 17, 2025 09:57

Rust: Adapt tests and existing models to account for generated models

0c3e8a0

paldepind force-pushed the rust-core-std-models branch from bfb716b to 0c3e8a0 Compare February 17, 2025 10:08

paldepind marked this pull request as ready for review February 17, 2025 10:20

Copilot bot review requested due to automatic review settings February 17, 2025 10:20

Copilot AI reviewed Feb 17, 2025

View reviewed changes

paldepind commented Feb 17, 2025

View reviewed changes

paldepind mentioned this pull request Feb 17, 2025

Rust: Model Result.ok and Result.err. #18777

Open

Merge branch 'main' into rust-core-std-models

b6144c2

paldepind requested a review from hvitved February 20, 2025 11:28

Merge branch 'main' into rust-core-std-models

6353dbf

geoffw0 reviewed Feb 24, 2025

View reviewed changes

geoffw0 previously approved these changes Feb 25, 2025

View reviewed changes

geoffw0 reviewed Feb 25, 2025

View reviewed changes

Merge branch 'main' into rust-core-std-models

5c99785

paldepind dismissed geoffw0’s stale review via 5c99785 February 25, 2025 14:07

geoffw0 previously approved these changes Feb 25, 2025

View reviewed changes

Rust: Accept changes

26a96d9

paldepind dismissed geoffw0’s stale review via 26a96d9 February 25, 2025 14:56

geoffw0 approved these changes Feb 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rust: Add generated models for standard libraries including core #18787

Rust: Add generated models for standard libraries including core #18787

paldepind commented Feb 14, 2025 •

edited

Loading

paldepind Feb 17, 2025

paldepind Feb 17, 2025

paldepind Feb 17, 2025

geoffw0 Feb 24, 2025

paldepind commented Feb 17, 2025

geoffw0 commented Feb 24, 2025

geoffw0 left a comment

geoffw0 Feb 24, 2025

geoffw0 Feb 24, 2025

paldepind Feb 25, 2025

geoffw0 Feb 24, 2025

paldepind commented Feb 25, 2025

geoffw0 left a comment

geoffw0 Feb 25, 2025

Rust: Add generated models for standard libraries including core #18787

Are you sure you want to change the base?

Rust: Add generated models for standard libraries including core #18787

Conversation

paldepind commented Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

PR Overview

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paldepind commented Feb 17, 2025

geoffw0 commented Feb 24, 2025

geoffw0 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paldepind commented Feb 25, 2025

geoffw0 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paldepind commented Feb 14, 2025 •

edited

Loading