Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

miri: optimize zeroed alloc #136035

Merged
merged 1 commit into from
Jan 30, 2025

Conversation

SpecificProtagonist
Copy link
Contributor

@SpecificProtagonist SpecificProtagonist commented Jan 25, 2025

When allocating zero-initialized memory in MIR interpretation, rustc allocates zeroed memory, marks it as initialized and then re-zeroes it. Remove the last step.

I don't expect this to have much of an effect on performance normally, but in my case in which I'm creating a large allocation via mmap it gets in the way.

@rustbot
Copy link
Collaborator

rustbot commented Jan 25, 2025

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @wesleywiser (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 25, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jan 25, 2025

Some changes occurred to the CTFE machinery

cc @rust-lang/wg-const-eval

The Miri subtree was changed

cc @rust-lang/miri

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri, @rust-lang/wg-const-eval

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

@rust-log-analyzer

This comment has been minimized.

@jieyouxu
Copy link
Member

r? miri

@SpecificProtagonist
Copy link
Contributor Author

Sorry, I'm not sure how I closed this – misclick?

@Kobzol
Copy link
Contributor

Kobzol commented Jan 25, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 25, 2025
@bors
Copy link
Contributor

bors commented Jan 25, 2025

⌛ Trying commit bd28faf with merge 837b710...

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 25, 2025
…c, r=<try>

miri: optimize zeroed alloc

When allocating zero-initialized memory in MIR interpretation, rustc allocates zeroed memory, marks it as initialized and then re-zeroes it. Remove the last step.

I don't expect this to have much of an effect on performance normally, but in my case in which I'm creating a large allocation via mmap miri is unusable without this.

There's probably a better way – with less code duplication – to implement this. Maybe adding a zero_init flag to the relevant methods, but then `Allocation::uninit` & co need a new name :)
@bors
Copy link
Contributor

bors commented Jan 25, 2025

☀️ Try build successful - checks-actions
Build commit: 837b710 (837b710e5dd54b53b888f1ab109a3b93efc9a144)

@rust-timer

This comment has been minimized.

@oli-obk
Copy link
Contributor

oli-obk commented Jan 25, 2025

This is not gonna show up in perf. No code path outside miri is changed

@RalfJung
Copy link
Member

We should definitely explore ways to do this with less code duplication. :)
Adding a flag sounds like a good idea. The name of the method could just be Allocation::new etc?

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (837b710): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.2% [-0.2%, -0.2%] 2
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (primary -2.2%, secondary 2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.2% [2.2%, 2.2%] 2
Improvements ✅
(primary)
-2.2% [-2.2%, -2.2%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.2% [-2.2%, -2.2%] 1

Cycles

Results (primary -1.6%, secondary -0.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.3% [1.3%, 1.3%] 1
Improvements ✅
(primary)
-1.6% [-1.6%, -1.6%] 1
Improvements ✅
(secondary)
-1.6% [-1.6%, -1.6%] 1
All ❌✅ (primary) -1.6% [-1.6%, -1.6%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 771.261s -> 771.165s (-0.01%)
Artifact size: 325.82 MiB -> 325.82 MiB (0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 25, 2025
@SpecificProtagonist
Copy link
Contributor Author

Adding a flag sounds like a good idea. The name of the method could just be Allocation::new etc?

Changed 👍

@@ -277,6 +278,7 @@ impl<'tcx, M: Machine<'tcx>> InterpCx<'tcx, M> {
new_size: Size,
new_align: Align,
kind: MemoryKind<M::MemoryKind>,
init: AllocInit,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to explain in a doc comment for the fn that this only refers to the new, grown part of the alloc; the carried-over part is as (un)initialized as the old allocation.

@oli-obk
Copy link
Contributor

oli-obk commented Jan 28, 2025

Please squash the commits once the final doc comment has been added

@oli-obk
Copy link
Contributor

oli-obk commented Jan 28, 2025

@bors r+

@bors
Copy link
Contributor

bors commented Jan 28, 2025

📌 Commit eee9df4 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 28, 2025
@RalfJung
Copy link
Member

RalfJung commented Jan 28, 2025 via email

@bors
Copy link
Contributor

bors commented Jan 30, 2025

⌛ Testing commit eee9df4 with merge 5e55679...

@bors
Copy link
Contributor

bors commented Jan 30, 2025

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 5e55679 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 30, 2025
@bors bors merged commit 5e55679 into rust-lang:master Jan 30, 2025
7 checks passed
@rustbot rustbot added this to the 1.86.0 milestone Jan 30, 2025
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (5e55679): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.3% [-0.3%, -0.3%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (primary 2.6%, secondary 2.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.6% [2.6%, 2.6%] 1
Regressions ❌
(secondary)
2.9% [2.9%, 2.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.6% [2.6%, 2.6%] 1

Cycles

Results (secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.3% [2.3%, 2.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.2% [-2.2%, -2.2%] 1
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 775.772s -> 775.404s (-0.05%)
Artifact size: 328.49 MiB -> 328.40 MiB (-0.03%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants