Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not clobber the Total line in mgr:mem report #1945

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion src/mem/PoolChunked.cc
Original file line number Diff line number Diff line change
Expand Up @@ -437,7 +437,6 @@ MemPoolChunked::getStats(Mem::PoolStats &stats)

clean((time_t) 555555); /* don't want to get chunks released before reporting */

stats.pool = this;
stats.label = label;
stats.meter = &meter;
stats.obj_size = objectSize;
Expand Down
1 change: 0 additions & 1 deletion src/mem/PoolMalloc.cc
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ MemPoolMalloc::deallocate(void *obj)
size_t
MemPoolMalloc::getStats(Mem::PoolStats &stats)
{
stats.pool = this;
stats.label = label;
stats.meter = &meter;
stats.obj_size = objectSize;
Expand Down
12 changes: 7 additions & 5 deletions src/mem/Stats.cc
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,6 @@ Mem::GlobalStats(PoolStats &stats)
{
MemPools::GetInstance().flushMeters();

stats.meter = &TheMeter;
stats.label = "Total";
stats.obj_size = 1;
stats.overhead += sizeof(MemPools);

/* gather all stats for Totals */
size_t pools_inuse = 0;
for (const auto pool: MemPools::GetInstance().pools) {
Expand All @@ -29,5 +24,12 @@ Mem::GlobalStats(PoolStats &stats)
stats.overhead += sizeof(Allocator *);
}

// Reset PoolStats::meter, label, and obj_size data members after getStats()
// calls in the above loop set them. TODO: Refactor to remove these members.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring mentioned in the above TODO will produce a lot of noise, so I did not include it in this bug-fixing PR.

I did remove PoolStats::pool member because keeping a stale value would essentially preserve a part of the bug this PR is fixing while resetting that raw pointer to nil is dangerous (some future code can easily start dereferencing it). That member removal did not produce a lot of noise because the removed data member was barely used.

P.S. This comment does not request any PR changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify in the code comment what you mean by "these members".

FTR; this particular stats variable holds the details for the entire MemPools pooling structure. The meta-pool of all pools (stats.overhead += sizeof(MemPools)) and allocations in terms of bytes (stats.obj_size = 1) .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify in the code comment what you mean by "these members".

Done in branch commit a644ade? If you need a different kind of clarification, please suggest replacement wording or detail what seems unclear. I am also OK with removing that TODO if you prefer to remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, going by that clarification, I think nobody else will be able to enact it without a much more detailed knowledge of what you in particular think is wrong with them. For now they serve a clear purpose.

So ... remove the TODO and enact it in a future PR if/when you have time to figure out the details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the TODO

Done in commit 5f6793b.

stats.meter = &TheMeter;
stats.label = "Total";
stats.obj_size = 1;
stats.overhead += sizeof(MemPools);

return pools_inuse;
}
1 change: 0 additions & 1 deletion src/mem/Stats.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ namespace Mem
class PoolStats
{
public:
Allocator *pool = nullptr;
const char *label = nullptr;
PoolMeter *meter = nullptr;
int obj_size = 0;
Expand Down
2 changes: 1 addition & 1 deletion src/mem/old_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -586,7 +586,7 @@ Mem::Report(std::ostream &stream)
PoolStats mp_stats;
pool->getStats(mp_stats);

if (mp_stats.pool->meter.gb_allocated.count > 0)
if (pool->meter.gb_allocated.count > 0)
Copy link
Contributor

@yadij yadij Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exposes an old TOCTOU bug. It is important for this if-statement to check a member of mp_stats when determining whether its pool was "used" or not.

Perhapse meter->gb_allocated will work better:

Suggested change
if (pool->meter.gb_allocated.count > 0)
if (mp_stats.meter->gb_allocated.count > 0)

Copy link
Contributor Author

@rousskov rousskov Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exposes an old TOCTOU bug.

I do not see a TOCTOU bug in this code (official or PR):

  • Time Of Check: This if statement condition.
  • Time Of Use: The "true" and "false" branches of this if statement.

It is important for this if-statement to check a member of mp_stats when determining whether its pool was "used" or not.

    stream << "Pools ever used:     " << poolCount - not_used << " (shown above)\n";

It is pretty much the opposite: In the context of this if statement, we want to know whether the pool was "ever used" rather than whether it is "currently used" (look for "ever used" reporting in code further below that relies on the results of this if statement branches; I quoted that code line above). Asking mp_stats is worse than asking the pool itself because mp_stats is problematic on several levels, including obscuring the distinction between "ever/cumulative" and "current" use statistics (e.g., mp_stats::items_inuse represents current use rather than cumulative use).

Perhapse meter->gb_allocated will work better:

Suggested change
if (pool->meter.gb_allocated.count > 0)
if (mp_stats.meter->gb_allocated.count > 0)

The proposed change has no effect on this if statement condition value: The two variants use two different ways to access the same gb_allocated object because mp_stats.meter just points to pool->meter:

    stats.meter = &meter;

That gb_allocated object represents long-term (i.e. "ever used") history because MemPools::flushMeters() and flushCounters() methods preserve past gb_allocated values.

PR condition variant is better than the suggested replacement because it is more direct, because it avoids a problematic object, and because we are computing an "ever used" condition that the pool ought to know about even when mp_stats members like items_inuse represent "current use".

I believe this change request about alleged old bug is invalid (for the reasons detailed above), but even if you disagree, please dismiss your negative review (so that this arguably minimal PR fixing another bug and not changing this if statement condition can be merged) and then post a PR dedicated to fixing that alleged old bug. That way, we will be making steady progress.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exposes an old TOCTOU bug.

I do not see a TOCTOU bug in this code (official or PR):

* Time Of Check: This `if` statement condition.

Correct.

* Time Of Use: The "true" and "false" branches of this `if` statement.

Not necessarily, TOU refers to the value(s) being validated by the conditional.

In this case the some values are being copied from pool to mp_stats outside the control of this if-statement. Their TOU is inside getStats(..) somewhere, with some pool's being changed between the TOU and TOC.

We do not notice it because the affected pools have large counts and the difference is single digits.

The proposed change has no effect on this if statement condition value: The two variants use two different ways to access the same gb_allocated object because mp_stats.meter just points to pool->meter:

You are right, yes. What we should do instead is have the getStats(..) call inside the if-statement branch with the new condition you made depend on pool.

if (pool->meter.gb_allocated.count > 0) {
    PoolStats mp_stats;
    pool->getStats(mp_stats);
    usedPools.emplace_back(mp_stats);
} else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exposes an old TOCTOU bug.

I do not see a TOCTOU bug in this code (official or PR):

* Time Of Check: This `if` statement condition.

Correct.

* Time Of Use: The "true" and "false" branches of this `if` statement.

Not necessarily, TOU refers to the value(s) being validated by the conditional.

TOU refers to use of values after the check (i.e. after TOC). That use is not limited to this if statement -- there is more value-using code after the if statement, of course.

As far as this PR is concerned, the condition is effectively unchanged and makes sense. The values used by conditional code (and code further below) are unchanged and make sense. We should not be discussing any of this in this PR!

In this case the some values are being copied from pool to mp_stats outside the control of this if-statement.

Yes, this if statement does not care how copied values were computed. It simply uses what was computed. There is nothing wrong with that.

Their TOU is inside getStats(..) somewhere, with some pool's being changed between the TOU and TOC.

TOCTOU (a.k.a. Time Of Check To Time Of Use) bugs are bugs that happen when the value changes between TOC and TOU. Whatever use happens before TOC is irrelevant to TOCTOU. ( There is also no relevant "use" inside getStats() as far as this report is concerned. "Use" happens later, when computed values are added to the report. )

For the purpose of assessing whether this if statement introduces a TOCTOU bug, any use of a value before this if statement is not relevant.

We do not notice it because the affected pools have large counts and the difference is single digits.

The alleged difference you are talking about (i.e. reporting zero stats for a pool that became used after its getStats()) call is not relevant to this PR.

The proposed change has no effect on this if statement condition value: The two variants use two different ways to access the same gb_allocated object because mp_stats.meter just points to pool->meter:

You are right, yes.

That "this if statement is effectively unchanged" fact should resolve this change request. This PR does not change anything related to this change request.

What we should do instead is have the getStats(..) call inside the if-statement branch with the new condition you made depend on pool.

The condition is not new, as we have established above. I have not "made depend on pool" anything that had not depend on the very same pool before this PR! No use/check timings have changed.

if (pool->meter.gb_allocated.count > 0) {
    PoolStats mp_stats;
    pool->getStats(mp_stats);
    usedPools.emplace_back(mp_stats);
} else

I agree that something like that should be done to avoid reporting all zeros for pools that became used after getStats(). This change has nothing to do with this PR though. Please let this PR merge and post a dedicated PR improving this aspect of this code. That other PR should also move the condition inside a pool method (easy) OR make getStats() constant (more improvements, but requires additional code adjustments). Otherwise, we should not reorder getStats() (that may compute meter.gb_allocated!) and the condition that checks meter.gb_allocated.

If you insist on adding these out of scope changes to this PR, I will instead revert PR changes that remove PoolStats::pool member, so that this code is completely unchanged. Would you prefer that?

usedPools.emplace_back(mp_stats);
else
++not_used;
Expand Down