Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zdb: improvements to block histograms (zdb -bb) #16999

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

intelfx
Copy link

@intelfx intelfx commented Jan 28, 2025

Motivation and Context

When trying to use zdb -bb to evaluate space requirements for a ZFS deployment
with special vdevs and special_small_blocks, I encountered several instances
of counterintuitive behavior that makes it mostly unsuitable for this task.

Description

Technically, this PR consists of three mutually independent changes, however,
since they further a common goal, I decided to submit them in a single branch.

Two changes add new flags to zdb -bb that make it easier to inspect specific
aspects of space usage in a ZFS pool:

  • zdb: add --bin=(lsize|psize|asize) arg to control histogram binning

    When counting blocks to generate block size histograms (-bb), accept a
    --bin= argument to force placing blocks into all three bins based on
    this size.

    E.g. with --bin=lsize, a block with lsize=512K, psize=128K, asize=256K
    will be placed into the "512K" bin in all three output columns. This
    way, by looking at the "512K" row the user will be able to determine
    how well was ZFS able to compress blocks of this logical size.

    Conversely, with --bin=psize, by looking at the "128K" row the user
    will be able to determine how much overhead was incurred for storage
    of blocks of this physical size.

  • zdb: add --class=(normal|special|...) to filter blocks by alloc class

    When counting blocks to generate block size histograms (-bb), accept a
    --class= argument (as a comma-separated list of either "normal",
    "special", "dedup" or "other") to only consider blocks that belong to
    these metaslab classes.

Third change alters the behavior of zdb -bb itself. As it stands now, the
binning behavior of zdb -bb is undocumented and contradicts a large body of
online discussion and advice that recommends zdb -bb to be used to estimate
vdev space requirements (in particular, space requirements for special vdevs
in conjunction with the special_small_blocks feature).

Hopefully, the new binning behavior can be deemed more generally useful than the
existing one. If not, we can add another flag to choose the binning strategy.
The explanation of the change follows.

  • zdb: adjust block histogram binning strategy

    Previously, a bin included all blocks starting from given size
    (e.g., a "4K" bin would include all blocks within the [4K; 8K) region).
    This is counter-intuitive and does not match the typical use-case of the
    block histogram (that is, to estimate disk usage considering how ZFS'
    block allocation works). In other words, if I'm looking at the "4K" row,
    I'm interested in records that fit into a 4K block.

    Hence, adjust the binning strategy such that a bin includes all blocks
    up to given size, such that e.g. a "4K" bin would include all blocks
    within the (2K; 4K] region.

How Has This Been Tested?

  • manually tested on x86_64 Linux

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

When counting blocks to generate block size histograms (`-bb`), accept a
`--bin=` argument to force placing blocks into all three bins based on
*this* size.

E.g. with `--bin=lsize`, a block with lsize=512K, psize=128K, asize=256K
will be placed into the "512K" bin in all three output columns. This
way, by looking at the "512K" row the user will be able to determine
how well was ZFS able to compress blocks of this logical size.

Conversely, with `--bin=psize`, by looking at the "128K" row the user
will be able to determine how much overhead was incurred for storage
of blocks of this physical size.

Signed-off-by: Ivan Shapovalov <[email protected]>
When counting blocks to generate block size histograms (`-bb`), accept a
`--class=` argument (as a comma-separated list of either "normal",
"special", "dedup" or "other") to only consider blocks that belong to
these metaslab classes.

Signed-off-by: Ivan Shapovalov <[email protected]>
Previously, a bin included all blocks _starting_ from given size
(e.g., a "4K" bin would include all blocks within the [4K; 8K) region).
This is counter-intuitive and does not match the typical use-case of the
block histogram (that is, to estimate disk usage considering how ZFS'
block allocation works). In other words, if I'm looking at the "4K" row,
I'm interested in records that _fit into_ a 4K block.

Adjust the binning strategy such that a bin includes all blocks _up to_
given size, such that e.g. a "4K" bin would include all blocks within
the (2K; 4K] region.

Signed-off-by: Ivan Shapovalov <[email protected]>
break;

case ARG_BLOCK_CLASSES: {
char *buf = strdupa(optarg), *tok = buf, *next;

Check failure

Code scanning / CodeQL

Call to alloca in a loop High

Stack allocation is inside a
while (...) ...
loop.
case ARG_BLOCK_CLASSES: {
char *buf = strdupa(optarg), *tok = buf, *next;

while ((next = strtok(tok, ",")) != NULL) {

Check failure

Code scanning / CodeQL

Deprecated function usage detection Error

Use strtok_r(3) instead!
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants