Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

materialized: strip debuginfo #31207

Merged
merged 1 commit into from
Jan 27, 2025

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Jan 27, 2025

The materialized binary generates massive debuginfo: 8GB with full debuginfo and 3.5GB with limited debuginfo. (This is not the compressed size of the debuginfo on disk, but the amount of memory required to load the debuginfo into memory so that backtraces can be symbolized.)

We've historically shipped the materialized binary with full debuginfo, so that we get rich backtraces for any crashes users run into when running the emulator. Unfortunately, the full debuginfo is so large that it is itself the cause of OOMs 0, which is unacceptable. (When a console query encounters certain routine errors, like a connection failing to validate, the adapter attempts to log a backtrace, which requires loading the debuginfo.) Even the limited debuginfo size (3.5GB) is unacceptable for a Docker image that's meant to be run on developer laptops.

So, this commit adjusts the materialized image to strip all debuginfo from the binary.

If a user reports a crash with an unsymbolized backtrace, it's still possible (just painful) to manually symbolize the backtrace as long as they give us the exact version of Materialize they were running. We'll just need to manually run addr2line on each address reported in the backtrace. (We do irrevocably lose access to frames for inlined functions, but that seems tolerable.)

One silver lining here is that the Docker image will get much smaller. It's currently about 1GB. I expect this change to shave off a huge chunk of that.

Motivation

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

The materialized binary generates massive debuginfo: 8GB with full
debuginfo and 3.5GB with limited debuginfo. (This is not the compressed
size of the debuginfo on disk, but the amount of memory required to load
the debuginfo into memory so that backtraces can be symbolized.)

We've historically shipped the materialized binary with full debuginfo,
so that we get rich backtraces for any crashes users run into when
running the emulator. Unfortunately, the full debuginfo is so large that
it is itself the *cause* of OOMs [0], which is unacceptable. (When a
console query encounters certain routine errors, like a connection
failing to validate, the adapter attempts to log a backtrace, which
requires loading the debuginfo.) Even the limited debuginfo size (3.5GB)
is unacceptable for a Docker image that's meant to be run on developer
laptops.

So, this commit adjusts the materialized image to strip all debuginfo
from the binary.

If a user reports a crash with an unsymbolized backtrace, it's still
*possible* (just painful) to manually symbolize the backtrace as long as
they give us the exact version of Materialize they were running. We'll
just need to manually run `addr2line` on each address reported in the
backtrace. (We do irrevocably lose access to frames for inlined
functions, but that seems tolerable.)

One silver lining here is that the Docker image will get much smaller.
It's currently about 1GB. I expect this change to shave off a huge chunk
of that.

[0]: https://materializeinc.slack.com/archives/C07FX1W1Y03/p1737414021061139
@benesch benesch requested a review from ParkMyCar January 27, 2025 21:09
@benesch
Copy link
Contributor Author

benesch commented Jan 27, 2025

cc @def- @antiguru

@benesch benesch enabled auto-merge January 27, 2025 21:25
Copy link
Member

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wild how large the debug info is, but this LGTM!

benesch added a commit to benesch/materialize that referenced this pull request Jan 27, 2025
Copy link
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@benesch benesch merged commit 131d014 into MaterializeInc:main Jan 27, 2025
69 checks passed
@benesch benesch deleted the strip-materialized-image branch January 27, 2025 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants