Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Link-Time Optimization (LTO) and codegen-units = 1 #8

Open
zamazan4ik opened this issue Feb 9, 2025 · 1 comment
Open

Enable Link-Time Optimization (LTO) and codegen-units = 1 #8

zamazan4ik opened this issue Feb 9, 2025 · 1 comment

Comments

@zamazan4ik
Copy link

Hi!

I've been on your FOSDEM talk about the project. It's great that you finally open-sourced your work!

I noticed that in the Cargo.toml file Link-Time Optimization (LTO) for the project is not enabled. I suggest switching it on since it will reduce the binary size (always a good thing to have) and will likely improve the application's performance. If you want to read more about LTO and its possible modes, I recommend starting from this Rustc documentation.

I think you can enable LTO only for the Release builds so as not to sacrifice the developers' experience while working on the project, since LTO consumes an additional amount of time to finish the compilation routine. In this case, we can create a dedicated [profile.optimized-dev] profile where LTO will be disabled (so developers experience will not be affected). If we enable it on the Cargo profile level for the Release profile, users, who install the application with cargo install, will get the LTO-optimized version of the app "automatically". E.g., check cargo-outdated Release profile. You also could be interested in other optimization options like codegen-units = 1 - it also brings improvements over the current defaults.

Basically, it can be enabled with the following lines:

[profile.release]
codegen-units = 1
lto = true

I have made quick tests (AMD Ryzen 5900x, Fedora 41, Rust 1.84.1, the latest version of the project at the moment, cargo build --release command) - here are the results.

Release (current default) binary sizes:

  • tuxtape-server: 11 Mib
  • tuxtape-dashboard: 13 Mib
  • tuxtape-kernel-builder: 7.5 Mib
  • tuxtape-cve-parser: 6.7 Mib

Release + codegen-units = 1 + Fat LTO:

  • tuxtape-server: 7.1 Mib
  • tuxtape-dashboard: 8.5 Mib
  • tuxtape-kernel-builder: 4.7 Mib
  • tuxtape-cve-parser: 5.9 Mib

Clean build times:

  • Release (current default): 33s
  • Release + codegen-units = 1 + Fat LTO: 72s

I understand that the current project state is a PoC. I suggest you enabling these optimizations (and probably some others) as earlier as possible so all future tuxtape versions like hardened PoC, MVP, an actual product, etc. will be optimized from the day one.

Thank you.

@graysonguarino
Copy link
Collaborator

Thank you very much for your contribution! (You are the first public contributor to TuxTape, so thank you again for your early support of this project.) This issue is very well researched and written, and this is a prime example of the community engagement we are seeking.

It seems like the changes you're suggesting will more than double compilation time for ~35% reduction in binary size on average, and the binaries already aren't too large, but you are claiming an improvement in performance which may be helpful once the project expands. Right now, there are optimizations needed in the code (I've seen likely higher than necessary CPU utilization in the dashboard for example) that are likely bigger performance bottlenecks which wouldn't affect compile time. Regardless, I don't want to dismiss any suggestions on optimization without benchmarking first.

For now, I'm going to leave this issue open as this is something we certainly should consider down the line. Once we shift into MVP and performance metrics are being benchmarked, I'll make sure to benchmark these changes and see if the tradeoffs are worthwhile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants