-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve cpu/memory (proctree wise) #4503
Open
geyslan
wants to merge
25
commits into
aquasecurity:main
Choose a base branch
from
geyslan:improv-cpu-time
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
github-actions
bot
added
area/ebpf
area/testing
area/events
area/grpc
and removed
area/performance
labels
Jan 14, 2025
geyslan
force-pushed
the
improv-cpu-time
branch
4 times, most recently
from
January 17, 2025 00:38
cb59e3e
to
cde5a80
Compare
geyslan
force-pushed
the
improv-cpu-time
branch
from
January 17, 2025 16:41
cde5a80
to
e4a750b
Compare
Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^BenchmarkDecodeArguments$ github.com/aquasecurity/tracee/pkg/bufferdecoder -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/bufferdecoder cpu: AMD Ryzen 9 7950X 16-Core Processor BenchmarkDecodeArguments-32 100000000 206.3 ns/op 512 B/op 1 alloc/op PASS ok github.com/aquasecurity/tracee/pkg/bufferdecoder 20.646s
It's a cosmetic change to make the code more readable.
When retrieving the event definition, there is no longer a need to check beforehand Core.IsDefined(). Validation can now be performed directly using the NotValid() method on the Definition type returned by GetEventDefinitionID() and GetEventDefinitionName(). Besides the lock contention reduction, this also gets rid of the window where the event definition could be changed between the check and the actual use of the definition. This also fixes a wrong logger usage in the pipeline.
Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeForkProcessor$ github.com/aquasecurity/tracee/pkg/ebpf -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeForkProcessor-32 100000000 547.4 ns/op 496 B/op 5 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf 54.757s
| Metric | Old Value | New Value | Improvement (%) | |-------------------------|------------|-------------|-----------------| | Time per operation (ns) | 547.4 | 267.5 | 51.14% | | Bytes allocated (B/op) | 496 | 0 | 100.00% | | Allocations per op | 5 | 0 | 100.00% | | Total runtime (s) | 54.757 | 26.763 | 51.13% | --- Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeForkProcessor$ github.com/aquasecurity/tracee/pkg/ebpf -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeForkProcessor-32 100000000 267.5 ns/op 0 B/op 0 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf 26.763s
Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeForkProcessor$ github.com/aquasecurity/tracee/pkg/ebpf/controlplane -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf/controlplane cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeForkProcessor-32 100000000 618.2 ns/op 496 B/op 5 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf/controlplane 61.827s
| Metric | Old Value | New Value | Improvement (%) | |-------------------------|------------|-------------|-----------------| | Time per operation (ns) | 618.2 | 274.0 | 55.67% | | Bytes allocated (B/op) | 496 | 0 | 100.00% | | Allocations per op | 5 | 0 | 100.00% | | Total runtime (s) | 61.827 | 27.415 | 55.67% | --- Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeForkProcessor$ github.com/aquasecurity/tracee/pkg/ebpf/controlplane -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf/controlplane cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeForkProcessor-32 100000000 274.0 ns/op 0 B/op 0 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf/controlplane 27.415s
Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExecProcessor$ github.com/aquasecurity/tracee/pkg/ebpf -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExecProcessor-32 100000000 514.7 ns/op 500 B/op 6 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf 51.483s
| Metric | Old Value | New Value | Improvement (%) | |-------------------------|------------|-------------|-----------------| | Time per operation (ns) | 514.7 | 215.6 | 58.12% | | Bytes allocated (B/op) | 500 | 4 | 99.20% | | Allocations per op | 6 | 1 | 83.33% | | Total runtime (s) | 51.483 | 21.571 | 58.12% | --- Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExecProcessor$ github.com/aquasecurity/tracee/pkg/ebpf -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExecProcessor-32 100000000 215.6 ns/op 4 B/op 1 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf 21.571s
Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExecProcessor$ github.com/aquasecurity/tracee/pkg/ebpf/controlplane -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf/controlplane cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExecProcessor-32 100000000 649.7 ns/op 500 B/op 6 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf/controlplane 64.981s
| Metric | Old Value | New Value | Improvement (%) | |-------------------------|------------|-------------|-----------------| | Time per operation (ns) | 649.7 | 284.2 | 56.26% | | Bytes allocated (B/op) | 500 | 4 | 99.20% | | Allocations per op | 6 | 1 | 83.33% | | Total runtime (s) | 64.981 | 28.435 | 56.26% | --- Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExecProcessor$ github.com/aquasecurity/tracee/pkg/ebpf/controlplane -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf/controlplane cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExecProcessor-32 100000000 284.2 ns/op 4 B/op 1 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf/controlplane 28.435s
Disable (comment out) ExecFeed interpreter fields not used by the feeders. This removal was already started by 4a5bb5d. --- Tracee | Metric | Old Value | New Value | Improvement (%) | |-------------------------|------------|-------------|-----------------| | Time per operation (ns) | 215.6 | 168.1 | 22.03% | | Bytes allocated (B/op) | 4 | 4 | 0.00% | | Allocations per op | 1 | 1 | 0.00% | | Total runtime (s) | 21.571 | 16.825 | 22.03% | - Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExecProcessor$ github.com/aquasecurity/tracee/pkg/ebpf -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExecProcessor-32 100000000 168.1 ns/op 4 B/op 1 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf 16.825s --- Controller | Metric | Old Value | New Value | Improvement (%) | |-------------------------|------------|-------------|-----------------| | Time per operation (ns) | 284.2 | 209.7 | 26.20% | | Bytes allocated (B/op) | 4 | 4 | 0.00% | | Allocations per op | 1 | 1 | 0.00% | | Total runtime (s) | 28.435 | 20.983 | 26.20% | - Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExecProcessor$ github.com/aquasecurity/tracee/pkg/ebpf/controlplane -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf/controlplane cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExecProcessor-32 100000000 209.7 ns/op 4 B/op 1 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf/controlplane 20.983s
For both Tracee and Controller. - Tracee Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExitProcessor$ github.com/aquasecurity/tracee/pkg/ebpf -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExitProcessor-32 100000000 159.9 ns/op 48 B/op 2 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf 16.001s --- Controller Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExitProcessor$ github.com/aquasecurity/tracee/pkg/ebpf/controlplane -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf/controlplane cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExitProcessor-32 100000000 335.5 ns/op 240 B/op 4 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf/controlplane 33.558s
Improve procTreeExitProcessor for both Tracee and Controller. - Tracee | Metric | Old Value | New Value | Improvement (%) | |-------------------------|------------|-------------|-----------------| | Time per operation (ns) | 159.9 | 95.71 | 40.14% | | Bytes allocated (B/op) | 48 | 0 | 100.00% | | Allocations per op | 2 | 0 | 100.00% | | Total runtime (s) | 16.001 | 9.586 | 40.14% | Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExitProcessor$ github.com/aquasecurity/tracee/pkg/ebpf -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExitProcessor-32 100000000 95.71 ns/op 0 B/op 0 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf 9.586s --- Controller | Metric | Old Value | New Value | Improvement (%) | |-------------------------|------------|-------------|-----------------| | Time per operation (ns) | 335.5 | 115.4 | 65.60% | | Bytes allocated (B/op) | 240 | 0 | 100.00% | | Allocations per op | 4 | 0 | 100.00% | | Total runtime (s) | 33.558 | 11.553 | 65.60% | Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^Benchmark_procTreeExitProcessor$ github.com/aquasecurity/tracee/pkg/ebpf/controlplane -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/ebpf/controlplane cpu: AMD Ryzen 9 7950X 16-Core Processor Benchmark_procTreeExitProcessor-32 100000000 115.4 ns/op 0 B/op 0 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/ebpf/controlplane 11.553s
Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^BenchmarkArgVal$ github.com/aquasecurity/tracee/pkg/events/parse -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/events/parse cpu: AMD Ryzen 9 7950X 16-Core Processor BenchmarkArgVal/int32/valid_args-32 100000000 14.43 ns/op 0 B/op 0 allocs/op BenchmarkArgVal/int32/invalid_val_type-32 100000000 551.7 ns/op 584 B/op 10 allocs/op BenchmarkArgVal/int32/not_found_arg-32 100000000 499.2 ns/op 520 B/op 10 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/events/parse 106.538s
| Sub-Benchmark | Old (ns/op) | New (ns/op) | Change (%) | |------------------|-------------|-------------|------------| | valid_args | 14.43 | 13.35 | -7.48% | | invalid_val_type | 551.7 | 589.8 | +6.90% | | not_found_arg | 499.2 | 586.0 | +17.38% | The valid_args is the most relevant case, since it traverses args based on a specific order. The other cases are not deterministic and used to measure upcoming changes for the worst case. --- Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem -run=^$ -tags ebpf -bench ^BenchmarkArgVal$ github.com/aquasecurity/tracee/pkg/events/parse -benchtime=100000000x goos: linux goarch: amd64 pkg: github.com/aquasecurity/tracee/pkg/events/parse cpu: AMD Ryzen 9 7950X 16-Core Processor BenchmarkArgVal/int32/valid_args-32 100000000 13.35 ns/op 0 B/op 0 allocs/op BenchmarkArgVal/int32/invalid_val_type-32 100000000 589.8 ns/op 584 B/op 10 allocs/op BenchmarkArgVal/int32/not_found_arg-32 100000000 586.0 ns/op 520 B/op 10 allocs/op PASS ok github.com/aquasecurity/tracee/pkg/events/parse 118.922s
It helps to reduce the stack dynamic growth and the number of allocations, which is good for performance. Changelog fields now holds pointers to the feeds, instead of the feeds themselves. This way, it aligns with the new feed pointers avoiding de-referencing.
The unique ExitFeed fields being tackeld by FeedFromExit() are TaskHash and TimeStamp. Then this commit comments out the other fields that are not being used by the proctree in this context.
Reuse the same TaskInfo reference avoiding the need to lock to fetch it. This also reorders the creation of the process and thread.
Mutex is a heavy lock, and it's not necessary to use it in the Thread concurrency control. This change replaces the mutex with atomic operations to reduce contention, what also reduces memory footprint.
It helps to reduce the stack dynamic growth and the number of allocations, which is good for performance.
geyslan
force-pushed
the
improv-cpu-time
branch
from
January 17, 2025 17:34
e4a750b
to
5e3ad59
Compare
geyslan
commented
Jan 19, 2025
// This should never fail, as the translation used in eBPF relies on the same event definitions | ||
logger.Errorw("No syscall event with id %d", id) | ||
logger.Errorw("No syscall event defined", "id", id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opportunistic fix.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Close: #4514
1. Explain what the PR does
Running local proctree stressor (deterministic workload) we got these results:
Tracee flags:
-e sched_process_exec,sched_process_fork,sched_process_exit --proctree source=both --proctree process-cache=16384 --proctree thread-cache=32768 --proctree disable-procfs -o none
Stressor details:
8 threads with 2_000_000 ops each
314326e chore(cmd): add proctree disable-procfs
e4a750b perf(controlplane): introduce signal pool
06201c3 perf(proctree): improve Process concurrency ctrl
6ac1547 perf(proctree): change Thread concurrency control
05999ac perf(proctree): reduce lock contention
d5c51a1 chore(proctree): remove leftover
250140b chore/perf(proctree): comment out exit fields
8343a94 perf(proctree): introduce feed pools
7731e88 perf(proctree): move functions from FeedFromFork
52dfe08 perf(events): improve ArgVal
a72689e chore(events): add BenchmarkArgVal
5ef143d perf: improve procTreeExitProcessor
9462139 chore: add Benchmark_procTreeExitProcessor
1c53874 perf: remove unused ExecFeed interpreter fields
7c124c8 perf(controlplane): improve procTreeExecProcessor
b3c10d5 chore(controlplane): add Benchmark_procTreeExecProcessor
6d16d35 perf(ebpf): improve procTreeExecProcessor
d423806 chore(ebpf): add Benchmark_procTreeExecProcessor
23e0ecf perf(controlplane): improve procTreeForkProcessor
07016ee chore(controlplane): add procTreeForkProcessor bench
75ae14b perf(ebpf): improve procTreeForkProcessor
dd27132 chore(ebpf): add Benchmark_procTreeForkProcessor
582858e perf: reduce events.Core lock contention
d0b94a7 chore(bufferdecoder): set zero from def fields
592c167 chore(bufferdecode): add DecodeArguments benchmark
e4a750b perf(controlplane): introduce signal pool
6ac1547 perf(proctree): change Thread concurrency control
05999ac perf(proctree): reduce lock contention
250140b chore/perf(proctree): comment out exit fields
8343a94 perf(proctree): introduce feed pools
52dfe08 perf(events): improve ArgVal
a72689e chore(events): add BenchmarkArgVal
5ef143d perf: improve procTreeExitProcessor
9462139 chore: add Benchmark_procTreeExitProcessor
1c53874 perf: remove unused ExecFeed interpreter fields
7c124c8 perf(controlplane): improve procTreeExecProcessor
b3c10d5 chore(controlplane): add Benchmark_procTreeExecProcessor
6d16d35 perf(ebpf): improve procTreeExecProcessor
d423806 chore(ebpf): add Benchmark_procTreeExecProcessor
23e0ecf perf(controlplane): improve procTreeForkProcessor
07016ee chore(controlplane): add procTreeForkProcessor bench
75ae14b perf(ebpf): improve procTreeForkProcessor
dd27132 chore(ebpf): add Benchmark_procTreeForkProcessor
582858e perf: reduce events.Core lock contention
d0b94a7 chore(bufferdecoder): set zero from def fields
592c167 chore(bufferdecode): add DecodeArguments benchmark
2. Explain how to test it
3. Other comments