-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve symbol resolution API #339
Conversation
✅ Deploy Preview for aya-rs ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify site settings. |
@dave-tucker I noticed this has some overlap with #329 - at least both changes implement parsing of |
596db35
to
63f4560
Compare
This change implements additional symbol resolving for stack traces, which improve the support for userspace stack traces. A challenge with the stack traces obtained via ebpf is that they just contain a raw address inside the virtual memory space of the process, which requires some extra steps to be translated into a function name. This change provides most of the infrastructure to provide the translation via a `DefaultResolver` type which can resolve both kernelspace and userspace functions based on a PID and address. In addition in adds a `SymbolResolver` trait that allows to customize resolving behavior and extend it further. I originally looked into just using the [StackTrace::resolve](https://docs.rs/aya/0.11.0/aya/maps/stack_trace/struct.StackTrace.html#method.resolve) function. However I noticed that the information that is at the moment stored in the StackTrace struct is not sufficient for resolving, e.g. due to missing the PID and being unable to account for the virtual -> physical memory address translation. Therefore this change uses a new resolver infrastructure. Usage via a suitable EBPF program that sends stack traces and process IDs: ```rust /// The BPF program populates this Queue with the process ID, kernel-space and user-space /// stack trace IDs. /// The former can be obtained via `bpf_get_current_pid_tgid()`, the stack /// traces via `aya_bpf::maps::StackTrace`. let mut stacks = Queue::<_, [u64; 3]>::try_from(bpf.map_mut("STACKS")?)?; let stack_traces = StackTraceMap::try_from(bpf.map_mut("STACK_TRACES")?)?; let resolver = DefaultResolver::new().unwrap(); loop { match stacks.pop(0) { Ok([pid_tgid, ktrace_id, utrace_id]) => { let tgid = pid_tgid & 0xFFFFFFFF; if let Ok(trace) = stack_traces.get(&(ktrace_id as u32), 0) { for f in trace.frames() { let mut symbol = SymbolInfo::unresolved_kernel(f.ip); resolver.resolve(&mut symbol); println!("Resolved kernel address: 0x{:x} to {:?}", f.ip, symbol); } } if let Ok(trace) = stack_traces.get(&(utrace_id as u32), 0) { for f in trace.frames() { let mut symbol = SymbolInfo::unresolved_user(tgid as _, f.ip); resolver.resolve(&mut symbol); info!("Resolved pid {}, address: 0x{:x} to {:?}", tgid, f.ip, symbol); } } } } } ``` BPF code: ```rust static STACK_TRACES: StackTrace = StackTrace::with_max_entries(10, 0); static STACKS: Queue<[u64; 3]> = Queue::with_max_entries(1024, 0); pub fn testprobe(ctx: ProbeContext) -> u32 { unsafe { let pid_tgid = bpf_get_current_pid_tgid(); let ustack = STACK_TRACES.get_stackid(&ctx, BPF_F_USER_STACK as _); let kstack = STACK_TRACES.get_stackid(&ctx, 0); match (kstack, ustack) { (Ok(kstack), Ok(ustack)) => { if let Err(e) = STACKS.push(&[pid_tgid, kstack as _, ustack as _], 0) { info!(&ctx, "Error pushing stack: {}", e); } }, _ => {} } 0 } } ``` Output when fetching stack traces of a kprobe on `sendmmsg` on a test program: ``` Resolved kernel address: 0xffffffff81a62691 to SymbolInfo { virtual_address: 18446744071589734033, object_address: Some(18446744071589734033), process_id: None, function_name: Some("__sys_sendmmsg"), object_path: None } Resolved kernel address: 0xffffffff81a62870 to SymbolInfo { virtual_address: 18446744071589734512, object_address: Some(18446744071589734512), process_id: None, function_name: Some("__x64_sys_sendmmsg"), object_path: None } Resolved kernel address: 0xffffffff81da30a3 to SymbolInfo { virtual_address: 18446744071593144483, object_address: Some(18446744071593144483), process_id: None, function_name: Some("do_syscall_64"), object_path: None } Resolved kernel address: 0xffffffff81e0007c to SymbolInfo { virtual_address: 18446744071593525372, object_address: Some(18446744071593525372), process_id: None, function_name: Some("entry_SYSCALL_64_after_hwframe"), object_path: None } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x7fccf3b2adee to SymbolInfo { virtual_address: 140518238629358, object_address: Some(1195502), process_id: Some(22158), function_name: None, object_path: Some("/usr/lib/x86_64-linux-gnu/libc-2.31.so") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd66cc to SymbolInfo { virtual_address: 94856245241548, object_address: Some(947916), process_id: Some(22158), function_name: Some("tokio::io::async_fd::AsyncFdReadyGuard<Inner>::try_io"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd43e6 to SymbolInfo { virtual_address: 94856245232614, object_address: Some(938982), process_id: Some(22158), function_name: Some("quinn_udp::imp::UdpSocket::poll_send"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570ccd31f to SymbolInfo { virtual_address: 94856245203743, object_address: Some(910111), process_id: Some(22158), function_name: Some("<quinn::endpoint::EndpointDriver as core::future::future::Future>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cc2400 to SymbolInfo { virtual_address: 94856245158912, object_address: Some(865280), process_id: Some(22158), function_name: Some("<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd0e8d to SymbolInfo { virtual_address: 94856245218957, object_address: Some(925325), process_id: Some(22158), function_name: Some("tokio::runtime::task::harness::poll_future"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd160a to SymbolInfo { virtual_address: 94856245220874, object_address: Some(927242), process_id: Some(22158), function_name: Some("tokio::runtime::task::harness::Harness<T,S>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c6ef25 to SymbolInfo { virtual_address: 94856244817701, object_address: Some(524069), process_id: Some(22158), function_name: Some("std::thread::local::LocalKey<T>::with"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7e505 to SymbolInfo { virtual_address: 94856244880645, object_address: Some(587013), process_id: Some(22158), function_name: Some("tokio::runtime::basic_scheduler::Context::run_task"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7b2fa to SymbolInfo { virtual_address: 94856244867834, object_address: Some(574202), process_id: Some(22158), function_name: Some("tokio::macros::scoped_tls::ScopedKey<T>::set"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7de7c to SymbolInfo { virtual_address: 94856244878972, object_address: Some(585340), process_id: Some(22158), function_name: Some("tokio::runtime::basic_scheduler::BasicScheduler::block_on"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c75a3d to SymbolInfo { virtual_address: 94856244845117, object_address: Some(551485), process_id: Some(22158), function_name: Some("tokio::runtime::Runtime::block_on"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c5c597 to SymbolInfo { virtual_address: 94856244741527, object_address: Some(447895), process_id: Some(22158), function_name: Some("std::sys_common::backtrace::__rust_begin_short_backtrace"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c59d44 to SymbolInfo { virtual_address: 94856244731204, object_address: Some(437572), process_id: Some(22158), function_name: Some("core::ops::function::FnOnce::call_once{{vtable.shim}}"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570ecc763 to SymbolInfo { virtual_address: 94856247297891, object_address: Some(3004259), process_id: Some(22158), function_name: Some("std::sys::unix::thread::Thread::new::thread_start"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } ```
Ping. I just rebased this on top of |
/// See [`DefaultResolver`] for exemplaric usage. | ||
pub trait SymbolResolver { | ||
/// Resolves a symbol based on it's address | ||
fn resolve(&self, symbol: &mut SymbolInfo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose this API with the intention of "you provide all the info you have, and the resolver fetches the rest". Alternatives could be:
fn resolve(&self, symbol: &SymbolInfo) -> Option<Symbolnfo>;
or
fn resolve(&self, symbol: &SymbolInfo) -> Vec<Symbolnfo>;
which don't mutate the initial info. But at the the former is also achievable with the current API, and cloning the sybol
before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit confusing, wouldn't it be clearer if we had something like:
let user_resolver = FileResolver::new("/some/path")?;
let sym = user_resolver.resolve(123);
let user_resolver = PidResolver::new(42)?;
let sym = user_resolver.resolve(123);
let kernel_resolver = KernelResolver()?;
let sym = kernel_resolver.resolve(123)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment at https://github.com/aya-rs/aya/pull/339/files#r933884841 on what could be public. That could certainly also be done, but it means there's no common trait (and thereby also builtin mechnaism for composition/caching/etc) anymore. Also the question what sym
contains in those results. Is it always the same, or an individual result type per resolver?
In your example, I guess PidResolver
will not resolve a symbol but just translate the address? It is ProcMemMap
?
/// Resolves a symbol based on it's address. | ||
/// | ||
/// See [`DefaultResolver`] for exemplaric usage. | ||
pub trait SymbolResolver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably replace StackTrace::resolve
in a backward incompatible API change. It's much more powerful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agree we need a more powerful API
} | ||
|
||
/// A resolver for kernel symbols | ||
pub struct KernelSymbolResolver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once StackTrace::resolve
is removed, this resolver can also avoid kernel_symbols()
being a public function, since the resolver should be able to fulfill the whole job.
impl DefaultResolver { | ||
/// Create a new DefaultResolver | ||
pub fn new() -> Result<Self, CreateResolverError> { | ||
let caching_resolver = CachingResolver::with_capacity(8192, CombinedResolver::new()?); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong opinion about any of the LRU capacities in this change. I just thought it would be a good idea to limit the amount of metadata that the process loads - if this is used for long-running profiling.
.ok() | ||
}); | ||
|
||
// Return the last frame in the stack of frames, since the inlined functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this part is an interesting design choice: Should the aya resolver return just the outermost frame? Or potentially all the frames? The latter might provide a bit more data for profilers that run on top of that, or for debugging purposes. But there's also more data produced - and the SymbolResolver
API would need to be changed to be able to return a stack of symbols. Not exactly sure how that API should look like, since just returning Vec<SymbolInfo>
might then have wrongly populated virtual addresses if we just copy&paste those. Maybe the API would need to be closer to resolve(&SymbolInfo) -> Vec<ResolvedSymbol>
where the latter doesn't care about PIDs and virtual addresses anymore?
Besides that producing more data obviously also has a cost.
/// The resolved function name | ||
function_name: Option<String>, | ||
/// Path to the object which defines the symbol | ||
object_path: Option<PathBuf>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those and potentially even the other fields will be shared a lot between symbols. Could be using Arc
s? But then OTOH all those strings should be relatively short and there might be more atomic refcount operations, which probably negates the benefit a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably too soon to think about this - ideally we would borrow from the underlying ELF parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this and apologies again for the delay in reviewing it!
We definitely need a better API to resolve symbols and once we settle on one we'll deprecate StackTrace::resolve
.
See my comments inline.
path::{Path, PathBuf}, | ||
}; | ||
|
||
use addr2line::{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you've noticed in the USDT PR, we use the object
crate to parse ELF and I think we should use it here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given addr2line
is a tiny wrapper around object
, what would be the win? It seems like it would just duplicate the functionality. Plus at least I don't have the domain knowledge about ELF files at the moment and time to dig into it, so if that's preferable someone else would need to look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I did not realize that :) It seems that it can do demangling too, and that's behind cargo features. I think we should default to having those features off so that we don't bring in more dependencies for people that are not going to implement profilers.
In any case we shouldn't re-export any symbols from addr2line
in the same way that we don't re-export anything from any dependencies.
|
||
/// Stores all information that was resolved for a specific symbol | ||
#[derive(Debug, Clone)] | ||
pub struct SymbolInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking, shouldn't we have different types for kernel and user symbols? Eg process_id
, object_address
and object_path
don't really apply to kernel symbols. Is there an advantage to having the same type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mostly there to have a common resolver interface. Also if someone wants to represent a full call stack covering kernel and application side, they could use Vec<SymbolInfo>
or Stack<SymbolInfo>
instead of having a hetereogenous list.
But sure, seperate types would also work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the ebpf API can return complete user + stack traces. One would have to do some work to reassemble them, and at that point they could also convert to a unified interface.
/// The resolved function name | ||
function_name: Option<String>, | ||
/// Path to the object which defines the symbol | ||
object_path: Option<PathBuf>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably too soon to think about this - ideally we would borrow from the underlying ELF parser.
/// Resolves a symbol based on it's address. | ||
/// | ||
/// See [`DefaultResolver`] for exemplaric usage. | ||
pub trait SymbolResolver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agree we need a more powerful API
/// See [`DefaultResolver`] for exemplaric usage. | ||
pub trait SymbolResolver { | ||
/// Resolves a symbol based on it's address | ||
fn resolve(&self, symbol: &mut SymbolInfo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit confusing, wouldn't it be clearer if we had something like:
let user_resolver = FileResolver::new("/some/path")?;
let sym = user_resolver.resolve(123);
let user_resolver = PidResolver::new(42)?;
let sym = user_resolver.resolve(123);
let kernel_resolver = KernelResolver()?;
let sym = kernel_resolver.resolve(123)
} | ||
|
||
/// A SymbolResolver which uses the `addr2line` Rust library | ||
pub struct Addr2LineResolver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably not the right level of abstraction. I'm not sure we should provide an API that can resolve from multiple pids and multiple object files. I think we should provide primitives and then let people could compose them as they wish. Eg, we should definitely provide the primitives to resolve from kernel, object files and pids - but should we provide a full blown API to implement a system wide profiler that can profile kernel and separate processes? I'm inclined to say no.
Generally speaking, I see aya as a high level rust wrapper for the primitives offered by the ebpf platform. Higher level abstractions that can be built using core primitives should ideally go in side crates. I do think that it would be really cool to provide a complete profiler API, but it should probably go in a aya-profiler
crate or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. Eg, we should definitely provide the primitives to resolve from kernel, object files and pids
I agree - that's why all basic resolvers are also individually available
but should we provide a full blown API to implement a system wide profiler that can profile kernel and separate processes? I'm inclined to say no.
I think yes - just because it requires a significant amount of domain knowledge to get a full stack trace for whatever values the BPF code returns otherwise. I certainly spent multiple hours on topics that are 100% unrelated to both eBPF and the application I wanted to look into before I got any useful results. I think eBPF application developers will often want to get a readable stack trace, but only the minority might be interested in how the innards of that work.
Also consider that those functions are not only interesting for profiler authors that might have a bit of a better background around call stacks - but also for all others. E.g. I might want to write a program for debugging purposes which logs the time spent on particular syscalls - and we then want to figure out which codepath actually triggered high latency. A simple resolver with a default constructor that allows to act on the address/pid values that the BPF code submits can do that job - and exactly that is available via DefaultResolver
.
I assume other tools like bpftrace
have the necessary code included for a similar reason.
It's however a fair question to ask which "intermediate" layer tools should be provided, and e.g. whether the struct Addr2LineResolver
with 2 layers of caching here really should be public - instead of being purely an implementation detail of DefaultResolver
(or DefaultUserspaceResolver` if you want to separate those).
If only building blocks and the high level resolver would be public, it would be:
KernelResolver
ProcMemMap
Addr2LineObjectContext
(potentially to be renamed)DefaultResolver
(potentially to be renamed)
And since most of those don't have common input/result types anymore, it might not be worthwhile to have the Resolver
trait anymore.
} | ||
} | ||
|
||
/// Parsed line for /proc/[pid]/maps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you've noticed in the other PR we already have proc maps parsing code we use to attach uprobes, and dave just moved it so it can be used to implement USDT probes too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. But that PR isn't merged yet - and it didn't even exist when I started writing this. My understanding is that this can be unified after whatever PR is merged first. Or even after both are merged in the form of a cleanup PR. I can remove the pub
annotations to avoid this being a breaking API change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have some proc maps parsing code here
aya/aya/src/programs/uprobe.rs
Line 192 in d968094
fn proc_maps_libs(pid: pid_t) -> Result<Vec<(String, String)>, io::Error> { |
I haven't commented on the details of the API yet, but yes we would definitely not make the proc maps code public, nor the addr2line re-exports or addr2line specific errors. The API should be small and self contained.
Thanks for tagging me, I have a lot of opinions on this topic! :) Symbol resolution can be quite a big topic, depending on how deep you want to go. I'm not sure if aya is the right place to offer a full symbol resolution API and choose which trade-offs to make; I'm thinking it might be better to offer just enough information so that absolute addresses can be mapped to relative address (along with the file path of the library), so that the user can call into a different crate to do the actual symbol resolution. I was already pondering whether I should make such a crate; I think this PR has convinced me that it would be worth doing. The following design questions come to mind:
Personally, I think I'd like aya to maintain each process's mappings over time, and have it tell me when libraries are loaded and unloaded, and when processes appear and disappear. I'll close with one piece of advice: When symbolicating stacks, make sure to treat return addresses specially, i.e. any non-topmost address in a call stack. If you get file+line information from addr2line, or if you get inlines, you need to make sure that the "lookup address" is shifted into the "call" instruction. By default, addresses in stacks are usually "return addresses", i.e. the address of the instruction which will be executed after the call returns. The common remedy is to just subtract one byte from all return addresses, i.e. ask addr2line to resolve |
Thanks for tagging me as well. I'm still catching up with my review requests. I'll have an in-depth look at this. |
As far as the #329 is concerned, the only similarity is that it requires more fields that are currently exposed from TL;DR: There should be no conflicts between this PR, and my one... and even if there were they should be trivial to resolve.
@mstange this ☝️ feels like the right approach to me.
I certainly feel like the first item would be ~easy to do. I'd already considered a lazy-init cache for proc/maps, and using something (inotify?) to keep it updated until the proc is closed. Adding a public API to register a callback when the cache is updated for a given PID would hopefully fulfil the requirement. I'm not sure how we'd handle "when process appear or disappear" without watching procfs for new pids... and how Aya would know which pids are interesting... 🤔 It might be easier to defer that to the user, and to let them register interesting PIDs with Aya via the same API as described above? |
When you call |
I tried taking a stab at resolving symbols here https://github.com/zz85/profile-bee/blob/main/profile-bee/src/symbols.rs The approach is similar to this PR but falls back to loading symbols from the ELF format. There's definitely still some rough edges (don't think it's handling relocations/offsets correctly) but putting it out here in case it's useful to anyone looking into symbol lookups |
Bringing this back up with #705: what's the preferred implementation here? We all agree on the need for a |
@Matthias247, this pull request is now in conflict and requires a rebase. |
@Matthias247, this pull request is now in conflict and requires a rebase. |
@Matthias247, this pull request is now in conflict and requires a rebase. |
Closing this since the consensus arrived at (see #740) is that symbol resolution for stack traces should exist outside of aya. |
This change implements additional symbol resolving for stack traces, which
improve the support for userspace stack traces.
A challenge with the stack traces obtained via ebpf is that they just contain
a raw address inside the virtual memory space of the process, which
requires some extra steps to be translated into a function name. This change
provides most of the infrastructure to provide the translation via a
DefaultResolver
type which can resolve both kernelspace and userspace functions based on
a PID and address. In addition in adds a
SymbolResolver
trait that allows to customizeresolving behavior and extend it further.
I originally looked into just using the StackTrace::resolve
function. However I noticed that the information that is at the moment stored
in the StackTrace struct is not sufficient for resolving, e.g. due to missing the PID
and being unable to account for the virtual -> physical memory address translation.
Therefore this change uses a new resolver infrastructure.
Usage via a suitable EBPF program that sends stack traces and process IDs:
BPF code:
Output when fetching stack traces of a kprobe on
sendmmsg
on a test program:This change isdata:image/s3,"s3://crabby-images/d0bb7/d0bb7f7625ca5bf5c3cf7a2b7a514cf841ab8395" alt="Reviewable"