Improve symbol resolution API #339

Matthias247 · 2022-07-04T02:49:15Z

This change implements additional symbol resolving for stack traces, which
improve the support for userspace stack traces.

A challenge with the stack traces obtained via ebpf is that they just contain
a raw address inside the virtual memory space of the process, which
requires some extra steps to be translated into a function name. This change
provides most of the infrastructure to provide the translation via a DefaultResolver
type which can resolve both kernelspace and userspace functions based on
a PID and address. In addition in adds a SymbolResolver trait that allows to customize
resolving behavior and extend it further.

I originally looked into just using the StackTrace::resolve
function. However I noticed that the information that is at the moment stored
in the StackTrace struct is not sufficient for resolving, e.g. due to missing the PID
and being unable to account for the virtual -> physical memory address translation.
Therefore this change uses a new resolver infrastructure.

Usage via a suitable EBPF program that sends stack traces and process IDs:

/// The BPF program populates this Queue with the process ID, kernel-space and user-space
/// stack trace IDs.
/// The former can be obtained via `bpf_get_current_pid_tgid()`, the stack
/// traces via `aya_bpf::maps::StackTrace`.
let mut stacks = Queue::<_, [u64; 3]>::try_from(bpf.map_mut("STACKS")?)?;
let stack_traces = StackTraceMap::try_from(bpf.map_mut("STACK_TRACES")?)?;
let resolver = DefaultResolver::new().unwrap();

loop {
    match stacks.pop(0) {
        Ok([pid_tgid, ktrace_id, utrace_id]) => {
            let tgid = pid_tgid & 0xFFFFFFFF;

            if let Ok(trace) = stack_traces.get(&(ktrace_id as u32), 0) {
                for f in trace.frames() {
                    let mut symbol = SymbolInfo::unresolved_kernel(f.ip);
                    resolver.resolve(&mut symbol);
                    println!("Resolved kernel address: 0x{:x} to {:?}", f.ip, symbol);
                }
            }

            if let Ok(trace) = stack_traces.get(&(utrace_id as u32), 0) {
                for f in trace.frames() {
                    let mut symbol = SymbolInfo::unresolved_user(tgid as _, f.ip);
                    resolver.resolve(&mut symbol);
                    info!("Resolved pid {}, address: 0x{:x} to {:?}", tgid, f.ip, symbol);
                }
            }
        }
    }
}

BPF code:

static STACK_TRACES: StackTrace = StackTrace::with_max_entries(10, 0);

static STACKS: Queue<[u64; 3]> = Queue::with_max_entries(1024, 0);

pub fn testprobe(ctx: ProbeContext) -> u32 {
    unsafe {
        let pid_tgid = bpf_get_current_pid_tgid();
        let ustack = STACK_TRACES.get_stackid(&ctx, BPF_F_USER_STACK as _);
        let kstack = STACK_TRACES.get_stackid(&ctx, 0);

        match (kstack, ustack) {
            (Ok(kstack), Ok(ustack)) => {
                if let Err(e) = STACKS.push(&[pid_tgid, kstack as _, ustack as _], 0) {
                    info!(&ctx, "Error pushing stack: {}", e);
                }
            },
            _ => {}
        }

        0
    }
}

Output when fetching stack traces of a kprobe on sendmmsg on a test program:

Resolved kernel address: 0xffffffff81a62691 to SymbolInfo { virtual_address: 18446744071589734033, object_address: Some(18446744071589734033), process_id: None, function_name: Some("__sys_sendmmsg"), object_path: None }
Resolved kernel address: 0xffffffff81a62870 to SymbolInfo { virtual_address: 18446744071589734512, object_address: Some(18446744071589734512), process_id: None, function_name: Some("__x64_sys_sendmmsg"), object_path: None }
Resolved kernel address: 0xffffffff81da30a3 to SymbolInfo { virtual_address: 18446744071593144483, object_address: Some(18446744071593144483), process_id: None, function_name: Some("do_syscall_64"), object_path: None }
Resolved kernel address: 0xffffffff81e0007c to SymbolInfo { virtual_address: 18446744071593525372, object_address: Some(18446744071593525372), process_id: None, function_name: Some("entry_SYSCALL_64_after_hwframe"), object_path: None }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x7fccf3b2adee to SymbolInfo { virtual_address: 140518238629358, object_address: Some(1195502), process_id: Some(22158), function_name: None, object_path: Some("/usr/lib/x86_64-linux-gnu/libc-2.31.so") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd66cc to SymbolInfo { virtual_address: 94856245241548, object_address: Some(947916), process_id: Some(22158), function_name: Some("tokio::io::async_fd::AsyncFdReadyGuard<Inner>::try_io"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd43e6 to SymbolInfo { virtual_address: 94856245232614, object_address: Some(938982), process_id: Some(22158), function_name: Some("quinn_udp::imp::UdpSocket::poll_send"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570ccd31f to SymbolInfo { virtual_address: 94856245203743, object_address: Some(910111), process_id: Some(22158), function_name: Some("<quinn::endpoint::EndpointDriver as core::future::future::Future>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cc2400 to SymbolInfo { virtual_address: 94856245158912, object_address: Some(865280), process_id: Some(22158), function_name: Some("<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd0e8d to SymbolInfo { virtual_address: 94856245218957, object_address: Some(925325), process_id: Some(22158), function_name: Some("tokio::runtime::task::harness::poll_future"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd160a to SymbolInfo { virtual_address: 94856245220874, object_address: Some(927242), process_id: Some(22158), function_name: Some("tokio::runtime::task::harness::Harness<T,S>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c6ef25 to SymbolInfo { virtual_address: 94856244817701, object_address: Some(524069), process_id: Some(22158), function_name: Some("std::thread::local::LocalKey<T>::with"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7e505 to SymbolInfo { virtual_address: 94856244880645, object_address: Some(587013), process_id: Some(22158), function_name: Some("tokio::runtime::basic_scheduler::Context::run_task"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7b2fa to SymbolInfo { virtual_address: 94856244867834, object_address: Some(574202), process_id: Some(22158), function_name: Some("tokio::macros::scoped_tls::ScopedKey<T>::set"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7de7c to SymbolInfo { virtual_address: 94856244878972, object_address: Some(585340), process_id: Some(22158), function_name: Some("tokio::runtime::basic_scheduler::BasicScheduler::block_on"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c75a3d to SymbolInfo { virtual_address: 94856244845117, object_address: Some(551485), process_id: Some(22158), function_name: Some("tokio::runtime::Runtime::block_on"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c5c597 to SymbolInfo { virtual_address: 94856244741527, object_address: Some(447895), process_id: Some(22158), function_name: Some("std::sys_common::backtrace::__rust_begin_short_backtrace"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c59d44 to SymbolInfo { virtual_address: 94856244731204, object_address: Some(437572), process_id: Some(22158), function_name: Some("core::ops::function::FnOnce::call_once{{vtable.shim}}"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }
02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570ecc763 to SymbolInfo { virtual_address: 94856247297891, object_address: Some(3004259), process_id: Some(22158), function_name: Some("std::sys::unix::thread::Thread::new::thread_start"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") }

This change is

netlify · 2022-07-04T02:49:30Z

✅ Deploy Preview for aya-rs ready!

Built without sensitive environment variables

Name	Link
🔨 Latest commit	`9ee2776`
🔍 Latest deploy log	https://app.netlify.com/sites/aya-rs/deploys/62e572985cde7c000807bb42
😎 Deploy Preview	https://deploy-preview-339--aya-rs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

Matthias247 · 2022-07-04T02:57:30Z

@dave-tucker I noticed this has some overlap with #329 - at least both changes implement parsing of /proc/[pid]/maps - which can probably be in a single place. Maybe also with symbol resolving, since I think a part of that is also implemented in #329, but in a more low level fashion than here (this PR just asks addr2line instead of going more low level). I really just wanted to get some basic stack traces working, and used the tools that seemed to require the least effort for this. I don't have a ton of experience with ELF files and no strong opinion on how it should look like :-)

This change implements additional symbol resolving for stack traces, which improve the support for userspace stack traces. A challenge with the stack traces obtained via ebpf is that they just contain a raw address inside the virtual memory space of the process, which requires some extra steps to be translated into a function name. This change provides most of the infrastructure to provide the translation via a `DefaultResolver` type which can resolve both kernelspace and userspace functions based on a PID and address. In addition in adds a `SymbolResolver` trait that allows to customize resolving behavior and extend it further. I originally looked into just using the [StackTrace::resolve](https://docs.rs/aya/0.11.0/aya/maps/stack_trace/struct.StackTrace.html#method.resolve) function. However I noticed that the information that is at the moment stored in the StackTrace struct is not sufficient for resolving, e.g. due to missing the PID and being unable to account for the virtual -> physical memory address translation. Therefore this change uses a new resolver infrastructure. Usage via a suitable EBPF program that sends stack traces and process IDs: ```rust /// The BPF program populates this Queue with the process ID, kernel-space and user-space /// stack trace IDs. /// The former can be obtained via `bpf_get_current_pid_tgid()`, the stack /// traces via `aya_bpf::maps::StackTrace`. let mut stacks = Queue::<_, [u64; 3]>::try_from(bpf.map_mut("STACKS")?)?; let stack_traces = StackTraceMap::try_from(bpf.map_mut("STACK_TRACES")?)?; let resolver = DefaultResolver::new().unwrap(); loop { match stacks.pop(0) { Ok([pid_tgid, ktrace_id, utrace_id]) => { let tgid = pid_tgid & 0xFFFFFFFF; if let Ok(trace) = stack_traces.get(&(ktrace_id as u32), 0) { for f in trace.frames() { let mut symbol = SymbolInfo::unresolved_kernel(f.ip); resolver.resolve(&mut symbol); println!("Resolved kernel address: 0x{:x} to {:?}", f.ip, symbol); } } if let Ok(trace) = stack_traces.get(&(utrace_id as u32), 0) { for f in trace.frames() { let mut symbol = SymbolInfo::unresolved_user(tgid as _, f.ip); resolver.resolve(&mut symbol); info!("Resolved pid {}, address: 0x{:x} to {:?}", tgid, f.ip, symbol); } } } } } ``` BPF code: ```rust static STACK_TRACES: StackTrace = StackTrace::with_max_entries(10, 0); static STACKS: Queue<[u64; 3]> = Queue::with_max_entries(1024, 0); pub fn testprobe(ctx: ProbeContext) -> u32 { unsafe { let pid_tgid = bpf_get_current_pid_tgid(); let ustack = STACK_TRACES.get_stackid(&ctx, BPF_F_USER_STACK as _); let kstack = STACK_TRACES.get_stackid(&ctx, 0); match (kstack, ustack) { (Ok(kstack), Ok(ustack)) => { if let Err(e) = STACKS.push(&[pid_tgid, kstack as _, ustack as _], 0) { info!(&ctx, "Error pushing stack: {}", e); } }, _ => {} } 0 } } ``` Output when fetching stack traces of a kprobe on `sendmmsg` on a test program: ``` Resolved kernel address: 0xffffffff81a62691 to SymbolInfo { virtual_address: 18446744071589734033, object_address: Some(18446744071589734033), process_id: None, function_name: Some("__sys_sendmmsg"), object_path: None } Resolved kernel address: 0xffffffff81a62870 to SymbolInfo { virtual_address: 18446744071589734512, object_address: Some(18446744071589734512), process_id: None, function_name: Some("__x64_sys_sendmmsg"), object_path: None } Resolved kernel address: 0xffffffff81da30a3 to SymbolInfo { virtual_address: 18446744071593144483, object_address: Some(18446744071593144483), process_id: None, function_name: Some("do_syscall_64"), object_path: None } Resolved kernel address: 0xffffffff81e0007c to SymbolInfo { virtual_address: 18446744071593525372, object_address: Some(18446744071593525372), process_id: None, function_name: Some("entry_SYSCALL_64_after_hwframe"), object_path: None } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x7fccf3b2adee to SymbolInfo { virtual_address: 140518238629358, object_address: Some(1195502), process_id: Some(22158), function_name: None, object_path: Some("/usr/lib/x86_64-linux-gnu/libc-2.31.so") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd66cc to SymbolInfo { virtual_address: 94856245241548, object_address: Some(947916), process_id: Some(22158), function_name: Some("tokio::io::async_fd::AsyncFdReadyGuard<Inner>::try_io"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd43e6 to SymbolInfo { virtual_address: 94856245232614, object_address: Some(938982), process_id: Some(22158), function_name: Some("quinn_udp::imp::UdpSocket::poll_send"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570ccd31f to SymbolInfo { virtual_address: 94856245203743, object_address: Some(910111), process_id: Some(22158), function_name: Some("<quinn::endpoint::EndpointDriver as core::future::future::Future>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cc2400 to SymbolInfo { virtual_address: 94856245158912, object_address: Some(865280), process_id: Some(22158), function_name: Some("<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd0e8d to SymbolInfo { virtual_address: 94856245218957, object_address: Some(925325), process_id: Some(22158), function_name: Some("tokio::runtime::task::harness::poll_future"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570cd160a to SymbolInfo { virtual_address: 94856245220874, object_address: Some(927242), process_id: Some(22158), function_name: Some("tokio::runtime::task::harness::Harness<T,S>::poll"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c6ef25 to SymbolInfo { virtual_address: 94856244817701, object_address: Some(524069), process_id: Some(22158), function_name: Some("std::thread::local::LocalKey<T>::with"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7e505 to SymbolInfo { virtual_address: 94856244880645, object_address: Some(587013), process_id: Some(22158), function_name: Some("tokio::runtime::basic_scheduler::Context::run_task"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7b2fa to SymbolInfo { virtual_address: 94856244867834, object_address: Some(574202), process_id: Some(22158), function_name: Some("tokio::macros::scoped_tls::ScopedKey<T>::set"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c7de7c to SymbolInfo { virtual_address: 94856244878972, object_address: Some(585340), process_id: Some(22158), function_name: Some("tokio::runtime::basic_scheduler::BasicScheduler::block_on"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c75a3d to SymbolInfo { virtual_address: 94856244845117, object_address: Some(551485), process_id: Some(22158), function_name: Some("tokio::runtime::Runtime::block_on"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c5c597 to SymbolInfo { virtual_address: 94856244741527, object_address: Some(447895), process_id: Some(22158), function_name: Some("std::sys_common::backtrace::__rust_begin_short_backtrace"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570c59d44 to SymbolInfo { virtual_address: 94856244731204, object_address: Some(437572), process_id: Some(22158), function_name: Some("core::ops::function::FnOnce::call_once{{vtable.shim}}"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } 02:46:16 [INFO] ayatest: [ayatest/src/main.rs:98] Resolved pid 22158, address: 0x564570ecc763 to SymbolInfo { virtual_address: 94856247297891, object_address: Some(3004259), process_id: Some(22158), function_name: Some("std::sys::unix::thread::Thread::new::thread_start"), object_path: Some("/mnt/c/Users/matth/Code/rust/quinn/target/release/bulk") } ```

Matthias247 · 2022-07-30T18:09:45Z

Ping. I just rebased this on top of main. I will also add a few comments on my thoughts

Matthias247 · 2022-07-30T18:12:12Z

aya/src/symbols.rs

+/// See [`DefaultResolver`] for exemplaric usage.
+pub trait SymbolResolver {
+    /// Resolves a symbol based on it's address
+    fn resolve(&self, symbol: &mut SymbolInfo);


I chose this API with the intention of "you provide all the info you have, and the resolver fetches the rest". Alternatives could be:

fn resolve(&self, symbol: &SymbolInfo) -> Option<Symbolnfo>;

or

fn resolve(&self, symbol: &SymbolInfo) -> Vec<Symbolnfo>;

which don't mutate the initial info. But at the the former is also achievable with the current API, and cloning the sybol before.

I find this a bit confusing, wouldn't it be clearer if we had something like:

let user_resolver = FileResolver::new("/some/path")?; let sym = user_resolver.resolve(123); let user_resolver = PidResolver::new(42)?; let sym = user_resolver.resolve(123); let kernel_resolver = KernelResolver()?; let sym = kernel_resolver.resolve(123)

See the comment at https://github.com/aya-rs/aya/pull/339/files#r933884841 on what could be public. That could certainly also be done, but it means there's no common trait (and thereby also builtin mechnaism for composition/caching/etc) anymore. Also the question what sym contains in those results. Is it always the same, or an individual result type per resolver?

In your example, I guess PidResolver will not resolve a symbol but just translate the address? It is ProcMemMap?

Matthias247 · 2022-07-30T18:13:39Z

aya/src/symbols.rs

+/// Resolves a symbol based on it's address.
+///
+/// See [`DefaultResolver`] for exemplaric usage.
+pub trait SymbolResolver {


This should probably replace StackTrace::resolve in a backward incompatible API change. It's much more powerful.

We agree we need a more powerful API

Matthias247 · 2022-07-30T18:14:24Z

aya/src/symbols.rs

+}
+
+/// A resolver for kernel symbols
+pub struct KernelSymbolResolver {


Once StackTrace::resolve is removed, this resolver can also avoid kernel_symbols() being a public function, since the resolver should be able to fulfill the whole job.

Matthias247 · 2022-07-30T18:17:45Z

aya/src/symbols.rs

+impl DefaultResolver {
+    /// Create a new DefaultResolver
+    pub fn new() -> Result<Self, CreateResolverError> {
+        let caching_resolver = CachingResolver::with_capacity(8192, CombinedResolver::new()?);


I have no strong opinion about any of the LRU capacities in this change. I just thought it would be a good idea to limit the amount of metadata that the process loads - if this is used for long-running profiling.

Matthias247 · 2022-07-30T18:22:54Z

aya/src/symbols.rs

+                            .ok()
+                    });
+
+                    // Return the last frame in the stack of frames, since the inlined functions


Now this part is an interesting design choice: Should the aya resolver return just the outermost frame? Or potentially all the frames? The latter might provide a bit more data for profilers that run on top of that, or for debugging purposes. But there's also more data produced - and the SymbolResolver API would need to be changed to be able to return a stack of symbols. Not exactly sure how that API should look like, since just returning Vec<SymbolInfo> might then have wrongly populated virtual addresses if we just copy&paste those. Maybe the API would need to be closer to resolve(&SymbolInfo) -> Vec<ResolvedSymbol> where the latter doesn't care about PIDs and virtual addresses anymore?

Besides that producing more data obviously also has a cost.

Matthias247 · 2022-07-30T18:24:53Z

aya/src/symbols.rs

+    /// The resolved function name
+    function_name: Option<String>,
+    /// Path to the object which defines the symbol
+    object_path: Option<PathBuf>,


those and potentially even the other fields will be shared a lot between symbols. Could be using Arcs? But then OTOH all those strings should be relatively short and there might be more atomic refcount operations, which probably negates the benefit a bit.

I think it's probably too soon to think about this - ideally we would borrow from the underlying ELF parser.

alessandrod

Thanks for doing this and apologies again for the delay in reviewing it!

We definitely need a better API to resolve symbols and once we settle on one we'll deprecate StackTrace::resolve.

See my comments inline.

alessandrod · 2022-07-30T19:06:56Z

aya/src/symbols.rs

+    path::{Path, PathBuf},
+};
+
+use addr2line::{


As you've noticed in the USDT PR, we use the object crate to parse ELF and I think we should use it here too

Given addr2line is a tiny wrapper around object, what would be the win? It seems like it would just duplicate the functionality. Plus at least I don't have the domain knowledge about ELF files at the moment and time to dig into it, so if that's preferable someone else would need to look into it.

Oh, I did not realize that :) It seems that it can do demangling too, and that's behind cargo features. I think we should default to having those features off so that we don't bring in more dependencies for people that are not going to implement profilers.

In any case we shouldn't re-export any symbols from addr2line in the same way that we don't re-export anything from any dependencies.

alessandrod · 2022-07-30T19:10:10Z

aya/src/symbols.rs

+
+/// Stores all information that was resolved for a specific symbol
+#[derive(Debug, Clone)]
+pub struct SymbolInfo {


I'm thinking, shouldn't we have different types for kernel and user symbols? Eg process_id, object_address and object_path don't really apply to kernel symbols. Is there an advantage to having the same type?

It's mostly there to have a common resolver interface. Also if someone wants to represent a full call stack covering kernel and application side, they could use Vec<SymbolInfo> or Stack<SymbolInfo> instead of having a hetereogenous list.

But sure, seperate types would also work.

I don't think that the ebpf API can return complete user + stack traces. One would have to do some work to reassemble them, and at that point they could also convert to a unified interface.

alessandrod · 2022-07-30T19:10:56Z

aya/src/symbols.rs

+    /// The resolved function name
+    function_name: Option<String>,
+    /// Path to the object which defines the symbol
+    object_path: Option<PathBuf>,


I think it's probably too soon to think about this - ideally we would borrow from the underlying ELF parser.

alessandrod · 2022-07-30T19:11:19Z

aya/src/symbols.rs

+/// Resolves a symbol based on it's address.
+///
+/// See [`DefaultResolver`] for exemplaric usage.
+pub trait SymbolResolver {


We agree we need a more powerful API

alessandrod · 2022-07-30T19:15:45Z

aya/src/symbols.rs

+/// See [`DefaultResolver`] for exemplaric usage.
+pub trait SymbolResolver {
+    /// Resolves a symbol based on it's address
+    fn resolve(&self, symbol: &mut SymbolInfo);


I find this a bit confusing, wouldn't it be clearer if we had something like:

let user_resolver = FileResolver::new("/some/path")?; let sym = user_resolver.resolve(123); let user_resolver = PidResolver::new(42)?; let sym = user_resolver.resolve(123); let kernel_resolver = KernelResolver()?; let sym = kernel_resolver.resolve(123)

alessandrod · 2022-07-30T19:23:52Z

aya/src/symbols.rs

+}
+
+/// A SymbolResolver which uses the `addr2line` Rust library
+pub struct Addr2LineResolver {


I think this is probably not the right level of abstraction. I'm not sure we should provide an API that can resolve from multiple pids and multiple object files. I think we should provide primitives and then let people could compose them as they wish. Eg, we should definitely provide the primitives to resolve from kernel, object files and pids - but should we provide a full blown API to implement a system wide profiler that can profile kernel and separate processes? I'm inclined to say no.

Generally speaking, I see aya as a high level rust wrapper for the primitives offered by the ebpf platform. Higher level abstractions that can be built using core primitives should ideally go in side crates. I do think that it would be really cool to provide a complete profiler API, but it should probably go in a aya-profiler crate or something.

. Eg, we should definitely provide the primitives to resolve from kernel, object files and pids

I agree - that's why all basic resolvers are also individually available

but should we provide a full blown API to implement a system wide profiler that can profile kernel and separate processes? I'm inclined to say no.

I think yes - just because it requires a significant amount of domain knowledge to get a full stack trace for whatever values the BPF code returns otherwise. I certainly spent multiple hours on topics that are 100% unrelated to both eBPF and the application I wanted to look into before I got any useful results. I think eBPF application developers will often want to get a readable stack trace, but only the minority might be interested in how the innards of that work.

Also consider that those functions are not only interesting for profiler authors that might have a bit of a better background around call stacks - but also for all others. E.g. I might want to write a program for debugging purposes which logs the time spent on particular syscalls - and we then want to figure out which codepath actually triggered high latency. A simple resolver with a default constructor that allows to act on the address/pid values that the BPF code submits can do that job - and exactly that is available via DefaultResolver.

I assume other tools like bpftrace have the necessary code included for a similar reason.

It's however a fair question to ask which "intermediate" layer tools should be provided, and e.g. whether the struct Addr2LineResolver with 2 layers of caching here really should be public - instead of being purely an implementation detail of DefaultResolver (or DefaultUserspaceResolver` if you want to separate those).

If only building blocks and the high level resolver would be public, it would be:

KernelResolver

ProcMemMap

Addr2LineObjectContext (potentially to be renamed)

DefaultResolver (potentially to be renamed)

And since most of those don't have common input/result types anymore, it might not be worthwhile to have the Resolver trait anymore.

alessandrod · 2022-07-30T19:25:29Z

aya/src/symbols.rs

+    }
+}
+
+/// Parsed line for /proc/[pid]/maps


As you've noticed in the other PR we already have proc maps parsing code we use to attach uprobes, and dave just moved it so it can be used to implement USDT probes too.

Sure. But that PR isn't merged yet - and it didn't even exist when I started writing this. My understanding is that this can be unified after whatever PR is merged first. Or even after both are merged in the form of a cleanup PR. I can remove the pub annotations to avoid this being a breaking API change.

We already have some proc maps parsing code here

aya/aya/src/programs/uprobe.rs

Line 192 in d968094

fn proc_maps_libs(pid: pid_t) -> Result<Vec<(String, String)>, io::Error> {

. Whatever PR is merged first (I haven't reviewed the USDT PR yet), the code should be shared.

I haven't commented on the details of the API yet, but yes we would definitely not make the proc maps code public, nor the addr2line re-exports or addr2line specific errors. The API should be small and self contained.

alessandrod · 2022-07-30T19:35:57Z

@mstange @kakkoyun you've built profilers with aya, tagging you in case you have any inputs

mstange · 2022-07-30T20:40:53Z

Thanks for tagging me, I have a lot of opinions on this topic! :)

Symbol resolution can be quite a big topic, depending on how deep you want to go. I'm not sure if aya is the right place to offer a full symbol resolution API and choose which trade-offs to make; I'm thinking it might be better to offer just enough information so that absolute addresses can be mapped to relative address (along with the file path of the library), so that the user can call into a different crate to do the actual symbol resolution. I was already pondering whether I should make such a crate; I think this PR has convinced me that it would be worth doing.

The following design questions come to mind:

When do you read proc maps? Whenever symbols are requested, or do you maintain a cached representation? Do you handle libraries being loaded and unloaded during program observation? Perf does the following: It reads proc maps once at the start of recording, and then sets perf_event_attr.mmap2 = true so that it receives PERF_RECORD_MMAP2 records for any newly loaded libraries. Then it can map absolute addresses to the right library at any point during recording.
Which type of "relative address" do you store? When you map an absolute address to a relative address, there are two types of relative addresses you might choose: file offsets or "stated virtual memory addresses" (SVMAs). The former is easy to compute from the proc maps info, but the latter is what addr2line wants. This is a source of bugs because, for many libraries, the two end up being the same, so you don't notice the discrepancy until you run into a library in which the two are different. For example, libxul.so (from Firefox) is one where the two are different; libxul.so's .text section has a virtual memory address which is different from the section's file offset. Perf stores relative addresses as file offsets (it calls them "rip", i.e. "relative instruction pointer") and then resolves those to SVMAs (which it calls "objdump addresses") only whenever it calls out to addr2line / objdump. Samply stores SVMAs; it has to look at the library's section information before it can compute any relative addresses.
Where do you look for symbol information? Only in the actual loaded binary, or do you search other places as well? Do you support split dwarf? Do you use debuginfod to pull additional debug files from the web?
How do you handle libraries which have incomplete symbol information? You may want to generate placeholder symbols based on function start addresses from unwind information. If you don't do this, you may end up with many misleading function names for unrelated addresses.
Do you handle proc mountinfo information? I think this is needed when observing ubuntu snap builds such as the snap Firefox build. Those load many libraries from the snap file system, so you can't just trust the paths in the proc maps.

Personally, I think I'd like aya to maintain each process's mappings over time, and have it tell me when libraries are loaded and unloaded, and when processes appear and disappear.

I'll close with one piece of advice: When symbolicating stacks, make sure to treat return addresses specially, i.e. any non-topmost address in a call stack. If you get file+line information from addr2line, or if you get inlines, you need to make sure that the "lookup address" is shifted into the "call" instruction. By default, addresses in stacks are usually "return addresses", i.e. the address of the instruction which will be executed after the call returns. The common remedy is to just subtract one byte from all return addresses, i.e. ask addr2line to resolve return_address - 1. If you don't do this, you can end up with wrong line numbers and wrong inline stacks.
Oh, also, please don't discard inline stack data. Yes, the callee-most function of an inline stack on its own is usually not very useful, but the outer function on its own is often not very useful either. For best results you really want to keep the entire inline stack for each address.

kakkoyun · 2022-08-01T11:08:05Z

Thanks for tagging me as well. I'm still catching up with my review requests. I'll have an in-depth look at this.

dave-tucker · 2022-08-02T11:30:01Z

As far as the #329 is concerned, the only similarity is that it requires more fields that are currently exposed from /proc/$pid/maps - we need the VMAs for find_by_offset, which yields the proc map entry for a given relative address, so you can compute an absolute address. All ELF parsing is related to the special USDT notes format, which contains the instruction pointer(s) used to attach to. No symbol resolution is required.

TL;DR: There should be no conflicts between this PR, and my one... and even if there were they should be trivial to resolve.

Symbol resolution can be quite a big topic, depending on how deep you want to go. I'm not sure if aya is the right place to offer a full symbol resolution API and choose which trade-offs to make; I'm thinking it might be better to offer just enough information so that absolute addresses can be mapped to relative address (along with the file path of the library), so that the user can call into a different crate to do the actual symbol resolution. I was already pondering whether I should make such a crate; I think this PR has convinced me that it would be worth doing.

@mstange this ☝️ feels like the right approach to me.
I know very little about stack traces and symbol resolution, but given my recent experience with /proc/$pid/maps I'll weigh in on a few points below...

Personally, I think I'd like aya to maintain each process's mappings over time, and have it tell me when libraries are loaded and unloaded, and when processes appear and disappear..

I certainly feel like the first item would be ~easy to do. I'd already considered a lazy-init cache for proc/maps, and using something (inotify?) to keep it updated until the proc is closed. Adding a public API to register a callback when the cache is updated for a given PID would hopefully fulfil the requirement.

I'm not sure how we'd handle "when process appear or disappear" without watching procfs for new pids... and how Aya would know which pids are interesting... 🤔 It might be easier to defer that to the user, and to let them register interesting PIDs with Aya via the same API as described above?

mstange · 2022-08-02T15:38:05Z

When you call perf_event_open, you can request event records for new processes, i.e. "task" (fork) records. And you can also request "comm" records for process renames / execs. These records are written into the perf event's mmap buffer, but I'm not really sure how this buffer interacts with the buffer that has the PerfEventArray output - or if that's even the exact same buffer.
Here's some code which requests these records: https://github.com/mstange/samply/blob/da71321c597d4499730976923a7d0695147030b7/samply/src/linux/perf_event.rs#L313-L315
You can parse these records with https://docs.rs/linux-perf-event-reader/.

zz85 · 2022-10-18T06:25:56Z

I tried taking a stab at resolving symbols here https://github.com/zz85/profile-bee/blob/main/profile-bee/src/symbols.rs

The approach is similar to this PR but falls back to loading symbols from the ELF format. There's definitely still some rough edges (don't think it's handling relocations/offsets correctly) but putting it out here in case it's useful to anyone looking into symbol lookups

addisoncrump · 2023-08-01T21:09:29Z

Bringing this back up with #705: what's the preferred implementation here? We all agree on the need for a SymbolResolver trait, but no consensus. Is the memory mapping monitor popular enough to justify a sample implementation? (I note that it particularly misses cases where the executable starts before our program is loaded)

mergify · 2023-09-14T23:51:00Z

@Matthias247, this pull request is now in conflict and requires a rebase.

mergify · 2024-02-06T12:36:14Z

@Matthias247, this pull request is now in conflict and requires a rebase.

mergify · 2025-01-01T16:41:12Z

@Matthias247, this pull request is now in conflict and requires a rebase.

dave-tucker · 2025-01-31T14:42:27Z

Closing this since the consensus arrived at (see #740) is that symbol resolution for stack traces should exist outside of aya.

Matthias247 force-pushed the symbols branch 2 times, most recently from 596db35 to 63f4560 Compare July 4, 2022 18:02

Matthias247 force-pushed the symbols branch from 63f4560 to 9ee2776 Compare July 30, 2022 18:04

Matthias247 commented Jul 30, 2022

View reviewed changes

alessandrod requested changes Jul 30, 2022

View reviewed changes

alessandrod changed the title ~~Userspace symbol resolving via addr2line~~ Improve symbol resolution API Jul 30, 2022

addisoncrump mentioned this pull request Aug 1, 2023

Segment out SymbolResolver to a trait #715

Merged

mergify bot added the aya This is about aya (userspace) label Sep 14, 2023

mergify bot added the needs-rebase label Sep 14, 2023

mergify bot removed the needs-rebase label Jan 1, 2025

mergify bot added the needs-rebase label Jan 1, 2025

dave-tucker closed this Jan 31, 2025

mergify bot removed the needs-rebase label Jan 31, 2025

Improve symbol resolution API #339

Improve symbol resolution API #339

Conversation

Matthias247 commented Jul 4, 2022 • edited by dave-tucker Loading

netlify bot commented Jul 4, 2022 • edited Loading

✅ Deploy Preview for aya-rs ready!

Matthias247 commented Jul 4, 2022

Matthias247 commented Jul 30, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alessandrod left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alessandrod commented Jul 30, 2022

mstange commented Jul 30, 2022 • edited Loading

kakkoyun commented Aug 1, 2022

dave-tucker commented Aug 2, 2022

mstange commented Aug 2, 2022

zz85 commented Oct 18, 2022

addisoncrump commented Aug 1, 2023 • edited Loading

mergify bot commented Sep 14, 2023

mergify bot commented Feb 6, 2024

mergify bot commented Jan 1, 2025

dave-tucker commented Jan 31, 2025

Matthias247 commented Jul 4, 2022 •

edited by dave-tucker

Loading

netlify bot commented Jul 4, 2022 •

edited

Loading

mstange commented Jul 30, 2022 •

edited

Loading

addisoncrump commented Aug 1, 2023 •

edited

Loading