feat(aya): Add task storage map type (in the user-space) #1161

vadorovsky · 2025-02-02T21:59:08Z

Task storage is a type of map which uses task_struct kernel type as a key. When the task (process) stops, the corresponding entry is automatically removed.

This change add support only in the user-space and tests the functionality with a C program.

This change is

netlify · 2025-02-02T21:59:25Z

✅ Deploy Preview for aya-rs-docs ready!

Built without sensitive environment variables

Name	Link
🔨 Latest commit	`c43d457`
🔍 Latest deploy log	https://app.netlify.com/sites/aya-rs-docs/deploys/67a384e2008d360008231e91
😎 Deploy Preview	https://deploy-preview-1161--aya-rs-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

dave-tucker · 2025-02-03T12:16:41Z

test/integration-test/src/tests/task_storage.rs

+    let store: TaskStorage<_, u32> =
+        TaskStorage::try_from(ebpf.map_mut("task_storage").unwrap()).unwrap();
+
+    let mut child = Command::new("sleep").arg("1").spawn().unwrap();


There is no sleep binary in our test userspace so this will fail.
You might want to consider using fork instead.
https://docs.rs/nix/latest/nix/unistd/fn.fork.html

Yeah, I noticed. I'm also trying to figure out other options:

using thread::spawn

using some other LSM hook and looking up for the pid of the testsuite

Do we need to fork? Can't we just spawn a thread?

We can't, unfortunately.

Even though in Linux threads are just processes with shared resources, they share PID as well. Threads have the same PID and different TID. Proof:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=d65191a69be10325a543cad9c138e9ea

use std::thread; fn main() -> anyhow::Result<()> { let handles = (0..10).map(|i| thread::spawn(move || { let pid = unsafe { libc::getpid() }; let tid = unsafe { libc::gettid() }; println!("thread {i}: pid: {pid}, tid: {tid}") })); for handle in handles { handle.join().unwrap(); } Ok(()) }

thread 0: pid: 13, tid: 33 thread 1: pid: 13, tid: 34 thread 2: pid: 13, tid: 35 thread 3: pid: 13, tid: 36 thread 4: pid: 13, tid: 37 thread 5: pid: 13, tid: 38 thread 6: pid: 13, tid: 39 thread 7: pid: 13, tid: 40 thread 8: pid: 13, tid: 41 thread 9: pid: 13, tid: 42

The only way to retrieve the entry from a task storage in the user-space is to use the pidfd - a file descriptor corresponding to a process differentiated by PID. There is nothing like that for TID.

I'm glad that you asked that question though, because it made me think of something I didn't consider before. On the eBPF side, my example creates/updates the entry every time we hit security_task_alloc, which happens for threads as well. task_struct should be unique for threads. Is the task storage on eBPF side smart enough to differentiate threads or does it differentiate only processes? 🤔 Also, if it does differentiate threads on eBPF side, it would mean inconsistency between eBPF and user-space. I need to check...

Why do you need to differentiate? Just store the thread ID in the map.

You're missing the point how task storage works. You can't just arbitrarily pick the key type.

In the eBPF code, you can only use task_struct as a key. bpf_task_storage_get and bpf_task_storage_delete take a task_struct pointer.

But in the user-space, you have to use the pidfd as a key. Nothing else works. See these examples:

https://elixir.bootlin.com/linux/v6.13.1/source/tools/testing/selftests/bpf/map_tests/task_storage_map.c
https://elixir.bootlin.com/linux/v6.13.1/source/tools/testing/selftests/bpf/prog_tests/task_local_storage.c#L283

In all of them, you need to issue the pidfd_open syscall, providing PID as an argument, and then use the returned file descriptor as the key.

Unfortunately, there is no concept of tidfd or anything which would let you use a TID as a key.

I will try to make it more clear in the docs.

Why can't you store something in the map on thread creation and use pidfd of the process to fetch it from the map?

dave-tucker

Thanks for this! I know you're still working through the integration test failures but adding some review comments too.

dave-tucker · 2025-02-03T12:27:26Z

aya/src/maps/task_storage.rs

+impl<T: Borrow<MapData>, V: Pod> TaskStorage<T, V> {
+    pub(crate) fn new(map: T) -> Result<Self, MapError> {
+        let data = map.borrow();
+        check_kv_size::<c_int, V>(data)?;


Other map types use:

Suggested change

check_kv_size::<c_int, V>(data)?;

check_kv_size::<u32, V>(data)?;

If you can't use u32 - although I'm not sure why not - a comment to explain would be helpful.

dave-tucker · 2025-02-03T12:29:26Z

aya/src/maps/task_storage.rs

+            TaskStorage::<_, u16>::try_from(&map),
+            Err(MapError::InvalidValueSize {
+                size: 2,
+                expected: 4


Coming back to earlier comment - if you're expecting c_int to always be a 4 byte value, you may as well use u32.

dave-tucker · 2025-02-03T12:30:45Z

aya/src/sys/pid_fd.rs

+use libc::pid_t;
+use std::os::fd::{AsRawFd, RawFd};
+/// A file descriptor of a process.
+pub(crate) struct PidFd(RawFd);


Suggested change

pub(crate) struct PidFd(RawFd);

pub(crate) struct PidFd(MockableFd);

or OwnedFd. We shouldn't be dealing with RawFd directly.

dave-tucker · 2025-02-03T12:33:03Z

aya/src/sys/pid_fd.rs

+use crate::sys::{SysResult, Syscall};
+use libc::pid_t;
+use std::os::fd::{AsRawFd, RawFd};
+/// A file descriptor of a process.


You should document that this is already in Rust nightly:
https://doc.rust-lang.org/std/os/linux/process/struct.PidFd.html

We'll want to migrate once it stablizes

dave-tucker · 2025-02-03T12:37:59Z

aya/src/sys/pid_fd.rs

+impl PidFd {
+    pub(crate) fn open(pid: u32, flags: u32) -> SysResult<Self> {
+        let pid_fd = pidfd_open(pid, flags)?;
+        Ok(Self(pid_fd as i32))


Per previous comment, you'll need to create an OwnedFd from the syscall.

tamird

Reviewed 9 of 10 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @dave-tucker and @vadorovsky)

aya/src/sys/pid_fd.rs line 1 at r2 (raw file):

use crate::sys::{SysResult, Syscall};

do we really need a file for this? seems like overkill to me

aya/src/sys/pid_fd.rs line 12 at r2 (raw file):

    }
}
impl AsRawFd for PidFd {

this is a smell

aya/src/sys/pid_fd.rs line 17 at r2 (raw file):

    }
}
impl Drop for PidFd {

you'd get this for free if you used OwnedFd or MockableFd.

tamird · 2025-02-03T17:14:54Z

test/integration-test/src/tests/task_storage.rs

+    let store: TaskStorage<_, u32> =
+        TaskStorage::try_from(ebpf.map_mut("task_storage").unwrap()).unwrap();
+
+    let mut child = Command::new("sleep").arg("1").spawn().unwrap();


Do we need to fork? Can't we just spawn a thread?

Task storage is a type of map which uses `task_struct` kernel type as a key. When the task (process) stops, the corresponding entry is automatically removed. This change add support only in the user-space and tests the functionality with a C program.

mergify bot added aya This is about aya (userspace) test A PR that improves test cases or CI labels Feb 2, 2025

vadorovsky force-pushed the task-storage-user-space branch from 66bdc42 to d57b707 Compare February 3, 2025 05:04

dave-tucker reviewed Feb 3, 2025

View reviewed changes

dave-tucker requested changes Feb 3, 2025

View reviewed changes

vadorovsky force-pushed the task-storage-user-space branch from d57b707 to 7d45628 Compare February 3, 2025 13:29

tamird requested a review from dave-tucker February 3, 2025 17:14

tamird requested changes Feb 3, 2025

View reviewed changes

vadorovsky force-pushed the task-storage-user-space branch 2 times, most recently from d6c7825 to 4b526d7 Compare February 5, 2025 15:30

vadorovsky force-pushed the task-storage-user-space branch from 4b526d7 to c43d457 Compare February 5, 2025 15:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(aya): Add task storage map type (in the user-space) #1161

feat(aya): Add task storage map type (in the user-space) #1161

vadorovsky commented Feb 2, 2025 •

edited by tamird

Loading

netlify bot commented Feb 2, 2025 •

edited

Loading

dave-tucker Feb 3, 2025

vadorovsky Feb 3, 2025

tamird Feb 3, 2025

vadorovsky Feb 3, 2025

tamird Feb 3, 2025

vadorovsky Feb 4, 2025 •

edited

Loading

tamird Feb 4, 2025

dave-tucker left a comment

dave-tucker Feb 3, 2025

dave-tucker Feb 3, 2025

dave-tucker Feb 3, 2025

dave-tucker Feb 3, 2025

dave-tucker Feb 3, 2025

tamird left a comment

tamird Feb 3, 2025

	check_kv_size::<c_int, V>(data)?;
	check_kv_size::<u32, V>(data)?;

	pub(crate) struct PidFd(RawFd);
	pub(crate) struct PidFd(MockableFd);

feat(aya): Add task storage map type (in the user-space) #1161

Are you sure you want to change the base?

feat(aya): Add task storage map type (in the user-space) #1161

Conversation

vadorovsky commented Feb 2, 2025 • edited by tamird Loading

netlify bot commented Feb 2, 2025 • edited Loading

✅ Deploy Preview for aya-rs-docs ready!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadorovsky Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dave-tucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tamird left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadorovsky commented Feb 2, 2025 •

edited by tamird

Loading

netlify bot commented Feb 2, 2025 •

edited

Loading

vadorovsky Feb 4, 2025 •

edited

Loading