Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] in_ebpf: initial version #9406

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

niedbalski
Copy link
Collaborator

@niedbalski niedbalski commented Sep 20, 2024

This is a proposal for a proof of concept (POC) of an eBPF ingestor plugin. It uses libebpf to load and link to an existing eBPF program and pulls events from a fixed-size ring buffer. These events are then fed into the log ingestion pipeline.

The event types are predefined in the fluent-bit codebase, and the eBPF program must follow these definitions when submitting events to the ring buffer. In the future, this process needs to be flexible, so we can support other eBPF collectors.

Additionally, I've added a fallback option to pass strings as event payloads without needing a specific event type.

Compiled as

cmake -D FLB_IN_EBPF=ON .

An example configuration is:

[INPUT]
    Name              ebpf
    bpf_object_file   ./ebpf_program.o
    bpf_program_name  handle_fs_event
    ringbuf_map_name  events

[INPUT]
    Name              ebpf
    bpf_object_file   ./ebpf_program.o
    bpf_program_name  handle_execve_event
    ringbuf_map_name  events

[OUTPUT]
    Name stdout
    Match *

[SERVICE]
    log_level trace

An example ebpf program used on this configuration

#include <linux/types.h>

#include <bpf/bpf_helpers.h>
#include <linux/bpf.h>


struct trace_entry {
  short unsigned int type;
  unsigned char flags;
  unsigned char preempt_count;
  int pid;
};

struct trace_event_raw_sys_enter {
  struct trace_entry ent;
  long int id;
  long unsigned int args[6];
  char __data[0];
};


#define MAX_EVENT_LEN 128

// Event types enum
enum event_type {
    EVENT_FILESYSTEM = 0,
    EVENT_NETWORK = 1,
    EVENT_PROCESS = 2
};

// Base event structure sent by eBPF
struct flb_ebpf_event {
    __u32 pid;
    __u32 event_type;            // Event type as an enum
    char data[MAX_EVENT_LEN];     // Event-specific data (filename, network info, etc.)
};


// Define the ring buffer map
struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 8192);
} events SEC(".maps");

// Hook for file open (Filesystem event)
SEC("tracepoint/syscalls/sys_enter_openat")
int handle_fs_event(struct trace_event_raw_sys_enter *ctx) {
    struct flb_ebpf_event *event;
    const char *filename = (const char *)ctx->args[1];

    // Reserve space in the ring buffer
    event = bpf_ringbuf_reserve(&events, sizeof(*event), 0);
    if (!event) {
        return 0;
    }

    // Fill event data (structured event)
    event->pid = bpf_get_current_pid_tgid() >> 32;
    event->event_type = EVENT_FILESYSTEM;
    bpf_probe_read_user_str(event->data, MAX_EVENT_LEN, filename);

    // Submit the structured event
    bpf_ringbuf_submit(event, 0);
    return 0;
}

// Function to send just a string (Raw String event)
SEC("tracepoint/syscalls/sys_enter_execve")
int handle_execve_event(struct trace_event_raw_sys_enter *ctx) {
    char *event;
    const char *cmd = (const char *)ctx->args[0];

    // Reserve space in the ring buffer for the string
    event = bpf_ringbuf_reserve(&events, MAX_EVENT_LEN, 0);
    if (!event) {
        return 0;
    }

    // Send the raw string (command)
    bpf_probe_read_user_str(event, MAX_EVENT_LEN, cmd);

    // Submit the raw string
    bpf_ringbuf_submit(event, 0);
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

To compile this program, you need clang in your system and run

clang -D__TARGET_ARCH_X86_64 -g -O2 -target bpf -c ebpf_program_example.c -o ebpf_program.o

With the sample configuration, the following outputs are produced:

[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.0] initializing
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.0] storage_strategy='memory' (memory only)
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.0] eBPF program 'handle_fs_event' loaded successfully from object file './ebpf_program.o' with ring buffer 'events'
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.1] initializing
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.1] storage_strategy='memory' (memory only)
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.1] eBPF program 'handle_execve_event' loaded successfully from object file './ebpf_program.o' with ring buffer 'events'
[2024/09/20 18:05:47] [ info] [sp] stream processor started
[2024/09/20 18:05:47] [ info] [output:stdout:stdout.0] worker #0 started
[0] ebpf.0: [[1726848348.381941693, {}], {"pid"=>71947, "event_type"=>"filesystem", "event_data"=>"./ebpf_program.o"}]
[1] ebpf.0: [[1726848348.382495832, {}], {"pid"=>71947, "event_type"=>"filesystem", "event_data"=>"/sys/kernel/debug/tracing/events/syscalls/sys_enter_execve/id"}]
[2] ebpf.0: [[1726848348.382551540, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.pressure"}]
[3] ebpf.0: [[1726848348.382586076, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.current"}]
[4] ebpf.0: [[1726848348.382610182, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.min"}]
[5] ebpf.0: [[1726848348.382634648, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.low"}]
[6] ebpf.0: [[1726848348.382657849, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.swap.current"}]
[7] ebpf.0: [[1726848348.382679632, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.stat"}]

Unknown events

[0] ebpf.0: [[1726848417.031536658, {}], {"event_type"=>"unknown", "event_data"=>"/usr/bin/ps"}]
[0] ebpf.0: [[1726848420.076250428, {}], {"event_type"=>"unknown", "event_data"=>"/usr/bin/cmake"}]
[0] ebpf.0: [[1726848422.176789034, {}], {"event_type"=>"unknown", "event_data"=>"/usr/bin/top"}]

^C[2024/09/20 18:07:16] [engine] caught signal (SIGINT)
[2024/09/20 18:07:18] [ warn] [engine] service will shutdown in max 5 seconds
[2024/09/20 18:07:18] [ info] [input] pausing ebpf.0
[2024/09/20 18:07:19] [ info] [engine] service has stopped (0 pending tasks)
[2024/09/20 18:07:19] [ info] [input] pausing ebpf.0
[2024/09/20 18:07:19] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/09/20 18:07:19] [ info] [output:stdout:stdout.0] thread worker #0 stopped

@cosmo0920
Copy link
Contributor

I'm actually living in Ubuntu 22.04 box. So, I needed to refer the actual architecture dependent header files:

$ clang -D__TARGET_ARCH_X86_64 -g -O2 -target bpf -c ebpf_program_example.c -o ebpf_program.o -I /usr/include/x86_64-linux-gnu/  

Copy link
Contributor

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current code base, I also concerned about libbpf linking status:

$ ldd bin/fluent-bit
	linux-vdso.so.1 (0x00007ffeee7be000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x000078338b5c9000)
	libyaml-0.so.2 => /lib/x86_64-linux-gnu/libyaml-0.so.2 (0x000078338b5a8000)
	libsystemd.so.0 => /lib/x86_64-linux-gnu/libsystemd.so.0 (0x000078338a139000)
	libbpf.so.0 => /lib/x86_64-linux-gnu/libbpf.so.0 (0x000078338a0ea000)
	libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x000078338a046000)
	libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x0000783389c00000)
	libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x0000783389b59000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x0000783389b3d000)
	libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x0000783389a6e000)
	libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x0000783389a53000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000783389a33000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000783389800000)
	/lib64/ld-linux-x86-64.so.2 (0x000078338b6e9000)
<snip>

This could indicate that libbpf is linked as shared object. So. fluent-bit is not tainted for non-Apache License such as GNU like license.

CMakeLists.txt Outdated Show resolved Hide resolved
This is an initial proposal of a POC of an ebpf ingestor
plugin. This adds capabilities to load and attach to
an existing ebpf program and consume events from a fixed-sized
ring buffer, subsequently those events are ingested in the log
ingestion buffer.

Events types are known and defined in the fluent-bit codebase and
those has to be implemented by the ebpf program to follow when submitted
into the ring buffer, this in the future must be serialized and
be an extensible part of the project as we possibly make progress towards
compability with other ebpf collectors.

Also, i've implemented a fallback to allow strings to be passed as the
payload of the event, without following a specific event type.

Signed-off-by: Jorge Niedbalski <[email protected]>
}

/* Find the BPF program by its name */
struct bpf_program *prog = bpf_object__find_program_by_name(ctx->obj, bpf_prog_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the old style of variables declarations. So, we need to put the definition into around here: https://github.com/fluent/fluent-bit/pull/9406/files#diff-cfd08bb24498894b88fb371270031942876b24aae69176d146e26006d7710157R169-R173

}

/* Attach the BPF program to the tracepoint */
struct bpf_link *link = bpf_program__attach(prog);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Comment on lines +293 to +294
int poll_seconds = ctx->poll_ms / 1000;
int poll_nanoseconds = (ctx->poll_ms % 1000) * 1000000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

.name = "ebpf",
.description = "eBPF input plugin",
.cb_init = in_ebpf_init,
.cb_pre_run = NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future enhancement, it would be nice to add checking the existence of ebpf object in pre_run callback.
IIRC, this is because this callback is used for the prerequisites check for reloading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required ok-package-test Run PR packaging tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants