Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble Accessing Bytes Member in Custom Module #306

Open
LloydLabs opened this issue Feb 12, 2025 · 7 comments
Open

Trouble Accessing Bytes Member in Custom Module #306

LloydLabs opened this issue Feb 12, 2025 · 7 comments

Comments

@LloydLabs
Copy link

LloydLabs commented Feb 12, 2025

Hi folks,

I'm having trouble accessing a bytes member of a custom module that I have created within my rule. However, I can't seem to compare the value to any sort of byte array. The custom module parses the data, and populates the corresponding definition:

Here is my protobuf definition for the field:

message TestBody {
    optional bytes body = 1;
}

I have confirmed it is populating this fine, and I can match against normal strings. I seem to have trouble comparing this field to any sort of binary data. The body field data is transformed within the module (from base64), so in my tests, it isn't a simple $bytes:

import "custom_module"

rule test_01
{
    strings:
        $bytes = { FC 48 83 E4 F0 EB 33 5D 8B 45 00 48 83 C5 04 8B }
    condition:
       // 1 - test against hex C-string - FAIL
       custom_module.body contains "\xfc\x48\x83\xe4\xf0\xeb\x33\x5d\x8b\x45\x00\x48\x83\xc5\x04\x8b" or
       // 2 - test against $bytes - FAIL
       custom_module.body == $bytes or
       /// 3 - test explicitly against $bytes content - FAIL
       custom_module.body == {FC 48 83 E4 F0 EB 33 5D 8B 45 00 48 83 C5 04 8B} or
       /// 4 - test string comparison with different data - OK
       custom_module.body contains "foobar"
}

Is there a way I can compare the body byte content against an arbitrary set of actual bytes? And if so, how can I use keywords such as startswith or even in for byte arrays to compare this field?

Thank you all for the continued hard work on yara-x and the porting from the previous version,
Best,
Lloyd.

@plusvic
Copy link
Member

plusvic commented Feb 13, 2025

Both 1 and 4 should work. YARA treats string and bytes as equivalent types, from the YARA standpoint both types are interchangeable, the difference between them exist only in the Rust side.

I've added some test cases in e3e2ba5 to make sure that string operations are working fine with arbitrary raw bytes, and it everything seems fine.

@plusvic
Copy link
Member

plusvic commented Feb 14, 2025

@LloydLabs did you manage to find out why it wasn't working for you? Can I close this issue?

@LloydLabs
Copy link
Author

Hey @plusvic, thanks for getting back to me. OK, I'll take a look back at the first case and debug it - because it certainly wasn't working as expected before. Should have a response in a few hours.

@LloydLabs
Copy link
Author

LloydLabs commented Feb 15, 2025

The first one doesn't appear to work, I've added some debug statements to see what is going on:

if let Some(body) = value.get("banner").and_then(Value::as_str) {
    let engine = base64::engine::general_purpose::STANDARD;
    match engine.decode(body) {
        Ok(decoded_body) => {
            // NOTE: temporarily for debugging purposes
            for byte in &decoded_body {
                print!("{:02X} ", byte);
            }
            println!();
            
            tcp.body = Some(decoded_body);
        }
        Err(e) => {
            eprintln!("Failed to decode base64 body: {}", e);
        }
    }
}

I'm using this short script within yara_x Python to test the rules, although this should not affect the matching at all:

import yara_x

rules = yara_x.compile(
    """
import "custom_module"

rule match_body_start {
    condition:
        custom_module.tcp.body startswith "\x01\x02\x03\x04"
}
"""
)

with open("input.json", "rb") as f:
    results = rules.scan(f.read())
    print(results.matching_rules[0].identifier)

This outputs:

01 02 03 04 # <- from the above, wherein we walk over the Vec<>
    print(results.matching_rules[0].identifier)
          ~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range

As can be seen, those bytes are there. However the rule is not matching on it with the escaped hex-string. Any help would be appreciated. I'm really unsure exactly what is going on here.

@LloydLabs
Copy link
Author

For some reason the hex-string \xFC\x48\x83\xE4\xF0\xEB\x33\x5D\x88\x45\x00\x48\x83\xC5\x04\x8B fails to match, but \x01\x00\x02\x03\x04 does work (01 00 02 03 04).

Module output from the debug print above confirms the bytes that it is populating the field with are correct: FC 48 83 E4 F0 EB 33 5D 8B 45 00 48 83 C5 04 8B. I am really unsure as to what is going on.

@plusvic
Copy link
Member

plusvic commented Feb 17, 2025

I'm not sure if that was a typo, but you said:

For some reason the hex-string \xFC\x48\x83\xE4\xF0\xEB\x33\x5D\x88\x45\x00\x48\x83\xC5\x04\x8B fails to match

However notice the 9th in the string(after 0x33 0x5D), it's 0x88 instead of 0x8B.

@plusvic
Copy link
Member

plusvic commented Feb 24, 2025

@LloydLabs any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants