Partial register and scattered arguments should be using the `partialregister_t` instead of `phrase_t`. #195

arizvisa · 2024-07-23T20:38:20Z

The function.args.storage function and its peer (function.arg.storage) supports the ALOC_REG1, ALOC_RREL, and ALOC_DIST argument locations using a phrase_t. This is wrong. The semantics of phrase_t are essentially "register+offset" whereas ALOC_REG1 is used to represent a field that is specified at a specific byte within a register. Thus, the type that these functions return should actually be partialregister_t. The partialregister_t class was introduced by commit 89dd9fe in order to represent an interval spread through a single register. If I recall correctly, I hacked this together in order to support some parts of ALOC_DIST that I had encountered.

To elaborate, the boundaries for ALOC_REG1 are determined by the "regoff" field and the size for the type, whereas the "regidx" field contains the register being described. The ALOC_RREL location type, despite not having any of IDA's magic syntax (represented via "name@<[reg+offset]>") should represent a "register+offset" (hence, correct in returning a phrase_t). Presently, the ALOC_STATIC type returns a straight-up integer. Returning an integer for ALOC_STATIC, however, might be incorrect as technically it has an address and a size (as per the type) which allows it to fit into a proper location_t...sorta like ALOC_STACK.

One thing that's worth noting is that the documentation for the "REG^OFFSET.SIZE" syntax (SetType) specifies that all these values should be in bytes. I noticed, however, that you can use "int@<eax^31.1>" or "int@<eax^30.2>" in a 32-bit 8.4 database (without complaints), whereas "int@<eax^31.2>", "int@<eax^32.1>", and "int@<eax^32.0> all produce an error (the latter is probably due to the 0-size). Despite this, when analysis is turned on, IDA corrects the "SIZE" to the size of the type. Perhaps Hex-Rays is planning to change the semantics of this from bytes to bits in the future in order to facilitate work on the decompiler, or perhaps it is just a bug. I haven't tested how this works with bitfield types or anything (to confirm), but this is probably worth a support@ email to confirm...

The text was updated successfully, but these errors were encountered:

arizvisa · 2024-07-23T20:41:21Z

Fixing this issue will require updating the implementation of tinfo.location from the base/_interface.py module, but will also require correcting the function.type.*.storage functions which seem to explicitly assume that a tuple is a "register+offset" pair. So to fix this, the function.type.*.storage implementation will probably need to avoid this assumption and perform a type check to differentiate what is being returned.

arizvisa added the bug label Jul 23, 2024

arizvisa self-assigned this Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial register and scattered arguments should be using the `partialregister_t` instead of `phrase_t`. #195

Partial register and scattered arguments should be using the `partialregister_t` instead of `phrase_t`. #195

arizvisa commented Jul 23, 2024 •

edited

Loading

arizvisa commented Jul 23, 2024 •

edited

Loading

Partial register and scattered arguments should be using the partialregister_t instead of phrase_t. #195

Partial register and scattered arguments should be using the partialregister_t instead of phrase_t. #195

Comments

arizvisa commented Jul 23, 2024 • edited Loading

arizvisa commented Jul 23, 2024 • edited Loading

Partial register and scattered arguments should be using the `partialregister_t` instead of `phrase_t`. #195

Partial register and scattered arguments should be using the `partialregister_t` instead of `phrase_t`. #195

arizvisa commented Jul 23, 2024 •

edited

Loading

arizvisa commented Jul 23, 2024 •

edited

Loading