Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simp_compose_and_mask improvements #1347

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
38 changes: 29 additions & 9 deletions miasm/expression/simplifications_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1753,19 +1753,39 @@ def simp_compose_and_mask(_, expr):
if not arg2.is_int():
return expr
int2 = int(arg2)
if (int2 + 1) & int2 != 0:
return expr
mask_size = int2.bit_length() + 7 // 8
mask_size = (int2.bit_length() + 7) // 8 * 8
Copy link
Contributor

@serpilliere serpilliere Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why is there a +7 here
Can you explain?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the mask size is rounded up to the next multiple of 8. i think this is not actually needed, you could just use int2.bit_length() instead, but this may result in more complex expressions because of this line in the loop:

            if arg.is_int() and offset < mask_size < offset+arg.size:
                arg = ExprSlice(arg, 0, mask_size-offset)

this might be the result without rounding up:

{X 0 8, Y 8 16} & 0x7FFF -> {x 0 8, Y & 0x7F 8 15, 0 15 16}

with rounding up:

{X 0 8, Y 8 16} & 0x7FFF -> {X 0 8, Y 8 16} & 0x7FFF

Copy link
Contributor

@serpilliere serpilliere Sep 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. The trick is that in Miasm, ExprCompose can have, and have components of different sizes. Not only 8 bit arguments.
So maybe we will have a problem here. I think we can distinguish 2 big cases.

  • The first one, in which this simplification will be very interesting is the case where we make elements "disappear". (the example you gave)
  • The second one, in which we will propagate the mask into the compose, and this propagation will only create multiple new mask for each (or most) sub arguments. In this case, maybe we don't want to propagate the mask

So maybe we can use this metric to do or not, this simplification
So the code may be:

  • do the simplification as you propose
  • once it's done, see if the number of ExprCompose arguments is reduced (compared to the original one).
  • if yes, return this expression
  • if no, return the original expression
    This way, the simplification should not generate a "more complex" expression (in term of number of operations)

I am aware of the fact that it may do some work for nothing, but hey, the 'simplification' problem is hard 😄

What do you think of this proposition?

Copy link
Author

@tly000 tly000 Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is a good idea, but it would be better to check if the resulting ExprCompose is simpler than the old one. i.e. if parts of the expression were replaced by constants. for me the whole reason for this simplification is to get rid of unneeded ExprIds to reduce the dependencies of the expression.

if int2 == int(arg1.mask):
return arg1
out = []
mask_needed = False
for offset, arg in arg1.iter_args():
if offset == mask_size:
return ExprCompose(*out).zeroExtend(expr.size)
elif mask_size > offset and mask_size < offset+arg.size and arg.is_int():
out.append(ExprSlice(arg, 0, mask_size-offset))
return ExprCompose(*out).zeroExtend(expr.size)
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, maybe here, instead of a break, juste append a ExprInt(0, arg.size)
This way, you generate generate the ExprCompose with the final elements, and don't need to have special cases at the end of the test.
More over, if I don't mess up, you may result with

{XXXX, 0, 8, ExprInt(0, 8), 8, 16, ExprInt(0, 16), 16, 32} 

But this will be simplified by another rule into:

{XXX, 0, 8, ExprInt(0, 24), 8, 32}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is true

else:
out.append(arg)
return expr
if arg.is_int() and offset < mask_size < offset+arg.size:
arg = ExprSlice(arg, 0, mask_size-offset)

arg_mask = (int(arg.mask) << offset)
if int2 & arg_mask != 0:
out.append(arg)
if int2 & arg_mask != arg_mask:
mask_needed = True
elif mask_size > offset + arg.size:
out.append(ExprInt(0, arg.size))

if mask_size <= offset + arg.size:
break

if len(out) == 0:
return ExprInt(0, expr.size)
else:
size = sum(arg.size for arg in out)
if size != expr.size:
out.append(ExprInt(0, expr.size - size))
result = ExprCompose(*out)
if mask_needed:
result = result & arg2
return result

def simp_bcdadd_cf(_, expr):
"""bcdadd(const, const) => decimal"""
Expand Down
13 changes: 13 additions & 0 deletions test/expression/simplifications.py
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,19 @@ def check(expr_in, expr_out):
ExprInt(0x1, 32),
ExprInt(0x0, 32))
),
(ExprCompose(a[:8],b[:8],c[:8],d[:8])
&
ExprInt(0xA000B000, 32),
ExprCompose(ExprInt(0,8), b[:8], ExprInt(0,8), d[:8]) &
ExprInt(0xA000B000, 32)
),

(ExprCompose(a[:8],b[:8],c[:8],d[:8])
&
ExprInt(0xFF00FF00, 32),
ExprCompose(ExprInt(0,8), b[:8], ExprInt(0,8), d[:8])
),

(ExprCompose(a[:16], b[:16])[8:32],
ExprCompose(a[8:16], b[:16])),
((a >> ExprInt(16, 32))[:16],
Expand Down