IAssassinSetrtnInvoke

PnkFelix made an experimental peephole for optimizing the instruction sequence:


setrtn L
invoke n
.align 4
L:

to just


setrtn/invoke n
.align 4
L:

He didn't think it would be possible when he first looked at it to use the x86 call instruction to implement this, because the call instruction pushes the address onto the stack
- (on Sparc, setrtn/invoke uses the delay slot in a clever way to get around this. Or at least maybe it does; the peephole has been disabled it seems...)
However, Will pointed out to PnkFelix that one can get around this problem by introducing a level of indirection: don't call directly to the target, but call to a short instruction sequence that stores the return address and then jumps to the target.
- The problem here became "where do I put this short instruction sequence?"

PnkFelix decided to put the instruction sequence at the end of the bytevector for the code segment. We don't always put it there; only if we actually make a setrtn/invoke call during the assembly.

However, the way things work out, this means that we have cases where the non-peepholed version generates smaller code than the peephole'd version. Namely, if we only have one occurrence of the above pattern, then the peepholed version occupies 8 bytes more than the non-peepholed verison.
- Here are the actual equations for calculating the expected instruction size, where K is the number of occurences of the above pattern
  - nonpeep: K*52
  - peep: K*40+20
So it might be worth revisiting this choice, perhaps finding some way to conditionally assemble the setrtn/invoke based on how many we expect to enounter in the code segment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IAssassinSetrtnInvoke

Clone this wiki locally