Skip to content

IntelsOptimizationRules

Felix S. Klock II edited this page Jul 28, 2013 · 1 revision

Intel has a list of optimization rules it provides in:

PnkFelix can' t copy the lists wholesale, as that would violate the copyright. A lot of the rules are somewhat obvious anyway (e.g. make basic blocks contiguous to avoid unnecessary branches...). So instead he'll summarize the interesting/non-intuitive ones.

  • Compiler Rule 2: Use the setcc and cmov instructions to eliminate unpredictable conditional branches, but do not do this for predictable branches, and do not use them to eliminate all unpredictable conditional branches.
  • Compiler Rule 3: Backward cond branches are predicted as taken, forward cond branches are predicted as untaken. Code accordingly.
  • Compiler Rule 4: Match calls with returns. (More specifically, match near calls with near returns, and far calls with far returns.)
    • We now do this about as well as one could expect, thanks to keeps the stack cache pointer in ESP and doing some acrobatics at points that control can return to (annotated with .cont).
  • Compiler Rule 10: Do not put more than four branches in 16-byte chunks.
  • Compiler Rule 11: Do not put more than two end loop branches in a 16-byte chunk
  • Compiler Rule 26: If (hopefully read-only) data must occur on the same page as code, avoid placing it immediately after an indirect jump.
    • This is relevant because Lars implemented exceptions by putting the exception code in the instruction stream and having the exception handler change the return address to point after the code. This may have been a really bad idea...
  • Compiler Rule 29: All branch targets should be 16-byte aligned
    • Is this feasible for us? Does it include targets like the start of our codevectors?
  • Compiler Rule 42: inc and dec instructions should be replaced with an add or sub instruction. (See manual for why.)
    • This is relevant because we use inc and dec as a code-size-minimizing sequence to load the constants 1 or -1.
  • Compiler Rule 56: ...this one is poorly written; I think its trying to say "favor removing redundant loads over cleverness with doing arithmetic on memory source operands."
  • Compiler Rule 57: Give preference to adding a register to memory instead of adding memory to a register. Give preference to adding a register to memory over load, add with reg dest, and storing the result.
  • Compiler Rule 61: Avoid putting explicit references to ESP in a sequence of stack operations (pop, push, call, ret)
    • At one point we used ESP as our globals pointer but now we do not (see changeset:4000), so this rule may not matter so much.
Clone this wiki locally