Skip to content

Latest commit

 

History

History
540 lines (439 loc) · 21.5 KB

cfi_backward.adoc

File metadata and controls

540 lines (439 loc) · 21.5 KB

Backward-edge control-flow integrity

A shadow stack is a second stack used to push the link register if it needs to be spilled to make a new procedure call. Leaf functions do not need to spill the link register.

The shadow stack, similar to the regular stack, grows downwards, i.e. from higher addresses to lower addresses. Each entry on the shadow stack is XLEN wide and holds the link register value. The ssp points to the top of the shadow stack, i.e. address of the last element pushed on the shadow stack.

Note

Compilers when generating code for a CFI enabled program must protect the link register, e.g. x1 and/or x5, from arbitrary modification by not emitting unsafe code sequences.

Backward-edge CFI instruction encoding

The backward-edge CFI extension introduces the following instructions for shadow stack operations. All instructions are encoded using the SYSTEM major opcode and using the mop.r and mop.rr instructions defined by the Zimops extension.

(mop.r) 31 20 19 15 14 12 11 7 6 0

mnemonic

1.00..0111..

rs1

func3

rd

opcode

sspush

100000011100

src

100

0

1110011

sspop

100000011100

0

100

dst

1110011

ssprr

100000011101

0

100

dst

1110011

12

5

3

5

7

Encoding rs1 as x0 is not supported for sspush. Encoding rd as x0 is not supported for sspop.

(mop.rr) 31 25 24 20 19 15 14 12 11 7 6 0

mnemonic

1.00..1

rs2

rs1

func3

rd

opcode

ssamoswap

1000001

src

addr

100

dst

1110011

sschkra

1000101

1

5

100

0

1110011

7

5

5

3

5

7

Encoding rd as x0 is not supported for ssamoswap.

When menvcfg.CFIE is 0, then Zisslpcfi is not enabled for privilege modes less than M and backward-edge CFI is not enabled at privilege levels less than M.

When V=0 and menvcfg.CFIE is 1, then backward-edge CFI is enabled at S-mode. When V=0 and menvcfg.CFIE is 1, then backward-edge CFI is enabled at U-mode if mstatus.UBCFIE is 1.

When henvcfg.CFIE is 0, then Zisslpcfi is not enabled for use when V=1.

When V=1 and menvcfg.CFIE and henvcfg.CFIE are both 1, then backward-edge CFI is enabled in VS-mode. When V=1 and menvcfg.CFIE and henvcfg.CFIE are both 1, then backward-edge CFI is enabled in VU-mode if vsstatus.UBCFIE is 1.

The term xBCFIE is used to determine if backward-edge CFI is enabled at a privilege level x and is defined as follows:

xBCFIE determination
if ( privilege == M-mode )
    xBCFIE = 1
else if ( menvcfg.CFIE == 1 && V == 0 && privilege == S-mode )
    xBCFIE = 1
else if ( menvcfg.CFIE == 1 && V == 0 && privilege == U-mode )
    xBCFIE = mstatus.UBCFIE
else if ( menvcfg.CFIE == 1 && henvcfg.CFIE == 1 & V == 1 && privilege == S-mode )
    xBCFIE = 1
else if ( menvcfg.CFIE == 1 && henvcfg.CFIE == 1 & V == 1 && privilege == U-mode )
    xBCFIE = vsstatus.UBCFIE
else
    xBCFIE = 0

When backward-edge CFI is not enabled(xBCFIE = 0):

  • The instructions defined for backward-edge CFI revert to their Zimops defined behavior and write 0 to [rd].

Note

The use of shadow stacks at U-mode must be explicitly enabled per application. Explicit enable for user mode applications allows such an application to invoke shared libraries that may have shadow stack instructions even when the application itself has backward-edge CFI not enable. The shadow stack instructions invoked in the context of this application revert to their Zimops defined behavior.

When Zisslpcfi is enabled, the use of backward-edge CFI is always enabled for use at S-mode. However, it is benign to use an operating system that has not been compiled with shadow stack instructions. Such an operating system that does not use backward-edge CFI for S-mode execution may still enable the backward-edge CFI use by U-mode.

When Zisslpcfi is implemented, the use of backward-edge CFI is always enabled at M-mode. However, it is benign to use M-mode firmware that has not been compiled with shadow stack instructions.

Note

When programs using shadow stack instructions are installed on a machine that supports the CFI extensions but the operating system installed does not enable the CFI extensions, the program continues to function due to Zimops defined behavior of writing 0 to [rd] and not causing an illegal-instruction exception.

When programs using shadow stack instructions are installed on a machine that does not support the CFI extensions but support the Zimops extension, the program continues to function due to Zimops defined behavior of writing 0 to [rd] and not causing an illegal-instruction exception.

On machines that do not support Zimops extension, the instructions cause an illegal-instruction exception. Installing programs that use the shadow stack instructions on such machines is not supported.

Push to and Pop from the shadow stack

A push operation is defined as decrement of the ssp by XLEN followed by a write of the link register at the new top of the shadow stack. A pop operation is defined as a XLEN wide read from the current top of the shadow stack followed by an increment of the ssp.

To push a link register on the shadow stack, the CFI extension provides sspush instruction. To pop a link register from the shadow stack, the CFI extension provides sspop instruction.

Note

Programs operating in shadow stack mode store the return address to the data stack as well as the shadow stack in the function prologue. Such programs when returning from the function load the link register from the data stack and load a shadow copy of the link register from the shadow stack. The two values are then compared (using the sschkra instruction that is discussed later). If the values do not match it is indicative of a shadow stack violation and causes an illegal-instruction exception. The function prologue and epilog of a function using shadow stacks is as follows:

function_entry:
    sspush x1      # push link register x1 on shadow stack
     :
     :
    sspop x5       # pop from shadow stack into x5
    sschkra x5, x1 # compare link register x1 to shadow
                   # link register x5; faults if not same
    ret

When x1 is used by the ABI as the link register, the x5 may be used to load the shadow link register value from the shadow stack. Alternatively, if the ABI uses x5 as the link register, then the x1 register may be used as the shadow link register.

A leaf function i.e. a function that does not itself make function calls does not need to push the link register to the shadow stack or pop it from the shadow stack. The return value may be held in the link register itself for the duration of the function execution.

In the RVI psABI, the x1 register is designated as the link register and x5 register is designated as a temporary register. SInce the shadow link register is loaded at the tail of the function, prior to return, the x5 register being used as the shadow link register does not impose a burden on the compiler as the x5 register, being a temporary register that is not preserved across a call, is usually free for use at the tail of the function.

Programs operating in the control stack mode may store the return address only to the shadow stack in the function prologue. Such functions when returning from the function load the link register value from the shadow stack. Such programs may either use the x1 or x5 register, depending on their ABI, as the link register. The hardware return-address prediction stacks detect the use of x1/x5 as the rd and x1/x5 as the source for a JALR instruction to infer if the JALR is used for a call or a return semantic for the purposes of prediction. To support both options the CFI extension provides the x1 and x5 variants of the shadow stack load and store.

sspop performs a load identically to the existing LOAD instruction with the difference that the base is implicitly ssp, the width is implicitly XLEN. sspush performs a store identically to the existing STORE instruction with the difference that the base is implicitly ssp, the width is implicitly XLEN.

The sspush and sspop require the virtual address in ssp to have a shadow stack attribute (see Shadow Stack Memory Protection).

If the virtual address in ssp is not XLEN aligned then the instructions cause a load or store/AMO address-misaligned exception.

Note

Misaligned accesses to shadow stack are not required and enforcing alignment is more secure to detect errors in the program.

The operation of the sspush instructions is as follows:

sspush operation
If (xBCFIE = 1)
    [ssp] = [ssp] - (XLEN/8)   # decrement ssp by XLEN/8
   *[ssp] = [src]              # Store src value to address in ssp
else
    [dst] = 0
endif`

The operation of the sspop instructions is as follows:

sspop operation
If (xBCFIE = 1)
    dst   = *[ssp]             # Load dst from address in ssp
    [ssp] = [ssp] + (XLEN/8)   # increment ssp by XLEN/8
else
    [dst] = 0;
endif
Note

Store to load forwarding is a common technique employed by high performance processor implementations. CFI implementations may restrict forwarding from a non-shadow-stack store to a sspop instruction. A non-shadow-stack store causes a fault if done to a page mapped as a shadow stack. However such determination may be delayed till the PTE has been examined and thus may be used to transiently forward the data from such stores to a sspop.

Note

A common operation performed on stacks is to unwind them to support constructs like setjmp/longjmp, C++ exception handling, etc. A program that uses shadow stacks must unwind the shadow stack in addition to the stack used to store data. The unwind function must verify that it does not accidentally unwind past the bounds of the shadow stack. Shadow stacks are expected to be bounded on each end using guard pages i.e. pages that do not have a shadow stack attribute. To detect if the unwind occurs past the bounds of the shadow stack the unwind may be done in maximal increments of 4 KiB and testing for the ssp to be still pointing to a shadow stack page or has unwound into the guard page. The following examples illustrate use of backward-edge CFI instructions to unwind a shadow stack.

setjmp() {
    :
    :
    // read and save top of stack pointer to jmp_buf
    asm(“ssprr %0” : “=r”(cur_ssp):);
    jmp_buf->saved_ssp = cur_ssp;
    :
    :
}
longjmp() {
    :
    // Read current shadow stack pointer and
    // compute number of call frames to unwind
    asm(“ssprr %0” : “=r”(cur_ssp):);
    // Skip the unwind if backward-edge CFI not enabled
    asm(“beqz %0, 1f” : “=r”(cur_ssp):);
    num_unwind = jmp_buf->saved_ssp - cur_ssp;
    // Unwind the frames in a loop
    while ( num_unwind > 0 ) {
        step = ( num_unwind >= 4096 ) ? 4096 : num_unwind;
        cur_ssp += step;
        num_unwind -= step;
        // write the ssp register with unwound value
        asm(“csrw %0, $ssp_csr_num” : “=r”(cur_ssp):);
        // Test if unwound past the shadow stack bounds
        asm(“sspush x5”);
        asm(“sspop  x5”);
    }
1f:
    :
}

Read ssp into a register

The ssprr instruction is provided to move the contents of ssp to the destination register.

The operation of the ssprr instructions is as follows:

ssprr operation
If (xBCFIE = 1)
    [dst] = [ssp]
else
    [dst] = 0;
endif
Note

The property of Zimops writing 0 to the rd when the extension using Zimops is not present or not enabled may be used by such functions to skip over unwind actions by dynamically detecting if the backward-edge CFI extension is enabled.

An example sequence such as the following may be used:

    ssprr t0                  # mv ssp to t0
    beqz bcfi_not_enabled     # zero is not a valid shadow stack
                              # pointer by convention
    # Shadow stacks enabled
    :
    :
bcfi_not_enabled:

Verifying return address

Programs operating with a shadow stack push the return address onto the data stack as well as the shadow stack in the function prologue. Such programs when returning from the function pop the link register from the data stack and pop a shadow copy of the link register from the shadow stack. The two values are then compared. If the values do not match it is indicative of a corruption of the return address variable and the program causes an illegal instruction exception.

When x1 is used by the ABI as the link register, the x5 may be used to hold the shadow link register value from the shadow stack. Alternatively, if the ABI uses x5 as the link register, then the x1 register may be used as the shadow link register.

A sschkra instruction is provided to perform the comparison.

The operation of the sschkra instruction is as follows:

sschkra operation
If (xBCFIE = 1)
   if [x1] != [x5]
       Raise illegal-instruction exception
   endif
else
    [dst] = 0;
endif

Atomic Swap from a shadow stack location

The CFI extension defines an ssamoswap instruction to atomically swap the XLEN bits of src register with XLEN bits on the shadow stack at address in addr and store the value from address in src into register dst.

The ssamoswap is always sequentially consistent and cannot be reordered with earlier or later memory operations from the same hart.

The ssamoswap requires the virtual address in addr to have a shadow stack attribute (see Shadow Stack Memory Protection).

If the virtual address is not XLEN aligned then the instructions cause a store/AMO address-misaligned exception.

The operation of the ssamoswap instructions is as follows:

ssamoswap operation
If (xBCFIE = 1)
    Perform the following atomically with sequential consistency
        [dst]  = *[addr]
       *[addr] = [src]
else
    [dst] = 0;
endif
Note

Stack switching is a common operation in user programs as well as supervisor programs. When a stack switch is performed the stack pointer of the currently active stack is saved into a context data structure and the new stack is made active by loading a new stack pointer from a context data structure.

When shadow stacks are enabled for a program, the program needs to additionally switch the shadow stack pointer. The pointer to the top of the deactivated shadow stack if held in a context data structure may be susceptible to memory corruption vulnerabilities. To protect the pointer value the program may then store it at the top of the shadow stack itself and thus create a checkpoint.

An example sequence to store and restore the shadow stack pointer is as follows:

# The a0 register holds the pointer to top of new shadow
# to switch to. The current ssp is first pushed on the current
# shadow stack and the ssp is restored from new shadow stack
save_shadow_stack_pointer:
    ssprr  x5                   # read ssp and push value onto
    sspush x5                   # shadow stack. The [ssp] now
                                # holds ssp+8. Save away x5
                                # into a context structure to
                                # restore later.
restore_shadow_stack_pointer:
    ssamoswap t0, a0, x0        # t0=*[ssp] and *[ssp]=0
    addi      t0, t0, (XLEN/8)  # t0+XLEN/8 must match to a0
    bnez      t0, a0, crash     # if not crash program
    csrw      ssp, t0           # setup new ssp

Further the program may enforce an invariant that a shadow stack can be active only on one hart by using the ssamoswap when performing the restore from the checkpoint such that the checkpointed data is zeroed as part of the restore sequence and multiple hart attempt to restore the checkpointed data only one of them succeeds.

Shadow Stack Memory Protection

To protect shadow stack memory the memory is associated with a new page type - Shadow Stack (SS) page - in the page tables.

When the Smepmp extension is supported the PMP configuration registers are enhanced to support a shadow stack memory region for use by M-mode.

Virtual-Memory system extension for Shadow Stack

The shadow stack memory is protected using page table attributes such that it cannot be stored to by instructions other than sspush and amossswap. The sspop instruction can only load from shadow stack memory.

The shadow stack can be read using all instructions that load from memory.

The encoding R=0, W=1, and X=0, is defined to mean a shadow stack page. When menvcfg.CFI=0, this encoding continues to be reserved. When V=1 and henvcfg.CFI=0, this encoding continues to be reserved at VS and VU.

The following faults may occur:

  1. If the accessed page is a shadow stack page

    1. Stores other than sspush and ssamoswap cause write/AMO access faults.

    2. Instructions fetch causes a page fault

  2. if the accessed page is not a shadow stack page

    1. ssamoswap and sspush cause a store/AMO access fault

    2. sspop causes a load access fault

To support these rules, the virtual address translation process specified in section 4.3.2 of the Privileged Specification cite:[PRIV] is modified as follows:

  1. If pte.v = 0 or if any bits of encodings that are reserved for future standard use are set within pte, stop and raise a page-fault exception corresponding to the original access type. The encoding pte.xwr = 010b is not reserved if menvcfg.CFI is 1 or if V=1 and henvcfg.CFI is 1.

  2. Otherwise, the PTE is valid. If pte.r = 1 or pte.w = 1 or pte.x = 1, go to step 5. Otherwise, this PTE is a pointer to the next level of the page table. Let i = i - 1. If i < 0, store and raise a page-fault exception corresponding to the original access type. Otherwise, let a = pte.ppn x PAGESIZE and go to step 2.

  3. A leaf PTE has been found. If the memory access is by a shadow stack instruction and pte.xwr != 010b then cause an access-violation exception corresponding to the access type. If the memory acccess is a store/AMO and pte.xwr == 010b then cause an store/AMO access-violation. If the requested memory access is not allowed by the pte.r, pte.w, pte.x, and pte.u bits, given the current privilege mode and the value of the SUM and MXR fields of the mstatus register, stop and raise a page-fault exception corresponding to the original access type.

The U and SUM bit enforcement is performed normally for shadow stack instruction initiated memory accesses.

Svpbmt extension and Svnapot extensions are supported for shadow stack pages.

Note

Operating systems should protect against writeable non-shadow-stack alias virtual-addresses mappings being created to the shadow stack physical memory.

The G-stage address translation and protections are not affected by the shadow stack extension. When G-stage page tables are active, the ssamoswap, and sspop instructions require the G-stage page table mapping the accessed memory to have read permission and the ssamoswap and sspush instructions require write permission. The SS encoding in the G-stage PTE remains reserved.

Note

A future extension may define shadow stack encoding the G-stage page table to support use cases such as a hypervisor enforcing shadow stack protections for virtual-supervisor.

Note

All loads are allowed to read the shadow stack. The shadow stack only holds a copy of the link register as saved on the regular stack. The ability to read the shadow stack is useful for debug, performance profiling, and other use cases.

PMP extension for shadow stack

When privilege mode is less than M, the PMP region accessed by sspush and ssamoswap must provide write permission and the PMP region accessed by sspop must provide read permission.

The M-mode memory accesses by sspush and ssamoswap instructions test for write permission in the matching PMP entry when permission checking is required.

The M-mode memory accesses by sspop instruction tests for read permission in the matching PMP entry when permission checking is required.

When the Smepmp extension is implemented, a new WARL field sspmp is defined in bits 8:3 of the mseccfg CSR to configrue a PMP entry as the shadow stack memory region for M-mode accesses.

When mseccfg.MML is 1, the sspmp field is read-only else it may be written.

When sspmp field is implemented and mseccfg.MML is 1 the following rules are additionally enforcecd for M-mode memory accesses:

  • sspush, sspop, and ssamoswap instructions must match PMP entry sspmp.

  • Write by instructions other than sspush and ssamoswap that match PMP entry sspmp cause an access violation exception.

Note

The PMP region used for the M-mode shadow stack is expected to be made inaccessible for U- and S-mode read and write accesses. Allowing write access violates the integrity of the shadow stack and allowing read access may lead to disclosure of M-mode return addresses.