Skip to content

Commit

Permalink
Klee maze example (lifting-bits#369)
Browse files Browse the repository at this point in the history
* In progress. Working on an example of using KLEE on a Maze, but with the maze program being compiled to x86, amd64, and aarch64.

* Making lots of progress on getting lifting and runnning an aarch64 maze program on amd64, but using --explicit_args. The key thing I'm working through right now is a jump offset table, but where the offset is a block pc, rather than a table base. Also adding various bits of code here and there to making runnning with klee more directly doable, and working on a debugging facility to track down when the emulated program counter gets out of sync with the original program.

* Fixed a subtle @page and @PAGEOFF-related reference bug on AArch64. Partially disabled the special jump offset table handling I had in table.py, as it doesn't (yet) handle the shifted table values. However, I still have the code there, so that it can recognize that a basic block address is used as a possible offset, so that I can remove the block address as a reference, which permits a new heuristic on the C++ side to work. On the C++ side, when there's a jump instruction that isn't associated with a cross-reference flow, I try to auto-augment it with addition switch cases, targeting blocks with no predecessors (as present in the CFG). This seems to work reasonably well.

* Improved the scripts and updated the READMEs.

* Minor rephrase

* Minor rephrase
  • Loading branch information
pgoodman authored Jan 14, 2018
1 parent bc17c70 commit 8a9856a
Show file tree
Hide file tree
Showing 25 changed files with 1,074 additions and 297 deletions.
274 changes: 83 additions & 191 deletions .gdbinit
Original file line number Diff line number Diff line change
Expand Up @@ -6,128 +6,37 @@ define print-rip
dont-repeat
end

define print-reg-state-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf " emulated native\n"
printf "rip 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2408)), $rip
printf "rax 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2152)), $rax
printf "rbx 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2168)), $rbx
printf "rcx 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2184)), $rcx
printf "rdx 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2200)), $rdx
printf "rsi 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2216)), $rsi
printf "rdi 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2232)), $rdi
printf "rbp 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2264)), $rbp
printf "rsp 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2248)), $rsp
printf "r8 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2280)), $r8
printf "r9 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2296)), $r9
printf "r10 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2312)), $r10
printf "r11 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2328)), $r11
printf "r12 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2344)), $r12
printf "r13 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2360)), $r13
printf "r14 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2376)), $r14
printf "r15 0x%016lx 0x%016lx\n", *((unsigned long long *)($rptr + 2392)), $r15
dont-repeat
end

define addr-of-xmm0-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm0) = 0x%016lx\n", $rptr + 16
dont-repeat
end

define addr-of-xmm1-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm1) = 0x%016lx\n", $rptr + 80
dont-repeat
end

define addr-of-xmm2-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm2) = 0x%016lx\n", $rptr + 144
dont-repeat
end

define addr-of-xmm3-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm3) = 0x%016lx\n", $rptr + 208
dont-repeat
end

define addr-of-xmm4-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm4) = 0x%016lx\n", $rptr + 272
dont-repeat
end

define addr-of-xmm5-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm5) = 0x%016lx\n", $rptr + 336
dont-repeat
end

define addr-of-xmm6-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm6) = 0x%016lx\n", $rptr + 400
dont-repeat
end

define addr-of-xmm7-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm7) = 0x%016lx\n", $rptr + 464
dont-repeat
end

define addr-of-xmm8-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm8) = 0x%016lx\n", $rptr + 528
dont-repeat
end

define addr-of-xmm9-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm9) = 0x%016lx\n", $rptr + 592
dont-repeat
end

define addr-of-xmm10-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm10) = 0x%016lx\n", $rptr + 656
dont-repeat
end

define addr-of-xmm11-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm11) = 0x%016lx\n", $rptr + 720
dont-repeat
end

define addr-of-xmm12-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm12) = 0x%016lx\n", $rptr + 784
dont-repeat
end

define addr-of-xmm13-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm13) = 0x%016lx\n", $rptr + 848
dont-repeat
end

define addr-of-xmm14-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm14) = 0x%016lx\n", $rptr + 912
dont-repeat
end
set $__rax_offset = 2216
set $__flags_offset = 2064
set $__xmm0_offset = 16

define addr-of-xmm15-64
define print-reg-state-amd64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&(RegState::xmm15) = 0x%016lx\n", $rptr + 976
dont-repeat
end

define print-flags-64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
set $flptr = (char *) ($rptr + 0x810)
printf " emulated native\n"
set $__rax_ptr = $rptr + $__rax_offset
printf "rip 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 16 * 16)), $rip
printf "rax 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 0 * 16)), $rax
printf "rbx 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 1 * 16)), $rbx
printf "rcx 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 2 * 16)), $rcx
printf "rdx 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 3 * 16)), $rdx
printf "rsi 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 4 * 16)), $rsi
printf "rdi 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 5 * 16)), $rdi
printf "rsp 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 6 * 16)), $rsp
printf "rbp 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 7 * 16)), $rbp
printf "r8 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 8 * 16)), $r8
printf "r9 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 9 * 16)), $r9
printf "r10 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 10 * 16)), $r10
printf "r11 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 11 * 16)), $r11
printf "r12 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 12 * 16)), $r12
printf "r13 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 13 * 16)), $r13
printf "r14 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 14 * 16)), $r14
printf "r15 0x%016lx 0x%016lx\n", *((unsigned long long *)($__rax_ptr + 15 * 16)), $r15
dont-repeat
end

define print-flags-amd64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
set $flptr = (char *) ($rptr + $__flags_offset)
printf "eflags ["
if $flptr[1]
printf "CF "
Expand All @@ -154,122 +63,105 @@ define print-flags-64
dont-repeat
end

define print-reg-state-32
set $rptr = ((unsigned (*)(void))__mcsema_debug_get_reg_state)()
printf " emulated native\n"
printf "eip 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2408)), $eip
printf "eax 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2152)), $eax
printf "ebx 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2168)), $ebx
printf "ecx 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2184)), $ecx
printf "edx 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2200)), $edx
printf "esi 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2216)), $esi
printf "edi 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2232)), $edi
printf "ebp 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2264)), $ebp
printf "esp 0x%08lx 0x%016lx\n", *((unsigned long long *)($rptr + 2248)), $esp
dont-repeat
define print-flags-x86
print-flags-amd64
end


define addr-of-rip
define print-reg-state-x86
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rip = 0x%016lx\n", $rptr + 2408
dont-repeat
end

define addr-of-rax
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rax = 0x%016lx\n", $rptr + 2152
dont-repeat
end

define addr-of-rbx
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rbx = 0x%016lx\n", $rptr + 2168
dont-repeat
end

define addr-of-rcx
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rcx = 0x%016lx\n", $rptr + 2184
dont-repeat
end

define addr-of-rdx
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rdx = 0x%016lx\n", $rptr + 2200
printf " emulated native\n"
set $__rax_ptr = $rptr + $__rax_offset
printf "eip 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 16 * 16)), (unsigned) $pc
printf "eax 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 0 * 16)), $eax
printf "ebx 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 1 * 16)), $ebx
printf "ecx 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 2 * 16)), $ecx
printf "edx 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 3 * 16)), $edx
printf "esi 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 4 * 16)), $esi
printf "edi 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 5 * 16)), $edi
printf "esp 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 6 * 16)), $esp
printf "ebp 0x%08x 0x%08x\n", *((unsigned *)($__rax_ptr + 7 * 16)), $ebp
dont-repeat
end

define addr-of-rsi
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rsi = 0x%016lx\n", $rptr + 2216
dont-repeat
end
set $__x0_offset = 544

define addr-of-rdi
define print-reg-state-aarch64
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rdi = 0x%016lx\n", $rptr + 2232
printf "\temulated\n"
set $__x0_ptr = $rptr + $__x0_offset
set $__i = 0
while $__i < 31
printf "x%d\t0x%016lx\t&x%d = 0x%lx\n", $__i, *((unsigned long long *)($__x0_ptr + $__i * 16)), $__i, $__x0_ptr + $__i * 16
set $__i = $__i + 1
end

printf "sp\t0x%016lx\n", *((unsigned long long *)($__x0_ptr + $__i * 16))
set $__i = $__i + 1
printf "pc\t0x%016lx\t&pc = 0x%lx\n", *((unsigned long long *)($__x0_ptr + $__i * 16)), $__x0_ptr + $__i * 16
dont-repeat
end

define addr-of-rbp
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rbp = 0x%016lx\n", $rptr + 2264
dont-repeat
end

define addr-of-rsp
define addr-of-rip
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&rsp = 0x%016lx\n", $rptr + 2248
set $__rax_ptr = $rptr + $__rax_offset
printf "&rip = 0x%016lx\n", $__rax_ptr + 16 * 16
dont-repeat
end


define addr-of-r8
define addr-of-rax
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&r8 = 0x%016lx\n", $rptr + 2280
set $__rax_ptr = $rptr + $__rax_offset
printf "&rax = 0x%016lx\n", $__rax_ptr + 0 * 16
dont-repeat
end

define addr-of-r9
define addr-of-rbx
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&r9 = 0x%016lx\n", $rptr + 2296
set $__rax_ptr = $rptr + $__rax_offset
printf "&rbx = 0x%016lx\n", $__rax_ptr + 1 * 16
dont-repeat
end

define addr-of-r10
define addr-of-rcx
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&r10 = 0x%016lx\n", $rptr + 2312
set $__rax_ptr = $rptr + $__rax_offset
printf "&rcx = 0x%016lx\n", $__rax_ptr + 2 * 16
dont-repeat
end

define addr-of-r11
define addr-of-rdx
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&r11 = 0x%016lx\n", $rptr + 2328
set $__rax_ptr = $rptr + $__rax_offset
printf "&rdx = 0x%016lx\n", $__rax_ptr + 3 * 16
dont-repeat
end

define addr-of-r12
define addr-of-rsi
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&r12 = 0x%016lx\n", $rptr + 2344
set $__rax_ptr = $rptr + $__rax_offset
printf "&rsi = 0x%016lx\n", $__rax_ptr + 4 * 16
dont-repeat
end

define addr-of-r13
define addr-of-rdi
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&r13 = 0x%016lx\n", $rptr + 2360
set $__rax_ptr = $rptr + $__rax_offset
printf "&rdi = 0x%016lx\n", $__rax_ptr + 5 * 16
dont-repeat
end

define addr-of-r14
define addr-of-rsp
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&r14 = 0x%016lx\n", $rptr + 2376
set $__rax_ptr = $rptr + $__rax_offset
printf "&rsp = 0x%016lx\n", $__rax_ptr + 6 * 16
dont-repeat
end

define addr-of-r15
define addr-of-rbp
set $rptr = ((unsigned long long (*)(void))__mcsema_debug_get_reg_state)()
printf "&r15 = 0x%016lx\n", $rptr + 2392
set $__rax_ptr = $rptr + $__rax_offset
printf "&rbp = 0x%016lx\n", $__rax_ptr + 7 * 16
dont-repeat
end

3 changes: 3 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -105,3 +105,6 @@ install(

install(CODE
"execute_process(COMMAND python ${PROJECT_SOURCE_DIR}/tools/setup.py install -f --prefix=${CMAKE_INSTALL_PREFIX})")

add_subdirectory(examples)

5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Why would anyone translate binaries *back* to bitcode?

* **Binary Patching And Modification**. Lifting to LLVM IR lets you cleanly modify the target program. You can run obfuscation or hardening passes, add features, remove features, rewrite features, or even fix that pesky typo, grammatical error, or insane logic. When done, your new creation can be recompiled to a new binary sporting all those changes. In the [Cyber Grand Challenge](https://blog.trailofbits.com/2015/07/15/how-we-fared-in-the-cyber-grand-challenge/), we were able to use McSema to translate challenge binaries to bitcode, insert memory safety checks, and then re-emit working binaries.

* **Symbolic Execution with KLEE**. [KLEE](https://klee.github.io/) operates on LLVM bitcode, usually generated by providing source to the LLVM toolchain. McSema can lift a binary to LLVM bitcode, [permitting KLEE to operate on previously unavailable targets](https://blog.trailofbits.com/2014/12/04/close-encounters-with-symbolic-execution-part-2/).
* **Symbolic Execution with KLEE**. [KLEE](https://klee.github.io/) operates on LLVM bitcode, usually generated by providing source to the LLVM toolchain. McSema can lift a binary to LLVM bitcode, [permitting KLEE to operate on previously unavailable targets](https://blog.trailofbits.com/2014/12/04/close-encounters-with-symbolic-execution-part-2/). See our [walkthrough](examples/Maze/README.md) showing how to run KLEE on a symbolic maze.

* **Re-use existing LLVM-based tools**. KLEE is not the only tool that becomes available for use on bitcode. It is possible to run LLVM optimization passes and other LLVM-based tools like [libFuzzer](http://llvm.org/docs/LibFuzzer.html) on [lifted bitcode](docs/UsingLibFuzzer.md).

Expand Down Expand Up @@ -71,8 +71,11 @@ sudo apt-get install \
git \
cmake \
python2.7 python-pip \
wget \
build-essential \
gcc-multilib g++-multilib \
libtinfo-dev \
lsb-release \
realpath

sudo pip install --upgrade pip
Expand Down
18 changes: 18 additions & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Copyright (c) 2017 Trail of Bits, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

project(examples)
cmake_minimum_required (VERSION 3.2)

add_executable(maze Maze/Maze.c)
2 changes: 2 additions & 0 deletions examples/Maze/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
bc/*
klee-*
Loading

0 comments on commit 8a9856a

Please sign in to comment.