This repository contains the following useful items related to AMDGPU ISA assembler:
- amdphdrs: utility to convert ELF produced by llvm-mc into AMD Code Object (v1)
- examples/asm-kernel: example of AMDGPU kernel code
- examples/gfx8/ds_bpermute: transfer data between lanes in a wavefront with ds_bpermute_b32
- examples/gfx8/dpp_reduce: calculate prefix sum in a wavefront with DPP instructions
- examples/gfx8/s_memrealtime: use s_memrealtime instruction to create a delay
- examples/gfx8/s_memrealtime_inline: inline assembly in OpenCL kernel version of s_memrealtime
- examples/api/assemble: use LLVM API to assemble a kernel
- examples/api/disassemble: use LLVM API to disassemble a stream of instructions
- bin/sp3_to_mc.pl: script to convert some AMD sp3 legacy assembler syntax into LLVM MC
- examples/sp3: examples of sp3 convertable code
At the time of this writing (February 2016), LLVM trunk build and latest ROCR runtime is needed.
LLVM trunk (May or later) now uses lld as linker and produces AMD Code Object (v2).
Top-level CMakeLists.txt is provided to build everything included. The following CMake variables should be set:
- HSA_DIR (default /opt/hsa/bin): path to ROCR Runtime
- LLVM_DIR: path to LLVM build directory
To build everything, create build directory and run cmake and make:
mkdir build
cd build
cmake -DLLVM_DIR=/srv/git/llvm.git/build ..
make
Examples that require clang will only be built if clang is built as part of llvm.
The following llvm-mc command line produces ELF object asm.o from assembly source asm.s:
llvm-mc -arch=amdgcn -mcpu=fiji -filetype=obj -o asm.o asm.s
It is possible to extract contents of .text section after assembling to code object:
llvm-mc -arch=amdgcn -mcpu=fiji -filetype=obj -o asm.o asm.s
objdump -h asm.o | grep .text | awk '{print "dd if='asm.o' of='asm' bs=1 count=$[0x" $3 "] skip=$[0x" $6 "]"}' | bash
The following command line may be used to dump contents of code object:
llvm-objdump -disassemble -mcpu=fiji asm.o
This includes text disassembly of .text section.
The following command line may be used to disassemble raw instruction stream (without ELF structure):
hexdump -v -e '/1 "0x%02X "' asm | llvm-mc -arch=amdgcn -mcpu=fiji -disassemble
Here, hexdump is used to display contents of file in hexadecimal (0x.. form) which is then consumed by llvm-mc.
Refer to examples/api/assemble.
Refer to examples/api/disassemble.
Note that normally standard lld and Code Object version 2 should be used which is closer to standard ELF format.
amdphdrs (now obsolete) is complimentary utility that can be used to produce AMDGPU Code Object version 1.
For example, given assembly source in asm.s, the following will assemble it and link using amdphdrs:
llvm-mc -arch=amdgcn -mcpu=fiji -filetype=obj -o asm.o asm.s
andphdrs asm.o asm.co
SP3 supports proprietary set of macros/tools. sp3_to_mc.pl script attempts to translate them into GAS syntax understood by llvm-mc.
LLVM AMDGPU:
flat_atomic_cmpswap v7, v[9:10], v[7:8]
SP3:
flat_atomic_cmpswap v[7:8], v[9:10], v[7:8]
LLVM AMDGPU: flat_atomic_swap_x2 v[0:1], v[0:1], v[2:3] glc
SP3 flat_atomic_swap_x2 v[0:1], v[0:1], v[2:3]