-
Notifications
You must be signed in to change notification settings - Fork 4
Object format
The tenyr object format is in flux, so the best source of documentation is the code in src/obj.h and src/obj.c, but a brief description of the elements contained in a serialised tenyr object as of version 0 follows.
For the sake of conciseness, any reference to a "word" in the following text can be taken to mean "an unsigned 32-bit quantity representable by the uint32_t
type" unless otherwise specified. A recurrent theme in the tenyr object format is the consistent sizing of fields to fit into 32-bit words. Except for strings (used in symbol and relocation names), all fields are 32 bits (one word) wide.
Every list in the tenyr object format is preceded by a word specifying the number of elements in that list.
The entry point to a linked tenyr object is fixed at 0x1000
. There is no way to specify an entry point in the version 0 object format, although the entry point can be configured at load time in the simulator.
Every tenyr object, as of version 0, consists of four main parts :
- the header
- a list of records
- a list of symbols
- a list of relocations
The tenyr object header consists of three main parts :
- the magic string "TOV"
- a version byte (currently only version
\0
is valid) - a flags word (currently no object-wide flags are defined)
The magic string and version byte can be considered to be a single one-word-wide field as well. Thus, the header consists of two words, which can be represented by the following C structure :
struct {
char magic[3];
uint8_t version;
uint32_t flags;
} header;
The first list, a list of records (a "record" is a region of serialised contiguous memory), coming directly after the header, consists of a count
word followed by count
record structures. Each record structure consists of an addr
word, specifying the base address of the record, a size
word, specifying the extent of the record's data footprint, and size
data words. Thus, the records list can be represented by the following pseudo-C structure :
uint32_t count;
struct {
uint32_t addr;
uint32_t size;
uint32_t data[size];
} records[count];
There is no required relationship between the values of the size
member of different records ; thus, it is not possible to predict the offset of a particular record without having read all previous records ahead of time. This is a design tradeoff for simplicity, and may be addressed in a subsequent object format revision (with a different version number).
As of this writing, all tenyr objects produced by the assembler and linker contain exactly one record, though this should not be depended upon.
The second list, a list of symbols (a "symbol" is a label exported by an object for linking purposes), coming directly after the records list, consists of a count
word followed by count
symbol structures. Each symbol structure consists of a flags
word, specifying flags for that symbol (currently no symbol-specific flags are defined), a name
string of SYMBOL_LEN
8-bit characters, and a value
word specifying the value of the symbol, which is generally its address relative to the beginning of the object (defined as 0x0
). Thus, the symbols list can be represented by the following pseudo-C structure :
uint32_t count;
struct {
uint32_t flags;
char name[SYMBOL_LEN];
uint32_t value;
} symbols[count];
The value of SYMBOL_LEN
is currently 32, which includes the \0
terminator required, for a maximum symbol length of 31 8-bit characters.
The third and final list, a list of relocations (a "relocation" is an outstanding modification to the object once it is loaded into memory to render symbol references valid), coming directly after the symbols list, consists of a count
word followed by count
relocation structures. Each relocation structure consists of a flags
word, specifying flags for that relocation (currently no relocation-specific flags are defined), a name
string of SYMBOL_LEN
8-bit characters specifying the name of the referenced symbol, an addr
word specifying the offset into this object of the word to update when relocations are done, and a width
word specifying the width in bits of the portion of the target word to update, starting from the least-significant bit. Thus, the relocations list can be represented by the following pseudo-C structure :
uint32_t count;
struct {
uint32_t flags;
char name[SYMBOL_LEN];
uint32_t addr;
uint32_t width;
} relocations[count];
For more information on how relocations work, see the linker document.