Skip to content
Benedikt Geßele edited this page Apr 16, 2015 · 3 revisions

#summary interfacing to generated C code

Interfacing to the Generated C Code

The input files passed to the GDSL compiler may contain export declarations that tell the compiler which functions to make publicly available in the emitted code. Beyond the user-exported functions, each emitted GDSL program contains a few functions that comprise the run-time of a decoder. This section details this run-time and generically describes how a GDSL function can be called from C.

The Interface to the Run-Time

The run-time of a GDSL C library contains a few type declarations that are the same across all decoders. These types resemble their counterparts in GDSL programs:

type usage
obj_t a pointer to a heap-allocated object
state_t a pointer to the state of the run-time
int_t the integer type use in the decoder
string_t a pointer to a C string
vec_data_t the contents of a bit-vector
vec_t a bit vector structure, containing size and content
con_tag_t a constructor tag, used in algebraic data types

Not all of these types are relevant when interfacing with the run-time. We briefly detail each type in turn.

the obj_t type

A decoder function returns a pointer to an algebraic data type that represents the abstract syntax tree of the decoded instruction. This pointer has the type obj_t. The bytes under this pointer contain a con_tag_t that identifies the constructor of the data type, followed by the argument of the constructor. Rather than accessing the values on the heap directly, it is more convenient to pass this pointer to a pretty printing function written in GDSL that will turn the heap structure into a string of type string_t which is a synonym for char*. A more general way is to reproduce the AST in C by calling a GDSL function that takes a set of C function pointers and which calls one of these function pointers, depending on the obj_t value.

the state_t type

The run-time retains certain information about the heap and the current monadic state in a value of type state_t. A value of this type must therefore be passed to every GDSL function as the first argument.

the int_t and string_t types

These types represent integers and string. Currently, the integer is always a 64-bit integer so that all possible constants can be represented in this type. The string_t type is a synonym for char*. A value of string_t is only returned by the run-time function merge_rope below.

the vec_data and vec_t types

Bit vectors with explicit size are represented by the following C structure:

struct vec { <br >    unsigned int size; <br >    vec_data_t data; <br > }; <br > <br > typedef struct vec vec_t;

Here, vec_data_t it the type of the bit-vector content. It is currently a 64-bit integer. Thus, it is not possible to write GDSL programs with bit vectors that are larger than 64 bits.

the con_tag_t type

This is the type of the tag used to distinguish different constructors. For algebraic data types without argument, an obj_t pointer can be cast to a con_tag_t pointer and its value compared to the CON_xx constants that are defined for each constructor xx in a decoder.

functions of the run-time

A decoder can be compiled with the --prefix= option in which case each of the following functions are emitted with the given prefix instead of gdsl. However, if GDSL_NO_PREFIX is defined before including the header file of a decoder, the standard gdsl prefix is used for all functions. We present all functions with the standard gdsl prefix.

state_t gdsl_init(void): initializing the library

An initial run-time state can be obtained by calling gdsl_init. The returned state should be freed by calling gdsl_destroy. Note that several states can be active at the same time, making it possible to run GDSL programs in several threads.

void gdsl_set_code(state_t s, char* buf, size_t buf_len, size_t base): set the code buffer

The functions sets the buffer that instructions are read from. The buf parameter contains the buffer of size buf_len. The parameter base denotes the address that the built-in GDSL function ip_get returns when called after no bytes have been consumed from the buffer, i.e., it denotes the starting address of the buffer.

size_t gdsl_get_ip_offset(state_t s): Query the offset of the current IP relative to base

Query the number of consumed bytes or, equivalently, the current IP relative to the base address of the buffer.

int_t gdsl_seek(state_t s, size_t i): Set the offset within the buffer

Set the current code position to this address. The offset is relative to the beginning of the buffer, that is, gdsl_seek(s,(gdsl_get_ip_offset(s)) is for all valid states s a no-op.

jmp_buf* gdsl_err_tgt(state_t s): Set an exception handler

An exception handler must be installed by calling setjmp with the argument returned by this function. If an exception occurs, control will return from setjmp with value 1 if there are no more bytes in the input buffer or with value 2 if there has been an error (e.g. pattern match failure). In both cases, an error message can be retrieved using gdsl_get_error_message.

char* gdsl_get_error_message(state_t s): Query an error message

Retrieve the error message after an exception has been raised.

void gdsl_reset_heap(state_t s): Discard all allocated data

Reset the heap. Objects returned by exported function are no longer valid after a call to this funciton. This function does not necessarily deallocate all of the heap.

string_t gdsl_merge_rope(state_t s, obj_t rope): turn a rope into a C string

Allocate a buffer on the heap and emit the given rope into it. A rope is a structure that allows the construction of efficient pretty printers. The rope data structure is defined in specifications/basis/prelude.ml. Returns a pointer to the buffer on the heap. Note that the C string is allocated on the GDSl heap and is therefore only valid as long as the heap is not reset.

void gdsl_destroy(state_t s): Free the state of the decoder

Frees the heap and the decoder state.

Interfacing to exported GDSL functions

Any GDSL function that appears in an export clause will have a corresponding C function in the generated C code. The following transformations apply:

  1. every C function takes the run-time state of type state_t as its first argument
  2. monadic and non-monadic functions become indistinguishable in the emitted C code, that is, they all take the state_t as parameter which contains not only the run-time state but also the monadic state
  3. a vector parameter or return value whose size is always fixed will be turned into an int_t type containing the relevant bits of the vector
  4. an algebraic data type that has only constructors without arguments is turned into an int_t containing the tag of the constructor
  5. a function parameter is a function to a C pointer unless some GDSL function requires that the parameter is a closure on the heap; in the latter case, it is advisable to change the GDSL program so that no function requiring values from its environment is passed as parameter

The conversion from, say, fixed-sized bit vectors to integers is an optimization of the back-end of the compiler that may or may not be possible, depending on the input GDSL program. In order to make the generated C code less brittle to changes in the GDSL program, the export statement can be told about the desired C type of the function. See the Syntax GDSL language.

Clone this wiki locally