-
Notifications
You must be signed in to change notification settings - Fork 47
3. Device Modeling Language, version 1.4
This chapter describes the Device Modeling Language (DML), version 1.4. It will help to have read and understood the object model in the previous chapter before reading this chapter.
DML is not a general-purpose programming language, but a modeling language, targeted at writing Simics device models. The algorithmic part of the language is similar to ISO C; however, the main power of DML is derived from its simple object-oriented constructs for defining and accessing the static data structures that a device model requires, and the automatic creation of bindings to Simics.
Furthermore, DML provides syntax for bit-slicing, which much simplifies the
manipulation of bit fields in integers; new
and
delete
operators for allocating and deallocating
memory; a basic try
/throw
mechanism
for error handling; built-in log
and
assert
statements; and a powerful metaprogramming
facility using templates and in each
statements.
Most of the built-in Simics-specific logic is implemented directly in
DML, in standard library modules that are automatically imported; the
dmlc
compiler itself contains as little knowledge as possible
about the specifics of Simics.
A major difference from C is that names do not generally need to be defined before their first use. This is quite useful, but might sometimes appear confusing to C programmers.
-
Character encoding
-
DML source files are written using UTF-8 encoding. Non-ASCII characters are only allowed in comments and in string literals. Unicode BiDi control characters (U+2066 to U+2069 and U+202a to U+202e) are not allowed. String values are still handled as byte arrays, which means that a string value written with a literal of three characters may actually create an array of more than three bytes.
-
Reserved words
-
All ISO/ANSI C reserved words are reserved words in DML (even if currently unused). In addition, the C99 and C++ reserved words
restrict
,inline
,this
,new
,delete
,throw
,try
,catch
, andtemplate
are also reserved in DML. The C++ reserved wordsclass
,namespace
,private
,protected
,public
,using
, andvirtual
, are reserved in DML for future use; as are identifiers starting with an underscore (_
).The following words are reserved specially by DML:
after
,assert
,call
,cast
,defined
,each
,error
,foreach
,in
,is
,local
,log
,param
,saved
,select
,session
,shared
,sizeoftype
,typeof
,undefined
,vect
,where
,async
,await
,with
, andstringify
. -
Identifiers
-
Identifiers in DML are defined as in C; an identifier may begin with a letter or underscore, followed by any number of letters, numbers, or underscores. Identifiers that begin with an underscore (
_
) are reserved by the DML language and standard library and should not be used. -
Constant Literals
-
DML has literals for strings, characters, integers, booleans, and floating-point numbers. The integer literals can be written in decimal (
01234
), hexadecimal (0x12af
), or binary (0b110110
) form.Underscores (
_
) can be used between digits, or immediately following the0b
,0x
prefixes, in integer literals to separate groups of digits for improved readability. For example,123_456
,0b10_1110
,0x_eace_f9b6
are valid integer constants, whereas_78
,0xab_
are not.String literals are surrounded by double quotes (
"
). To include a double quote or a backslash (\
) in a string literal, precede them with a backslash (\"
and\\
, respectively). Newline, carriage return, tab and backspace characters are represented by\n
,\r
,\t
and\b
. Arbitrary byte values can be encoded as\x
followed by exactly two hexadecimal digits, such as\x1f
. Such escaped byte values are restricted to 00-7f for strings containing Unicode characters above U+007F.Character literals consist of a pair of single quotes (
'
) surrounding either a single printable ASCII character or one of the escape sequences\'
,\\
,\n
,\r
,\t
or\b
. The value of a character literal is the character's ASCII value. -
Comments
-
C-style comments are used in DML. This includes both in-line comments (
/*
...*/
) and comments that continue to the end of the line (//
...).
DML employs a very simple module system, where a module is any source file
that can be imported using the import
directive.
Such files may not contain a [device
declaration], but otherwise look like
normal DML source files. The imported files are merged into the main model
as if all the code was contained in a single file (with some exceptions). This
is similar to C preprocessor #include
directives, but in DML each imported
file must be possible to parse in isolation, and may contain declarations (such
as bitorder
) that are only effective for that file.
Also, DML imports are automatically idempotent, in the sense that importing the
same file twice does not yield any duplicate definitions.
The import hierarchy has semantic significance in DML: If a module defines some method or parameter declarations that can be overridden, then only files that explicitly import the module are allowed to override these declarations. It is however sufficient to import the module indirectly via some other module. For instance, if A.dml contains a default declaration of a method, and B.dml wants to override it, then B.dml must either import A.dml, or some file C.dml that in turn imports A.dml. Without that import, it is an error to import both A.dml and B.dml in the same device.
A DML source file describes both the structure of the modeled device and the actions to be taken when the device is accessed.
A DML source file defining a device starts with a language version declaration
and a device declaration. After that, any number of parameter declarations,
methods, data fields, object declarations, or global declarations can be
written. A DML file intended to be imported (by an import
statement in another DML file) has the same layout except
for the device declaration.
Every DML source file should contain a version declaration, on the form
dml 1.4;
. The version
declaration allows the dmlc
compiler to select the proper
versions of the DML parser and standard libraries to be used for the
file. A file can not
import a file with a different language version than its own.
The version declaration must be the first declaration in the file, possibly preceded by comments. For example:
// My Device
dml 1.4;
...
Every DML source file that contains a device declaration is a DML model, and defines a Simics device class with the specified name. Such a file may import other files, but only the initial file may contain a device declaration.
The device declaration must be the first proper declaration in the file, only preceded by comments and the language version declaration. For example:
/*
* My New Device
*/
dml 1.4;
device my_device;
...
DML is structured around an object model, where each DML model describes a single device object, which can contain a number of member objects. Each member object can in its turn have a number of members of its own, and so on in a nested fashion.
Every object (including the device itself) can also have methods, which implement the functionality of the object, and parameters, which are members that describe static properties of the object.
Each object is of a certain object type, e.g., bank
or
register
. There is no way of adding user-defined object types in
DML; however, each object is in general locally modified by adding (or
overriding) members, methods and parameters - this can be viewed as
creating a local one-shot subtype for each object.
A DML model can only be instantiated as a whole: Individual objects can not be instantiated standalone; instead, the whole hierarchy of objects is instantiated atomically together with the model. This way, it is safe for sibling objects in the hierarchy to assume each other's existence, and any method can freely access state from any part of the object hierarchy.
Another unit of instantiation in DML is the template. A template contains a reusable block of code that can be instantiated in an object, which can be understood as expanding the template's code into the object.
Many parts of the DML object model are automatically mapped onto the Simics configuration object model; most importantly, the device object maps to a Simics configuration class, such that configuration objects of that class correspond to instances of the DML model, and the attribute and interface objects of the DML model map to Simics attributes and interfaces associated with the created Simics configuration class (See Simics Model Builder User's Guide for details.)
A device is made up of a number of member objects and methods, where any object may contain further objects and methods of its own. Many object types only make sense in a particular context and are not allowed elsewhere:
-
There is exactly one
device
object. It resides on the top level. -
Objects of type
bank
,port
orsubdevice
may only appear as part of adevice
orsubdevice
object. -
Objects of type
implement
may only appear as part of adevice
,port
,bank
, orsubdevice
object. -
Objects of type
register
may only appear as part of abank
. -
Objects of type
field
may only appear as part of aregister
. -
Objects of type
connect
may only appear as part of adevice
,subdevice
,bank
, orport
object. -
Objects of type
interface
may only appear directly below aconnect
object. -
Objects of type
attribute
may only appear as part of adevice
,bank
,port
,subdevice
, orimplement
object. -
Objects of type
event
may appear anywhere except as part of afield
,interface
,implement
, or anotherevent
. -
Objects of type
group
are neutral: Any object may contain agroup
object, and agroup
object may contain any object that its parent object may contain, with the exception that agroup
cannot contain an object of typeinterface
orimplement
.
Parameters (shortened as "param
") are a kind of object members that describe
expressions. During compilation, any parameter reference will be expanded to
the definition of that parameter. In this sense, parameters are similar to
macros, and indeed have some usage patterns in common - in particular,
parameters are typically used to represent constant expressions.
Like macros, no type declarations are necessary for parameters, and every usage of a parameter will re-evaluate the expression. Unlike macros, any parameter definition must be a syntactically valid expression, and every unfolded parameter expression is always evaluated using the scope in which the parameter was defined, rather than the scope in which the parameter is referenced.
Parameters cannot be dynamically updated at run-time; however, a parameter can be declared to allow it being overridden by later definitions - see Parameter Declarations.
Within DML's built-in modules and standard library, parameters are used to
describe static properties of objects, such as names, sizes, and offsets. Many
of these are overridable, allowing some properties to be configured by users.
For example, every bank object has a byte_order
parameter that controls the
byte order of registers within that bank. By default, this parameter is defined
to be "little-endian"
- but by overriding it, users may specify the byte
order on a bank-by-bank basis.
Methods are object members providing implementation of the functionality of the
object. Although similar to C functions, DML methods can have any number of
input parameters and return values. DML methods also support a basic exception
handling mechanism using throw
and try
.
In-detail description of method declarations are covered in a separate section.
The device defined by a DML model corresponds directly to a Simics configuration object, i.e., something that can be included in a Simics configuration.
In DML's object hierarchy, the device object is represented by the top-level scope.
The DML file passed to the DML compiler must start with a device
declaration following the language version specification:
dml 1.4; device name;
A device
declaration may not appear anywhere else, neither in the
main file or in imported files. Thus, the device declaration is
limited to two purposes:
- to give a name to the configuration class registered with Simics
- to declare which DML file is the top-level file in a DML model
A register bank (or simply bank) is an abstraction that is used to
group registers in DML, and to expose these to the outside
world. Registers are exposed to the rest of the simulated system
through the Simics interface io_memory
, and exposed to scripting and
user interfaces through the register_view
, register_view_read_only
,
register_view_catalog
and bank_instrumentation_subscribe
Simics interfaces.
It is possible to define bank arrays to model a row of similar banks. Each element in the bank array is a separate configuration object in Simics, and can thus be individually mapped in a memory space.
Simics configuration objects for bank instances are named like the bank but
with a .bank
prefix. For instance, if a device model has a declaration bank regs[i < 2]
on top level, and a device instance is named dev
in Simics, then
the two banks are represented in Simics by configuration objects named
dev.bank.regs[0]
and dev.bank.regs[1]
.
A register is an object that contains an integer value. Normally, a register corresponds to a segment of consecutive locations in the address space of the bank; however, it is also possible (and often useful) to have registers that are not mapped to any address within the bank. All registers must be part of a register bank.
Every register has a fixed size, which is
an integral, nonzero number of 8-bit bytes. A single register cannot
be wider than 8 bytes. The size of the register is given by the
size
parameter,
which can be specified either by a normal parameter assignment, as in
register r1 {
param size = 4;
...
}
or, more commonly, using the following short-hand syntax:
register r1 size 4 {
...
}
which has the same meaning. The default size is provided by the
register_size
parameter of the containing register bank, if that is defined.
There are multiple ways to manipulate the value of a register: the simplest
approach is to make use of the val
member of registers, as in:
log info: "the value of register r1 is %d", r1.val;
or
++r1.val;
For more information, see Section Register Objects.
For a register to be mapped into the internal address space of the
containing bank, its starting address within the bank must be given by
setting the
offset
parameter. The address range occupied by the register is then from
offset
to offset
+ size
- 1. The offset
can be specified by a normal parameter assignment, as in
register r1 {
param offset = 0x0100;
...
}
or using the following short-hand syntax:
register r1 @ 0x0100 {
...
}
similar to the size
parameter above. Usually, a normal
read/write register does not need any additional specifications apart
from the size and offset, and can simply be written like this:
register r1 size 4 @ 0x0100;
or, if the bank contains several registers of the same size:
bank b1 {
param register_size = 4;
register r1 @ 0x0100;
register r2 @ 0x0104;
...
}
The translation from the bank address space to the actual value of the
register is controlled by the byte_order
parameter. When it is set to
"little-endian"
(the default), the lowest address, i.e., that
defined by offset
, corresponds to the least significant byte in
the register, and when set to "big-endian"
, the lowest address
corresponds to the most significant byte in the register.
An important thing to note is that registers do not have to be mapped at all. This may be useful for internal registers that are not directly accessible from software. By using an unmapped register, you can get the advantages of using register, such as automatic checkpointing and register fields. This internal register can then be used from the implementations of other registers, or other parts of the model.
Historically, unmapped registers were commonly used to store simple device state, but this usage is no longer recommended — Saved Variables should be preferred if possible. Unmapped registers should only be used if saved variables do not fit a particular use case.
To make a register unmapped, set the offset to unmapped_offset
or use the standard template unmapped
:
register r is (unmapped);
For every register, an attribute of integer type is automatically added
to the Simics configuration class generated from the device model. The
name of the attribute corresponds to the hierarchy in the DML model;
e.g., a register named r1
in a bank named bank0
will
get a corresponding attribute named bank0_r1
.
The register value is automatically saved when Simics creates a checkpoint,
unless the configuration
parameter indicates otherwise.
The value of a register is stored in a member named val
. E.g., the r1
register will store its value in r1.val
. This is normally the value that is
saved in checkpoints; however, checkpointing is defined by the get
and set
methods, so if they are overridden, then some other value can be saved instead.
Real hardware registers often have a number of fields with separate meaning. For example, the lowest three bits of the register could be a status code, the next six bits could be a set of flags, and the rest of the bits could be reserved.
To make this easy to express, a register
object can
contain a number of field
objects. Each field is defined
to correspond to a bit range of the containing register.
The value of a field is stored in the corresponding bits of the containing
register's storage. The easiest way to access the value of a register or field
is to use the get
and set
methods.
The read and write behaviour of registers and fields is in most cases controlled by instantiating templates. There are three categories of templates:
-
Registers and fields where a read or write just updates the value with no side-effects, should use the
read
andwrite
templates, respectively. -
Custom behaviour can be supplied by instantiating the
read
orwrite
template. The template leaves a simple methodread
(orwrite
) abstract; custom behaviour is provided by overriding the method. There is also a pair of templatesread_bits
andwrite_bits
, which similarly provide abstract methodsread_bits
andwrite_bits
. These functions have some extra parameters, making them less convenient to use, but they also offer some extra information about the access. -
There are many pre-defined templates with for common specialized behaviour. The most common ones are
unimpl
, for registers or fields whose behaviour has not yet been implemented, andread_only
for registers or fields that cannot be written.
A register or field can often instantiate two templates, one for reads
and one for writes; e.g., read
to supply a read method
manually, and read_only
to supply a standard write method. If
a register with fields instantiates a read or write template, then the
register will use that behaviour instead of descending into
fields. For instance, if a register instantiates
the read_only
template, then all writes will be captured, and
only reads will descend into its fields.
The register described above could be modeled as follows, using the default little-endian bit numbering.
bank b2 {
register r0 size 2 @ 0x0000 {
field status @ [2:0];
field flags @ [8:3];
field reserved @ [15:9];
}
...
}
Note that the most significant bit number is always the first number (to
the left of the colon) in the range, regardless of whether little-endian
or big-endian bit numbering is used. (The bit numbering convention used
in a source file can be selected by a bitorder
declaration.)
The value of the field can be accessed by using the get
and set
methods, e.g.:
log info: "the value of the status field is %d", r0.status.get();
An attribute
object in DML represents a
Simics configuration object attribute of the device. As mentioned above, Simics
attributes are created automatically for register
and
connect
objects to allow external inspection and modification;
explicit attribute
objects can be used to expose additional data. There are
mainly three use cases for explicit attributes:
-
Exposing a parameter for the end-user to configure or tweak. Such attributes can often be required in order to instantiate a device, and they usually come with documentation.
-
Exposing internal device state, required for checkpointing to work correctly. Most device state is usually saved in registers or saved variables, but attributes may sometimes be needed to save non-trivial state such as FIFOs.
-
Attributes can also be created as synthetic back-doors for additional control or inspection of the device. Such attributes are called pseudo attributes, and are not saved in checkpoints.
An attribute is basically a name with an associated pair of get
and set
functions. The type of the value read and written through the get/set functions
is controlled by the type
parameter. More information about configuration
object attributes can be found in Simics Model Builder User's Guide.
The init
template and associated method is often
useful together with attribute
objects to initialize any associated state.
Four standard templates are provided for attributes: bool_attr
, int64_attr
,
uint64_attr
and double_attr
. They provide overridable get
and set
methods, and store the attribute's value in a session variable named val
,
using the corresponding type. For example, if int64_attr
is used in the
attribute a
, then one can access it as follows:
log info: "the value of attribute a is %d", dev.a.val;
These templates also provide an overridable implementation of
init()
that initializes the val
session variable.
The value that val
is initialized to is controlled by the init_val
parameter, whose default definition simply causes val
to be zero-initialized.
Defining init_val
is typically the most convenient way of initializing any
attribute instantiating any one of the these templates — however,
overriding the default init()
implementation with a custom one may still be
desirable in certain cases. In particular, the definition of init_val
must be
constant, so a custom init()
implementation is necessary if val
should be
initialized to a non-constant value.
Note that using an attribute object purely to store and checkpoint simple internal device state is not recommended; prefer Saved Variables for such use cases.
A connect
object
is a container for a reference to an
arbitrary Simics configuration object. An attribute with the same name
as the connect is added to the Simics configuration class generated
from the device model. This attribute can be assigned a value of type
"Simics object".
A connect
declaration is very similar to a simple
attribute
declaration, but specialized to handle
connections to other objects.
Typically, the connected object is expected to implement one or more
particular Simics interfaces, such as signal
or ethernet_common
(see
Simics Model Builder User's Guide for details). This is described
using interface
declarations inside the
connect
.
Initialization of the connect (i.e., setting the object reference) is
done from outside the device, usually in a Simics configuration
file. Just like other attributes, the parameter
configuration
controls whether the value must
be initialized when the object is created, and whether it is
automatically saved when a checkpoint is created.
The actual object pointer, which is of type
conf_object_t*
is stored in a session
member called obj
. This means that to access the current
object pointer in a connect called otherdev, you need to
write otherdev.obj
.
If the configuration
parameter is not required
,
the object pointer may have a null value, so any code that tries to
use it must check if it is set first.
This is an example of how a connect can be declared and used:
connect plugin {
param configuration = "optional";
}
method mymethod() {
if (plugin.obj)
log info: "The plugin is connected";
else
log info: "The plugin is not connected";
}
In order to use the Simics interfaces that a connected object
implements, they must be declared within the connect
.
This is done through interface
objects.
These name the expected interfaces and may also specify additional properties.
An important property of an interface object is whether or not a
connected object is required to implement the interface. This
can be controlled through the interface parameter required
,
which is true
by default. Attempting to connect an object
that does not implement the required interfaces will cause a runtime
error. The presence of optional interfaces can be verified by testing
if the val
member of the interface object has a null
value.
By default, the C type of the Simics interface corresponding to a
particular interface object is assumed to be the name of the object
itself with the string "_interface_t"
appended. (The C type is
typically a typedef
:ed name for a struct containing function
pointers).
The following is an example of a connect with two interfaces, one of which is not required:
connect plugin {
interface serial_device;
interface rs232_device { param required = false; }
}
Calling interface functions is done in the same way as any C function is called, but the first argument which is the target object pointer is omitted.
The serial_device
used above has a function with the
following definition:
int (*write)(conf_object_t *obj, int value);
This interface function is called like this in DML:
method call_write(int value) {
local int n = plugin.serial_device.write(value);
// code to check the return value omitted
}
When a device needs to export a Simics interface, this is specified by an
implement
object, containing the methods that implement
the interface. The name of the object is also used as the name of the
Simics interface registered for the generated device, and the names and
signatures of the methods must correspond to the C functions of the
Simics interface. (A device object pointer is automatically added as the
first parameter of the generated C functions.)
In most cases, a device exposes interfaces by adding implement
object as
subobjects of named port
objects. A port object often represents a
hardware connection
The C type of the Simics interface is assumed to be the
value of the object's name
parameter (which defaults to the name of
the object itself), with the string "_interface_t"
appended. The C
type is typically a typedef
:ed name for a struct containing function
pointers.
For example, to implement the ethernet_common
Simics interface, we can write:
implement ethernet_common {
method frame(const frags_t *frame, eth_frame_crc_status_t crc_status) {
...
}
}
This requires that ethernet_common_interface_t
is defined as a struct type
with a field frame
with the function pointer type
void (*)(conf_object_t *, const frags_t *, eth_frame_crc_status_t)
.
Definitions of all standard Simics interface types are available as DML files named like the corresponding C header files;
for instance, the ethernet_common
interface can be imported as follows:
import "simics/devs/ethernet.dml"
An event object is an encapsulation of a Simics event that can be posted on a processor time or step queue. The location of event objects in the object hierarchy of the device is not important, so an event object can generally be placed wherever it is most convenient.
An event has a built-in post
method, which inserts the
event in the default queue associated with the device. An event also
defines an abstract method event
, which the user must
implement. That method is called when the event is triggered.
An event must instantiate one of six predefined
templates: simple_time_event
, simple_cycle_event
,
uint64_time_event
, uint64_cycle_event
,
custom_time_event
or custom_cycle_event
. The choice
of template affects the signature of the post
and event
methods: In a time event, the delay is specified
as a floating-point value, denoting number of seconds, while in a
cycle event, the delay is specified in CPU cycles.
A posted event may have data associated with it. This data is given to
the post
method and is provided to the event
callback method. They type of data depends on the template used: No
data is provided in simple events, and in uint64 events it is provided
as a uint64 parameter. In custom events, data is provided as
a void *
parameter, and extra
methods get_event_info
set_event_info
and destroy
must be provided in order to provide proper
checkpointing of the event.
Objects of type attribute
, connect
, event
, field
, register
, bank
,
port
and subdevice
can be organized into groups. A group is a neutral
object, which can be used just for namespacing, or to help structuring an array
of a collection of objects. Groups may appear anywhere, but are most commonly
used to group registers: If a bank has a sequence of blocks, each containing
the same registers, it can be written as a group array. In the following
example eight homogeneous groups of registers are created, resulting in
8×6 instances of register r3
.
bank regs {
param register_size = 4;
group blocks[i < 8] {
register r1 @ i * 32 + 0;
register r2 @ i * 32 + 4;
register r3[j < 6] @ i * 32 + 8 + j * 4;
}
}
Another typical use of group
is in combination with a
template for the group that contains common registers and
more that are shared between several groups, as in the following
example.
template weird {
param group_offset;
register a size 4 @ group_offset is (read, write);
register b size 4 @ group_offset + 4 is (read, write) {
method read() -> (uint64) {
// When register b is read, return a
return a.val;
}
}
}
bank regs {
group block_a is (weird) { param group_offset = 128; }
group block_b is (weird) { param group_offset = 1024; }
}
In addition, groups can be nested.
bank regs {
param register_size = 4;
group blocks[i < 8] {
register r1 @ i * 52 + 0;
group sub_blocks[j < 4] {
register r2 @ i * 52 + j * 12 + 4;
register r3[k < 3] @ i * 52 + j * 12 + k * 4 + 8;
}
}
}
Banks, ports and subdevices can be placed inside groups; in this case, the
Simics configuration object that represents the bank, port or subdevice will be
placed under a namespace object; for instance, if a device with group g { bank regs; }
is instantiated as dev
, then the bank is represented by an object
dev.g.bank.regs
, where g
and bank
are both namespace
objects.
As groups have no special properties or restrictions, they can be used as a tool for building abstractions — in particular in combination with templates.
For example, a template can be used to create an abstraction for finite state machine objects, by letting users create FSMs by declaring group objects instantiating that template. FSM states can also be represented through a template instantiated by groups.
The following demonstrates a simple example of how such an abstraction may be implemented:
// Template for finite state machines
template fsm is init {
saved fsm_state curr_state;
// The initial FSM state.
// Must be defined by any object instantiating this template.
param init_fsm_state : fsm_state;
shared method init() default {
curr_state = init_fsm_state;
}
// Execute the action associated to the current state
shared method action() {
curr_state.action();
}
}
// Template for states of an FSM. Such states must be sub-objects
// of an FSM.
template fsm_state {
param parent_fsm : fsm;
param parent_fsm = cast(parent, fsm);
// The action associated to this state
shared method action();
// Transitions the parent FSM to this state
shared method set() {
parent_fsm.curr_state = cast(this, fsm_state);
}
}
These templates can then be used as follows:
group main_fsm is fsm {
param init_fsm_state = cast(init_state, fsm_state);
group init_state is fsm_state {
method action() {
log info: "init_state -> second_state";
// Transition to second_state
second_state.set();
}
}
group second_state is fsm_state {
method action() {
log info: "second_state -> final_state";
// Transition to final_state
final_state.set();
}
}
group final_state is fsm_state {
method action() {
log info: "in final_state";
}
}
}
method trigger_fsm() {
// Execute the action of main_fsm's current state.
main_fsm.action();
}
An interface port is a structural element that groups implementations of one or more interfaces. When one configuration object connects to another, it can connect to one of the target object's ports, using the interfaces in the port. This way, the device model can expose different interfaces to different objects.
Sometimes a port is as simple as a single pin, implementing
the signal
interface, and sometimes it can be more complex,
implementing high-level communication interfaces.
It is also possible to define port arrays that are indexed with an integer parameter, to model a row of similar connectors.
In Simics, a port is represented by a separate configuration object, named like
the port but with a .port
prefix. For instance, if a device model has a
declaration port p[i<2]
on top level, and a device instance is named dev
in
Simics, then the two ports are represented in Simics by objects named
dev.port.p[0]
and dev.port.p[1]
.
A subdevice is a structural element that represents a distinct subsystem of the
device. Like a group
, a subdevice can be used to group a set of related
banks, ports and attributes, but a subdevice is presented to the end-user as a
separate configuration object. If a subdevice contains attribute
or connect
objects, or saved
declarations, then the corresponding configuration
attributes appears as members of the subdevice object rather than the device.
template name { ... }
Defines a template, a piece of code that can be reused in multiple locations. The body of the template contains a number of declarations that will be added to any object that uses the template.
Templates are imported into an object declaration body using
is
statements, written as
is name;
for example:
field F {
is A;
}
It is also possible to use templates when declaring an object, as in
field F is (name1, name2);
These can be used in any context where an object declaration may be written, and
has the effect of expanding the body of the template at the point of the is
.
Using is
together with object declarations is typically more idiomatic than
the standalone is
object statement; however, the latter is useful in order
to instantiate templates in the top-level device object, and also for use in
conjunction with in each
declarations; for example:
register r {
in each field {
is A;
}
field F1 @ [7:6];
...
}
If two templates define methods or parameters with the same name, then the
template instantiation hierarchy is used to deduce which method overrides the
other: If one template B instantiates another template A, directly or
indirectly, then methods from B override methods from A. Note, however, that
overrides can only happen on methods and parameters that are declared default
.
Example:
template A {
method hello() default {
log info: "hello";
}
}
template B is A {
// this method overrides the
// method from A
method hello() default {
default();
log info "world";
}
}
See Resolution of Overrides for a formal specification of override rules.
Each template defines a type, which is similar to a class in an object oriented language like Java. The type allows you to store references to a DML object in a variable. Some, but not all, top-level declarations inside a template appear as members of the template type. A template type has the following members:
-
All session and saved variables declared within the template. E.g., the declaration
session int val;
gives a type memberval
. -
All declarations of typed parameters, further discussed below. E.g., the declaration
param foo : uint64;
gives a type memberfoo
. -
All method declarations declared with the
shared
keyword, further discussed below. E.g., the declarationshared method fun() { ... }
gives a type memberfun
, which can be called. -
Every
shared
hook declared within the template. E.g. the declarationshared hook(int, bool) h;
gives a type memberh
. -
All type members of inherited templates. E.g., the declaration
is simple_time_event;
adds two type memberspost
andnext
, sincepost
andnext
are members of thesimple_time_event
template type. -
The
templates
member, which permits template-qualified method implementation calls to theshared
method implementations of the template type's ancestor templates.
Template members are dereferenced using the .
operator,
much like struct members.
A template's type is named like the template, and an object
reference can be converted to a value using the cast
operator. For instance, a reference to the
register regs.r0
can be created and used as follows
(all register objects automatically implement the template register
):
local register x = cast(regs.r0, register);
x.val = 14; // sets regs.r0.val
Two values of the same template type can be compared for equality, and are considered equal when they both reference the same object.
A value of a template type can be upcast to an ancestor template type; for example:
local uint64_attr x = cast(attr, uint64_attr);
local attribute y = cast(x, attribute);
In addition, a value of any template type can be cast to the template type
object
, even if object
is not an ancestor of the template.
If a method is declared in a template, then one copy of the method will appear in each object where the template is instantiated; therefore, the method can access all methods and parameters of that object. This is often convenient, but comes with a cost; in particular, if a template is instantiated in many objects, then this gives unnecessarily large code size and slow compilation. To address this problem, a method can be declared shared to operate on the template type rather than the implementing object. The implementation of a shared method is compiled once and shared between all instances of the template, rather than duplicated between instances.
Declaring a method as shared imposes restrictions on its
implementation, in particular which symbols it is permitted to access:
Apart from symbols in the global scope, a shared method may only
access members of the template's type; it is an error to access any
other symbols defined by the template. Members can be referenced
directly by name, or as fields of the automatic this
variable. When accessed in the scope of the shared method's body,
the this
variable evaluates to a value whose type is the
template's type.
Example:
template base {
// abstract method: must be instantiated in sub-template or object
shared method m(int i) -> (int);
shared method n() -> (int) default { return 5; }
}
template sub is base {
// override
shared method m(int i) -> (int) default {
return i + this.n();
}
}
If code duplication is not a concern, it is possible to define a shared method whose implementation is not subject to above restrictions while still retaining the benefit of having the method be a member of the template type. This is done by defining the implementation separately from the declaration of the shared method, for example:
template get_qname {
shared method get_qname() -> (const char *);
method get_qname() -> (const char *) {
// qname is an untyped parameter, and would thus not be accessible
// within a shared implementation of get_qname()
return this.qname;
}
}
A typed parameter declaration is a parameter declaration form that may only appear within templates:
param name : type;
A typed parameter declaration adds a member to the template type with the same name as the specified parameter, and with the specified type. That member is associated with the specified parameter, in the sense that the definition of the parameter is used as the value of the template type member.
A typed parameter declaration places a number of requirements on the named parameter:
-
The named parameter must be defined (through a regular parameter declaration). This can be done either within the template itself, within sub-templates, or within individual objects instantiating the template.
-
The parameter definition must be a valid expression of the specified type.
-
The parameter definition must be free of side-effects, and must not rely on the specific device instance of the DML model — in particular, the definition must be independent of device state.
This essentially means that the definition must be a constant expression, except that it may also make use of device-independent expressions whose values are known to be constant. For example, index parameters,
each
-in
expressions, and object references cast to template types are allowed. It is also allowed to reference other parameters that obey this rule.Examples of expressions that may not be used include method calls and references to
session
/saved
variables. -
The parameter definition must not contain calls to independent methods.
Typed parameters are most often used to allow a shared method defined within the template to access parameters of the template. For example:
template max_val_reg is write {
param max_val : uint64;
shared method write(uint64 v) {
if (v > max_val) {
log info: "Ignoring write to register exceeding max value %u",
max_val;
} else {
default(v);
}
}
}
bank regs {
register reg[i < 2] size 8 @0x0 + i*8 is max_val_reg {
param max_val = 128 * (i + 1) - 1;
}
}
A parameter declaration has the general form "param name
specification;
", where specification
is
either "= expr
" or "default expr
".
For example:
param offset = 8;
param byte_order default "little-endian";
A default value is overridden by an assignment (=
). There can
be at most one assignment for each parameter.
Typically, a default value for a parameter is specified in a
template, and the programmer may then choose to override it where the
template is used.
The specification
part is in fact optional; if omitted, it means that the
parameter is declared to exist (and must be given a value, or the model will
not compile). This is sometimes useful in templates, as in:
template constant is register {
param value;
method get() -> (uint64) {
return value;
}
}
so that wherever the template constant
is used, the programmer
is also forced to define the parameter value
. E.g.:
register r0 size 2 @ 0x0000 is (constant) {
param value = 0xffff;
}
Note that simply leaving out the parameter declaration from the template definition can have unwanted effects if the programmer forgets to specify its value where the template is used. At best, it will only cause a more obscure error message, such as "unknown identifier"; at worst, the scoping rules will select an unrelated definition of the same parameter name.
Also note that a parameter declaration without definition is redundant if a typed parameter declaration for that parameter already exists, as that already enforces that the parameter must be defined.
Note: You may see the following special form in some standard library files:
param name auto;
for example,
param parent auto;
This is used to explicitly declare the built-in automatic parameters, and should never be used outside the libraries.
The type system in DML builds on the type system in C, with a few
modifications. There are eight kinds of data types. New names for
types can also be assigned using a typedef
declaration.
-
Integers
-
Integer types guarantee a certain minimum bit width and may be signed or unsigned. The basic integer types are named
uint1
,uint2
, ...,uint64
for the unsigned types, andint1
,int2
, ...,int64
for the signed types. Note that the size of the integer type is only a hint and the type is guaranteed to be able to hold at least that many bits. Assigning a value that would not fit into the type is undefined, thus it is an error to assume that values will be truncated. For bit-exact types, refer tobitfields
andlayout
.The familiar integer types
char
andint
are available as aliases forint8
andint32
, respectively. The C keywordsshort
,signed
andunsigned
are reserved words in DML and not allowed in type declarations.The types
size_t
anduintptr_t
,long
,uint64_t
,int64_t
, are defined as in C. The typeslong
,uint64_t
andint64_t
are provided mainly for compatibility with third party libraries; they are needed because they are incompatible with the corresponding Simics types (uint64
, etc) on some platforms. -
Endian integers
-
Endian integer types hold similar values as integer types, but in addition have the following attributes:
-
They are guaranteed to be stored in the exact number of bytes required for their bitsize, without padding.
-
They have a defined byte order.
-
They have a natural alignment of 1 byte.
Endian integer types are named after the integer type with which they share a bitsize and sign but in addition have a
_be_t
or_le_t
suffix, for big-endian and little-endian integers, respectively. The full list of endian types is:int8_be_t int8_le_t uint8_be_t uint8_le_t int16_be_t int16_le_t uint16_be_t uint16_le_t int24_be_t int24_le_t uint24_be_t uint24_le_t int32_be_t int32_le_t uint32_be_t uint32_le_t int40_be_t int40_le_t uint40_be_t uint40_le_t int48_be_t int48_le_t uint48_be_t uint48_le_t int56_be_t int56_le_t uint56_be_t uint56_le_t int64_be_t int64_le_t uint64_be_t uint64_le_t
These types can be transparently used interchangeably with regular integer types, values of one type will be coerced to the other as needed. Note that operations on integers will always produce regular integer types, even if all operands are of endian integer type.
-
-
Floating-point numbers
-
There is only one floating-point type, called
double
. It corresponds to the C typedouble
. -
Booleans
-
The boolean type
bool
has two values,true
andfalse
. -
Arrays
-
An array is a sequence of elements of another type, and works as in C.
-
Pointers
-
Pointers to types, work as in C. String literals have the type
const char *
. A pointer has undefined meaning if the pointer target type is an integer whose bit-width is neither 8, 16, 32, nor 64. -
Structures
-
A
struct
type defines a composite type that contains named members of different types. DML makes no assumptions about the data layout in struct types, but see the layout types below for that. Note that there is no struct label as in C, and struct member declarations are permitted to refer to types that are defined further down in the file. Thus, new struct types can always be declared using the following syntax:typedef struct { member declarations } name;
-
Layouts
-
A layout is similar to a struct in many ways. The important difference is that there is a well-defined mapping between a layout object and the underlying memory representation, and layouts may specify that in great detail.
A basic layout type looks like this:
layout "big-endian" { uint24 x; uint16 y; uint32 z; }
By casting a pointer to a piece of host memory to a pointer of this layout type, you can access the fourth and fifth byte as a 16-bit unsigned integer with big-endian byte order by simply writing
p->y
.The allowed types of layout members in a layout type declaration are integers, endian integers, other layout types, bitfields (see below), and arrays of these.
The byte order declaration is mandatory, and is either
"big-endian"
or"little-endian"
.Access of layout members do not always provide a value of the type used for the member in the declaration. Bitfields and integer members (and arrays of similar) are translated to endian integers (or arrays of such) of similar size, with endianness matching the layout. Layout and endian integer members are accessed normally.
-
Bitfields
-
A bitfield type works similar to an integer type where you use bit slicing to access individual bits, but where the bit ranges are assigned names. A
bitfields
declaration looks like this:bitfields 32 { uint3 a @ [31:29]; uint16 b @ [23:8]; uint7 c @ [7:1]; uint1 d @ [0]; }
The bit numbering is determined by the
bitorder
declaration in the current file.Accessing bit fields is done as with a struct or layout, but the whole bitfield can also be used as an unsigned integer. See the following example:
local bitfields 32 { uint8 x @ [7:0] } bf; bf = 0x000000ff; bf.x = bf.x - 1; local uint32 v = bf;
Serializable types are types that the DML compiler knows how to serialize and
deserialize for the purposes of checkpointing. This is important for the use of
saved
variables and the after
statement.
All primitive non-pointer data types (integers, floating-point types, booleans, etc.) are considered serializable, as is any struct, layout, or array type consisting entirely of serializable types. Template types and hook reference types are also considered serializable.
Any type not fitting the above criteria is not considered serializable:
in particular, any pointer type is not considered serializable, nor is any
extern
struct type; the latter is because it's
impossible for the compiler to ensure it's aware of all members of the struct
type.
Methods are similar to C functions, but also have an implicit
(invisible) parameter which allows them to refer to the current device
instance, i.e., the Simics configuration object representing the device.
Methods also support exception handling in DML, using try
and
throw
. The body of the method is a compound statement in an
extended subset of C.
It is an error to have more than one method declaration using the same
name within the same scope.
A DML method can have any number of return values, in contrast to C
functions which have at most one return value. DML methods do not use
the keyword void
— an empty pair of parentheses always
means "zero parameters". Furthermore, lack of return value can even be
omitted. Apart from this, the parameter declarations of a method are
ordinary C-style declarations.
For example,
method m1() -> () {...}
and
method m1() {...}
are equivalent, and define a method that takes no input parameters and returns nothing.
method m2(int a) -> () {...}
defines a method that takes a single input parameter, and also returns nothing.
method m3(int a, int b) -> (int) {
return a + b;
}
defines a method with two input parameters and a single return value. A method that has a return value must end with a return statement.
method m4() -> (int, int) {
...;
return (x, y);
}
has no input parameters, but two return values.
A method that can throw an exception must declare so, using
the throws
keyword:
method m5(int x) -> (int) throws {
if (x < 0)
throw;
return x * x;
}
A parameter or method can now be overridden more than once.
When there are multiple declarations of a parameter, then the template and import hierarchy are used to deduce which declaration to use: A declaration that appears in a block that instantiates a template will override any declaration in that template, and a declaration that appears in a file that imports another file will override any declaration from that file. The declarations of one parameter must appear so that one declaration overrides all other declarations of he parameter; otherwise the declaration is considered ambiguous and an error is signalled.
Examples: A file common.dml
might contain:
param num_banks default 2;
bank banks[num_banks] {
...
}
Your device my-dev.dml
can then contain:
device my_dev;
import "common.dml";
// overrides the declaration in common.dml
param num_banks = 4;
The assignment in my-dev.dml
takes precedence,
because my-dev.dml
imports common.dml
.
Another example: The following example gives an compile error:
template my_read_constant {
param value default 0;
...
}
template my_write_constant {
param value default 0;
...
}
bank b {
// ERROR: Two declarations exist, and neither takes precedence
register r is (my_read_constant, my_write_constant);
}
The conflict can be resolved by declaring the parameter a third time, in a location that overrides both the conflicting declarations:
bank b {
register r is (my_read_constant, my_write_constant) {
param value default 0;
}
}
Furthermore, an assignment (=
) of a parameter may not be
overridden by another declaration.
If more than one declaration of a method appears in the same object, then the template and import hierarchies are used to deduce the override order. This is done in a similar way to how parameters are handled:
-
A method declaration that appears in a block that instantiates a template will override any declaration from that template
-
A method declaration that appears in a file that imports another file will override any declaration from that file.
-
The declarations of one method must appear so that one declaration overrides all other declarations of the method; otherwise the declaration is considered ambiguous and an error is signalled.
-
A method can only be overridden by another method if it is declared
default
.
Note: An overridable built-in method is defined by a template
named as the object type. So, if you want to write a template that
overrides the read
method of a register, and want to make
your implementation overridable, then your template must explicitly
instantiate the register
template using a statement is register;
.
In DML, a method call looks much like in C, with some exceptions. For instance,
(a, b) = access(...);
calls the method 'access' in the same object, assigning the return values to
variables a
and b
.
If one method overrides another, it is possible to refer to the overridden
method from within the body of the overriding method using the identifier
default
:
x = default(...);
In addition to default
, there exists the templates
member of
objects which allows for
calling the particular implementation of a method as provided by a specified
template. This is particularly useful when default
can't be used due to the
method overriding implementations provided by multiple hierarchically unrelated
templates, such that default
can't be unambiguously resolved (see Resolution
of overrides.) Unlike default
, templates
can also
be used even outside the body of the overriding method.
DML supports compound initializer syntax for the arguments of called methods, meaning arguments of struct-like types can be constructed using {...}. For example:
typedef struct {
int x;
int y;
} struct_t;
method copy_struct(struct_t *tgt, struct_t src) {
*tgt = src
}
method m() {
local struct_t s;
copy_struct(&s, {1, 4});
copy_struct(&s, {.y = 1, .x = 4});
copy_struct(&s, {.y = 1, ...}); // Partial designated initializer
}
This syntax can't be used for variadic arguments or inline arguments.
Methods can also be defined as inline, meaning that at
least one of the input arguments is declared inline
instead
of declaring a type. The method body is re-evaluated every time it is
invoked, and when a constant is passed for an inline argument, it will
be propagated into the method as a constant.
Inline methods were popular in previous versions of the language, when constant folding across methods was a useful way to reduce the size of the compiled model. DML 1.4 provides better ways to reduce code size, and inline methods remain mainly for compatibility reasons.
In DML 1.4, methods can be exported
using the
export
declaration.
In DML 1.4, method references can be converted to function pointers using
&
.
Methods that do not rely on the particular instance of the device model may
be declared independent
:
independent method m(...) -> (...) {...}
Exported independent methods do not have the input
parameter corresponding to the device instance, allowing them to be called
in greater number of contexts. The body of independent methods may not contain
statements or expressions that rely on the device instance in any way; for
example, session
or saved
variables may not be referenced, after
and log
statements may not be used, and non-independent
methods may not be called.
Within a template, shared
independent methods may be declared.
When independent methods are used as callbacks, it can sometimes be desirable to
mutate device state. In order to do this safely, device state should be mutated
within a method not declared independent
, which can called from independent
methods through the use of &
.
Device state should not be mutated directly within an independent method as this
could cause certain Simics breakpoints to not function correctly; for example,
an independent method should not mutate a session variable through a pointer to
that variable.
Independent methods may also be declared startup
, which causes them to be
called when the model is loaded into the simulation, before any device is
created. In order for this to be possible, independent startup methods
may not
have any return values nor be declared throw
s. In addition, independent
startup methods may not be declared with an overridable definition due to
technical limitations — this restriction can be worked around by having an
independent startup method call an overridable independent method. Note that
abstract shared
independent startup methods are allowed.
The order in which independent startup methods are implicitly called at model load is not defined, with the exception that independent startup methods not declared memoized are called before any independent startup methods that are.
Independent startup methods may also be declared memoized
. Unlike regular
independent startup
methods, independent startup memoized
methods may
— indeed, are required to — have return values and/or be
declared throws
.
After the first call of a memoized method, all subsequent calls for the simulation session return the results of the first call without executing the body of the method. If a memoized method call throws, then subsequent calls will throw without executing the body.
The first call to an independent startup memoized method will typically be the one implicitly performed at model load, but it may also occur beforehand (for example, if the method is called as part of another independent startup method).
Result caching is shared across all instances of the device model. This mechanism can be used to compute device-independent data which is then shared across all instances of the device model.
The results of shared
memoized methods are cached per template instance, and
are not shared across all objects instantiating the template.
(Indirectly) recursive memoized method calls are not allowed; the result of such a call is a run-time critical error.
A session declaration creates a number of named storage locations for arbitrary run-time values. The names belongs to the same namespace as objects and methods. The general form is:
session declarations = initializer;
where = initializers
is optional
and declarations
is a variable declaration similar to C, or
a sequence of such declarations; for example,
session int id = 1;
session bool active;
session double table[4] = {0.1, 0.2, 0.4, 0.8};
session (int x, int y) = (4, 3);
session conf_object_t *obj;
In the absence of explicit initializer expressions, a default "all zero" initializer will be applied to each declared object.
Note that the number of initializers — together given as a tuple — must match the number of declared variables. In addition, the number of elements in each initializer must match with the number of elements or fields of the type of the declared session variable. This also implies that each sub-element, if itself being a compound data structure, must also be enclosed in braces.
C99-style designated initializers are supported for struct
, layout
, and
bitfields
types:
typedef struct { int x; struct { int i; int j; } y; } struct_t;
session struct_t s = { .x = 1, .y = { .i = 2, .j = 3 } }
Unlike C, partial initialization is not allowed implicitly; a designated
initializer for each member must be specified.
However, partial initialization can be done explicitly through the use of
trailing ...
syntax:
session struct_t s = { .y = { .i = 2, ... }, ... }
Also unlike C, designator lists are not supported, and designated initializers for arrays are not supported.
Note: Previously session
variables were known as data
variables.
A saved declaration creates a named storage location for an arbitrary run-time value, and automatically creates an attribute that checkpoints this variable. Saved variables can be declared in object or statement scope, and the name will belong to the namespace of other declarations in that scope. The general form is:
saved declaration = initializer;
where = initializer
is optional
and declaration
is similar to a C variable
declaration; for example,
saved int id = 1;
saved bool active;
saved double table[4] = {0.1, 0.2, 0.4, 0.8};
In the absence of explicit initializer expression, a default "all zero" initializer will be applied to the declared object.
Note that the number of elements in the initializer must match with the number of elements or fields of the type of the saved variable. This also implies that each sub-element, if itself being a compound data structure, must also be enclosed in braces.
C99-style designated initializers are supported for struct
, layout
, and
bitfields
types:
typedef struct { int x; struct { int i; int j; } y; } struct_t;
saved struct_t s = { .x = 1, .y = { .i = 2, .j = 3 } }
Unlike C, partial initialization is not allowed implicitly; a designated
initializer for each member must be specified.
However, partial initialization can be done explicitly through the use of
trailing ...
syntax:
session struct_t s = { .y = { .i = 2, ... }, ... }
Also unlike C, designator lists are not supported, and designated initializers for arrays are not supported.
In addition, the types of saved declaration variables are currently restricted to primitive data types, or structs or arrays containing only data types that could be saved. Such types are called serializable.
Note: Saved variables are primarily intended for making checkpointable
state easier. For configuration, attribute
objects should
be used instead. Additional data types for saved declarations are planned to
be supported.
hook(msgtype1, ... msgtypeN) name;
A hook declaration defines a named object member to which suspended computations may be attached for execution at a later point. By sending a message through the hook, every computation suspended on the hook will become detached from the hook, and then executed — receiving the message as data. Computations suspended on a hook are executed in order of least recently attached; in other words, FIFO semantics.
Currently, the only computations that can be suspended and attached to hooks are
single method calls, which is done through the use of the after
statement. This will later be expanded upon: hooks will play
a central role in the future introduction of coroutines, as hooks will serve
as the primitive mechanism through which coroutines suspend themselves and
become resumed.
Every hook has an associated list of message component types, specified during declaration through the (msgtype1, ... msgtypeN) syntax. This specifies what form of data is sent and received via the hook. Any number of message component types can be given, including zero, in which case a message sent via the hook has no associated data.
Example declarations:
// Hook with no associated message component types
hook() h1;
// Hook with a single message component type
hook(int) h2;
// Hook with two message component types
hook(int *, bool) h3;
Beyond suspending computations on it, a hook h has two associated operations:
-
h.send(msg1, ... msgN)
Sends a message through the hook, with message components msg1 through msgN. The number of message components must match the number of message component types of the hook, and each message component must be compatible with the corresponding message component type of the hook.
send
is asynchronous: the message will only be sent — and suspended computations executed — once all current device entries on the call stack have been completed. It is exactly equivalent to after: h.send_now(msg1, ... msgN), except it's not possible to prevent the message from being sent viacancel_after()
. For more information, see Immediate After Statements.Like immediate after statements, pointers to stack-allocated data must not be passed as message components to a
send
. If you must use pointers to stack-allocated data, thensend_now
should be used instead ofsend
. If you want the message to be delayed to avoid ordering bugs, create a method which wraps thesend_now
call together with the declarations of the local variable(s) which are pointed to, and then use an immediate after statement (after: m(...)
) to delay the call to that method. -
h.send_now(msg1, ... msgN)
Sends a message through the hook, with message components msg1 through msgN. The number of message components must match the number of message component types of the hook, and each message component must be compatible with the corresponding message component type of the hook.
send_now
is synchronous: every computation suspended on the hook will execute beforesend_now
completes.send_now
returns the number of suspended computations that were successfully resumed from the message being sent. Currently, every suspended computation is guaranteed to successfully be resumed unless cancelled by a preceding computation resumed by thesend_now
. This will not remain true in the future: coroutines are planned to be able to reject a message and reattach themselves to the hook. -
h.suspended
Evaluates to the number of computations currently suspended on the hook.
References to hooks are valid run-time values: a reference to a hook with message component types msgtype1 through msgtypeN will have the hook reference type hook(msgtype1, ... msgtypeN). This means hook references can be stored in variables, and can be passed around as method arguments or return values. In fact, hook references are even serializable.
Two hook references of the same hook reference type can be compared for equality, and are considered equal when they both reference the same hook.
method send_now_checked_no_data(hook() h) { local uint64 resumed = h.send_now(); if (resumed == 0) { log error: "Unhandled message to hook"; } }method send_now_checked_int(hook(int) h, int x) { local uint64 resumed = h.send_now(x); if (resumed == 0) { log error: "Unhandled message to hook"; } }
The general form of an object declaration is "type
name extras is (template, ...) desc {
... }
" or "type name extras is
(template, ...) desc;
", where type
is an object type such as bank
, name
is an
identifier naming the object, and extras
is optional
special notation which depends on the object type. The is
(template, ...)
part is optional and will make the object
inherit the named templates. The surrounding parenthesis can be omitted if
only one template is inherited. The desc
is an optional
string constant giving a very short summary of the object. It can consist
of several string literals concatenated by the '+' operator. Ending the
declaration with a semicolon is equivalent to ending with an empty
pair of braces. The body (the section within the braces) may
contain parameter declarations, methods, session
variable declarations, saved variable declarations,
in each declarations and
object declarations.
For example, a register
object may be declared as
register r0 @ 0x0100 "general-purpose register 0";
where the "@ offset
" notation is particular for the
register
object type; see below for details.
Using is (template1, template2)
is equivalent to
using is
statements in the body, so the following two
declarations are equivalent:
register r0 @ 0x0100 is (read_only,autoreg);
register r0 @ 0x0100 {
is read_only;
is autoreg;
}
An object declaration with a desc
section, on the form
type name ... desc { ... }
is equivalent to defining the parameter desc
, as in
type name ... { param desc = desc; ... }
In the following sections, we will leave out desc
from
the object declarations, since it is always optional. Another parameter,
documentation
(for which there is no short-hand), may also be
defined for each object, and for some object types it is used to give a
more detailed description.
See Section Universal Templates
for details.)
If two object declarations with the same name occur within the same containing object, and they specify the same object type, then the declarations are concatenated; e.g.,
bank b { register r size 4 { ...body1... } ... register r @ 0x0100 { ...body2... } ... }
is equivalent to
bank b { register r size 4 @ 0x0100 { ...body1... ...body2... } ... }
However, it is an error if the object types should differ.
Most object types (bank
, register
,
field
,
group
, attribute
, connect
,
event
, and port
) may be used
in arrays. The general form of an object array declaration is
type name[var < size]... extras { ... }
Here each [var < size]
declaration defines
a dimension of resulting array. var defines the name of the
index in that dimension, and size defines the size of the dimension.
Each variable defines a parameter in the object scope, and thus must
be unique.
The size must be a compile time constant. For instance,
register regs[i < 16] size 2 {
param offset = 0x0100 + 2 * i;
...
}
or written more compactly
register regs[i < 16] size 2 @ 0x0100 + 2 * i;
defines an array named regs
of 16 registers (numbered from 0 to
15) of 2 bytes each, whose offsets start at 0x0100.
See Section Universal Templates
for details about arrays and index parameters.
The size specification of an array dimension may be replaced with ...
if the
size has already been defined by a different declaration of the same object
array. For example, the following is valid:
register regs[i < 16][j < ...] size 2 @ 0x0100 + 16 * i + 2 * j;
register regs[i < ...][j < 8] is (read_only);
Note that in Simics 5, port
arrays and bank
arrays
cannot be multi-dimensional.
The following sections give further details on declarations for object types that have special conventions.
The general form of a register
declaration is
register name size n @ d is (templates) { ... }
Each of the "size n
", "@ d
", and "is
(templates)
" sections is optional, but if present, they must
be specified in the above order.
-
A declaration
register name size n ... { ... }
is equivalent to
register name ... { param size = n; ... }
-
A declaration
register name ... @ d ... { ... }
is equivalent to
register name ... { param offset = d; ... }
The general form of a field
object
declaration is
field name @ [highbit:lowbit] is (templates) { ... }
or simply
field name @ [bit] ... { ... }
specifying a range of bits of the containing register, where the syntax
[bit]
is short for [bit:bit]
.
Both the "@ [...]
" and the is (templates)
sections are optional; in fact, the "[...]
" syntax is merely a
much more convenient way of defining the (required) field parameters
lsb
and msb
.
For a range of two or more bits, the first (leftmost) number always indicates the most significant bit, regardless of the bit numbering scheme used in the file. This corresponds to how bit fields are usually visualized, with the most significant bit to the left.
The bits of a register are always numbered from zero to n - 1, where n is the width of the register. If the default little-endian bit numbering is used, the least significant bit has index zero, and the most significant bit has index n - 1. In this case, a 32-bit register with two fields corresponding to the high and low half-words may be declared as
register HL size 4 ... {
field H @ [31:16];
field L @ [15:0];
}
If instead big-endian bit numbering is selected in the file, the most significant bit has index zero, and the least significant bit has the highest index. In that case, the register above may be written as
register HL size 4 ... {
field H @ [0:15];
field L @ [16:31];
}
This is useful when modeling a system where the documentation uses big-endian bit numbering, so it can be compared directly to the model.
It is also possible to conditionally include or exclude one or more object declarations, depending on the value of a boolean expression. This is especially useful when reusing source files between several similar models that differ in some of the details.
The syntax is very similar to the #if
statements
used in methods.
#if (enable_target) {
connect target (
interface signal;
}
}
One difference is that the braces are required. It is also possible to add else branches, or else-if branches.
#if (modeltype == "Mark I") {
...
} #else #if (modeltype == "Mark II" {
...
} #else {
...
}
The general syntax is
#if ( conditional ) { object declarations ... } #else #if ( conditional ) { object declarations ... } ... #else { object declarations ... }
The conditional is an expression with a constant boolean value. It may reference parameters declared at the same level in the object hierarchy, or in parent levels.
The object declarations are any number of declarations of objects, session
variables, saved variables, methods, or other #if
statements, but not
parameters, is
statements, or in each
statements . When the conditional is
true
(or if it's the else branch of a false conditional), the object
declarations are treated as if they had appeared without any surrounding #if.
So the two following declarations are equivalent:
#if (true) {
register R size 4;
} #else {
register R size 2;
}
is equivalent to
register R size 4;
In Each declarations are a convenient mechanism to apply a pattern to a group of objects. The syntax is:
in each
(template-name
, ...) {
body
}
where template-name
is the name of a template
and body
is a list of object statements, much like the body
of a template. The statements in body
are expanded in any
subobjects that instantiate the template template-name
,
either directly or indirectly. If more than
one template-name
is given, then the body will be expanded
only in objects that instantiate all the listed templates.
The in each
construct can be used as a convenient way to
express when many objects share a common property. For
example, a bank can contain the following to conveniently set the size
of all its registers:
in each register { param size = 2; }
Declarations in an in each
block will override any
declarations in the extended template. Furthermore, declarations in
the scope that contains an in each
statement, will override
declarations from that in each
statement. This can be used
to define exceptions for the in each
rule:
bank regs {
in each register { param size default 2; }
register r1 @ 0;
register r2 @ 2;
register r3 @ 4 { param size = 1; }
register r4 @ 5 { param size = 1; }
}
An in each
block is only expanded in subobjects; the
object where the in each
statement is present is
unaffected even if it instantiates the extended template.
An in each
statement with multiple template names can be used
to cause a template to act differently depending on context:
template greeting { is read; }
template field_greeting is write {
method write(uint64 val) {
log info: "hello";
}
}
in each (greeting, field) { is field_greeting; }
template register_greeting is write {
method write(uint64 val) {
log info: "world";
}
}
in each (greeting, register) { is register_greeting; }
bank regs {
register r0 @ 0 {
// logs "hello" on write
field f @ [0] is (greeting);
}
// logs "world" on write
register r1 @ 4 is (greeting);
}
The following sections describe the global declarations in DML. These can only occur on the top level of a DML model, i.e., not within an object or method. Unless otherwise noted, their scope is the entire model.
import filename;
Imports the contents of the named file. filename must be a string
literal, such as "utility.dml"
. The -I
option to the
dmlc
compiler can be used to specify directories to be searched
for import files.
If filename starts with ./
or ../
, the
compiler disregards the -I
flag, and the path is instead
interpreted relative to the directory of the importing file.
Note that imported files are parsed as separate units, and use their own language version and bit order declarations. A DML 1.4 file is not allowed to import a DML 1.2 file, but a DML 1.2 file may import a DML 1.4 file.
Templates may only be declared on the top level, and the syntax and semantics for such declarations have been described previously.
Templates share the same namespace as types, as each template declaration defines a corresponding template type of the same name. It is illegal to define a template whose name conflicts with that of another type.
bitorder order;
Selects the default bit numbering scheme to be used for interpreting
bit-slicing expressions and bit field declarations in the file. The
order
is one of the identifiers le
or
be
, implying little-endian or big-endian, respectively. The
little-endian numbering scheme means that bit zero is the least
significant bit in a word, while in the big-endian scheme, bit zero is
the most significant bit.
A bitorder
declaration should be placed before any other
global declaration in each DML-file, but must follow immediately after
the device
declaration if such one is present.
The scope of the declaration is the whole of the file it
occurs in. If no bitorder
declaration is present in a file, the
default bit order is le
(little-endian). The bitorder does not
extend to imported files; for example, if a file containing a
declaration "bitorder be;
" imports a file with no bit order
declaration, the latter file will still use the default le
order.
constant name = expr;
Defines a named constant.
expr
must be a constant-valued expression.
Parameters have a similar behaviour as constants but are more
powerful, so constants are rarely useful. The only advantage of
constants over parameters is that they can be used in typedef
declarations.
loggroup name;
Defines a log group, for use in log
statements.
More generally,
the identifier name
is bound to an unsigned integer
value that is a power of 2, and can be used anywhere in C context; this
is similar to a constant
declaration, but the value is
allocated automatically so that all log groups are represented by
distinct powers of 2 and can be combined with bitwise or.
A maximum of 63 log groups may be declared per device (61 excluding the built-in
Register_Read
and Register_Write
log groups.)
typedef declaration; extern typedef declaration;
Defines a name for a data type.
When the extern
form is used, the type is assumed to exist in
the C environment. No definition of the type is added to the
generated C code, and the generated C code blindly assume that the
type exists and has the given definition.
An extern typedef
declaration may not contain a layout
or
endian int
type.
If a struct
type appears within an extern typedef
declaration, then DMLC will assume that there is a corresponding C
type, which has members of given types that can be accessed with
the .
operator. No assumptions are made on completeness or
size; so the C struct may have additional fields, or it might be
a union
type. An empty member list is even allowed; this can
make sense for opaque structs. DML variables of extern
struct type are
initialized such that any members of the C struct which are unknown to DML are
initialized to 0.
Nested struct definitions are permitted in an extern typedef
declaration, but an inner struct type only supports member access; it
cannot be used as a standalone type. For instance, if you have:
extern typedef struct {
struct { int x; } inner;
} outer_t;
then you can declare local outer_t var;
and access the member
var.inner.x
, but the inner type is unknown to DML so you cannot
declare a variable local typeof var.inner *inner_p;
.
extern declaration;
Declares an external identifier, similar to a C extern
declaration; for example,
extern char *motd;
extern double table[16];
extern conf_object_t *obj;
extern int foo(int x);
extern int printf(const char *format, ...);
Multiple extern
declarations for the same identifier are permitted as long as
they all declare the same type for the identifier.
header %{
...
%}
Specifies a section of C code which will be included verbatim in the
generated C header file for the device. There must be no whitespace
between the %
and the corresponding brace in the %{
and %}
markers. The contents of the header section are not
examined in any way by the dmlc
compiler; declarations made
in C code must also be specified separately in the DML code proper.
This feature should only be used to solve problems that cannot easily be handled directly in DML. It is most often used to make the generated code include particular C header files, as in:
header %{
#include "extra_defs.h"
%}
The expanded header block will appear in the generated C file, which
usually is in a different directory than the source DML
file. Therefore, when including a file with a relative path, the C
compiler will not automatically look for the .h
file in
the directory of the .dml
file, unless a corresponding
-I
flag is passed. To avoid this problem, DMLC inserts a C
macro definition to permit including a companion header
file. For instance, if the
file /path/to/hello-world.dml
includes a header block,
then the macro DMLDIR_HELLO_WORLD_H
is defined as the
string "/path/to/hello-world.h"
within this header
block. This allows the header block to contain #include DMLDIR_HELLO_WORLD_H
, as a way to include hello-world.h
without passing -I/path/to
to the C compiler.
DMLC only defines one such macro in each header block, by taking the
DML file name and substituting the .dml
suffix
for .h
. Furthermore, the macro is undefined after the
header. Hence, the macro can only be used to access one specific
companion header file; if other header files are desired, then
#include
directives can be added to the companion header
file, where relative paths are expanded as expected.
See also footer
declarations, below.
footer %{
...
%}
Specifies a piece of C code which will be included verbatim at the end
of the generated code for the device. There must be no whitespace
between the %
and the corresponding brace in the %{
and %}
markers. The contents of the footer section are not
examined in any way by the dmlc
compiler.
This feature should only be used to solve problems that cannot easily be
handled directly in DML. See also header
declarations, above.
export method as name;
Exposes a method specified by method
to other C modules within the
same Simics module under the name name
with external linkage. Note
that inline methods, shared methods, methods that throw, methods with
more than one return argument, and methods declared inside object
arrays cannot be exported. It is sometimes possible to write wrapper
methods that call into non-exportable methods to handle such cases,
and export the wrapper instead.
Exported methods are rarely used; it is better to use Simics interfaces for communication between devices. However, exported methods can sometimes be practical in tight cross-language integrations, when the implementation of one device is split between one DML part and one C/C++ part.
Example: the following code in DML:
method my_method(int x) { ... }
export my_method as "my_c_function";
will export my_method
as a C function with external linkage,
using the following signature:
void my_c_function(conf_object_t *obj, int x);
The conf_object_t *obj
parameter corresponds to the device instance, and is
omitted when the referenced method is independent.
This section describes in detail the rules for how DML handles when there are multiple definitions of the same parameter or method. A less technical but incomplete description can be found in the section on templates.
-
Each declaration in every DML file is assigned a rank. The set of ranks form a partial order, and are defined as follows:
- The top level of each file has a rank.
- Each template definition has a rank.
- The block in an
in each
declaration has a rank. - If one object declaration has rank R, then any subobject
declaration inside it, also those inside an
#if
block, has rank R. -
param
andmethod
declarations has the rank of the object they are declared within. This includes shared methods. - If an object declaration contains
is T
, then that object declaration has higher rank than the body of the templateT
. - If one file F1 imports another file F2, then the top level of F1 has higher rank than the top level of F2.
- A declaration has higher rank than the block of any
in each
declaration it contains. - An
in each
block has higher rank than the templates it applies to - If there are three declarations D1, D2 and D3, where D1 has higher rank than D2 and D2 has higher rank than D3, then D1 has higher rank than D3.
- A declaration may not have higher rank than itself.
-
In a set of
method
orparam
declarations that declare the same object in the hierarchy, then we say that one declaration dominates the set if it has higher rank than all other declarations in the set. Abstractparam
declarations (param name;
orparam name : type;
) and abstract method definitions (method name(args...);
) are excluded here; they cannot dominate a set, and a dominating declaration in a set does not need to have higher declaration than any abstractparam
ormethod
declaration in the set. -
There may be any number of untyped abstract definitions of a parameter (
param name;
). -
There may be at most one typed abstract definition of a parameter (
param name : type;
) -
There may be at most one abstract shared definition of a method. Any other shared definition of this method must have higher rank than the abstract definition, but any rank is permitted for non-shared definitions. For instance:
template a { method m() default {} } template b { shared method m() default {} } template aa is a { // OK: overrides non-shared method shared method m(); } template bb is b { // Error: abstract shared definition overrides non-abstract shared method m(); }
-
When there is a set of declarations of the same a
method
orparam
object in the hierarchy, then there must be (exactly) one of these declarations that dominates the set; it is an error if there is not. -
If there is a
method
orparam
that is not declareddefault
, then it must dominate the set of declarations of that method or parameter; it is an error if it does not. -
In the above two rules, "the set of declarations" of an object does not include declarations that are disabled through an
#if
statement, or definitions that appear in a template that never is instantiated in an object. However, the rules do also apply to shared method declarations in templates, regardless whether the templates are used. For instance:template t1 { method a() {} shared method b() {} } template t2 is t1 { // OK, as long as t2 never is instantiated method a default {} // Error, even if t2 is unused shared method b() default {} }
-
If the set of declarations D1, D2, ..., Dn of a method M is dominated by the declaration Dn, then:
- If there is a k, 1 ≤ k ≤ n-1, such that Dk dominates
the set D1, ..., Dn-1, then the symbol
default
refers to the method implementation of Dk within the scope of the method implementation of Dn. - If not, then
default
is an illegal value within the method implementation of Dn.
- If there is a k, 1 ≤ k ≤ n-1, such that Dk dominates
the set D1, ..., Dn-1, then the symbol
It follows that:
-
The following code is illegal, because it would otherwise give T a higher rank than itself:
template T { #if (p) { group g is T { param p = false; } } }
-
Cyclic imports are not permitted, for the same reason.
-
If an object is declared twice on the top level in the same file, then both declarations have the same rank. Thus, the following declarations of the parameter
p
count as conflicting, because neither has a rank that dominates the other:bank b { register r { param p default 3; } } bank b { register r { param p = 4; } }
The algorithmic language used to express method bodies in DML is an extended
subset of ISO C, with some C++ extensions such as new
and delete
. The
DML-specific statements and expressions are described in Sections
Method Statements and Expressions.
DML defines the following additional built-in data types:
-
int1
, ...,int64
,uint1
, ...,uint64
-
Signed and unsigned specific-width integer types. Widths from 1 to 64 are allowed.
-
bool
-
The generic boolean datatype, consisting of the values
true
andfalse
. It is not an integer type, and the only implicit conversion is touint1
DML also supports the non-standard C extension
typeof(expr)
operator, as provided by some modern C
compilers such as GCC.
DML deviates from the C language in a number of ways:
-
All integer arithmetic is performed on 64-bit numbers in DML, and truncated to target types on assignment. This is similar to how arithmetic would work in C on a platform where the
int
type is 64 bits wide (though in DML,int
is an alias ofint32
). Similarly, all floating-point arithmetic is performed on thedouble
type.For instance, consider the following:
local int24 x = -3; local uint32 y = 2; local uint64 sum = x + y;
In C, the expression
x + y
would cast both operands up to unsigned 32-bit integers before performing a 32-bit addition; overflow gives the result is 232 - 1, which is promoted without sign extension into a 64-bit integer before stored in thesum
variable. In DML, both operands are instead promoted to 64-bit signed integers, so the addition evaluates to -1, which is stored as 264 - 1 in thesum
variable.Formally, if any of the two operands of an arithmetic binary operator (including bitwise operators) has the type
uint64
, then both operands are promoted intouint64
before the operation; otherwise, both operands are promoted intoint64
before the operation. If any operand has floating-point type, then both operands are promoted into thedouble
type. -
Comparison operators (
==
,!=
,<
,<=
,>
and>=
) do not promote signed integers to unsigned before comparison. Thus, unlike in C, the following comparison yieldstrue
:int32 x = -1; uint64 val = 0; if (val > x) { ... }
-
The shift operators (
<<
and>>
) have well-defined semantics when the right operand is large: Shifting by more than 63 bits gives zero (-1 if the left operand is negative). Shifting a negative number of bits is an error. -
Division by zero is an error.
-
Signed overflow in arithmetic operations (
+
,-
,*
,/
,<<
) is well-defined. The overflow value is calculated assuming two's complement representation; i.e., the result is the unique value v such that v ≡ r (mod 264), where r is the result of operation using arbitrary precision arithmetic. -
Local variable declarations must use the keyword local, session, or saved; as in
method m() { session int call_count = 0; saved bool called = false; local int n = 0; local float f; ... }
Session and saved variables have a similar meaning to static variables as in C: they retain value over function calls. However, such variables in DML are allocated per device object, and are not globally shared between device instances.
Unlike C, multiple simultaneous variable declaration and initialization is done through tuple syntax:
method m() { local (int n, bool b) = (0, true); local (float f, void *p); ... }
-
Plain C functions (i.e., not DML methods) can be called using normal function call syntax, as in
f(x)
.In order to call a C function from DML, three steps are needed:
-
In order for DML to recognize an identifier as a C function, it must be declared in DML, using an
extern
declaration. -
In order for the C compiler to recognize the identifier when compiling generated C code, a function declaration must also be declared in a
header
section, or in a header file included from this section. -
In order for the C linker to resolve the symbol, a function definition must be present, either in a separate C file or in a header or
footer
section.
foo.c
int foo(int i) { return ~i + 1; }
foo.h
int foo(int i);
bar.dml
// tell DML that these functions are available extern int foo(int); extern int bar(int); header %{ // tell generated C that these functions are available #include "foo.h" int bar(int); // defined in the DML footer section %} footer %{ int bar(int i) { return -i; } %}
Makefile
SRC_FILES=foo.c bar.dml
-
-
Assignments (
=
) are required to be separate statements. You are still allowed to assign multiple variables in one statement, as in:i = j = 0;
-
Multiple simultaneous assignment can be performed in one statement through tuple syntax, allowing e.g. the following:
(i, j) = (j, i);
However, such assignments are not allowed to be chained.
-
If a method can throw exceptions, or if it has more than one return argument, then the call must be a separate statement. If it has one or more return values, these must be assigned. If a method has multiple return arguments, these are enclosed in a parenthesis, as in:
method divmod(int x, int y) -> (int, int) { return (x / y, x % y); } ... (quotient, remainder) = divmod(17, 5);
-
Type casts must be written as
cast(expr, type)
. -
Comparison operators and logical operators produce results of type
bool
, not integers. -
Conditions in
if
,for
,while
, etc. must be proper booleans; e.g.,if (i == 0)
is allowed, andif (b)
is allowed ifb
is a boolean variable, butif (i)
is not, ifi
is an integer. -
The
sizeof
operator can only be used on lvalue expressions. To take the size of a datatype, thesizeoftype
operator must be used. -
Comma-expressions are only allowed in the head of
for
-statements, as infor (i = 10, k = 0; i > 0; --i, ++k) ...
-
delete
andthrow
can only be used as statements in DML, not as expressions. -
throw
does not take any argument, andcatch
cannot switch on the type or value of an exception. -
Type declarations do not allow the use of
union
. However, theextern typedef
construct can be used to achieve the same result. For example, consider the union data type declared in C as:typedef union { int i; bool b; } u_t;
The data type can be exposed in DML as follows:
header %{ typedef union { int i; bool b; } u_t; %} extern typedef struct { int i; bool b; } u_t;
This will make
u_t
look like a struct to DML, but since union and struct syntax is identical in C, the C code generated from uses ofu_t
will work correctly together with the definition from theheader
declaration.
All ISO C statements are available in DML, and have the same semantics as in C. Like ordinary C expressions, all DML expressions can also be used in expression-statements.
DML adds the following statements:
target1 [= target2 = ...] = initializer; (target1, target2, ...) = initializer;
Assign values to targets according to an initializer. Unlike C, assignments are not expressions, and the right-hand side can be any initializer — such as compound initializers ({...}) for struct-like types.
The first form is chaining assignments. The initializer is executed once and the value it evaluates to is assigned to each target.
The second form is multiple simultaneous assignment. The initializer describes multiple values — one for each target. This can be done either through:
- Providing an initializer for each target through tuple syntax, e.g.:
(a, i) = (false, 4);
- Performing a method call where each target is a return value recipient, e.g.:
method m() -> (bool, int) {
...
}
(a, i) = m();
Targets are updated simultaneously, meaning it's possible to e.g. swap the contents of variables through the following:
(a, b) = (b, a)
local type identifier [= initializer]; local (type1 identifier1, type2 identifier2, ...) [= initializer];
Declares one or multiple local variables in the current scope. The right-hand side is an initializer, meaning, for example, that compound initializers ({...}) can be used.
The initializer must provide the exact number of values needed to initialize the variables, and they must be of compatible type. Multiple values can be provided either through:
- Providing an initializer for each variable through tuple syntax, e.g.:
local (bool a, int i) = (false, 4);
- Performing a method call where each return value initializes a variable, e.g.:
method m() -> (bool, int) {
...
}
local (bool a, int i) = m();
In the absence of explicit initializer expressions, a default "all zero" initializer will be applied to each declared object.
session type identifier [= initializer]; session (type1 identifier1, type2 identifier2, ...) [= (initializer1, initializer2, ...)];
Declares one or multiple session variables in the current scope. Note that initializers of such variables are evaluated once when initializing the device, and thus must be a compile-time constant.
saved type identifier [= initializer]; sabed (type1 identifier1, type2 identifier2, ...) [= (initializer1, initializer2, ...)];
Declares one or multiple saved variables in the current scope. Note that initializers of such variables are evaluated once when initializing the device, and thus must be a compile-time constant.
return [initializer];
Returns from method with the value(s) specified by the argument. Unlike C, the argument is an initializer, meaning, for example, return values of struct-like type can be constructed using {...}.
The initializer must provide the exact number of values corresponding as the return values of the method, and they must be of compatible type. Multiple values can be provided either through:
- Providing an initializer for each return value through tuple syntax, e.g.:
method m() -> (bool, int) {
return (false, 4);
}
- Performing a method call and propagating the return values:
method n() -> (bool, int) {
return m();
}
delete expr;
Deallocates the memory pointed to by the result of evaluating
expr
. The memory must have been allocated with the
new
operator, and must not have been deallocated previously.
Equivalent to delete
in C++; however, in DML, delete
can only be used as a statement, not as an expression.
try protected-stmt catch handle-stmt
Executes protected-stmt
; if that completes normally,
the whole try
-statement completes normally. Otherwise,
handle-stmt
is executed. This is similar to exception
handling in C++, but in DML there is only one kind of exception. Note
that Simics C-exceptions are not handled. See also throw
.
throw;
Throws (raises) an exception, which may be caught by a
try
-statement. This is
similar to throw
in C++, but in DML it is not possible to
specify a value to be thrown. Furthermore, in DML,
throw
is a statement, not an expression.
If an exception is not caught inside a method body, then the method
must be declared as throws
, and the exception is propagated
over the method call boundary.
(d1, ... dM) = method(e1, ... eN);
A DML method is called similarly as a C function, with the exception that you
must have assignment destinations according to the number of return values of
the method. Here a DML method is called with input arguments e1
, ... eN
,
assigning return values to destinations d1
, ... dM
. The destinations are
usually variables, but they can be arbitrary L-values (even bit slices) as long
as their types match the method signature.
If the method has no return value, the call is simply expressed as:
p(...);
A method with exactly one return value can also be called in any expression, unless it is an inline method, or a method that can throw exceptions. For example:
method m() -> (int) { ... }
...
if (m() + i == 13) { ... }
A method call (even if it is throwing or has multiple return values) can be used as an initializer in any context that accepts non-constant initializers; i.e., in assignment statements (as shown above), local variable declarations, and return statements. For example:
// declare multiple variables, and initialize them from one method call
local (int i, uint8 j) = m(e1);
// Propagate all return values from a method call as the return values of the
// caller.
return m(e1)
Every object, as well as every template type, has a
templates
member to allow for calling particular implementations of that
object's methods, as opposed to only the final overriding implementations
that are reachable directly. Specifically, templates
allows for invoking any
particular implementation as provided by a specified template instantiated by
the object. Such invocations are called template-qualified method
implementation calls, and are made as follows:
template t {
method m() default {
log info: "implementation from 't'"
}
}
template u is t {
method m() default {
log info: "implementation from 'u'"
}
}
group g is u {
method m() {
log info: "final implementation"
}
}
method call_ms() {
// Logs "final implementation"
g.m();
// Logs "implementation from 'u'"
g.templates.u.m();
// Logs "implementation from 't'"
g.templates.t.m();
}
Template-qualified method implementation calls are primarily meant as a way
for an overriding method to reference overridden implementations, even when
the implementations are provided by hierarchically unrelated templates such that
default
can't be used (see Resolution of
overrides.) In particular, this typically allows for
ergonomically resolving conflicts introduced when multiple orthogonal templates
are instantiated, as long as all conflicting implementations are overridable,
and one of the following is true:
- The implementations can be combined together by calling each one of them, as long as that can be done without risking e.g. side-effects being duplicated.
- The implementations can be combined by choosing one particular template's implementation to invoke (typically the one most complex), and then adding code around that implementation call in order to replicate the behaviour of the implementations of the other templates. Ideally, the other templates would provide methods that may be leveraged so that their behaviour may be replicated without the need for excessive boilerplate.
The following is an example of the first case:
template alter_write is write {
method write(uint64 written) {
default(alter_write(written));
}
method alter_write(uint64 curr, uint64 written) -> (uint64);
}
template gated_write is alter_write {
method write_allowed() -> (bool) default {
return true;
}
method alter_write(uint64 curr, uint64 written) -> (uint64) default {
return write_allowed() ? written : curr;
}
}
template write_1_clears is alter_write {
method alter_write(uint64 curr, uint64 written) -> (uint64) default {
return curr & ~written;
}
}
template gated_write_1_clears is (gated_write, write_1_clears) {
method alter_write(uint64 curr, uint64 written) default {
local uint64 new = this.templates.write_1_clears.alter_write(
curr, written);
return this.templates.gated_write.alter_write(curr, new);
}
}
// Resolve the conflict introduced whenever the two orthogonal templates are
// instantiated by also instantiating gated_write_1_clears when that happens
in each (gated_write, write_1_clears) { is gated_write_1_clears; }
The following is an example of the second case:
template very_complex_register is register {
method write_register(uint64 written, uint64 enabled_bytes,
void *aux) default {
... // An extremely complicated implementation
}
}
template gated_register is register {
method write_allowed() -> (bool) default {
return true;
}
method on_write_attempted_when_not_allowed() default {
log spec_viol: "%s was written to when not allowed", qname;
}
method write_register(uint64 written, uint64 enabled_bytes,
void *aux) default {
if (write_allowed()) {
default(written, enabled_bytes, aux);
} else {
on_write_attempted_when_not_allowed();
}
}
}
template very_complex_gated_register is (very_complex_register,
gated_register) {
// No sensible way to combine the two implementations by calling both.
// Even if there were, calling both implementations would cause each field
// of the register to be written to multiple times, potentially duplicating
// side-effects, which is undesirable.
// Instead, very_complex_register is chosen as the base implementation
// called, and the behaviour of gated_register is replicated around that
// call.
method write_register(uint64 written, uint64 enabled_bytes,
void *aux) default {
if (write_allowed()) {
this.templates.very_complex_register.write_register(
written, enabled_bytes, aux);
} else {
on_write_attempted_when_not_allowed();
}
}
}
in each (gated_register, very_complex_register) {
is very_complex_gated_register;
}
A template-qualified method implementation call is resolved by using
the method implementation provided to the object by the named template.
If no such implementation is provided (whether it be because the template does
not specify one, or specifies one which is not provided to the object due to its
definition being eliminated by an #if
), then the
ancestor templates of the named template are recursively searched for the
highest-rank (most specific) implementation provided by them. If the ancestor
templates provide multiple hierarchically unrelated implementations, then the
choice is ambiguous and the call will be rejected by the compiler. In this case,
the modeller must refine the template-qualified method implementation call to
name the ancestor template whose implementation they would like to use.
A template-qualified method implementation call done via a value of template
type functions differently compared to compile-time
object references. In particular, this.templates
within the bodies of shared
methods functions differently. The specified template must be an ancestor
template of the value's template type, the object template, or the
template type itself; furthermore, the specified template must provide or
inherit a shared
implementation of the named method. It is not sufficient
that the method is simply declared shared
such that it is part of the
template type: the implementation itself must also be shared
. For more
information, see the documentation of the ENSHAREDTQMIC
error
message.
after ...: method(e1, ... eN);
The after
statement sets up the given method call (the callback) such that
it will be performed with the provided arguments at a specified point in the
future. There are three different forms of the after
statement, syntactically
determined through what appears before the :
— each form corresponds
to different specifications of at what future point the method should be called.
A method call suspended using an after
statement will be performed at most
once per execution of the after
statement; it will not recur. If it's
desirable to have a suspended method call recur, then the called method must
itself make use of after
to set up a method call to itself.
The referenced method must be a regular or independent
method with no return values. It may not be a C function, or a shared
method. The only exception to this is that the send_now
operation of hooks is also supported for use as a callback.
All method calls suspended via an after
statement are associated with the
object that contains the method containing the statement. It is possible to
cancel all suspended method calls associated with an object through that
object's cancel_after()
method, as provided by the object
template.
Note: We plan to extend the after
statement to allow for users to
explicitly state what objects the suspended method call is to be associated
with.
after scalar unit: method(e1, ... eN);
In this form, the specified point in the future is given through a time delay
(in simulated time, measured in the specified time unit) relative to the time
when the after delay statement is executed. The currently supported time units
are s
for seconds (with type double
), ps
for picoseconds
(with type uint64
), and cycles
for cycles (with type uint64
).
Every argument to the called method is evaluated at the time the after
statement is executed, and stored so that they may be used when the method call
is to be performed. In order to allow the suspended method call to be
represented in checkpoints, every input parameter of the method must be of
serializable type. This means that after delay
statements cannot be used with methods that e.g. have pointer input parameters.
unless the arguments for those input parameters are message component parameters
of the after
.
Example:
after 0.1 s: my_callback(1, false);
The after delay statement is equivalent to creating a named event
object with an event-method that performs the specified call, and posting
that event at the given time, with associated data corresponding to the
provided arguments.
after hookref[-> (msg1, ... msgN)]: method(e1, ... eM);
In this form, the suspended method call is bound to the hook specified by hookref. The point in the future when the method call is executed is thus the next time a message is sent through the specified hook.
The binding syntax -> (msg1, ... msgN) is used to bind each component of the message received to a corresponding identifier, called a message component parameter. These message component parameters can be used as arguments of the called method, thus propagating the contents of the message to the method call.
Every argument to the called method which isn't a message component parameter is
evaluated at the time the after
statement is executed, and stored so that they
may be used when the method call is to be performed. In order to allow the
suspended method call to be represented in checkpoints, every input parameter of
the method must be of serializable type, unless that
input parameter receives a message component. This means that hook-bound after
statements cannot be used with methods that e.g. have pointer input parameters,
unless the arguments for those input parameters are message component parameters
of the after
.
Example use:
hook(int, float) h;
method my_callback(int i, float f, bool b) {
...
}
method m() {
after h -> (x, y): my_callback(x, y, false);
}
method send_message() {
// Assuming m() has been called once before, this 'send_now' will result in
// `my_callback(1, 3.7, false)` being called.
h.send_now(1, 3.7);
}
If the hook has only one message component, the syntax -> msg
can be used instead, and if the hook has no message components, then the binding
syntax can be entirely omitted. Any message component parameter can be used for
any number of arguments, but cannot be used as anything but a direct argument.
For example, using the definitions of h
and my_callback
as above, the
following use of after
is valid:
after h -> (x, y): my_callback(x, x, false)
Note that the first message component is used multiple times, and the second is not used at all.
In contrast, the following use of after
is invalid:
after h -> (x, y): my_callback(i, y + 1.5, false)
as the message component parameter y
is used, but not as a direct argument.
after: method(e1, ... eN);
In this form, the specified point in the future is when control is given back to the simulation engine such that the ongoing simulation of the current processor may progress, and would otherwise be ready to move onto the next cycle. This happens after all entries to devices on the call stack have been completed.
Immediate after statements are most useful to avoid ordering bugs. It can be used to delay a method call until all current lines of execution into the device have been completed, and the device is guaranteed to be in a consistent state where it is ready to handle the method call.
Semantically, the immediate after statement is very close to
after 0 cycles: ...
, but has a number of advantages. In general, the immediate
after statement is designed to execute the callback as promptly as possible
while satisfying the semantics stated above, while after 0 cycles: ...
is not.
In particular, in Simics, callbacks delayed via after 0 cycles
are always
bound to the clock associated with the device instance, which is not always
that of the processor currently under simulation — in such cases the
simulated processor may progress indefinitely without the posted callback being
executed. The immediate after statement does not have this issue.
In addition, if an immediate after statement is executed while the
simulation is stopped (due to a device entry such as an attribute get/set
performed from a script/CLI) then the callback is registered as work,
thus guaranteeing that it is called before the simulation starts again.
Within a particular device instance, method calls suspended by immediate after statements are executed in order of least recently suspended; in other words, FIFO semantics. The order in which method calls suspended by immediate after statements are executed across multiple device instances is not defined.
Within an immediate after statement, every argument provided to the called
method is evaluated at the time the after
statement is executed, and stored so
that they may be used when the method call is to be performed. Unlike the other
forms of after
statements, the input parameters of the method are never
required to be of serializable type, meaning pointers can be passed as arguments
to the callback. But beware: pointers to stack-allocated data (pointers to
or within local
variables) must never be passed as arguments. The
stack with which the after
statement is executed is not preserved, so any
pointers to stack-allocated data will point to invalid data by the time the
callback is called. The DML compiler has some checks in place to warn about the
most obvious cases where pointers to stack-allocated data are provided as
arguments, but it is unable to detect all cases. It is ultimately the modeller's
responsibility to ensure it doesn't happen.
To detail a scenario exemplifying the kind of issues that immediate after may be leveraged to solve, consider the following device, which needs to communicate with a manager device and receive permission in order to perform a particular action. Its function is simple: once prompted, the device will raise a signal to the manager in order to request permission, and waits for it to respond with an acknowledgement. Once received, the device lowers the signal to the manager and performs the action it just received permission for. In order to implement the asynchronous logic needed for this, a simple FSM is used.
param STATE_IDLE = 0;
param STATE_EXPECTING_ACK = 1;
saved int curr_state = STATE_IDLE;
port manager_link {
connect manager {
interface signal;
}
implement signal {
method signal_raise() {
on_acknowledgement();
}
}
}
method request_permission_for_action() {
if (curr_state != STATE_IDLE) {
log error: "Request already in progress";
return;
}
manager_link.manager.signal.signal_raise();
curr_state = STATE_EXPECTING_ACK;
}
method on_acknowledgement() {
if (curr_state != STATE_EXPECTING_ACK) {
log spec_viol: "Received ack when not expecting it";
return;
}
manager_link.manager.signal.signal_lower();
perform_permission_gated_action();
curr_state = STATE_IDLE;
}
This device has a subtle bug: it can't handle if the manager responds to the
signal_raise()
call synchronously.
The FSM transitions to the state capable of handling the acknowledgement
only after the signal_raise()
call returns, so if the manager responds
synchronously — as part of the signal_raise()
call — then
on_acknowledgement
will be called while the device still considers itself to
be in its idle state.
This bug can be solved in numerous ways — the most obvious is to
transition the state before making the signal_raise()
call — but
immediate after provides a solution which doesn't require carefully managing
the FSM's logic, by delaying the call to on_acknowledgement
until the device
is done with all other logic.
implement signal {
method signal_raise() {
after: on_acknowledgement();
}
}
This guarantees that the FSM is able to finish its current line of execution and properly transition itself to its new state before it's asked to manage any response of manager, even if the manager responds synchronously.
log log-type[, level [ then subsequent_level ] [, groups] ]: format-string, e1, ..., eN;
Outputs a formatted string to the Simics logging facility. The string
following the colon is a normal C printf
format string,
optionally followed by one or more arguments separated by commas. (The
format string should not contain the name of the device, or the type of
the message, e.g., "error:..."; these things are automatically prefixed.)
Either both of level
and groups
may be
omitted, or only the latter; i.e., if groups
is
specified, then level
must also be given explicitly.
A Simics user can configure the logging facility to show only specific messages, by matching on the three main properties of each message:
-
The
log-type
specifies the general category of the message. The value must be one of the identifiersinfo
,warning
,error
,critical
,spec_viol
, orunimpl
. -
The
level
specifies at what verbosity level the log messages are displayed. The value must be an integer from 1 to 4; if omitted, the default level is 1. The different levels have the following meaning:-
Important messages (displayed at the normal verbosity level)
-
High level informative messages (like mode changes and important events)
-
Medium level information (the lowest log level for SW development)
-
Debugging level with low level model detail (Mainly used for model development)
-
-
If
subsequent_level
is specified, then all logs after the first issued will be on the levelsubsequent_level
. You are allowed to specify asubsequent_level
of 5, meaning no logging after the initial log. -
The
groups
argument is an integer whose bit representation is used to select which log groups the message belongs to. If omitted, the default value is 0. The log groups are specific for the device, and must be declared using theloggroup
device-level declaration. For example, a DML source file containing the declarationsloggroup good; loggroup bad; loggroup ugly;
could also contain a log statement such as
log info, 2, (bad | ugly): "...";
(note the
|
bitwise-or operator), which would be displayed if the user chooses to view messages from groupbad
orugly
, but not if only groupgood
is shown.Groups allow the user to create arbitrary classifications of log messages, e.g., to indicate things that occur in different states, or in different parts of the device, etc. The two log groups
Register_Read
andRegister_Write
are predefined by DML, and are used by several of the built-in methods.
The format-string
should be one or several string
literals concatenated by the '+' operator, all optionally surrounded
by round brackets.
See also Simics Model Builder User's Guide, section "Logging", for further details.
assert expr;
Evaluates expr
. If the result is true
, the
statement has no effect; otherwise, a runtime-error is generated.
expr
must have type bool
.
error [string];
Attempting to compile an error
statement causes the compiler to
generate an error, using the specified string as error message. The
string may be omitted; in that case, a default error message is used.
The string
, if present, should be one or several
string literals concatenated by the '+' operator, all optionally
surrounded by round brackets.
foreach identifier in (expr) statement
The foreach
statement repeats its body (the
statement
part) once for each element given by expr
.
The identifier
is used to refer to the current element
within the body.
DML currently only supports foreach
iteration on values of sequence
types
— which are created through Each-In expressions.
The continue
statement can be used within a foreach
loop to continue to the
next element, and the break
statement can be used to exit the loop.
#foreach identifier in (expr) statement
In this alternative form the expr
is required
to be a DML compile-time constant,
and the loop is completely unrolled by the DML compiler.
This can be combined with tests on the value of
identifier
within the body, which will be evaluated at
compile time.
DML currently only supports #foreach
iteration on compile-time list
constants.
For example:
#foreach x in ([3,2,1]) {
#if (x == 1) foo();
#else #if (x == 2) bar();
#else #if (x == 3) baz();
#else error "out of range";
}
would be equivalent to
baz();
bar();
foo();
Only #if
can be used to make such selections; switch
or
if
statements are not evaluated at compile time. (Also
note the use of error
above to catch any compile-time
mistakes.)
The break
statement can be used within a #foreach
loop to exit it.
select identifier in (expr) where (cond-expr) statement else default-statement
The select
statement resembles a C switch
statement and is very similar
to the foreach
statement, but executes the statement
exactly once for the
first matching element of those given by expr
, i.e., for the first element
such that cond-expr
is true
; or if no element matches, it executes the
default-statement
.
#select identifier in (expr) where (cond-expr) statement #else default-statement
In this alternative form the expr
is required to be a DML
compile-time constant, and
cond-expr
can only depend on compile-time constants, apart
from identifier
. The selection will then be performed by
the DML compiler at compile-time, and code will only be generated for
the selected case.
DML currently only supports #select
iteration on compile-time list
constants.
Note: The select
statement has been temporarily removed from DML 1.4 due
to semantic issues, and only the #select
form may currently be used.
The select
statement will be reintroduced in the near future.
#if (condition) { true_body } #else { false_body }
The #if
statement resembles a C if
statement. The difference
being that the #if
statement must have a constant-valued
condition and the statement is evaluated at compile-time.
The true_body of the #if
is only processed
if the condition evaluates to true
,
and will be dead-code eliminated otherwise.
Similarly, the #else
statement can immediately follow the body of an
#if
statement and the false_body will only be processed
if the condition in the preceding #if
evaluates to
false
.
All ISO C operators are available in DML, except for certain limitations
on the comma-operator, the sizeof
operator, and type casts; see
Section Comparison to C/C++. Operators have the same
precedences and semantics as in C
DML adds the following expressions:
undefined
The constant undefined
is an abstract compile-time
only value, mostly used as a default for parameters that are
intended to optionally be overridden. The undefined
expression may only
appear as a parameter value and as argument to
the defined expr
test (see below).
identifier
To reference something in the DML object structure, members may be
selected using
.
and ->
as in C. (However, most objects in the DML
object structure are proper substructures selected with the .
operator.) For example,
this.size # a parameter
dev.bank1 # a bank object
bank1.r0.hard_reset # a method
The DML object structure is a compile-time construction; references to certain objects are not considered to be proper values, and result in compile errors if they occur as standalone expressions.
Some DML objects are proper values, while others are not:
-
session
/saved
variables are proper values -
Composite object references (to
bank
,group
,register
, etc.) are not proper values unless cast to a template type. -
Inside an object array, the index variable (named
i
by default) may evaluate to an unknown index if accessed from a location where the index is not statically known. For instance, ingroup g[i < 4] { #if (i == 0) { ... } }
, the#if
statement is invoked once, statically, across all indices, meaning that thei
reference is an unknown index, and will yield a compile error. -
A reference to a
param
member is a proper value only if the parameter value is a proper value: A parameter value can be a reference to an object, an object array, a list, theundefined
expression, or a static index (discussed above), in which case the parameter is not allowed as a standalone expression. -
When the object structure contains an array of objects, e.g.
register r[4] { ... }
, then a reference to the array itself (i.e.r
as opposed tor[0]
), is not considered a proper value.
If a DML object is not a proper value, then a reference to the object will give a compile error unless it appears in one of the following contexts:
-
As the left operand of the
.
operator -
As the definition of a
param
-
As a list element in a compile-time list
-
As the operand of the
defined
operator -
A
method
object may be called -
An object array may appear in an index expression
array[index]
-
An unknown index may be used as an index to an object array; in the resulting object reference, the corresponding index variable of the object array will have an unknown value.
It is possible to retrieve a function pointer for a method by using the prefix
operator &
with a reference to that method. The methods this is possible with
are subject to the same restrictions as with the export
object
statement: it's not possible to retrieve a function
pointer to any inline method, shared method, method that throws, method with
more than one return argument, or method declared inside an object array.
For example, with the following method in DML:
method my_method(int x) { ... }
then the expression &my_method
will be a function pointer of type:
void (*)(conf_object_t *, int);
The conf_object_t *
parameter corresponds to the device instance, and is
omitted when the referenced method is independent.
Note that due to the precedence rules of &
, if you want to immediately call a
method reference converted to a function pointer, then you will need to wrap
parentheses around the converted method reference. An example of where this may
be useful is in order to call a non-independent method from within an
independent method:
independent method callback(int i, void *aux) {
local conf_object_t *obj = aux;
(&my_method)(obj, i);
}
new type new type[count]
Allocates a chunk of memory large enough for a value of the specified type. If the second form is used, memory for count values will be allocated. The result is a pointer to the allocated memory. (The pointer is never null; if allocation should fail, the Simics application will be terminated.)
When the memory is no longer needed, it should be deallocated using a
delete
statement.
cast(expr, type)
Type casts in DML must be written with the above explicit cast
operator, for syntactical reasons.
Semantically, cast(expr, type)
is equivalent to
the C expression (type) expr
.
sizeoftype type
The sizeof
operator in DML can only be used on expressions,
not on types, for syntactical reasons. To take the size of a datatype,
the sizeoftype
operator must be used, as in
int size = sizeoftype io_memory_interface_t;
Semantically, sizeoftype type
is equivalent to the C
expression sizeof (type)
.
DML does not know the sizes of all types statically; DML usually regards a
sizeoftype
expression as non-constant and delegates size calculations
to the C compiler. DML does evaluate the sizes of integer types, layout types,
and constant-sized arrays thereof, as constants.
defined expr
This compile-time test evaluates to false
if
expr
has the value undefined
, and to
true
otherwise.
An expression each
-in
is available to traverse all objects
that implement a specific template. This can be used as a generic hook
mechanism for a specific template, e.g. to implement custom reset patterns.
For example, the following can be used to reset all registers in the bank
regs
:
foreach obj in (each hard_reset_t in (regs)) {
obj.hard_reset();
}
An each
-in
expression can currently only be used for
iteration in a foreach
statement. The expression's type
is sequence(template-name)
.
An each
-in
expression searches recursively in the
object hierarchy for objects implementing the template, but once it
finds such an object, it does not continue searching inside that
subobject. Recursive traversal can be achieved by letting the
template itself contain a method that descends into subobjects; the
implementation of hard_reset
in utility.dml
demonstrates how this can be done.
The order in which objects are given by a specific each
-in
expression is not
defined, except for that it is deterministic. That is, for a particular choice
of template X
and object Y
in an each X in (Y)
expression, for a
particular iteration of the device model, and for the particular DMLC build
used, the order in which objects are given by that expression is guaranteed to
be consistent.
[e1, ..., eN]
A list is a compile-time only value, and is an ordered sequence
of zero or more compile-time constant values. Lists are in particular
used in combination with foreach
and select
statements.
A list expression may only appear in the following contexts:
-
As the list to iterate over in a
#foreach
or#select
statement -
As the value in a
param
orconstant
declaration -
As a list element in another compile-time list
-
In an index expression,
list[index]
-
As the operand of the
defined
operator
list.len sequence.len object-array.len value-array.len
Used to obtain the length of a list
,
sequence
, object-array
,
or value-array
expression.
This expression is constant for each form but sequence
expressions.
The value-array
form can only be used with arrays of known
constant size: it can't be used with pointers, arrays of unknown size,
or variable-length arrays.
expr[e1:e2] expr[e1:e2, bitorder] expr[e1] expr[e1, bitorder]
If expr
is of integer type, then the above
bit-slicing syntax can be used in DML to simplify extracting or
updating particular bit fields of the integer. Bit slice syntax can be
used both as an expression producing a value, or as the target of an
assignment (an L-value), e.g., on the left-hand side of an =
operator.
Both e1
and e2
must be integers. The
syntax expr[e1]
is a short-hand for
expr[e1:e1]
(but only evaluating
e1
once).
The bitorder
part is optional, and selects the bit
numbering scheme (the "endianness") used to interpret the values of
e1
and e2
. If present, it must be one
of the identifiers be
or le
, just as in the
bitorder
device-level declaration. If no
bitorder
is given in the expression, the global bit
numbering (as defined by the bitorder
declaration) is used.
The first bit index e1
always indicates the most
significant bit of the field,
regardless of the bit numbering scheme. If the default
little-endian bit numbering is used, the least significant bit of the
integer has index zero, and the most significant bit of the integer has
index n - 1, where n is the width of the integer type.
If big-endian bit numbering is used, e.g., due to a bitorder be;
declaration in the file, or using a specific local bit
numbering as in expr[e1:e2, be]
, then
the bit corresponding to the little-endian bit number n - 1 has
index zero, and the least significant bit has the index n - 1,
where n is the bit width of expr
. Note that
big-endian numbering is illegal if expr
isn't a simple
expression with a well-defined bit width. This means that only local
variables, method parameters, device variables (registers, data etc),
and explicit cast expressions are allowed. For little-endian
numbering, any expressions are allowed, since there is never any doubt
that bit 0 is the least significant bit.
If the bit-slicing expression results in a zero or negative sized range of bits, the behavior is undefined.
stringify(expr)
Translates the value of expr
(which must be a
compile-time constant) into a string constant. This is similar to the
use of #
in the C preprocessor, but is performed on the level
of compile time values, not tokens. The result is often used with the
+
string operator.
expr1 + expr2
If both expr1
and expr2
are compile-time
string constants, the expression expr1 + expr2
concatenates the two strings at compile time. This is often used in
combination with the #
operator, or to break long lines for
source code formatting purposes.
condition #? expr1 #: expr2
Similar to the C conditional
expression, with the difference
that the condition must have a constant value and the
expression is evaluated at compile-time.
expr1 is only processed if the
condition is true
and expr2 is only processed
if condition is false
, so an expression like
false #? 1/0 #: 0
is equivalent to 0
.