www.digitalmars.com

D Programming Language 1.0


Last update Sun Dec 30 20:34:43 2012

Application Binary Interface

A D implementation that conforms to the D ABI (Application Binary Interface) will be able to generate libraries, DLL's, etc., that can interoperate with D binaries built by other implementations.

C ABI

The C ABI referred to in this specification means the C Application Binary Interface of the target system. C and D code should be freely linkable together, in particular, D code shall have access to the entire C ABI runtime library.

Endianness

The endianness (byte order) of the layout of the data will conform to the endianness of the target machine. The Intel x86 CPUs are little endian meaning that the value 0x0A0B0C0D is stored in memory as: 0D 0C 0B 0A.

Basic Types

bool8 bit byte with the values 0 for false and 1 for true
byte8 bit signed value
ubyte8 bit unsigned value
short16 bit signed value
ushort16 bit unsigned value
int32 bit signed value
uint32 bit unsigned value
long64 bit signed value
ulong64 bit unsigned value
cent128 bit signed value
ucent128 bit unsigned value
float32 bit IEEE 754 floating point value
double64 bit IEEE 754 floating point value
realimplementation defined floating point value, for x86 it is 80 bit IEEE 754 extended real

Delegates

Delegates are fat pointers with two parts:

Delegate Layout
offset property contents
0 .ptr context pointer
ptrsize .funcptr pointer to function

The context pointer can be a class this reference, a struct this pointer, a pointer to a closure (nested functions) or a pointer to an enclosing function's stack frame (nested functions).

Structs

Conforms to the target's C ABI struct layout.

Classes

An object consists of:

Class Object Layout
size property contents
ptrsize .__vptr pointer to vtable
ptrsize .__monitor monitor
... ... super's non-static fields and super's interface vptrs, from least to most derived
... named fields non-static fields
ptrsize...   vptr's for any interfaces implemented by this class in left to right, most to least derived, order

The vtable consists of:

Virtual Function Pointer Table Layout
size contents
ptrsize pointer to instance of ClassInfo
ptrsize... pointers to virtual member functions

Casting a class object to an interface consists of adding the offset of the interface's corresponding vptr to the address of the base of the object. Casting an interface ptr back to the class type it came from involves getting the correct offset to subtract from it from the object.Interface entry at vtbl[0]. Adjustor thunks are created and pointers to them stored in the method entries in the vtbl[] in order to set the this pointer to the start of the object instance corresponding to the implementing method.

An adjustor thunk looks like:

 ADD EAX,offset
  JMP method

The leftmost side of the inheritance graph of the interfaces all share their vptrs, this is the single inheritance model. Every time the inheritance graph forks (for multiple inheritance) a new vptr is created and stored in the class' instance. Every time a virtual method is overridden, a new vtbl[] must be created with the updated method pointers in it.

The class definition:

class XXXX {
  ....
};

Generates the following:

Interfaces

An interface is a pointer to a pointer to a vtbl[]. The vtbl[0] entry is a pointer to the corresponding instance of the object.Interface class. The rest of the vtbl[1..$] entries are pointers to the virtual functions implemented by that interface, in the order that they were declared.

A COM interface differs from a regular interface in that there is no object.Interface entry in vtbl[0]; the entries vtbl[0..$] are all the virtual function pointers, in the order that they were declared. This matches the COM object layout used by Windows.

Arrays

A dynamic array consists of:

Dynamic Array Layout
offset property contents
0 .length array dimension
size_t .ptr pointer to array data

A dynamic array is declared as:

type[] array;

whereas a static array is declared as:

type[dimension] array;

Thus, a static array always has the dimension statically available as part of the type, and so it is implemented like in C. Static array's and Dynamic arrays can be easily converted back and forth to each other.

Associative Arrays

Associative arrays consist of a pointer to an opaque, implementation defined type. The current implementation is contained in and defined by internal/aaA.d.

Reference Types

D has reference types, but they are implicit. For example, classes are always referred to by reference; this means that class instances can never reside on the stack or be passed as function parameters.

When passing a static array to a function, the result, although declared as a static array, will actually be a reference to a static array. For example:

int[3] abc;

Passing abc to functions results in these implicit conversions:

void func(int[3] array); // actually <reference to><array[3] of><int>
void func(int* p);       // abc is converted to a pointer
                         // to the first element
void func(int[] array);	 // abc is converted to a dynamic array

Name Mangling

D accomplishes typesafe linking by mangling a D identifier to include scope and type information.

MangledName:
    _D QualifiedName Type
    _D QualifiedName M Type

QualifiedName:
    SymbolName
    SymbolName QualifiedName

SymbolName:
    LName
    TemplateInstanceName

The M means that the symbol is a function that requires a this pointer.

Template Instance Names have the types and values of its parameters encoded into it:

TemplateInstanceName:
     __T LName TemplateArgs Z

TemplateArgs:
    TemplateArg
    TemplateArg TemplateArgs

TemplateArg:
    T Type
    V Type Value
    S LName

Value:
    n
    Number
    i Number
    N Number
    e HexFloat
    c HexFloat c HexFloat
    A Number Value...
    S Number Value...

HexFloat:
    NAN
    INF
    NINF
    N HexDigits P Exponent
    HexDigits P Exponent

Exponent:
    N Number
    Number

HexDigits:
    HexDigit
    HexDigit HexDigits

HexDigit:
    Digit
    A
    B
    C
    D
    E
    F
n
is for null arguments.
Number
is for positive numeric literals (including character literals).
N Number
is for negative numeric literals.
e HexFloat
is for real and imaginary floating point literals.
c HexFloat c HexFloat
is for complex floating point literals.
Width Number _ HexDigits
Width is whether the characters are 1 byte (a), 2 bytes (w) or 4 bytes (d) in size. Number is the number of characters in the string. The HexDigits are the hex data for the string.
A Number Value...
An array or asssociative array literal. Number is the length of the array. Value is repeated Number times for a normal array, and 2 * Number times for an associative array.
S Number Value...
A struct literal. Value is repeated Number times.
Name:
    Namestart
    Namestart Namechars

Namestart:
    _
    Alpha

Namechar:
    Namestart
    Digit

Namechars:
    Namechar
    Namechar Namechars

A Name is a standard D identifier.

LName:
    Number Name

Number:
    Digit
    Digit Number

Digit:
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9

An LName is a name preceded by a Number giving the number of characters in the Name.

Type Mangling

Types are mangled using a simple linear scheme:

Type:
    Shared
    Const
    Immutable
    Wild
    TypeArray
    TypeStaticArray
    TypeAssocArray
    TypePointer
    TypeFunction
    TypeIdent
    TypeClass
    TypeStruct
    TypeEnum
    TypeTypedef
    TypeDelegate
    TypeNone
    TypeVoid
    TypeByte
    TypeUbyte
    TypeShort
    TypeUshort
    TypeInt
    TypeUint
    TypeLong
    TypeUlong
    TypeFloat
    TypeDouble
    TypeReal
    TypeIfloat
    TypeIdouble
    TypeIreal
    TypeCfloat
    TypeCdouble
    TypeCreal
    TypeBool
    TypeChar
    TypeWchar
    TypeDchar
    TypeTuple

Shared:
    O Type

Const:
    x Type

Immutable:
    y Type

Wild:
    Ng Type

TypeArray:
    A Type


TypeStaticArray:
    G Number Type

TypeAssocArray:
    H Type Type

TypePointer:
    P Type

TypeFunction:
    CallConvention Arguments ArgClose Type

CallConvention:
    F       // D
    U       // C
    W       // Windows
    V       // Pascal
    R       // C++



Arguments:
    Argument
    Argument Arguments

Argument:
    Type
    J Type     // out
    K Type     // ref
    L Type     // lazy

ArgClose
    X     // variadic T t...) style
    Y     // variadic T t,...) style
    Z     // not variadic

TypeIdent:
    I LName

TypeClass:
    C LName

TypeStruct:
    S LName

TypeEnum:
    E LName

TypeTypedef:
    T LName

TypeDelegate:
    D TypeFunction

TypeNone:
    n

TypeVoid:
    v

TypeByte:
    g

TypeUbyte:
    h

TypeShort:
    s

TypeUshort:
    t

TypeInt:
    i

TypeUint:
    k

TypeLong:
    l

TypeUlong:
    m

TypeFloat:
    f

TypeDouble:
    d

TypeReal:
    e

TypeIfloat:
    o

TypeIdouble:
    p

TypeIreal:
    j

TypeCfloat:
    q

TypeCdouble:
    r

TypeCreal:
    c

TypeBool:
    b

TypeChar:
    a

TypeWchar:
    u

TypeDchar:
    w

TypeTuple:
    B Number Arguments

Function Calling Conventions

The extern (C) and extern (D) calling convention matches the C calling convention used by the supported C compiler on the host system. Except that the extern (D) calling convention for Windows x86 is described here.

Register Conventions

Return Value

Parameters

The parameters to the non-variadic function:

	foo(a1, a2, ..., an);

are passed as follows:

a1
a2
...
an
hidden
this

where hidden is present if needed to return a struct value, and this is present if needed as the this pointer for a member function or the context pointer for a nested function.

The last parameter is passed in EAX rather than being pushed on the stack if the following conditions are met:

Parameters are always pushed as multiples of 4 bytes, rounding upwards, so the stack is always aligned on 4 byte boundaries. They are pushed most significant first. out and ref are passed as pointers. Static arrays are passed as pointers to their first element. On Windows, a real is pushed as a 10 byte quantity, a creal is pushed as a 20 byte quantity. On Linux, a real is pushed as a 12 byte quantity, a creal is pushed as two 12 byte quantities. The extra two bytes of pad occupy the ‘most significant’ position.

The callee cleans the stack.

The parameters to the variadic function:

	void foo(int p1, int p2, int[] p3...)
	foo(a1, a2, ..., an);

are passed as follows:

p1
p2
a3
hidden
this

The variadic part is converted to a dynamic array and the rest is the same as for non-variadic functions.

The parameters to the variadic function:

	void foo(int p1, int p2, ...)
	foo(a1, a2, a3, ..., an);

are passed as follows:

an
...
a3
a2
a1
_arguments
hidden
this

The caller is expected to clean the stack. _argptr is not passed, it is computed by the callee.

Exception Handling

Windows

Conforms to the Microsoft Windows Structured Exception Handling conventions.

Linux

FreeBSD and OS X,

Uses static address range/handler tables. It is not compatible with the ELF/Mach-O exception handling tables. The stack is walked assuming it uses the EBP/RBP stack frame convention. The EBP/RBP convention must be used for every function that has an associated EH (Exception Handler) table.

For each function that has exception handlers, an EH table entry is generated.

EH Table Entry
field description
void* pointer to start of function
DHandlerTable* pointer to corresponding EH data
uint size in bytes of the function

The EH table entries are placed into the following special segments, which are concatenated by the linker.

EH Table Segment
Operating System Segment Name
Windows FI
Linux .deh_eh
FreeBSD.deh_eh
OS X __deh_eh, __DATA

The rest of the EH data can be placed anywhere, it is immutable.

DHandlerTable
field description
void* pointer to start of function
uint offset of ESP/RSP from EBP/RBP
uint offset from start of function to return code
uint number of entries in DHandlerInfo[]
DHandlerInfo[] array of handler information

DHandlerInfo
field description
uint offset from function address to start of guarded section
uint offset of end of guarded section
int previous table index
uint if != 0 offset to DCatchInfo data from start of table
void* if not null, pointer to finally code to execute

DCatchInfo
field description
uint number of entries in DCatchBlock[]
DCatchBlock[] array of catch information

DCatchBlock
field description
ClassInfo catch type
uint offset from EBP/RBP to catch variable
void* catch handler code

Garbage Collection

The interface to this is found in phobos/internal/gc.

Runtime Helper Functions

These are found in phobos/internal.

Module Initialization and Termination

All the static constructors for a module are aggregated into a single function, and a pointer to that function is inserted into the ctor member of the ModuleInfo instance for that module.

All the static denstructors for a module are aggregated into a single function, and a pointer to that function is inserted into the dtor member of the ModuleInfo instance for that module.

Unit Testing

All the unit tests for a module are aggregated into a single function, and a pointer to that function is inserted into the unitTest member of the ModuleInfo instance for that module.

Symbolic Debugging

D has types that are not represented in existing C or C++ debuggers. These are dynamic arrays, associative arrays, and delegates. Representing these types as structs causes problems because function calling conventions for structs are often different than that for these types, which causes C/C++ debuggers to misrepresent things. For these debuggers, they are represented as a C type which does match the calling conventions for the type. The dmd compiler will generate only C symbolic type info with the -gc compiler switch.

Types for C Debuggers
D type C representation
dynamic array unsigned long long
associative array void*
delegate long long
dchar unsigned long

For debuggers that can be modified to accept new types, the following extensions help them fully support the types.

Codeview Debugger Extensions

The D dchar type is represented by the special primitive type 0x78.

D makes use of the Codeview OEM generic type record indicated by LF_OEM (0x0015). The format is:

Codeview OEM Extensions for D
field size 2 2 2 2 2 2
D Type Leaf Index OEM Identifier recOEM num indices type index type index
dynamic array LF_OEM OEM 1 2 @index @element
associative array LF_OEM OEM 2 2 @key @element
delegate LF_OEM OEM 3 2 @this @function

Where:

OEM 0x42
index type index of array index
key type index of key
element type index of array element
this type index of context pointer
function type index of function

These extensions can be pretty-printed by obj2asm.

The Ddbg debugger supports them.

Dwarf Debugger Extensions

The following leaf types are added:

Dwarf Extensions for D
D type Identifier Value Format
dynamic array DW_TAG_darray_type 0x41 DW_AT_type is element type
associative array DW_TAG_aarray_type 0x42 DW_AT_type, is element type, DW_AT_containing_type key type
delegate DW_TAG_delegate_type 0x43 DW_AT_type, is function type, DW_AT_containing_type is ‘this’ type

These extensions can be pretty-printed by dumpobj.

The ZeroBUGS debugger supports them.

Note that these Dwarf extensions have been removed as they conflict with recent gcc additions.





Forums | Comments |  D  | Search | Downloads | Home