Using Assembly Language Functions
This chapter describes how to call assembly language functions from both C and C++ and how to create an interface to assembly language modules. It explains conventions for function return values, register usage, and data alignment at the assembly language level.Conventions for both 16- and 32-bit compilations are covered. When describing register usage and contents, the name of the corresponding 32-bit register appears in parentheses after the name of a 16-bit register.
For information about the advantages of writing assembly language code inline, instead of assembling it separately, see Using the Inline Assembler.
What's in This Chapter
- How to call assembly language code from C.
- How DMC++ object files are organized.
- Function return values and register usage.
- How to create assembly language routines with C linkage.
Implications of Type-Safe Linkage
Type-safe linkage affects how you call assembly language functions from a C++ program. You cannot use the standard C-to-assembly language interface for C++ functions for the following reasons:- Type-safe linkage means that the compiler internally modifies (mangles) the names of all C++ functions, decorating them with type information.
- The name mangling algorithm uses special characters that are not valid for C identifiers.
- DMC++ takes advantage of the requirement to distinguish C functions from C++ functions and uses a more efficient parameter-passing method in C++ functions than in C functions (while still allowing functions with variable numbers of arguments).
- C++ class member functions that are not static have an additional hidden argument, the this pointer.
The Easy Way to Call Assembly Code from C++
In many cases in which you want to call an assembly language routine from a C++ function, you can use the following method, which does not require you to worry about function-naming or parameter-passing conventions:extern "C" { int assembler_routine(int x); }
This method tells the C++ compiler through its function prototype that your assembly language routine uses C linkage. This is the easiest method of specifying C linkage and does not involve any change to the naming of the assembly language routine.
For complete information, see the section "Creating Routines With C++ Linkage" in this chapter.
Using existing assembly language modules
If you already have some assembly language routines written for use with DMC or Microsoft C, you can almost certainly use them with DMC++. However, you will need an ANSI C standard header file containing the function prototypes for these routines, and you will need to modify it to declare the functions as taking C linkage.The best method of specifying C linkage is to enclose in braces with an extern "C" {} statement the prototypes of your assembly language functions, as shown in the section "The Easy Way to Call Assembly Code from C++" above. The advantage is that you can use the same routine with both C and C++ modules.
Provided you include the header file containing the function prototypes in all the source files that use the assembly language routines, you will not even have to reassemble their code.
Similarly, when calling a C++ function from an assembly language routine, declare that function as having C linkage in your C++ program. The only exception to this rule is member functions, which cannot be given C linkage.
Organization of Object Files
Digital Mars .com files are not the same as those produced by other compilers. In most other compilers, CS==SS==DS for .com files, and the entire size of the program, plus stack and heap, must be less than 64KB. In Digital Mars .com files, only the size of the code plus DGROUP areas must be less than 64KB. Considerably larger .com programs can thus be created. Also, the only difference between a Digital Mars Tiny model program and a Small model program is how it is linked.In all but the Tiny model, the STACK segment is set to 128 bytes in length. This is enough to allow the operating system to start up the program. Code in the C++ startup module, c. asm, then allocates a full stack elsewhere. The 128 bytes are subsequently used to store the program command line so that it is addressable using the DS register. In the Tiny model, the STACK segment is zero bytes in length.
All BSS segments are cleared to 0 by the startup module, regardless of the memory model in use.
For the Tiny, Small, and Medium models, there are two schemes for allocation of the near heap. These schemes are selected by the value of the global variable _okbigbuf. For more information on memory allocation, see "Choosing a memory model" in Compiling Code.
Layout of Assembly Language Modules
To work with DMC++, assembly language code must be divided into code and data segments. Executable code and functions callable from C or C++ go into the code segment. Static and global data declarations go into the data segment.The pseudo-ops for defining the code and data segments for each memory model are different. Therefore, use the macros begcode, endcode, begdata, and enddata defined in macros. asm for each memory model. The general layout for an .asm source file is:
INCLUDE MACROS.ASM ;define memory model macros ;EXTRN statements for C/C++ ;functions to call go here begdata ;define start of data ;EXTRN statements for ;external data globals go here enddata ;define end of data segment begcode modulename ;define start of code ;executable code goes here endcode modulename ;define end of code segment END ;define end of module
Function Return Values for 16-Bit Models
For the 16-bit memory models (Tiny, Small, Medium, Compact, and Large), near pointers, ints, unsigned ints, and shorts are returned in AX. Chars are returned in AL. Longs and unsigned longs are returned in DX, AX, where DX contains the most significant 16 bits and AX contains the least significant 16 bits. Far pointers are returned in DX, AX, where DX has the segment portion and AX has the offset.When C linkage is in effect, floats are returned in DX, AX, and doubles are returned in AX, BX, CX, DX, where AX contains the most significant 16 bits, and DX contains the least significant. When C++ linkage is in effect, the compiler creates a temporary copy of the variable on the stack and returns a pointer to it. Both these techniques are reentrant. In AL, 1-byte structs are returned, 2-byte structs in AX, and 4-byte structs in DX, AX. With larger structures, the method used depends on the linkage system in use for the function. For C linkage, when a function returns a structure, it actually returns a pointer to the structure, which is in the static data segment. This means that C functions that return structures are not reentrant. C++ linkage creates a temporary copy on the stack and returns a pointer to it which is reentrant.
Function Return Values for 32-Bit Models
Near pointers, ints, unsigned ints, longs, and unsigned longs are returned in EAX. Chars are returned in AL; shorts are returned in AX. Far pointers are returned in DX, EAX, where DX contains the segment and EAX contains the offset. long longs are returned in EDX, EAX.When C Linkage is in effect, floats are returned in EAX and doubles in EDX, EAX, where EDX contains the most significant 32 bits and EAX contains the least significant 32 bits. When C++ linkage is in effect, the compiler creates a temporary copy of the variable on the stack and returns a pointer to it. Both these techniques are reentrant.
1-byte structs are returned in AL, 2-byte structs in AX, and 4-byte structs in EAX. With larger structures, the compiler creates a temporary copy of the variable on the stack and returns a (reentrant) pointer to it.
For 32-bit C++ code, where a struct has no constructors or destructors declared for it, 1-byte structs are returned in AL, 2-byte structs in AX, 4-byte structs in EAX, and 8-byte structs in EDX: EAX.
Warning: In previous versions of DMC++, small structs without constructors in 32-bit C++ code were passed through a hidden pointer to the return value. The change described above was made for compatibility with Microsoft. Due to this change, if you build part of an application with the current version of DMC++, you need to rebuild all of the application; otherwise, crash bugs could be introduced.
Register usage and data alignment for 16-bit models
When interfacing to 16-bit memory models, assembly language functions can change the values in AX, BX, CX, DX, or ES. Functions must preserve the values in SI, DI, BP, SP, SS, CS, and DS. The direction flag must always be set to forward.Data should be aligned along 16-bit boundaries to maximize speed on 16-bit buses.
Register usage and data alignment for 32-bit models
When interfacing to 32-bit memory models, assembly language functions can change the values in EAX, ECX, EDX, or ES. Functions must preserve the values in EBX, ESI, EDI, EBP, ESP, SS, FS, CS, and DS. The direction flag must always be set to forward.Data should be aligned along 32-bit boundaries to maximize speed on 32-bit buses.
Macros in macros.asm
There are macros defined in macros.asm that aid in the development of memory model-independent assembly language files. The macros are:
begcode | Define start of code segment |
endcode | Define end of code segment |
begdata | Define start of initialized data segment |
enddata | Define end of initialized data segment |
SIZEPTR | Default pointer size in bytes (2 for Tiny, Small, Medium models, 4 for Compact, Large, Phar Lap, and DOSX models) |
P | Offset of first parameter from BP (EBP) |
SPTR | Non-zero if pointers are near by default (Tiny, Small, Medium, Phar Lap, and DOSX memory models) |
LPTR | Non-zero if pointers are far by default (Compact and Large memory models) |
LCODE | Non-zero if large code (Medium or Large memory models) |
SSeqDS | Non-zero if SS == DS |
ESeqDS | Non-zero if ES == DS |
uses | Pushes registers that must be saved |
unuse | Pops saved registers |
Creating Routines with C Linkage
Calling an assembly language routine directly from a C function is much easier than calling an assembly language routine from C++. Subroutine linkage The BP register (EBP for 32-bit compilations) is dedicated to pointing to the current stack frame. A subroutine with C linkage is called by pushing the arguments onto the stack from right to left; then the subroutine is called. The called subroutine saves the old BP (EBP) on the stack, sets BP (EBP) to point to it, allocates space on the stack for all local variables, and pushes SI (ESI) and DI (EDI) if they are needed by the function. 32-bit code must also save the EBX register. The body of the subroutine is then executed.The subroutine returns by popping EBX (32-bit code only), DI (EDI) and SI (ESI), deallocating space on the stack for the local variables, popping off the old value of BP (EBP), and returning. The calling code then removes the parameters from the stack.
Organization of the stack frame
The stack frame of a function is the current state of the stack and variables in it at a given point in the execution of the function. The table below shows the normal organization of the stack frame.
High memory | |
Previous stack frame | |
Parameters | |
Return address | |
BP (EBP) | Old value of BP (EBP) |
Local variables and temporaries | |
SI (ESI) | |
SP (ESP) | DI (EDI) |
Low memory |
The stack grows downward (toward lower addresses).
Small model example
The example below shows a short C++ program that calls an assembly language function using C linkage. This function sets the cursor position to the coordinates x, y. All macros are expanded, and the calling function is translated to assembly language to further show how the compiler translates a function with C linkage. The utility to translate the function is obj2asm. exe.Here is the C++ program:
// essential! extern "C" void gotoxy(int x, int y); // normally in a header file int main() { gotoxy(10, 20); // set cursor position at ROW 10, COL 20 return 1; }
After compiling the C++ program to an object file, use the utility OBJ2ASM to produce the assembly language equivalent below:
_TEXT segment _main: mov AX,014h ; move 20 into AX push AX ; push on stack (2 BYTES) mov AX,0Ah ; move 10 into AX push AX ; push on stack (2 BYTES) callm _gotoxy ; call gotoxy() function add SP,4 ; adjust stack ptr. (4 BYTES) ret _TEXT ends
or for a 32-bit memory model:
_TEXT segment _main: push 014h ; push 20 on stack (4 BYTES) push 0Ah ; push 10 on stack (4 BYTES) callm _gotoxy ; call gotoxy() function add ESP,8 ; adjust stack ptr. (8 BYTES) ret _TEXT ends
Since the function gotoxy has been defined as using C linkage, the variables are pushed on the stack from right to left.
First, the column (20) is pushed on the stack. Next, the row (10) is pushed on the stack. Finally, the call to gotoxy is made, pushing the instruction pointer (IP) on the stack. Note that the 32-bit version pushes the parameters directly onto the stack, whereas the 16-bit version first moves them into AX. The table below shows some of the advantages of generating 32-bit code.
High memory | |
BP+4 (EBP+8) | 20 |
BP+2 (EBP+4) | 10 |
BP+0 (EBP+0) | IP (EIP) return address |
Low memory |
The compiler prepends an underscore to the function _main and _gotoxy. The _TEXT segment is the CODE segment. Table 5-3 shows how the stack looks after the call to _gotoxy(10,20):
The assembly language function below defines a set of utility macros for MASM 5. 0 and above that is supplied with the compiler and normally installed in the INCLUDE directory. All macros are defined in macros. asm. The 32-bit version is controlled by whether the macro DOS386 is defined.
include macros.asm ; pull in defs of macros begcode gotoxy ; define start of code seg called gotoxy ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; C++ interface routine, C linkage. ; Puts cursor at row, col. ; Usage: ; void gotoxy(int row, int col); IFNDEF DOS386 public _gotoxy ; make gotoxy global _gotoxy proc near ; define start of func push BP ; save old stack frame mov BP,SP ; set BP to point to old BP mov DH,P[BP] ; DH = row mov DL,P[BP+2] ; DL = col mov AH,2 ; BIOS function set cursor pos. xor BX,BX ; page 0 int 10h ; BIOS video interrupt pop BP ; restore old BP ret ; return to caller _gotoxy endp ; define end of func ELSE public _gotoxy ; make gotoxy global _gotoxy proc near ; define start of func push EBP ; save old stack frame mov EBP,ESP ; set EBP to point to old BP uses; saves registers that are used mov DH,P[EBP] ; DH = row mov DL,P[EBP 4] ; DL = col mov AH,2 ; BIOS function set cursor pos. xor EBX,EBX ; page 0 int 10h ; BIOS video interrupt unuse ; note reverse order pop EBP ; restore old EBP ret ; return to caller _gotoxy endp ; define end of func ENDIF endcode gotoxy ;define end of code seg end
The assembly language function begins by first pushing BP (EBP) onto the stack and moving the stack pointer into BP (EBP). This permits access to the variables pushed onto the stack by the calling function.
This is done by using BP (EBP) to point to offsets within the stack. In the example above, MOV DH, P[BP] (MOV DH, P[EBP]) obtains the row number from the stack and places it in DH. The next diagram shows the variables and their positions on the stack. P expands to 4 for the Tiny, Small, and Compact models; 6 for the Medium, and Large models; and 8 for the Phar Lap and DOSX models. It is the offset from BP (EBP) to the first parameter on the stack.
High memory | |
BP+6 (EBP+12) | 20 |
BP+4 (EBP+8) | 10 |
BP+2 (EBP+4) | IP (EIP) return address |
BP+0 (EBP+0) | Previous BP (EBP) |
Low memory |
After completing the function, restore BP (EBP) and return to the calling function. The ret will pop IP off the stack and begin execution at the instruction following the calln _gotoxy. The next instruction in the calling function is ADD SP, 4. (ADD ESP, 8) This instruction resets the stack pointer to the position it occupied before the parameters were pushed.
The above example pertains to the Tiny, Small, Compact, and the 32-bit models. If the Large or Medium memory models are used, the far call also pushes CS onto the stack. This changes the position of the variables on the stack to those shown below:
High memory | |
BP+8 | 20 |
BP+6 | 10 |
BP+4 | CS return segment |
BP+2 | IP return address |
BP+0 | Previous BP |
Low memory |
Using the P macro (defined in macros.asm) compensates for these differences.
Model-independent example
This example illustrates an assembly language routine to implement the following C function. The routine is written to make it assemble correctly for any memory model:// C++ MODULE extern var1; int var2; extern "C" int func1(int *p, int a);// essential! int func2(int *pa, int a) { int b; *pa = b; var2 = b + var1 + func1(&b, a); return a - var2; }
Here is the corresponding assembly language module:
; Assembler MODULE include MACROS.ASM IFNDEF DOS386 begdata ; define start of data seg extrn _var1:word _var2 dw 0 ; allocate var2 enddata ; end of data segment IF LCODE ; if large code model extrn _func1:far ; then far function ELSE extrn _func1:near ; else near function ENDIF begcode func2 public _func2 ; make func2 global IF LCODE _func2 proc far ; define function func2 ELSE _func2 proc near ; define function func2 ENDIF push BP ; save old frame pointer mov BP,SP ; set new frame pointer sub SP,2 ; create room for b mov AX,-2[BP] ; AX = b IF SPTR ; if small memory model mov BX,P[BP] ; BX = pa mov [BX],AX ; *pa = b ELSE ; else large memory model les BX,P[BP] ; ES:BX = pa mov ES:[BX],AX ; *pa = b ENDIF push P+SIZEPTR[BP] ; push a onto stack IF LPTR ; if far pointers push SS ; push segment of b ENDIF lea AX,-2[BP] ; AX = offset of b push AX call func1 ; call func1(& b, a) add SP,SIZEPTR+ 2 ; restore the stack add AX,_var1 ; func1 returned result in AX add AX,-2[BP] ; AX = b+ var1+ func1(a) mov _var2,AX mov AX,p+SIZEPTR[BP]; AX = a sub AX,_var2 ; AX = a -var2 mov SP,BP ; dump local variables pop BP ; restore old frame pointer ret ; AX has return value _func2 endp ; end of function func2 endcode func2 ; end of code segment ELSE begdata ; start of data seg extrn _var1:dword _var2 dd 0 ; allocate var2 enddata ; end of data segment extrn _func1:near ; near function begcode func2 public _func2 ; make func2 global proc _func2 near ; define function func2 push EBP ; save old frame pointer mov EBP,ESP ; set new frame pointer sub ESP,4 ; create room for b uses; preserve EBX mov EAX,-4[EBP] ; EAX = b mov EBX,P[EBP] ; EBX = pa mov [EBX],EAX ; *pa = b push P+SIZEPTR[EBP] ; push a onto stack lea EAX,-4[EBP] ; EAX = offset of b push EAX call near ptr func1 ; call func1(& b, a) add ESP,SIZEPTR+4 ; restore the stack add EAX,_var1 ; func1 returned result in EAX add EAX,-4[EBP] ; EAX = b + var1 + func1(a) mov _var2,EAX mov EAX,p+SIZEPTR[EBP] ; EAX = a sub EAX,_var2 ; EAX = a - var2 unuse ; restore EBX mov ESP,EBP ; dump local variables pop EBP ; restore old frame ptr. ret ; EAX has return value _func2 endp ; end of function func2 endcode func2 ; end of code segment endif END ; end of module
EXTERN statements for code should be outside the begcode/ endcode pairs; otherwise, a message about fix up errors from the linker can be generated when using the Medium or Large models.
Creating Routines With C++ Linkage
In almost all cases, it is better to use C linkage for assembly language functions that will be called from C++ code. This ensures compatibility with future versions of DMC++ and other compilers and avoids the problems associated with subtle differences in C++ calling conventions in different situations.Digital Mars recommends that writing assembly language functions inline or use C linkage (that is, declare them as extern "C"), rather than use C++ linkage. If you must use C++ linkage, see the book Microsoft Object Mapping Specification for implementation details.
Running MASM
When you call MASM, the include file macros.asm sets up macros, depending on which memory model you desire. You indicate the memory model by defining a symbol on the command line:MASM /MX /DI8086? module;
where ? is one of S, M, C, L, or V, corresponding to the appropriate memory model. The Small model is the default. You can see this by looking at the file macros. asm. Do not define I8086T for Tiny model programs; use I8086S instead. (Remember that the only difference between Tiny and Small programs is how they are linked, not how they are compiled or assembled.) For 32-bit programs, define the symbol as /DDOS386.
The /MX switch is necessary so that all global names are case sensitive. Do not use the /ML switch; it causes some versions of MASM to assemble 8087 opcodes incorrectly.
The /R switch enables the assembling of 8087 opcodes.
DMC++ offers built-in support for MASM (Versions 5.0 and higher; Version 5.1 is recommended). If a file argument to the compiler ends in .asm, the compiler tries to assemble it with MASM. If you specify a memory model, the compiler passes the appropriate define to MASM. The compiler passes -g, -D, -v, and -I options to MASM as the corresponding MASM switches.
Support for 386ASM
DMC also supports the Phar Lap assembler, 386ASM. To assemble test.asm using 386ASM, use:
dmc -mp test
Using Register Variables
DMC++ defines the following register variables:
_EAX | _AX | _AH | _AL |
_EBX | _BX | _BH | _BL |
_ECX | _CX | _CH | _CL |
_EDX | _DX | _DH | _DL |
_ESI | _SI | ||
_EDI | _DI | ||
_EBP | _BP | ||
_ESP | _SP |
The extended registers are not available in 16-bit compilations.
The register variables have the following types:
Register | Type |
---|---|
Byte registers | unsigned char |
Word registers | unsigned short |
Extended registers | unsigned long |
Keep the following limitations in mind when you use register variables:
- Problems can arise when your code evaluates an expression after assigning a result to a register variable. This is because using register variables does not prevent the compiler from using these registers.
- Specifying the -A (ANSI compliance) compiler option disables register variables.
- You cannot take the address of a register variable. In most instances, code is more robust and portable if you use inline assembly language code instead of register variables. For information, see Inline Assembler.
Using the __emit__ Function
The __emit__ function lets you insert inline machine instructions into your program in byte pairs. Although of limited usefulness in writing large routines (use the inline assembler instead), the __emit__ function is comparable to the inline assembler for implementing simple functions.Note: The __emit__ function replaces the asm() function supported in Zortech 3.1.
Calls to __emit__ have the form:
__emit__(arg1, arg2, . . .);
The type of each argument determines the number of bytes stored, with this exception: If the argument is of type int and has a value in the range 0 to 255, only one byte is stored. Therefore, to store sizeof(int) bytes, cast the argument to unsigned:
__emit__(1,(unsigned) 23,6);
or use the u postfix:
__emit__(1,23u, 6);