Last update Sun Sep 8 21:42:01 2013

Choosing a Memory Model

Digital Mars C++ is a comprehensive development system for the Intel 8086 family of processors. This chapter explains how to choose an appropriate memory model, so that you can create everything from small command line utilities to the largest and most complex applications.

Overview of Memory Models

Choosing a memory model means making choices among meeting minimum system requirements, maximizing code efficiency, and gaining access to every available memory location. If you don't specify any particular memory model, the Digital Mars compilers use the Win32 model. To compile for DOS or Win16, a memory model must be selected.

To use the Small memory model, you don't need to know anything about compiler switches or configuring the IDDE. And when you use a debugger, small model addresses are easy to interpret. However, if your program requires more than 640KB of memory to store its code or data, you must choose a different memory model.

For Programs Under 640KB

If your program's total size is under 640KB, you should choose one of the memory models in Table 7-1 below. These are the real mode memory models. Since all the processors used in IBM PCs and compatibles can run in real mode, programs compiled with these models can run on all PCs.

The Tiny memory model creates .com programs. The Small model creates .exe programs.

Table 7-1 Choosing a real mode memory model
Code Data Use this model
under 64KB under 64KB Small (-ms) or Tiny (-mt)
over 64KB under 64KB Medium (-mm)
under 64KB over 64KB Compact (-mc)
over 64KB over 64KB Large (-ml)
For faster and more efficient code use the memory model that gives you the best fit for your program. For example, using the Large memory model when another model would suffice makes your program slower than it has to be because more data is referenced using both a segment and an offset. For information on how data is stored in the various memory models, see the section "How Data Is Stored" later in this chapter. For information on making your program as efficient as possible, see the section "Fine-Tuning with Mixed Model Programming" later in this chapter.

For Programs Over 640KB

If your program is over 640KB, that restricts what machines will be able to run it. Ultimately, for large DOS programs, you must choose between performance and portability to older systems.

Running on 8086/8088 machines and later

When using DOS on a machine with an 8086 or 8088 CPU, more than 640KB of memory is not accessible. If your program has a large amount of data, consider using handle pointers described in Using Handle Pointers.

Note: Digital Mars C++ no longer supports the Virtual Code Management system or the -v memory model.

For users compiling for a 80286 "minimum configuration," the Digital Mars C++ compiler still supports the Rational Systems 16-bit DOS extender (-mr), available separately from Rational Systems.

Running under DOS on 80386 machines and later

If your DOS program will run only on the 80386 and 80486, it can operate in 32-bit protected mode, which lets you access up to 4GB of RAM.

Note: 4GB is a theoretical maximum. Under DOS or XMS, the maximum extended memory is limited by BIOS function calls to just under 64MB. The DOSX 32-bit DOS extender can handle up to 3GB, because it allocates 3/4 of the available extended memory. There is no system that supports 4GB of memory at this time.

To run in 32-bit protected mode, you need a 32-bit DOS extender. The DOSX memory model (-mx) is compatible with the DOSX 386 DOS extender. The Phar Lap memory model (-mp) is compatible with the Phar Lap 32-bit DOS extender, available from Phar Lap.

Note: Digital Mars's OPTLINK linker can link DOSX programs, but does not support linking Phar Lap programs.

For related information, see DOS32 (DOSX) Programming Guidelines.

Memory Models for Windows 3.x Programs

Since Windows is itself a form of DOS extender, the DOSX or Phar Lap memory models cannot be used for Windows programs. You can compile a Windows 3 application with the Small, Medium, Compact, or Large memory models. Digital Mars recommends compiling Windows applications with the Large model because it minimizes the problems associated with mixed model programming. Windows 3.0 and later eliminate any advantage to using the Medium memory model.

Compile all the files in a Windows application with the same (preferably Large) memory model if possible, or explicitly declare a type for each pointer in a function prototype. If you are mixing near and far data references, make sure that all declarations match their corresponding definitions, or hard-to-find bugs can result. For more information see "Fine-Tuning with Mixed Model Programming" below.

Note: Since Digital Mars's Large memory model does not place data into far data segments by default, Large model programs compiled with Digital Mars C++ can be multiple-instance applications.

Memory Models for Windows 95 and NT

To create a program for a 32-bit operating system like Windows NT, you need a memory model that can reference a flat 32-bit address space (where CS, DS, SS, and ES all map onto a single memory area). Digital Mars C++ supports a 32-bit flat address space with the NT (-mn) memory model. For more information, see Win32 Programming Guidelines.

Note: The compiler ignores the keywords __far, __huge, __interrupt, __loadds, and __handle when compiling with the -mn memory model. You can tell the compiler to ignore these keywords for any compilation with the -NF compiler option.

How Data Is Stored

How your program stores data depends on whether it is a 16-or 32-bit program.

16-bit programs

Real mode programs can run on the 8086 and 8088 processors that were in the original IBM PCs and compatibles. In real mode DOS programs, code and data are stored in 64KB segments. DOS limits programs to 640KB bytes of memory, including both code and data.

For the most part, programs written for the 8086 architecture use two types of references: near and far.

Near code and data references

A near reference refers to a function or data object (or a pointer to a function or data object) that is within the current segment. It is 16 bits long and contains an offset into the current data segment if it's a data pointer, or into the current code segment (or the stack segment) if it's a function pointer.

Far code and data references

A far reference refers to a function or data object (or a pointer to a function or data object) that is in a different segment than the current one. It is 32 bits long and contains a 16-bit quantity called the segment, which identifies the memory segment where the code or data is stored, and a 16-bit quantity called the offset, which contains the location of the code or data in that segment. (The 8088 and 8086 have a 20-bit address bus. Therefore, they actually use a 20-bit segment address, which is obtained by shifting the 16-bit segment value four bits to the left. This 20-bit value is combined with the offset to reference an actual memory location.)

Choosing a memory model changes how the compiler stores addresses to functions and data. If the model can handle less than a segment's worth of code or data, it uses near pointers to reference them. If the model can handle more than a segment's worth of code or data, it uses far pointers to reference them.

Accessing code or data with a near reference is much quicker than accessing it with a far reference. When you use a far reference, your program must first find the segment and then find the code or data within that segment. When you use a near reference, your program only needs to find the code or data. For a faster program, choose the memory model that lets you make as many near references as possible.

Memory models and segmentation

Choosing a memory model does not change how the compiler segments your code. You choose the segment in which to store code and data with the __far and __huge keywords, as described in "Fine-Tuning with Mixed Model Programming" later in this chapter. The compiler and linker automatically segment your code. You can fine-tune how the compiler and linker segment your code with the techniques described in Compiling Code.

32-bit programs

In 32-bit protected mode programs (those compiled with the DOSX, Phar Lap, or NT memory model), near pointers are 32 bits long and far pointers are 48 bits long. With these models, your programs can access up to 4GB of RAM, all through near references.

In 32-bit applications, far pointers are used only for special purposes like accessing video memory. Therefore, you should not typically use pointer modifiers in 32-bit programs. Sizes of data types and pointer types The table below lists the sizes of the base data types and pointer types in all Digital Mars C++ memory models.

Table 7-2 Data and pointer types and sizes
Data/PointerType Size in 16-bit compilations (T, S, M, C, L, and V models) Size in 32-bit compilations (X, P, F, and N models)
char signed 8 bits signed 8 bits
signed char signed 8 bits signed 8 bits
unsigned char unsigned 8 bits unsigned 8 bits
short signed 16 bits signed 16 bits
unsigned short unsigned 16 bits unsigned 16 bits
int signed 16 bits signed 32 bits
unsigned unsigned 16 bits unsigned 32 bits
long signed 32 bits signed 32 bits
unsigned long unsigned 32 bits unsigned 32 bits
float 32 bits floating 32 bits floating
double 64 bits floating 64 bits floating
long double 64 bits floating 64 bits floating, 80 bits N model
__near pointer 16-bit segment offset 32-bit segment offset
__far pointer 16-bit segment and 16-bit offset 16-bit segment and 32-bit offset
__huge pointer 16-bit segment and 16-bit offset 16-bit segment and 32-bit offset
__ss pointer 16-bit segment offset 32-bit segment offset
__cs pointer 16-bit segment offset 32-bit segment offset
__handle pointer 16-bit segment and 16-bit offset 16-bit segment and 32-bit offset

Fine-Tuning with Mixed Model Programming

Digital Mars C++ lets you mix memory models within a program by using the __near, __far, __cs, __ss, and __huge keywords. These keywords permit you to fine-tune how your program uses memory.

Note: The __near, __far, and __huge keywords are not part of ANSI C language and are used only in operating systems with segmented memory. Code that uses them is not portable. In addition, they are of limited usefulness when creating 32-bit applications.

Creating large data structures with far data in 16-bit programs

In all the 16-bit memory models, the compiler puts all static and global variables into a single data segment (called DGROUP) that can only contain 64KB. With far data, you can put a particular data structure into a data segment of its own. However, that data structure cannot be larger than 64KB.

To declare a data structure to be far, put the __far keyword immediately before the identifier, like this:

int __far array[10000];
struct ABC __far table[600] = { .... }
Access far data with array syntax:
array[301] = 32;
table[258] = an_abc_struct;
The compiler creates a segment name for the data structure from the source file name and the variable name.

By default, the compiler uses far data in the Compact and Large memory models. When you use the __far keyword with a data declaration, the compiler starts a new data segment and puts the rest of the data in the file into that segment.

Portably declaring large arrays in 16-bit compilations

It is frequently necessary to declare arrays larger than 64K in size. For instance:
char array[100000];     // 100K bytes
double values[10][1000] // 10*1000*8= 80K bytes
To portably declare arrays greater than 64K in 16-bit compilations, you can construct an array of pointers to arrays, where each unit is less than 64K is size. Using this technique, the above arrays would be declared as:
char *array[2];
#define array(i) (array[(i) & 1][(i) >> 1])

double *values[10];
Code that declares large arrays using pointers must be compiled in one of the large data models (Compact or Large). Storage for an array of pointers cannot be allocated statically; you need to call calloc() to initialize them to all zeros:
  int i;
  for (i = 0; i < 2; i++)
    array[i] = (char *) calloc(100000/2,sizeof(char));

  for (i = 0; i < sizeof(values)/sizeof(values[0]); i++)
    values[i] = (double *) calloc(1000, sizeof(double));
  . . .
To access an element of array[], instead of array[i]. use this syntax:
long i;
array(i) = array(i + 10);
Note that the macro can be used both as an lvalue or an rvalue. Similarly, for values:
int i, j;
values[i][j] = values[i][j] + 6.7;
Most of the time you won't need to deallocate the memory used for the arrays, if they are used for the duration of program execution; the operating system will deallocate the storage when the program terminates.

The methods described above are not only portable to ANSI C and to 32 bits, they can also be faster than using _huge.

Declaring class objects as far data

In the Small, Tiny, and Medium memory models, you cannot declare as far class objects that you create with new data. In this example, the first declaration causes an error, but the second will not:
AClassA __far *a1 = new(classA) // ERROR
AClassA __far a2;               // OK
In the other 16-bit memory models, you can declare any class object as far data. In the 32-bit models, you cannot declare class objects as far data.

Using __near and __far functions

When you compile a program with the Medium or Large memory models, by default the compiler uses far pointers for function addresses. However, if you know that a function is used only by other functions that are in the same code segment, you can declare it __near, so that the compiler will access it with near pointers.

The __near keyword is especially useful with static functions that is, functions that are used only within the file where they're defined. Since the compiler by default puts all a file's functions into the same code segment, you can declare any static function as near. However, you should not declare global functions as __near.

In the example below, walktree() is a recursive static near function. The program saves a significant amount of time by using a near instead of a far address. A near address pushes a 16-bit return address on the stack for each call.

typedef struct NODE
  int value;
  struct NODE *left;
  struct NODE *right;
} node;

/* Use a near function for * the recursive part */
static int __near walktree(node *n)
  return (n->value + walktree(n->left) + walktree(n->right));

/* Calculate sum of all nodes in the tree */
int calcsum(node *n)
  return walktree(n);
Note: You cannot declare a static function whose address you take as near and then attempt to call it as a far function.

You rarely need to declare functions with the __far keyword. Programs that use the Medium or Large memory models use far pointers by default and programs that use the Small and Compact memory models don't contain multiple code segments. The only exception is a Small, Tiny, or Compact program that runs under Windows and uses a dynamic link library (or DLL). The functions in the DLL are in a separate code segment and must be declared far.

Using huge pointers

A huge pointer is similar to a far pointer. It is 32 bits long and can point to any location in memory. You declare data to be huge by substituting the __huge keyword for __far.

Huge pointers offer three advantages over far pointers:

Because of the extra overhead associated with huge pointer arithmetic, you should use huge pointers only for data objects larger than 64KB. Do not use huge pointers in 32-bit code. The keyword __huge is ignored in compilations using the NT (-mn) memory model.

Note: Digital Mars C++ does not support the Huge memory model. That is, a pointer whose type is unspecified cannot be made huge by default.

Using handle pointers

Handle pointers are a Digital Mars C++ extension to the far pointer type that support virtual memory management. You use handle pointers to access expanded memory (EMS or LIMS) in 16-bit programs.

Like far pointers, handle pointers are 32 bits long in 16-bit applications. They let a data structure use as much as 16KB of memory, and let your program use as much as 16MB.

Note: The keyword __handle is ignored in compilations using the NT (-mn) memory model.

See Using Handle Pointers for more information.

Using __ss pointers

You use __ss pointers to point to objects on the stack. In the Tiny, Small, Compact, Medium, and Large memory models, __ss is a 16-bit offset. In the DOSX and Phar Lap memory models, it is a 32-bit offset.

__ss pointers work like near pointers; the difference is that their segment address is set to the stack segment instead of the data segment. Thus __ss pointers are relative to the SS segment register, while near pointers are relative to the DS segment register.

If SS==DS (which is TRUE in the Tiny, Small, and Medium memory models), there is no difference between __ss pointers and near pointers. In the Compact and Large models, or whenever you set SS!= DS with the w qualifier to the -m compiler option (as for DLLs or ROM-based code), __ss can only be used to point to parameters and automatic variables, while near pointers can only point to static and global data.

Storing data in the code segment

Digital Mars C++ lets you store data in the code segment with the keyword __cs. Use __cs as you do __far. For example:
int __cs x = 3;         // x in code segment
char __cs ca[] = "abc"; // ca[] in code segment
char __cs *pc = "abc";  // "abc" in code segment pc in data segment

char __cs * __cs pc2 = "def"; // "def" and pc2 both go in code segment
char __cs * cps[] = {" abc", "def"};
	// "abc" and "def" are in code segment
	// array of pointers cpa is in data segment

void func ()
{ char __cs *p;
  p = "xyz";          // "xyz" in code segment
Advantages to storing data in the code segment Some of the significant advantages to storing data in the code segment are: In 32-bit memory models, placing data in the code segment is rarely advantageous unless read-only protection is desired.

Potential problems

When using the __cs keyword, keep the following potential problems in mind:
Home | Runtime Library | IDDE Reference | STL | Search | Download | Forums