The D Programming Language

by Walter Bright

D is an advanced systems programming language. It is designed to appeal to C and C++ programmers who need a more powerful language that has a much lower complexity and hence is easier to master. D is multiparadigm, and looks and feels very much like C and C++. It offers opportunities for advanced compilers to generate more efficient code than is possible for C/C++, while supporting facilities that reduce the probability of program bugs.

Why D?

Refactoring

C++ has been around for 20 years now. C++ has largely succeeded in adding enormous capability to C while retaining backwards compatibility with it. But with 20 years experience comes the opportunity to reflect on how one might engineer a language that retains C++'s strengths, add modern features, and remove its weaknesses and more troublesome aspects.

Difficulty in adding modern new features

The longer a language has been evolving, the harder it gets to add new features. Each new feature adds an unanticipated layer on top of old ones, in a way that no legacy code breaks. Eventually, it takes forever to add an insignificant improvement. The C++ 'export' is an extreme example of this effect, taking a reported 3 man years to implement and delivering little apparent benefit. A more mundane indication of this problem is C++ was standardized 5 years ago and just now conformant compilers are emerging.

While C++ is pioneering generic programming practice, it lags behind in other modern techniques such as Contract Programming, modules, automated testing, and automatic memory management, It's very difficult to add these while still supporting legacy code.

Brief Tour

D looks a great deal like C and C++, so much so that the canonical hello world program is nearly identical:

	import std.c.stdio;

	int main()
	{
	    printf("Hello world\n");
	    return 0;
	}

Look and feel is very much like C and C++

Many years ago in grammar school, we were shown a film about a researcher who wore special goggles that turned the world upside down. He wore them continuously such that his brain never saw the world right side up. After 2 weeks, his brain suddenly righted that upside down view. Then, the researcher took the goggles off. The film darkly warned the viewer to not try that ourselves!

I am so, so used to C/C++ syntax that I feel like that poor guy when faced with a new and improved language that also turns the syntax inside out (or so it looks to me). Frankly, I rarely give such languages a chance even when the feature set looks intriguing. D doesn't take that route, its syntax is as comfortable to C/C++ programmers as an old shoe. Functions, statements, expressions, operator precedence, integral promotions, it's all there pretty much unchanged. The world is right side up, it's just got brighter colors and sharper focus!

Since D is so similar to C/C++, the rest of the article will focus on characteristics that distinguish it.

Binary compatibility with C

C is the lingua franca of the programming world. Most new languages accept the inevitable and reluctantly provide some sort of (usually barbaric) interface to C tacked on as an afterthought. Not so with D. D supports every one of C's data types as a native type (and not even all C compilers support all C99 types). D can directly call any C function, pass any C type, and accept any C return value as naturally as doing it in C. Additionally, D can access any function in the C library, including malloc and free. There are no shims, compatibility layers, or separately compiled DLL's needed.

There is no direct D interface to C++, but the two languages can talk to each other because they both support C interfaces.

Retains and even expands access to low level programming

Real programs sometimes need to get down to the metal. D offers the usual methods for that with pointers, casting, and unions. It extends access by having an inline assembler as part of the language, rather than a vendor specific add-on that is incompatible with every other vendor's. If you're not convinced of the need for this, take a tour through the linux kernel source code.

Clean separation between lexical analysis, syntactic analysis, and semantic analysis.

At first blush, this feature would seem to be irrelevant to D programmers, seeming to be just compiler implementation arcana. But the positive effects of it do filter down to many indirect benefits to the programmer such as: fewer compiler bugs, faster development of compilers, ease of building syntax-directed editors, ease of building source code analysis tools, fewer funky exceptional rules in learning the language, etc.

For the kinds of troubles this avoids, see the infamous "template angle bracket hack" in C++. Add up all the time highly paid professionals have spent reading about it, writing about it, explaining it to other programmers, proposing fixes for it, and multiply by their hourly cost.

Modules

Every source file in D is a module with its own namespace. Other modules can be imported, which is a symbolic include rather than a textual one. Modules can be precompiled. There is no need to craft a 'header' file for a corresponding source file, the module serves both functions.

No forward declarations needed!

Names at the global level can be forward referenced without needing a special declaration:

	void foo() { bar(); }	// forward reference OK
	void bar() { }

Oddly, C++ allows such forward referencing for class members but not outside classes.

No preprocessor

C++ has added many features that obsolete parts of the preprocessor, but the language is still heavily dependent on it. Even Boost, representative of modern C++ code and thinking, is remarkably tied to the arcana of advanced preprocessor tricks. D provides syntactical elements that obsolete the rest of it - such as modules, nested functions, version statements, debug statements, and foreach iteration.

Unicode

Let's face it, the future is not with ASCII, EBCDIC, or code pages. It's unicode. D can handle unicode from front to back - source code, string literals, and library code all speak unicode. There isn't any vagueness about what a multibyte is in D, or what size a wchar_t is. UTF-8, UTF-16, and UTF-32 strings are all properly supported. Other encoding schemes are handled by putting a filter on input and output.

Interfaces

The object-oriented inheritance model is single inheritance enhanced with interfaces. Interfaces are essentially classes with only abstract virtual functions as members. Users of Microsoft's COM programming model will recognize interfaces as being a natural fit with COM classes.

Function in, out, inout parameters

Function in parameters are values passed in to a function. out parameters are references to variables that are to be initialized and assigned a value by the function. inout parameters are references to variables whose values can be read and modified by the function.

	void ham()
	{   int x = 1, y = 2, z = 3;

	    eggs(x, y, z);
	}

	void eggs(in int a, out int b, inout int c)
	{
	    a += 1;	// x is not affected
	    b = 4;	// y is set to 4
	    c *= 3;	// z is set to 9
	}

Being able to specify all three variations makes for more self-documenting code, better code generation, and fewer bugs.

Automatic memory management

Also known as garbage collection, this enables programmers to focus on algorithms rather than the tedious details of determining the owner of a block of memory, when that block can be deleted, and finding/plugging all the memory leaks. Contrary to common wisdom, automatic memory management can result in smaller and faster programs than explicit memory management. It certainly results in higher programmer productivity. The current D memory manager uses a classic conservative mark/sweep collector, although more advanced ones like a generational collector are certainly possible.

RAII

Automatic memory management techniques are good for managing a resource that is relatively cheap and plentiful, like memory, but are not good for managing scarce and expensive resources, such as file handles. For those, RAII (Resource Acquisition Is Initialization) techniques are supported in an equivalent manner to the way destructors in C++ are executed when a variable goes out of scope. RAII can be either attached to all instances of a particular class, or to a particular instance of any class.

Explicit memory allocation

Automatic memory management isn't a panacea for every memory management problem. Explicit memory allocation is available either by manually calling C's malloc/free, or by overloading new and delete operators for a class. Stack based allocation via alloca() is available as well.

Arrays

Arrays are enhanced from being little more than an alternative syntax for a pointer into first class objects. Key to making this work is that arrays always know their length, and the contents of the arrays are managed by the automatic memory manager. Arrays can be sliced, sorted, and reversed. Both rectangular and jagged arrays are represented.

Array bounds checking is performed, eliminating a very common and mundane cause of bugs (infamously called 'buffer overflow' bugs). Bounds checking, being a runtime check, can be expensive, and so can be optionally removed for release builds.

Associative Arrays

Also known as dictionaries or maps, associative arrays are key/value pairs where the key and the value can be of arbitrary types. For example, a keyword/value setup would look like:

	int[ char[] ] keywords;
	char[] abc;
	...
	keywords["foo"] = 3;	// initialize table
	...
	if (keywords[abc] == 3)	// look up keyword
	    ...

Or, a sparse array of integers would be:

	int[int] sa;
	sa[1] = 3;
	sa[1000] = 16;

Symbol table

Symbols are looked up using natural lexical scoping rules. All module level symbols are available, even if they only appear lexically after the reference. There is no separate tag name space, there is no two phase lookup, there is no ADL (Argument Dependent Lookup), and there is no separation between point of definition and point of instantiation. D has true modules, each module has its own symbol table.

Nested functions

Lexically nested functions is nothing more than embedding one function inside another:

	int foo()
	{
	    int x = 3;

	    int bar(int z) { return x + z; }

	    return bar(2) * bar(3);
	}

Nested functions can access local variables in the lexically enclosing scope. Nested functions can be inline expanded by a competent compiler. They have a surprising number of uses, starting with eliminating another common use of the C++ preprocessor to factor out common code within a function.

Delegates

Function pointers are pointers that contain the address of a function. Delegates are function pointers combined with a context pointer. A delegate for a class member function would contain the address of the member function and the 'this' pointer. A delegate for a nested function contains the address of the nested function and a pointer to the stack frame of the lexically enclosing function. Delegates are a simpler and more powerful replacement for the C++ pointer-to-member.

Function literals

Extending the idea of nested functions brings us function literals, which are just nested functions without a name. Function literals are a handy way to pass a function to some generic function, such as an integrator:

	double parabola(double x, double y)
	{
	    return integrate(0.0, x,
		function double (double x) { return y * x; });
	}

	double integrate(double x1, double x2, double delegate(double dx) f)
	{   double sum = 0;
	    double dx = 0.0001;
	    for (double x = x1; x < x2; x += dx)
		sum += f(x) * dx;
	    return sum;
	}

Exception handling

D adopts the try-catch-finally paradigm for exception handling. Having a finally clause means that the occasional create-a-dummy- class-just-to-get-a-destructor programming pattern is no longer necessary. The try-catch works much as it does in C++, except that catches are restricted to catching only class references, not arbitrary types.

Although not required by the semantics of the language, D adopts the pattern of using exceptions to signal error conditions rather than using error return codes.

Contract Programming

DbC (Contract Programming) is a technique pioneered by Bertrand Meyer. Contracts are assertions that must be true at specified points in a program. Contracts range from simple asserts to class invariants, function entry preconditions, function exit postconditions, and how they are affected by polymorphism.

Typical documentation for code is either wrong, out of date, misleading, or absent. Contracts in many ways substitute for documentation, and since they cannot be ignored and are verified automatically, they have to be kept right and up to date.

DbC is a significant cornerstone of improving the reliability of programs.

DbC can be done in C++, but to do it fully winds up looking a lot like implementing polymorphism in C. Building the constructs for it into the language makes it easy and natural, and hence much more likely to be used.

Unit testing

Like DbC, building unit testing capability into the language makes it easier to use, and hence more likely that it will be used. Putting the unit tests for a module right in the same source as the module has great benefits in verifying that tests were actually written, keeps the tests from getting lost, and helps ensure that the tests actually get run.

In my experience using unit tests and not using unit tests, the places where I've used it have wound up being much more reliable, even in the presence of an external test suite. But of course this is obvious, but if it's so obvious, why do we rarely see unit tests in production code? D's presumption is that the problem is the lack of a consistent, portable, easy, language supported unit test system.

Minor feature improvements

D fleshes out the major features with a number of minor ones that serve to just smooth things out:

No need for -> operator

There is no ambiguity between a pointer to an object and the object itself, so there is no need to use a separate operator for the former:

	    struct X { int a; }
	    ...
	    X x;
	    X* px;
	    ...
	    x.a;     // access X.a
	    px.a;    // access X.a

Anonymous structs/unions for member layout

It's not necessary to provide dummy names for struct/union members when laying out a complex struct:

	    struct Foo
	    {	int x;
		union
		{  int a;
		   struct { int b, c; }
		   double d;
		}
	    }

Anonymous unions and structs are possible in C++, but they still require a name, as in:

	    struct Foo
	    {   int x;
		union
		{   int a;
		    struct { int b, c; } s;
		    double d;
		} u;
	    };

Struct member alignment control

Controlling alignment is a common issue with mapping structs onto existing data structures. D provides an alignment attribute obviating the need for incompatible compiler extensions.

	    struct Foo
	    {
		align (4) char c;
		int a;
	    }

Easier declaration syntax

Ever tried to declare in C a pointer to a pointer to an array of pointers to ints?

	    int *(**p[]);

This gets even more complex when adding function pointers in to the mix. D adopts a simple right-to-left declaration syntax:

	    int*[]** p;

Unsigned right shift >>>

To get an unsigned right shift in C, as opposed to a signed right shift, the left operand must be an unsigned type. This is accomplished by casting the left operand to an equivalently unsigned type.

	    int x;
	    ...
	    int i = (unsigned)x << 3;

The problem comes in when dealing with a typedef'd type for the left operand and in C there's no way to reliably determine what is the correct unsigned type to cast it to. (In C++ one could write a traits template library to do it, but it seems a weighty workaround for something so simple.)

Having a separate right shift operator eliminates this subtle source of bugs.

Embedded _ in numeric literals

Ever been faced with a numeric literal like 18446744073709551615? Quick, how big is it? If you're like me, you put a pencil point on the literal and carefully count out the digits. But there's a better way. Taking a page from the usual way of dealing with this, putting commas at every 3 digits, D allows _ to be embedded into numeric literals, yielding 18_446_744_073_709_551_615. 18 quintillion. Not a big feature, but it helps expose subtle transcription errors like a dropped digit.

Imaginary suffixes

Imaginary floating point literals naturally have an 'i' suffix, as in:

	    cdouble c = 6 + 7i; // initialize complex double c

as opposed to the C99:

	    double complex c = 6 + 7 * I;

or the C++:

	    complex<double> c = complex<double>(6, 7);

WYSIWYG strings

While embedded escape sequences are a must, What-You-See-Is-What-You-Get is a nice thing to have for string literals. D offers both kinds, the traditional escaped "" string literal and the r"" WSYIWYG literal. The later is particularly useful when entering regular expressions:

		"y\\B\\w"	// regular strings
		r"y\B\w"	// WYSIWYG strings

and for Windows filesystem names:

		file("\\dm\\include\\stdio.h")	// regular strings
		file(r"\dm\include\stdio.h")	// WYSIWYG strings

X strings

Hex dumps often come in the form of:

		00 0A E3 DC 86 73 7E 7E

Putting them into a form acceptable to C:

		0x00, 0x0A, 0xE3, 0xDC, 0x86, 0x73, 0x7E, 0x7E,

or:

		"\x00\x0A\xE3\xDC\x86\x73\x7E\x7E"

This can get tedious and error prone when there's a lot of it. D has the x string, where hex data can be simply wrapped with double quotes, leaving the whitespace intact:

		x"00 0A E3 DC 86 73 7E 7E"

Debug conditionals

Most non-trivial applications can be built with a 'debug' build and a 'release' build. The debug build often adds in extra code for printing things, extra checks, etc. The debug conditional is a simple way to turn these extra statements and code on and off:

	    debug (FooLogging) printf("checking foo\n");

means if the 'FooLogging' debug version is being built, compile in the printf statement.

Versioning

It's a rare piece of application that doesn't have some ability to generate multiple versions. But since D has eliminated the preprocessor, where #if is used to generate multiple versions, it is replace by the version statement.

	    version (Windows)
	    {
		... windows version ...
	    }
	    else version (linux)
	    {
		... linux version ...
	    }
	    else
	    {
		static assert(0);	// unsupported system
	    }

Deprecation

Library routines in active use inevitably evolve, and inevitably some of the routines will become obsolete. But existing code may still rely on them. This is a constant problem. D offers a 'deprecated' attribute for declarations:

		deprecated int foo() { ... }

If foo() is referenced in the code, the compiler can (optionally) diagnose it as an error. This makes it easy for program maintainers to purge code of obsolete and deprecated dependencies without requiring tedious manual inspection.

Deprecated is an ideal tool for library vendors to use to mark functions that are obsolete and may be removed in future versions.

Switch strings

Switch statements are extended to be able to select a case based on string contents:

	    int main(char[][] args)
	    {
		foreach (char[] arg; args)
		{   switch (arg)
		    {	case "-h": printHelpMessage(); break;
			case "-x": setXOption(); break;
			default: printf("Unrecognized option %.*s\n", arg); break;
		    }
		}
		...
	    }

The string values can also be string constants, but like integer cases, they cannot be string variables.

Module initializers

Module initializers are code that needs to get executed before control is passed to main(). They are like static constructors, except at the module level:

	    module foo;

	    static this()
	    {
		... initialization code ...
	    }

All the module initializers are collected together by the runtime and executed upon program startup. The order they are run is controlled by how modules import other modules - the module initializers of all the imported modules must be completed before the importing module initializer can be run. The runtime detects any cycles in this and will abort if the rule cannot be followed.

Static asserts

Static asserts are like regular asserts, except that they are evaluated at compile time. If they fail, then the compilation is stopped with an error:

	    T x;
	    ...
	    static assert(x.size == (int*).size);

Default initializers

All local variables are initialized to their default values if an initializer is not provided:

	    int x = 3;	// x is initialized to 3
	    int y;	// y is initialized to y.init, which is 0
	    double d;	// d is initialized to d.init, which is NAN

Typedefs can have their own unique default initializer:

	    typedef int T = 4;
	    T t;	// t is initialized to T.init, which is 4

Even class and struct members are initialized to their defaults. No more bugs with forgetting to add an initializer to the constructor.

	    class Foo
	    {	int x;
		int y = 4;
		int z;
		this() { z = 5; }
	    }

	    Foo f = new Foo();
	    printf("%d, %d, %d\n", f.x, f.y, f.z); // prints 0, 4, 5

Synchronized

Since multiprocessor and multithreaded computing environments are becoming ubiquitous, improved support for multithreading in the language is helpful. D offers synchronized methods, synchronization on a per object basis, and synchronized critical sections as language primitives:

	    synchronized
	    {
		...	// critical section - only one thread at a time here
	    }

	    synchronized (o)
	    {
		...	// only one thread using Object o at a time here
	    }

	    class C
	    {
		// only one thread can use this.foo() at a time
		synchronized int foo();

		// only one thread at a time can execute bar()
	        synchronized static int bar();
	    }

Nested comments

Ever want to comment out a block of code regardless of whether it contains comments or not? Comments that can nest can do it. They are delineated by /+ +/.

Const means constant

Const is not a type modifier in D, it is a storage class. Hence, the value of a const cannot change. A const declaration can be put in read-only storage, and the optimizer can assume its value never changes. This is unlike C/C++, where since const is a type modifier, the value of a reference to a const can legally change.

Advanced Features

Operator overloading

Operator overloading in D is based on the idea of enabling the use of operators with their ordinary meaning on user defined types. To that end, operator overloads keep the same semantics of the operator on built-in types. For example, the operator overload for + retains its commutivity; (a + b) is the same as (b + a). This means that only one operator overload can be used for both (a + b) and (b + a), if a and b are different types. For another example, operator overload opCmp() takes care of (a<b), (b<a), (a<=b), (b<=a), (a>b), (b>a), (a>=b), (b>=a). Similarly, opEquals() takes care of (a==b), (b==a), (a!=b), and (b!=a). Hence, far less code needs to be written to create a user defined arithmetic type, as well as the appeal of not having to worry that >= may be overloaded in a bizarrely different way than <.

Operator overloads for non-commutative operators like / have a special 'reverse' operator overload. This means that the C++ asymmetry of having member forward overloads and global reverse overloads is not necessary. Global operator overloads are not necessary, and eliminating them removes the requirement for ADL. Not having ADL implies that templates can be in imported modules without having complicated symbol lookup semantics.

Foreach iteration

Foreach is a generalized way to access each element of a collection. Many languages implement some form of foreach, falling into two categories: the first requires some form of linearized access to the collection, the second relies on two coroutines. D takes a unique third route. The elements of a collection can be accessed in a sequence defined by an opApply function in the class; no need for linearization, creation of specialized iterators, or the problems of coroutines.

Furthermore, the body of the foreach does not have to be a separate function. It can be a collection of arbitrary statements just like the body of a standard for loop or a while loop. It looks like:

	foreach (T v; collection)
	{
	    ... statements ...
	}

where T is the type of the objects in the collection class object instance collection. The foreach body is iterated once for each element of the collection, iteratively assigning them to v.

Templates

Generic programming is a huge advance in programming methodology, and much, if not nearly all of it, was pioneered by C++. Some simple generic programming has filtered out into other languages, but taking its cue from C++, D fully embraces the trail blazed by its older brother.

In particular, D supports class templates, function templates, partial specialization, explicit specialization, and partial ordering. Template parameters can be types, values, or other templates.

But it isn't just a boring clone of C++ templates. D corrects some of the root problems and extends template technology in several directions.

Template problems corrected:

Angle brackets are not used to enclose template arguments:

		Foo<int>	// C++ template instantiation
		Foo!(int)	// D template instantiation

The angle brackets cause much internal grief in the lexical, syntactic, and semantic phases of compilation because of the ambiguity with the less than, greater than, and right shift operators. Using an unambiguous syntax eliminates a raft of problems, special case rules, conflicting interpretations of those rules, incompatible extensions, and plain old bugs. It is no longer necessary to insert spaces, typename and template keywords in strategic spots to get the parser to parse it correctly.

Since D has a module system rather than #include, separate compilation of templates follows naturally from the symbol table rules of imports. There is no need for an export keyword or any of the grief trying to implement it. It comes for free.

Another aspect of the module system is that there are no template declarations, there are only template definitions. Just import the module containing the template definition needed, and the language takes care of the rest.

D templates do not recognize a difference between point of instantiation and point of definition when dealing with forward references. All forward references are visible.

Closely related templates, rather than being defined separately with no obvious connection between them, can be declared as one template with its own scope:

		template Foo(T)
		{
		    struct bar { T t; }

		    void func(T t} { ... }
		}

There is no need to provide a declaration of a member template within a class, and then provide the definition of it outside the class, and its associated complex and tedious syntax. Member templates are always defined in place.

Template Extensions:

A template can be its own scope, with all declarations within that scope being 'templated'. This means that anything that can be declared can be templated, not just classes and functions but typedefs and enums.

Class templates (in fact, all templates) are overloadable.

The much discussed 'typeof' is supported as a standard part of the language. Similar facilities for type detection and manipulation are first class aspects of D, rather than needing macros or trait templates. For example, common properties of types can be accessed directly:

		T t;
		t.max;	// maximum value
		t.min;	// minimum value
		t.init;	// default initializer

Any non-local symbol can be passed as a template parameter. This includes templates, specific template instances, module names, typedefs, aliases, functions, etc.

Inline assembler

The ultimate in performance programming is only achievable with inline assembler, so it's fitting that D rounds out its support for bare metal programming with a standardized inline assembler. Inline assembly as well provides access to specialized CPU instructions like LOCK, makes it easy to do multi-precision arithmetic that requires access to the carry and overflow flags, etc.

Standardizing the inline assembler means that one's inline assembler code will be portable from compiler to compiler (as long as it's on the same CPU). This is quite unlike the C/C++ world, where every compiler vendor extends the language with an inline assembler incompatible with all the others.

The D runtime library proves the worth of a standardized inline assembler by implementing many routines in inline assembly; the code is identical between the Windows and Linux versions.

Advantages of D

Simpler and faster to learn

Refactoring C and C++ enables D to offer the equivalent power while eliminating a great many of the special case rules and awkward syntax required for legacy code compatibility. D offers much more available power built in to the core language, while being a far less complicated language. This makes for short learning curves, and quicker mastery of the language.

There's no need for multiple warning levels in D, a "D Puzzle Book", a lint program, a "Bug Of The Month", or a world's leading expert on preprocessor token concatenation.

Retains investment in existing C code

Rather than trying to compile existing C code, which would require D to carry on with all the legacy decisions of C, instead D assumes existing C code will be compiled with a C compiler, and connects directly to it at the object code level. D supports all C types (more than C++ does!), common calling conventions (including printf's), and runtime library functions, so C code need never realize it is interfacing to D code.

There's no need to convert C code to D; new functionality can be written in D and linked in to the existing C object code. Indeed, some parts of the standard library - e.g. std.zlib, std.recls - are widely used C and C++ libraries that have been incorporated with no changes, only D declaration files were created for the libraries' C interfaces.

More portable

D locks down many undefined or implementation defined behaviors of C and C++, thereby making code more predictable and portable. Some of these are:

char is unsigned. With C/C++, it is hard to verify that code using chars is portable without actually trying it on one compiler where chars are signed and another where they are unsigned. D doesn't have that problem; chars are unsigned.
Integral types are fixed sizes. No more need for those endless typedef'ing schemes one sees in C/C++ code to get a fixed integral type size.
Floating point is IEEE 754. This means that NaN's and infinities work, rounding is well behaved, and comparisons when one operand is a NaN are handled properly.
Order of module initialization is specified.
Source text is unicode, not some unspecified multibyte encoding.
wchar is a 2 byte UTF-16 character, not 2 bytes on one machine and 4 bytes on another.
It's far easier to write a compiler for D, meaning that there will be better conformance of various D implementations to the D standard.

More robust

With the ever increasing size of programs, and ever increasing cost of bugs in shipping code, anything that can be done to improve program robustness before it ships will pay big dividends.

D starts with ensuring that no variables or object members get left uninitialized - a common source of random bugs in C/C++. It follows up with the replacement of arbitrary pointers with arrays and array bounds checking. (No more buffer overflows overwriting the stack.)

Next, and most important, are Contract Programming and unit testing. Let's face it, test and verification of code is often clumsily done at the last minute, or not done at all. (How many times have you seen the source to a shipping program with no tests or verification code at all?).

The answer isn't to force programmers to write test and verification code, it's to make it easy to do it and manage it. Easy enough to tip the balance and make adding such as much a matter of course as adding comments. Having the test and verify code right there in the source along with the algorithm code also brings to bear peer pressure and management pressure to add it in. Having test (unit test) and verify (Contract Programming) code in with the algorithm will become as normal and expected as adding in explanatory comments.

With the difference that the test and verify code actually runs.

Pointers

While D supports generalized pointer operations, pointers are not necessary for nearly all routine programming tasks. They are replaced by function out and inout parameters, first class arrays, automatic memory management, and implicit class object dereferencing. While under the hood these constructs still use pointers, the notorious brittleness of pointer code isn't there anymore.

Better code generation

D compilers can potentially generate better code than equivalent C/C++ code because:

Compiler has semantic access to modules and potentially the entire program, rather than being restricted to a single source file and its #include'd headers.
Compiler can look at forward references. For example, a function appearing lexically after its use can still be inlined. Any function is potentially inlineable - it doesn't need to be declared as inline.
Higher level constructs that enable better optimization with simpler compilers. For example, D's foreach is built in to the language making it simple for the compiler to emit an optimized loop traversal. C and C++ compilers, on the other hand, need to do some fairly advanced analysis to detect loops, determine the loop counters, etc.
Contracts (from asserts, Contract Programming) must evaluate to true, so optimizers can mine them for more information about data. Other language guarantees about the state of data, such as array bounds checking, can be used by the optimizer.

Straightforward symbol table

This is one of the indirect advantages of D. Having a simple, straightforward symbol lookup scheme results in better, more accurate, and more quickly produced compilers. It enables new features to be added in easier. Implementing a correct D compiler is not meant to be a test of programming virtuosity. Having correct and reliable compilers quickly put in the hands of programmers is to the benefit of both the compiler vendors and the programmers.

Open source reference implementation

The source code to the D front end implementation is available under both the GPL and Artistic License.

Example

This D program reads a text file, and counts the number of occurrences of each word. It illustrates some features of D:

It's close look and feel similarity to C.
Use of imports rather than #include.
New declaration syntax.
Default initialization.
Ability to seamlessly call C functions such as printf.
Use of an associative array (dictionary) as a simple symbol table.
Using foreach to iterate through different kinds of collections.
Use of arrays.
Use of array slicing (args[1 .. args.length]).

	import std.c.stdio;
	import std.file;

	int main (char[][] args)
	{
	    int w_total;
	    int l_total;
	    int c_total;
	    int[char[]] dictionary;

	    printf("   lines   words   bytes file\n");
	    foreach (char[] arg; args[1 .. args.length])
	    {
		char[] input;
		int w_cnt, l_cnt, c_cnt;
		int inword;
		int wstart;

		input = cast(char[])std.file.read(arg);

		for (int j = 0; j < input.length; j++)
		{   char c;

		    c = input[j];
		    if (c == '\n')
			++l_cnt;
		    if (c >= '0' && c <= '9')
		    {
		    }
		    else if (c >= 'a' && c <= 'z' ||
			c >= 'A' && c <= 'Z')
		    {
			if (!inword)
			{
			    wstart = j;
			    inword = 1;
			    ++w_cnt;
			}
		    }
		    else if (inword)
		    {   char[] word = input[wstart .. j];

			dictionary[word]++;
			inword = 0;
		    }
		    ++c_cnt;
		}
		if (inword)
		{   char[] w = input[wstart .. input.length];
		    dictionary[w]++;
		}
		printf("%8lu%8lu%8lu %.*s\n", l_cnt, w_cnt, c_cnt, arg);
		l_total += l_cnt;
		w_total += w_cnt;
		c_total += c_cnt;
	    }

	    if (args.length > 2)
	    {
		printf("--------------------------------------\n%8lu%8lu%8lu total",
		    l_total, w_total, c_total);
	    }

	    printf("--------------------------------------\n");

	    foreach (char[] word1; dictionary.keys.sort)
	    {
		printf("%3d %.*s\n", dictionary[word1], word1);
	    }
	    return 0;
	}

D Community

There is a large and active D community. Discussion groups are at news.digitalmars.com. Many open source D projects are linked to from www.digitalmars.com/d/dlinks.html.

References

The D Programming Language specification: www.digitalmars.com/d/
D newsgroups: news.digitalmars.com
Design By Contract: "Object-Oriented Software Construction" by Bertrand Meyer
C++ Boost: www.boost.org

Acknowledgements

The following people are just a few of the many who have contributed to the D language project with ideas, code, expertise, inspiration and moral support:

Bruce Eckel, Eric Engstrom, Jan Knepper, Helmut Leitner, Lubomir Litchev, Pavel Minayev, Paul Nash, Pat Nelson, Burton Radons, Tim Rentsch, Fabio Riccardi, Bob Taniguchi, John Whited, Matthew Wilson, Peter Zatloukal