www.digitalmars.com

D Programming Language 1.0


Last update Sun Dec 30 20:34:43 2012

D 1.0 FAQ

The same questions keep cropping up, so the obvious thing to do is prepare a FAQ.

D 2.0 FAQ

C++ FAQ


Why the name D?

The original name was the Mars Programming Language. But my friends kept calling it D, and I found myself starting to call it D. The idea of D being a successor to C goes back at least as far as 1988, as in this thread.


Where can I get a D compiler?

Right here.


Is there a linux port of D?

Yes, the D compiler includes a linux version.


Is there a GNU version of D?

Yes, David Friedman has integrated the D frontend with GCC.


How do I write my own D compiler for CPU X?

Burton Radons has written a back end. you can use as a guide.


Where can I get a GUI library for D?

Since D can call C functions, any GUI library with a C interface is accessible from D. Various D GUI libraries and ports can be found at AvailableGuiLibraries.


Where can I get an IDE for D?

Try Elephant, Poseidon, or LEDS.


What about templates?

D now supports advanced templates.


Why emphasize implementation ease?

Isn't ease of use for the user of the language more important? Yes, it is. But a vaporware language is useless to everyone. The easier a language is to implement, the more robust implementations there will be. In C's heyday, there were 30 different commercial C compilers for the IBM PC. Not many made the transition to C++. In looking at the C++ compilers on the market today, how many years of development went into each? At least 10 years? Programmers waited years for the various pieces of C++ to get implemented after they were specified. If C++ was not so enormously popular, it's doubtful that very complex features like multiple inheritance, templates, etc., would ever have been implemented.

I suggest that if a language is easier to implement, then it is likely also easier to understand. Isn't it better to spend time learning to write better programs than language arcana? If a language can capture 90% of the power of C++ with 10% of its complexity, I argue that is a worthwhile tradeoff.


Why is printf in D?

printf is not part of D, it is part of C's standard runtime library which is accessible from D. D's standard runtime library has std.stdio.writefln, which is as powerful as printf but much easier to use.


Will D be open source?

The front end for D is open source, and the source comes with the compiler. The runtime library is completely open source. David Friedman has integrated the D frontend with GCC to create gdc, a completely open source implementation of D.


Why fall through on switch statements?

Many people have asked for a requirement that there be a break between cases in a switch statement, that C's behavior of silently falling through is the cause of many bugs.

The reason D doesn't change this is for the same reason that integral promotion rules and operator precedence rules were kept the same - to make code that looks the same as in C operate the same. If it had subtly different semantics, it will cause frustratingly subtle bugs.


Why should I use D instead of Java?

D is distinct from Java in purpose, philosophy and reality. See this comparison.

Java is designed to be write once, run everywhere. D is designed for writing efficient native system apps. Although D and Java share the notion that garbage collection is good and multiple inheritance is bad <g>, their different design goals mean the languages have very different feels.


Doesn't C++ support strings, etc. with STL?

In the C++ standard library are mechanisms for doing strings, dynamic arrays, associative arrays, bounds checked arrays, and complex numbers.

Sure, all this stuff can be done with libraries, following certain coding disciplines, etc. But you can also do object oriented programming in C (I've seen it done). Isn't it incongruous that something like strings, supported by the simplest BASIC interpreter, requires a very large and complicated infrastructure to support? Just the implementation of a string type in STL is over two thousand lines of code, using every advanced feature of templates. How much confidence can you have that this is all working correctly, how do you fix it if it is not, what do you do with the notoriously inscrutable error messages when there's an error using it, how can you be sure you are using it correctly (so there are no memory leaks, etc.)?

D's implementation of strings is simple and straightforward. There's little doubt about how to use it, no worries about memory leaks, error messages are to the point, and it isn't hard to see if it is working as expected or not.


Can't garbage collection be done in C++ with an add-on library?

Yes, I use one myself. It isn't part of the language, though, and requires some subverting of the language to make it work. Using gc with C++ isn't for the standard or casual C++ programmer. Building it into the language, like in D, makes it practical for everyday programming chores.

GC isn't that hard to implement, either, unless you're building one of the more advanced ones. But a more advanced one is like building a better optimizer - the language still works 100% correctly even with a simple, basic one. The programming community is better served by multiple implementations competing on quality of code generated rather than by which corners of the spec are implemented at all.


Can't unit testing be done in C++ with an add-on library?

Sure. Try one out and then compare it with how D does it. It'll be quickly obvious what an improvement building it into the language is.

Why have an asm statement in a portable language?

An asm statement allows assembly code to be inserted directly into a D function. Assembler code will obviously be inherently non-portable. D is intended, however, to be a useful language for developing systems apps. Systems apps almost invariably wind up with system dependent code in them anyway, inline asm isn't much different. Inline asm will be useful for things like accessing special CPU instructions, accessing flag bits, special computational situations, and super optimizing a piece of code.

Before the C compiler had an inline assembler, I used external assemblers. There was constant grief because many, many different versions of the assembler were out there, the vendors kept changing the syntax of the assemblers, there were many different bugs in different versions, and even the command line syntax kept changing. What it all meant was that users could not reliably rebuild any code that needed assembler. An inline assembler provided reliability and consistency.


What is the point of 80 bit reals?

More precision enables more accurate floating point computations to be done, especially when adding together large numbers of small real numbers. Prof. Kahan, who designed the Intel floating point unit, has an eloquent paper on the subject.

How do I do anonymous struct/unions in D?

import std.stdio;

struct Foo
{
    union { int a; int b; }
    struct { int c; int d; }
}

void main()
{
    writefln(
      "Foo.sizeof = %d, a.offset = %d, b.offset = %d, c.offset = %d, d.offset = %d",
      Foo.sizeof,
      Foo.a.offsetof,
      Foo.b.offsetof,
      Foo.c.offsetof,
      Foo.d.offsetof);
}

How do I get printf() to work with strings?

In C, the normal way to printf a string is to use the %s format:
char s[8];
strcpy(s, "foo");
printf("string = '%s'\n", s);
Attempting this in D, as in:
char[] s;
s = "foo";
printf("string = '%s'\n", s);
usually results in garbage being printed, or an access violation. The cause is that in C, strings are terminated by a 0 character. The %s format prints until a 0 is encountered. In D, strings are not 0 terminated, the size is determined by a separate length value. So, strings are printf'd using the %.*s format:
char[] s;
s = "foo";
printf("string = '%.*s'\n", s);

which will behave as expected. Remember, though, that printf's %.*s will print until the length is reached or a 0 is encountered, so D strings with embedded 0's will only print up to the first 0.

Of course, the easier solution is just use std.stdio.writefln which works correctly with D strings.


Why are floating point values default initialized to NaN rather than 0?

A floating point value, if no explicit initializer is given, is initialized to NaN (Not A Number):
double d;	// d is set to double.nan
NaNs have the interesting property in that whenever a NaN is used as an operand in a computation, the result is a NaN. Therefore, NaNs will propagate and appear in the output whenever a computation made use of one. This implies that a NaN appearing in the output is an unambiguous indication of the use of an uninitialized variable.

If 0.0 was used as the default initializer for floating point values, its effect could easily be unnoticed in the output, and so if the default initializer was unintended, the bug may go unrecognized.

The default initializer value is not meant to be a useful value, it is meant to expose bugs. Nan fills that role well.

But surely the compiler can detect and issue an error message for variables used that are not initialized? Most of the time, it can, but not always, and what it can do is dependent on the sophistication of the compiler's internal data flow analysis. Hence, relying on such is unportable and unreliable.

Because of the way CPUs are designed, there is no NaN value for integers, so D uses 0 instead. It doesn't have the advantages of error detection that NaN has, but at least errors resulting from unintended default initializations will be consistent and therefore more debuggable.


Why is overloading of the assignment operator not supported?

Overloading of the assignment operator for structs is supported in D 2.0.


The '~' is not on my keyboard?

On PC keyboards, hold down the [Alt] key and press the 1, 2, and 6 keys in sequence on the numeric pad. That will generate a '~' character.


Can I link in C object files created with another compiler?

DMD produces OMF (Microsoft Object Module Format) object files while other compilers such as VC++ produce COFF object files. DMD's output is designed to work with DMC, the Digital Mars C compiler, which also produces object files in OMF format.

The OMF format that DMD uses is a Microsoft defined format based on an earlier Intel designed one. Microsoft at one point decided to abandon it in favor of a Microsoft defined variant on COFF.

Using the same object format doesn't mean that any C library in that format will successfully link and run. There is a lot more compatibility required - such as calling conventions, name mangling, compiler helper functions, and hidden assumptions about the way things work. If DMD produced Microsoft COFF output files, there is still little chance that they would work successfully with object files designed and tested for use with VC. There were a lot of problems with this back when Microsoft's compilers did generate OMF.

Having a different object file format makes it helpful in identifying library files that were not tested to work with DMD. If they are not, weird problems would result even if they successfully managed to link them together. It really takes an expert to get a binary built with a compiler from one vendor to work with the output of another vendor's compiler.

That said, the linux version of DMD produces object files in the ELF format which is standard on linux, and it is specifically designed to work with the standard linux C compiler, gcc.

There is one case where using existing C libraries does work - when those libraries come in the form of a DLL conforming to the usual C ABI interface. The linkable part of this is called an "import library", and Microsoft COFF format import libraries can be successfully converted to DMD OMF using the coff2omf tool.


Why not support regular expression literals with the /foo/g syntax?

There are two reasons:

  1. The /foo/g syntax would make it impossible to separate the lexer from the parser, as / is the divide token.
  2. There are already 3 string types; adding the regex literals would add 3 more. This would proliferate through much of the compiler, debugger info, and library, and is not worth it.

Why is the D front end written in C++ rather than D?

The front end is in C++ in order to interface to the existing gcc and dmd back ends. It's also meant to be easily interfaced to other existing back ends, which are likely written in C++. The D implementation of DMDScript, which performs better than the C++ version, shows that there is no problem writing a professional quality compiler in 100% D.


Why aren't all Digital Mars programs translated to D?

There is little benefit to translating a complex, debugged, working application from one language to another. But new Digital Mars apps are implemented in D.


When should I use a foreach loop rather than a for?

Is it just performance or readability?

By using foreach, you are letting the compiler decide on the optimization rather than worrying about it yourself. For example - are pointers or indices better? Should I cache the termination condition or not? Should I rotate the loop or not? The answers to these questions are not easy, and can vary from machine to machine. Like register assignment, let the compiler do the optimization.

for (int i = 0; i < foo.length; i++)
or:
for (int i = 0; i < foo.length; ++i)
or:
for (T* p = &foo[0]; p < &foo[length]; p++)
or:
T* pend = &foo[length];
for (T* p = &foo[0]; p < pend; ++p)
or:
T* pend = &foo[length];
T* p = &foo[0];
if (p < pend)
{
	do
	{
	...
	} while (++p < pend);
}
and, of course, should I use size_t or int?
for (size_t i = 0; i < foo.length; i++)
Let the compiler pick!
foreach (v; foo)
	...

Note that we don't even need to know what the type T needs to be, thus avoiding bugs when T changes. I don't even have to know if foo is an array, or an associative array, or a struct, or a collection class. This will also avoid the common fencepost bug:

for (int i = 0; i <= foo.length; i++)

And it also avoids the need to manually create a temporary if foo is a function call.

The only reason to use a for loop is if your loop does not fit in the conventional form, like if you want to change the termination condition on the fly.


Why doesn't D have an interface to C++ as well as C?

Attempting to have D interface with C++ is nearly as complicated as writing a C++ compiler, which would destroy the goal of having D be a reasonably easy language to implement. For people with an existing C++ code base that they must work with, they are stuck with C++ (they can't move it to any other language, either).

There are many issues that would have to be resolved in order for D code to call some arbitrary C++ code that is presumed to be unmodifiable. This list certainly isn't complete, it's just to show the scope of the difficulties involved.

  1. D source code is unicode, C++'s is ASCII with code pages. Or not. It's unspecified. This impacts the contents of string literals.
  2. std::string cannot deal with multibyte UTF.
  3. C++ has a tag name space. D does not. Some sort of renaming would have to happen.
  4. C++ code often relies on compiler specific extensions.
  5. C++ has namespaces. D has modules. There is no obvious mapping between the two.
  6. C++ views source code as one gigantic file (after preprocessing). D sees source code as a hierarchy of modules and packages.
  7. Enum name scoping rules behave differently.
  8. C++ code, despite decades of attempts to replace macro features with inbuilt ones, relies more heavily than ever on layer after layer of arbitrary macros. There is no D analog for token pasting or stringizing.
  9. Macro names have global scope across #include files, but are local to the gigantic source files.
  10. C++ has arbitrary multiple inheritance and virtual base classes. D does not.
  11. C++ does not distinguish between in, out and ref (i.e. inout) parameters.
  12. The C++ name mangling varies from compiler to compiler.
  13. C++ throws exceptions of arbitrary type, not just descendants of Object.
  14. C++ overloads based on const and volatile. D does not.
  15. C++ overloads operators in significantly different ways - for example, operator[]() overloading for lvalue and rvalue is based on const overloading and a proxy class.
  16. C++ overloads operators like < completely independently of >.
  17. C++ overloads indirection (operator*).
  18. C++ does not distinguish between a class and a struct object.
  19. The vtbl[] location and layout is different between C++ and D.
  20. The way RTTI is done is completely different. C++ has no classinfo.
  21. D does not allow overloading of assignment.
  22. D does not have constructors or destructors for struct objects.
  23. D does not have two phase lookup, nor does it have Koenig (ADL) lookup.
  24. C++ relates classes with the 'friend' system, D uses packages and modules.
  25. C++ class design tends to revolve around explicit memory allocation issues, D's do not.
  26. D's template system is very different.
  27. C++ has 'exception specifications'.
  28. C++ has global operator overloading.
  29. C++ name mangling depends on const and volatile being type modifiers. Since D does not have const and volatile type modifiers, there is no straightforward way to infer the C++ mangled identifier from a D type.

The bottom line is the language features affect the design of the code. C++ designs just don't fit with D. Even if you could find a way to automatically adapt between the two, the result will be about as enticing as the left side of a honda welded to the right side of a camaro.


Why doesn't D use reference counting for garbage collection?

Reference counting has its advantages, but some severe disadvantages:

The proposed C++ shared_ptr<>, which implements ref counting, suffers from all these faults. I haven't seen a heads up benchmark of shared_ptr<> vs mark/sweep, but I wouldn't be surprised if shared_ptr<> turned out to be a significant loser in terms of both performance and memory consumption.

That said, D may in the future optionally support some form of ref counting, as rc is better for managing scarce resources like file handles.


Isn't garbage collection slow and non-deterministic?

Yes, but all dynamic memory management is slow and non-deterministic, including malloc/free. If you talk to the people who actually do real time software, they don't use malloc/free precisely because they are not deterministic. They preallocate all data. However, the use of GC instead of malloc enables advanced language constructs (especially, more powerful array syntax), which greatly reduce the number of memory allocations which need to be made. This can mean that GC is actually faster than explict management.





Forums | Comments |  D  | Search | Downloads | Home