In A Module Far, Far Away Part 1

July 8, 2009

In working on the design of the D programming language, a topic that fascinates me is how the language can help a user find and eliminate programming bugs. The tug of war is between the programmer adding in loads of explicit annotations that can be checked by the compiler, and the higher productive method of throwing the code and relying on runtime testing to sort out problems. If there’s too much of the former, the language gets characterized as bondage and discipline and will only be used if one’s contract requires it, and if too little then an awful lot of time is lost in debugging.

My job as a language designer is to find ways for the language to help without requiring B & D annotations, or at least to approach that sweet spot in trading off between those goalposts.

Here we’ll talk about what happens when there’s a declaration of X in module A, and X is used far, far away in module B, and then the declaration of X is changed. What happens in B? In order of preference, these can happen:

B adapts to the changes and works correctly
The compiler complains when compiling B that changes need to be made
The program fails in B at runtime with a reasonable message
The program crashes
The program launches nuclear missiles

The further up that hierarchy we can push things, the better off things are (better meaning higher productivity). Relying on manual code reviews isn’t a great solution, because A and B will likely be reviewed independently, missing the dependency between them. (I call this kind of issue a non-local bug, as opposed to a local bug which is entirely contained in one module.)

To illustrate with a simple C example, suppose X is an array declaration in module A:

float X[10];

and in module B we have the loop:

for (int i = 0; i < 10; i++) {
   float s = X[i];
   ...
}

Later, the declaration in A is changed to:

float X[5];

and now B fails at runtime. The solution is:

for (int i = 0; i < sizeof(X)/sizeof(X[0]); i++) {
   float s = X[i];
   ...
}

and B seamlessly adapts to any changes in the length of the array. But now let’s change the type of X as:

double X[10];

In B, the declaration:

float s = X[i];

causes an implicit conversion from double to float, which may produce quite unintended results (after all, the type was changed from float to double for a reason). In C, we can deal with this with a typedef:

typedef float X_element_t;
X_element_t X[10];

for (int i = 0; i < sizeof(X)/sizeof(X[0]); i++) {
   X_element_t s = X[i];
   ...
}

This works, but it relies on the programmer to have the discipline to use the convention consistently, and it’s extra work. The compiler really isn’t very helpful here. In D, we can write it as:

float X[10];

for (int i = 0; i < X.length; i++) {
   auto s = X[i];
   ...
}

The auto declaration tells the compiler to infer the type from the type of the initializer expression. (The dimension of the X array is picked up by the convenient .length property.)

So far, our module B is doing a lovely job of adapting to changes in the declaration of X. But wait! What if a function is called in module B?

void foo(float e) { ... }
...
foo(X[i]);

We’re back to that doggone conversion to float. Can we get the parameter type to be inferred from the argument type as well? Sure, make the type a parameter:

void foo(T)(T e) { ... }

Functions can have two parameter lists, the first is the types which are inferred from the argument types supplied to the parameters in the second list.

Where else can we infer types (and hence adapt to changes) rather than specify them?

Where ever a type is needed, a type can be inferred from the type of any expression using the typeof construct:

short x,y;
typeof(x + y) f; // f is of type int

(Due to type conversions in expressions, the type of an addition may be different from the type of either of its operands.)

The type of a function return can be accessed:

int foo()
{
    typeof(return) e;  // e is of type int
}

Even the return type of a function can be inferred:

auto foo()
{
    return 3.0;  // return type of foo() is double
}

These capabilities of D go a long way to reducing the errors in module B if the declarations in A change. In part 2 of this, we’ll look at some more features along these lines.

If you want to learn more about how real compilers work, I am hosting a seminar in the fall on compiler construction.

Articles

In A Module Far, Far Away Part 1