www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Cool D project: Remove cycle detection code from runtime

For those who wish to have a deeper understanding of D's runtime and/or 
binary files, there is a project I think would be a nice fun challenge.

Currently, when D builds, it uses the linker to assemble a table of 
ModuleInfo structures, each of which define the pieces of the module 
that the runtime needs to be aware of. One of these things is the static 
constructors and destructors, both shared (run at startup) and unshared 
(run at thread start). Note that this is a simplification, I'm not 
actually sure how the compiler and linker assemble this stuff.

Because we don't have control over the linker, and indeed the compiler 
cannot tell which modules will be included in the final binary, we must 
run an interesting, yet annoying, check every time you run a D 
executable -- topological sorting of the module constructors.

The reason is simple, let's say you have 2 modules:

chicken.d:
import egg;

static int x;

static this()
{
    x = egg.x + 1;
}

egg.d:
import chicken;

static int x;
static this()
{
    x = chicken.x + 1;
}

Which x came first? The answer is easy -- it depends on the order the 
modules are linked (obviously)!

To avoid such a disastrous situation, the runtime can fetch dependency 
data out of the executable (stored by the compiler/linker), and do a 
proper determination of the ordering for the static constructors. If, 
for instance, chicken stopped importing egg, or vice versa, then the 
static constructors would have a well-defined order. If instead there is 
a cyclic dependency, the runtime will halt execution and print an error 
informing you of the cycle.

The solution for detecting cycles is quite interesting (and recently I 
discovered that the cycle detection algorithm was broken, and I've 
re-implemented the cycle detection, this time with a test case [1]), but 
one very annoying piece of this is that the cycle detection has to run 
every time the executable is run, or every time a shared object is 
loaded. This is a sheer waste of computing power -- we are sorting the 
same static data every time we load because we lack the hooks to the 
linker to make it happen at compile time. There's not much we can do to 
improve the linker, but we CAN monkey around with the executable file :)

I propose that we (well, mostly someone other than me) write a tool that 
reads the generated linked code, extracts the relevant module 
information, sorts the data per the same algorithm as defined in 
druntime, then writes the ordering into a reserved index space (for both 
shared and TLS construction), or returns an error if there is a cycle. 
Then the runtime can be modified to detect if this tool has been run and 
avoid doing the cycle detection code. The tool will become part of the 
build normally done by your makefile. Druntime will use a flag written 
by the tool to determine that a pre-sort has been done.

This requires knowledge of executable/object format, knowledge of the 
layout of the module info, knowledge of DMD (you have to properly output 
the space for the reserved spaces), knowledge of druntime's runtime 
intialization, and knowledge of graph algorithms (specifically 
topological sorting and cycle detection).

Any hackers out there want to try this? You get to close an old bug if 
so: https://issues.dlang.org/show_bug.cgi?id=2457

-Steve

[1] https://github.com/dlang/druntime/pull/1602
Jun 30 2016