digitalmars.D - Re: Port a benchmark to D?

bearophile <bearophileHUGS lycos.com> Jun 04 2011

bearophile <bearophileHUGS lycos.com> Jun 04 2011

bearophile <bearophileHUGS lycos.com> writes:

UnionFindNode struct instances probably doesn't need a memory pool, it's
already allocated in (small) arrays.

I have not found a simple enough way to allocate SimpleLoop struct instances
with a memory pool.

I have not found simple ways to use most of the C++ optimization hints of the
original paper. The paper is badly written and its contents are worse. I think
they have written this as an internal benchmark and only later have created a
paper. I think they have used an old CPU because their PC farms often use
similar old CPUs.

I have created a D version a bit de-optimized, that uses classes instead of
structs for the classes that are instantiated only few times. Generally the
original C++ code is not good: it's low level where it doesn't need to be low
level, and it wastes memory and time where it matters (like in tiny data
structures with lot of overhead.) I do not generally write code like that.

I don't have desire to work further on this code.

I'd like a Set with a very handy API (copied from Python) in Phobos.

I'd like the built-in associative arrays to have an optimization for very small
number of items. Python dicts have an optimization for 8 items or less.

I have done no attempts to tune the GC. I confirm that if you disable the GC
and perform a collection only once in a while, the code gets faster. Regarding
the improvements of the D GC, I remember I have found an optimization for LDC1
lot of time ago, but I don't remember it well. I think the GC scans the static
memory too, etc, this slows down the GC.

Google seems to care a lot about compilation times too. The D code compiles in
about 1 second or less with full optimization. The C++0x code compiled with G++
doesn't come even close to that.


D code compiled with DMD 2.053 with:
dmd -O -release -inline -w -L/STACK:5000000
Using -g doesn't crash my code on Windows, so I suggest someone to minimize the
code to find this bug.
I also suggest someone to minimize the code for the 64 version to find the 64
bit DMD bug.

C++0x code compiled with G++ 4.6 with:
g++ -std=c++0x -Ofast -Wall -Wextra -s

With LDC and GDC I suggest to look much better for the correct switches.
Othrerwise this benchmark becomes even more meaningless (it's already not so
meaningful. Google has not released the full C++ code).


I have created a D version that uses a TinySet, following one of the C++
optimization hints, one of the D versions use it, but I don't currently have a
good PC to actually take the timings:


import std.typetuple: TypeTuple;
extern(C) pure nothrow void exit (in int status);

template Range(int stop) {
    static if (stop <= 0)
        alias TypeTuple!() Range;
    else
        alias TypeTuple!(Range!(stop-1), stop-1) Range;
}

struct TinySet(T, int maxLength) {
    T[maxLength] data;
    size_t length = 0; // readonly property

    void add(T x) {
        /*static*/ foreach (i; Range!maxLength)
            if (x == data[i])
                return;
        if (length < maxLength) {
            data[this.length] = x;
            length++;
        } else
            exit(1);
    }

    T[] opSlice() {
        return data[0 .. this.length];
    }
}


C++0x code a bit modified, uses unordered_map instead of silly map:
http://codepad.org/9NEL9yPo

D code with structs and classes (no manual deallocations), less ugly, probably
this is the version to show to people:
http://codepad.org/Xqg0Y26S

Experimental D code with TinySet, with structs and classes (no manual
deallocations):
http://codepad.org/4ATOvFsV

D code with only structs (no manual deallocations):
http://codepad.org/SePFDAaT

D code with only classes:
http://codepad.org/k0DUzP3D

Bye,
bearophile

Jun 04 2011

bearophile <bearophileHUGS lycos.com> writes:

     void add(T x) {
         /*static*/ foreach (i; Range!maxLength)
             if (x == data[i])
                 return;


Sorry, this code is wrong, it has to iterate from 0 to length.

If some code doesn't have unittests then it's broken.

Bye,
bearophile

Jun 04 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Re: Port a benchmark to D?