digitalmars.D - Work done so far

bearophile (294/294) Jan 20 2008 Hello, in this long post I explain what's the current situation with the...

Matti Niemenmaa (27/42) Jan 21 2008 This is still pretty impressive. Results from a sort benchmark (and the

Robert Fraser (4/7) Jan 21 2008 Mergesort generally does fewer compares than quicksort (that's why it's

Christopher Wright (4/12) Jan 21 2008 There are in-place merge sorts, though I'm not sure how costly it is to

bearophile (9/14) Jan 21 2008 I see. Notes:

Matti Niemenmaa (24/35) Jan 21 2008 Also true, this is what Andrei does in 2.0's std.algorithm, as torhu sai...

bearophile (7/17) Jan 21 2008 It beats quicksort in the common case of partially ordered input, I thin...

torhu (8/10) Jan 21 2008 Did you have look at Andrei's new sort template in std.algorithm? It
Sean Kelly (7/15) Jan 21 2008 The ticket you linked has revised numbers for the Tango sort, if anyone
bearophile (5/7) Jan 22 2008 Note that last line, I may need to improve it, because generally tt isn'...

BCS (3/12) Jan 21 2008 do you have them in SVN somewhere? If not I would be willing to let youp...

bearophile (5/7) Jan 22 2008 I don't have them in SVN. (In the meantime I have fixed fastSort and add...

BCS (3/15) Jan 22 2008 scapple has Phobos only code as well. In fact All of /my/ code is Phobos...

bearophile <bearophileHUGS lycos.com> writes:

Hello, in this long post I explain what's the current situation with the D libs
I am developing (I call them just "d libs" for the lack of a better name).

http://www.fantascienza.net/leonardo/so/libs_d.zip

I am showing the current version (2.48, but I update them almost daily. In the
zip there is the "meta" package too, not written by me, that is used by just a
bit of d libs code) of the code now because their growth is now slow enough,
they already contain most of the things I need, and because Walter and Sean
have just said to me:

Walter>If you have a faster qsort, and want to contribute it, please do so!
Same goes for other faster routines you've written.<

Sean Kelly>I'd be interested in seeing that.  I've been able to beat the D sort
for some data sets and match it in others, but not beat it across the board.<

In those d libs ("func" module) there is the fastSort() too I was talking
about, plus lot of other things [note: that sort of mine is really simple, and
it looks much similar to the sort already present in Phobos, but it run faster
to me. In the code you can find some speed tests too. It's a quicksort plus a
Simple Insertion Sort. But if you need a faster sort I suggest to adapt
something much better than the naive 60-lines long sort I have written, like
the Timsort used by Python. I have tested the speed of my code only in few
situations, not long and lot]. It's about 388 KB of clean D code, with full
unittests and ddocs. They are mostly functions, but there are few small classes
too.

I post them as open source hoping to see part of that stuff added to Phobos (or
Tango, if you want, but they may need some time to be ported). They run on D
v1.025, so they may need some little changes to run with D v2.x. But so far
Python developers have refused all my offered improvements to the Python std
lib, so my hopes of helping D std lib are nearly zero already :o) I'll use part
of this stuff anyway, just as I use my bag of tricks in developing my Python
code.

Probably among those things there are functions/classes may be just
nice-looking toys of little practical usage (example: on D v2.x xrange() is
probably nearly useless), few ones are just different ways of doing things
already present in Phobos, but I think (many) other things can be useful.

Those d libs are a mixed bag of things, but behind more than 70% of those
things there is a common rationale: I think D is a multi-level language. I have
seen I can use D to write in a low-level style (like using ASM or programming
in a low-level-C-like style), or I can use it with a style more similar to
Python code.

Usually (but not always) D programs written in lower level style are faster,
but if you have used a lot both lower level language (C, Delphi, etc) and
scripting languages (Python, Ruby, Perl) (and maybe some Functional language
too (Scheme, etc)) you know that there's a big need for both levels of coding.

You can write all the program in a low-level style, but it often requires lot
of coding time, lot of code, and lot of debugging. On the other hand you can
write most programs in Python, but sometimes you feel the need to have a
language that produces faster code, that allows you to access raw memory bytes
to implement your data structure filled with pointers, etc.

Many programs I have written have some "hot spots", critical parts, that must
be fast, and the are often quite larger parts of the program that are I/O
bound, GUI stuff, or parts that process little data, etc. So I have felt the
need of a language that allows me to write the critical parts in a low-level
style, while allowing me to write higher-level code in all the other parts of
the program. D and Lisp are among the few languages that I think allow such
multi-level style of programming without forcing you to use another language
(well, D you use ASM, that's another language, but it's built-in, you don't
need another compiler to compile C/Pyrex/Cython/ShedSkin extensions like in
Python). (I am sure there are other kinds of programs that need to be uniformly
fast. For such programs a single-level language may be enough). (Probably you
can do multi-level programming in C++ too, but I think C++ is a quite complex
language). If you have used only C/C++/Delphi then probably you need more time
to understand why a higher level of coding is often very very positive.

D (like D. v2.x) may allow such higher-level of coding adding some things to
the core language. But I have tried to add some things using the language
itself. The results aren't perfectly fine because some things are missing
(example: D doesn't offer a simple way to scan two lazy iterables in parallel),
some syntax isn't nice, but maybe some of the things I have written may be
useful, easy and nice-looking enough.

So most of the things you can find in those d libs are meant to be used in
places where running speed (and memory) isn't the main concern (but usually I
have coded them as fast as possible). Where you need max running speed you can
just use C-style constructs. But if you have to write a 10-lines D script to
process some text, if you have to write a part of code that's not meant to be
maximally fast, etc, they may be useful. So you hopefully end writing those
parts in less time, with less code and less bugs.

Those d libs contain mixed things (about 130, and they can be combined in many
ways):
- They contain some higher level functions you can find in Python. Python has
its faults, and I don't believe it's the best possible language, but I believe
its design has many nice things, that I have tried to copy. So I am not ashamed
by the fact that many of the things you can find in those d libs are so much
similar to Python syntax (and I think D already contains lot of things copied
from Python).
- Some support of lazy generators, especially some of the things you can find
in the standard Python module itertools:
http://docs.python.org/lib/module-itertools.html
- Some fast mathematical/string functions missing in Phobos that I need to use.

As I've said, some of the things you can find in those d libs may end being
just useless toys (and probably I'll move/remove them away when I am convinced
that I'll not use them, this d lib is just few months old, so I have used it
only few hundred times) but they are (or they want to be, failing) easy to use
and safe. If they are hard to use, or they need too much accesses to the docs
to be used, then they have missed most of their purpose, because they are meant
for the parts of the code where you want max *coding&debugging* speed instead
of running speed.


For example, here are the functions/classes of the main module (func.d), other
things can be found in the other modules:
AA             create an AA of given types
AAKeyType
AAValType
all            true if all items of the given iterable are true
any            true if one or more items of the given iterable are true
apply          return result of function applied to given arguments
array          copy of items of iterable class, array, etc
arrayMul       return given array multiplied 'times' times
ArrayType      return the type of the base elements of an n-dimensional array
ArrayType1     like ArrayType, but it goes down just 1 level.
autoAssign     to be used with a mixin, to assign some attributes inside class
constructors
azip           zip arrays
BaseType       combines IterType, ArrayType1 and AAKeyType
BaseTypedef    template, gives the base type of type type-defined one or more
times
cinterval      closed arithmetic progressions of chars
clear          quickly clear an AA/dynamic array in-place, and return it too
DA             create a dynamic array of given type
DeconstArrType if given a static array return the type of a dynamc array with
the same item type
dict           builds an AA from given keys and values
doTable        return array with the results of fun called with no arguments
range times
dup            return the (dynamic) duplicate of the given AA/array.
equalAA        true if the two given AAs are equal
fastSort       faster than the .sort builtin
filter         to filter an iterable
filterAA       to filter an AA
flatten        return the flattened 1-dimensional array of the elements.
floatLen       len of float array
frequency      return AA, its keys are unique items, the values are their
frequencies
fromKeys       builds an AA from given keys and value
groupBy        like Python grouby
intersectionAA return an AA with (key,val) pairs present in both input AAs
intLen         len of int array
IsAA
IsArray
IsCallable
IsDynamicArray
isIn           true if item is into iterable
isInSub        true if a subseq is into iterable
IsIterable
IsPointer
IsStaticArray
IsStruct
IterType       given the type of an iterable class, return the type of the
items it yields.
joinArrays
leftRotate     rotate array to the left
len            len of any iterable, lazy ones too
Lets1          generate assigment instructions with a mixin
Lets2          generate assigment instructions with a mixin
Lets3          generate assigment instructions with a mixin
map            to map a function on one more arrays
map2           to map a function to a single iterable with foreach(x, y; items)
max
maxs           return all max among arguments
min
mins           return all min among arguments
moduleName     given __FILE__ return the name of the module it is called in
partition      return given iterable divided into blocks
peek           return an element of the given array/AA/iterable object
posMax         return index of max into given iterable
posMin         return index of min into given iterable
put            alias of writef
putr           alias of writefln
Raises         return true if given lazy expression raises specified exceptions
range
rightRotate    rotate array to the right
SeriesGen1     generates a natural series from min to max, for mixin, 1
replacement
SeriesGen2     generates a natural series from min to max, for mixin, 2
replacements
SeriesGen3     generates a natural series from min to max, for mixin, 3
replacements
SeriesGen4     generates a natural series from min to max, for mixin, 4
replacements
sorted         return sorted iterable, according to key() too
sortedAA       return sorted AA, according to key() too
splitArray     splits according to the given delimiter
stableSort     accepts a sorting predicate too
StaticEval     to execute a function at compile time
staticSort     compile-time sort
stdinRead      quickly read all stdin into a string
str            alias of format
strLen         len of string array
Struct         defines the type of a struct that contains given types
StructArr      like struct, but the input types are arrays (takes tye type of
their elements)
structCmp      cmp among structs
sum            to sum the items of an iterable
toStruct       puts given values into a struct
Tuple          to define a tuple.
xcycle         return an infinite generator object that yields the input items
in a cycle
xmap           iterative version of map
xmap2          iterative version of map2
xrange
Xrepeat        yields 'el' 'times' times, or forever
xsplitArray    lazy version of splitArray
xstdin         stdin wrapper, yields stdin lines or reads it all
xzip           lazy zip fit for "foreach"
zip            zips different arrays, return struct array


Many other things can be found in the other modules, d.math, d.string,
d.random, etc.


Some usage examples of my D libs, mostly coming from the unittests. This shows
just a part (~40-50%) of the function/classes that are present, the module
d.func alone contains lot of things (few things are copied from Tango, like the
Kiss RND generator, few little templates, etc, so far I have used this d libs
only for personal use, but if necessary I can remove those things, they are all
labeled):

import d.all;

void main() {
    // tinyvector module example, they are quite fast
    alias TinyVector!(float, 3) TyV;
    auto v1 = new TyV;
    auto v2 = new TyV;
    putr(str(v1)); // prints <[nan, nan, nan]>
    v1.set(10, 20, 30);
    v2.set(1, 2, 3);
    putr(v1.sqrLength(), ' ', v1.dot(v2)); // prints 1400.0 140.0

    // comb module example, a lazy iterable combinations
    foreach(p; xpermutations("abc", false))
        put(p, ' '); // prints abc bac cab acb bca cba
    putr();

    // random module example
    seed(10);
    putr(shuffled("abcdefghil"[])); // prints iecbgldfah

    // math module example, very fast approximate 1/sqrt
    putr(appInvSqrt(2)); // prints 0.70693
    putr(countBitUint(0b1010_0100__0100_0101__0000_0010__0001_0001));
    // prints 9

    // string module examples
    // Smart and numerically-aware sort:
    putr(numSorted(["aa35", "Aa6", "aA261"], true)); // prints ["Aa6", "aa35",
"aA261"]

    // func module, for functional-style programming and more
    int add3(int x, int y, int z) { return x + y + z; }
    putr(apply(&add3, 2, 3, 4)); // prints 9
    // lists iterable objects and copies arrays/AAs/etc:
    putr(array("ab cd ef".xsplit())); // prints ["ab", "cd", "ef"]
    // Helpers to create AAs:
    putr(dict("abc", "def")); // prints ['a': 'd', 'b': 'e', 'c': 'f']
    putr(fromKeys("ab", 5)); // prints ['a': 5, 'b': 5]
    // More functional stuff
    int neg(int x) { return -x; }
    putr(map(&neg, xrange(3))); // prints [0, -1, -2]
    int sum3(int x, int y, int z) { return x + y + z; }
    putr(map(&sum3, [100, 200], [10, 20, 30], [1, 2, 3])); // prints [111, 222]
    // lazy version too (here "listed" anyway)
    putr(array(xmap((int x){return -x;}, [3, 1, -5, 11]))); // prints [-3, -1,
5, -11]
    int[] data2 = [33, -12, 0, 15];
    putr(filter((int x){ return x > 0; },data2)); // prints [33, 15]
    int[char[]] aa = ["aa":2, "BB":1, "cc":-2, "DD":0, "ee":5];
    bool fil(string k, int v) { return k > "DD" && v < 2; }
    putr(filterAA(&fil, aa)); // ["cc": -2]
    int[] ve1 = [1, 2, 3, 4, 5, 6];
    int[] ve2 = [3, 4, 5, 6, 7];
    int[] ve3 = [-1, -2, -3, -4];
    putr(azip(ve1, ve2, ve3, ve1));
    // prints [[1, 3, -1, 1], [2, 4, -2, 2], [3, 5, -3, 3], [4, 6, -4, 4]]
    char[][] az1 = ["hello", "how", "are"];
    auto az2 = [70, 60, 50, 40, 30];
    float[] az3 = [1.9, 2.8, 3.7, 4.6, 5.5, 6.4];
    auto az4 = "abcdefghi";
    putr(zip(az1, az2, az3, az4));
    // prints [<"hello", 70, 1.9, 'a'>, <"how", 60, 2.8, 'b'>, <"are", 50, 3.7,
'c'>]
    putr(range(3, 10)); // prints [3, 4, 5, 6, 7, 8, 9]
    // lazy too:
    putr(array(xrange(3, 10))); // prints [3, 4, 5, 6, 7, 8, 9]
    putr(cinterval('g', 'a', -1)); // prints gfedcba
    putr(sum([1.5, 2.5, 3.5], 2.0)); // prints 9.5
    // lazyly too:
    putr(sum(xrange(5), 0)); // prints 10
    putr(max(1, 2, 3)); // prints 3
    putr(max([1, 2, 3, 4, 5, 6])); // prints 6
    putr(max(range(10))); // prints 9
    putr(min([-1, -2, -3, -4, -5, -6], &abs)); // prints -1
    // find all max/min:
    putr(maxs([[-1,-2], [-1,-3], [-1,-3], [-1,-2]], (int[] a){return
map(&abs,a);}));
    // prints: [[-1, -3], [-1, -3]]
    // to find the index of max/min:
    putr(posMax([1, 2, 7, 4, 5, 6])); // prints: 2
    // all()/any() (they work on iterable objects, AAs, etc):
    class IsNegative { bool opCall(int x) { return x < 0; } }
    putr(all([-1:0, -2:0, -3:0], new IsNegative)); // prints: true
    // sorting with key function.
    // There is a sortedAA too that works on keys&values
    putr(sorted([10, 1, -20, 0, 1], &abs)); // prints: [0, 1, 1, 10, -20]
    putr(flatten([[[1], [2]], [[3], [4], [5, 6]]]));
    // prints: [1, 2, 3, 4, 5, 6]
    int[] ap = [1, 2, 3 ,4 ,5, 6, 7];
    putr(partition(ap, 3)); // prints: [[1, 2, 3], [4, 5, 6]]
    // iterable objects too:
    putr(partition(xrange(7))); // prints: [[0, 1], [2, 3], [4, 5]]
    putr(isIn('f', "felipe")); // prints: true
    putr(isInSub([1.1, 1.2], [-1.5, 2.3, 1.6, 1.1, 1.2]));
    // prints: true
    putr(frequency([4, 5, 4, 5, 4])); // prints: [4: 3, 5: 2]
    // lazily cycle any kind of iterable
    // The doTable builds an array calling a callable a certain number of times
    auto c1 = xcycle("Abcd");
    putr(doTable(&c1.next, 17)); // prints: AbcdAbcdAbcdAbcdA
    // lazily groups items of any kind of iterable according to given key():
    bool myisupper(char c) { return c >= 'A' && c <= 'Z'; }
    putr(array(groupBy("ALLOrachENEPE", &myisupper)));
    // prints: ["ALLO", "rach", "ENEPE"]
    putr(array(groupBy([1, 1, 1, 2, 2, 3, 3, 3])));
    // prints [[1, 1, 1], [2, 2], [3, 3, 3]]
    char[][int] aa1 = [1:"aa", 2:"bb", 3:"cc", 4:"dd"];
    char[][int] aa2 = [1:"aa", 2:"BB", -3:"cc"];
    // Intersection among AAs:
    putr(intersectionAA(aa1, aa2)); // prints: [1: "aa"]
    // There is xstdin() that allows to read all the stdin very quicky,
    // or lazily iterate on its lines.
    // cmp among structs:
    struct Sc2 { int x, y;}
    putr(structCmp(Sc2(7, 7), Sc2(7, 8))); // prints -1
    // == among AAs:
    putr(equalAA(['a':2, 'b':3], ['a':2, 'b':3])); // prints: true
}


Those functions are designed to be fast and very flexible, they usually contain
code to manage AA, arrays, strings and iterable objects in the faster way.

One of the most complex parts of those d libs are str/repr/put/putr of the
d.string module. putr() is a function like writefln(), but it doesn't perform
the triky formatting (as write/writeln of D 2.x), and it prints most things
*way* better than writefln (it's slower too, so if you need a simpler faster
printing you can use writefln. putr/put aren't meant to replace
writefln/writef), fixing lot and lot of the bad ways it prints things (you can
find many examples in the unittests of the putr/put). The "put"/"putr" names
too are shorter and simpler to write. Maybe you don't like some of the design
choices present inside that put/putr (like the automatic struct printing), but
I think they are nice :-)

Here are few examples of putr usage (like in Python inside arrays, AAs, etc it
uses repr instead of str, so it escapes chars, etc):

import d.string: putr;

void main() {
    putr([1, 2, 3]); // [1, 2, 3]
    putr(["1", "2", "3"]); // ["1", "2", "3"]
    putr("hel\"lo"); // hel"lo
    putr("hel\"lo"); // hel"lo
    putr(["he\"llo"]); // ["he\"llo"]
    putr([["Ab", "c"], ["D", "ef"]]); // [["Ab", "c"], ["D", "ef"]]
    putr([1:"aa", 2:"ba", 3:"bb"]); // [1: "aa", 2: "ba", 3: "bb"]

    struct S3 { int x; float y; int z;}
    putr(S3(2, 7.1, 8)); // <2, 7.1, 8>

    union U { int x; char c; float f; }
    U u;
    u.x = 100;
    putr(u); // {100}

    putr(["ab", "ba"]); // ["ab", "ba"]
    putr(['a':'d','b':'e']); // ['a': 'd', 'b': 'e']

    char[int] aa_empty;
    putr(aa_empty); // AA!(int, char)

    class Cl0 { int a; this(int b){a=b;} }
    Cl0 cl0;
    putr(cl0); // null
    cl0 = new Cl0(100);
    putr(cl0); // test.main.Cl0

    putr(cast(creal)-53-55i); // -53.0-55.0i
    putr(cast(cdouble)-7-0i); // -7.0+0.0i
    putr(cast(idouble)-5i); // -5.0i

    typedef int T;
    T t = 10;
    putr(t); // 10

    auto vv = [null, null, null];
    putr(vv); // [null, null, null]

    putr([`"hello"`]); // ["\"hello\""]
    putr(["`hello`"]); // ["`hello`"]

    putr("abc\0\25de"); // abc..de
    putr(["abc\0\25de"]); // ["abc\x00\x15de"]
}


To compare, if you replace that first line with this one:

import std.stdio: putr = writefln;

to use writefln, and you comment the lines that print those structs/enums, you
obtain this output:

[1,2,3]
[1,2,3]
hel"lo
hel"lo
[he"llo]
[[Ab,c],[D,ef]]
[1:[a,a],2:[b,a],3:[b,b]]
<error>
<error>
[ab,ba]
[a:d,b:e]
[]
null
test2.main.Cl0
-53+-55i
-7+0i
-5
10
[0000,0000,0000]
["hello"]
[`hello`]


Explaining the rationale behind every function/class requires lot of space, so
I can't explain everything here, I am open to questions (I can give my
not-spam-shielded email address too, if necessary). Most of those things are
similar to Python ones, so you can find the rationale in the Python docs, but
there are few differences, like in the groupBy (it doesn't return a lazy
iterable of tuples that contain as first element the head and as second element
an iterable to the group, but the group here is a plain array to increase usage
simplicity). The sorted/sortedAA functions take a key function instead of a cmp
function like most sorting things you can find in D because for the purposes I
was talking about it's simpler and shorter to use (but it requires more RAM).

New versions of the d lib will follow, I am currently adding xpartition() to
func.d, that is quite hairy to write (but very easy to use). And I am writing a
fast Deque class too (it's slower than the Deque of the C++ STL just because
DMD doesn't perform method inlining at all in my benchmarks).

Bye,
bear hugs,
bearophile

Jan 20 2008

Matti Niemenmaa <see_signature for.real.address> writes:

bearophile wrote:
 Walter>If you have a faster qsort, and want to contribute it, please do so!
 Same goes for other faster routines you've written.<
 
 Sean Kelly>I'd be interested in seeing that.  I've been able to beat the D
 sort for some data sets and match it in others, but not beat it across the
 board.<
 
 In those d libs ("func" module) there is the fastSort() too I was talking
 about, plus lot of other things [note: that sort of mine is really simple,
 and it looks much similar to the sort already present in Phobos, but it run
 faster to me. In the code you can find some speed tests too. It's a quicksort
 plus a Simple Insertion Sort. But if you need a faster sort I suggest to
 adapt something much better than the naive 60-lines long sort I have written,
 like the Timsort used by Python. I have tested the speed of my code only in
 few situations, not long and lot].

This is still pretty impressive. Results from a sort benchmark (and the 
benchmark itself, which requires Tango) are attached. Your code is quite short 
but still knocks the socks off the built-in sort.

It appears to do a lot of comparing, though---and thus, when made to use a 
custom comparing function instead of the built-in "<" operator it quickly slows 
down. It's still much better than the built-in sort (especially since the 
built-in sort can't even take a custom comparator), but for a more general sort 
it's not that great. Unless I completely messed things up when converting it to 
use a comparing function, that is. Which is entirely possible.

One other change I had to make is to add "j &&" to this line in _aqs, because 
when cmp was a function "return a > b" instead of "return a < b" it would
result 
in array bounds errors:
do j--; while (j && cmp(a, v[j]));

I see that Sean has changed the Tango sort to use introspective sort as well,
so 
the results for that don't apply any more 
(http://www.dsource.org/projects/tango/ticket/571) but they're included anyway.

Key for understanding the numbers:

For each algorithm, the first column set is for random, the second for already 
sorted, the third for sorted and then reversed, and the fourth for "median-of-3 
killer" data, specially crafted to bring worst-case behaviour in median-of-3 
quicksorts. The Tango sort used to hit a stack overflow and thus isn't tested 
with them.

In each column set, the first column is the average, the second the minimum,
and 
the third the maximum time taken.

-- 
E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi

Jan 21 2008

Robert Fraser <fraserofthenight gmail.com> writes:

Matti Niemenmaa wrote:
 It appears to do a lot of comparing, though---and thus, when made to use 
 a custom comparing function instead of the built-in "<" operator it 
 quickly slows down.

Mergesort generally does fewer compares than quicksort (that's why it's 
the default sorting algorithm for objects, but not primitives, in Java), 
although it has the cost of copying & allocation there.

Jan 21 2008

Christopher Wright <dhasenan gmail.com> writes:

Robert Fraser wrote:
 Matti Niemenmaa wrote:
 It appears to do a lot of comparing, though---and thus, when made to 
 use a custom comparing function instead of the built-in "<" operator 
 it quickly slows down.

 
 Mergesort generally does fewer compares than quicksort (that's why it's 
 the default sorting algorithm for objects, but not primitives, in Java), 
 although it has the cost of copying & allocation there.

There are in-place merge sorts, though I'm not sure how costly it is to 
do that. One implementation I see right now will cost O(n * n * log n) 
in the worst case, which is clearly unacceptable.

Jan 21 2008

bearophile <bearophileHUGS lycos.com> writes:

Matti Niemenmaa:
 It appears to do a lot of comparing, though---and thus, when made to use a 
 custom comparing function instead of the built-in "<" operator it quickly
slows 
 down. It's still much better than the built-in sort (especially since the 
 built-in sort can't even take a custom comparator), but for a more general
sort 
 it's not that great.

I see. Notes:
- Two or more sorts can be kept, one for native data, one for more complex
sorting, etc. (That's what I do in my d libs)
- Some of the situations where you use cmp() can be managed with a string
mixin, maybe this can speed things up some.
- A better support of inlining by DMD may change the situation some.
- In many simple situations where speed and data size are less important a
key() function is simpler to use than cmp(). See sorted/sortedAA in d libs. If
uou use a key() you often don't need a cmp() too.
- Try adapting Timsort used by Python, probably it's not easy to find something
better: http://py-stablesort.sourceforge.net/

Bye,
bearophile

Jan 21 2008

Matti Niemenmaa <see_signature for.real.address> writes:

bearophile wrote:
 Notes:
 - Two or more sorts can be kept, one for native data, one for more complex
 sorting, etc. (That's what I do in my d libs)

True.

 - Some of the situations where you use cmp() can be managed with a string
 mixin, maybe this can speed things up some.

Also true, this is what Andrei does in 2.0's std.algorithm, as torhu said. And 
it probably does speed things up.

 - A better support of inlining by DMD may change the situation some.

In some cases, yes, but in this general case of passing a function pointer I 
don't think it can help.

 - In many simple situations where speed and data size are less important a
 key() function is simpler to use than cmp(). See sorted/sortedAA in d libs.
 If uou use a key() you often don't need a cmp() too.

If speed is less important you probably don't care that much about the
algorithm 
  being used either. ;-)

 - Try adapting Timsort used by Python, probably it's not easy to find
 something better: http://py-stablesort.sourceforge.net/

I haven't tested timsort but I gather its main selling point is that it's 
stable. As such I don't think it'll be faster than an average quicksort 
(although I could be completely wrong, of course), since, as far as I can tell, 
it's a highly optimized version of merge sort.

As the docs say: "an adaptive, stable, natural mergesort, modestly called
timsort". The situations in which one needs a stable sort are, IMHO, rare
enough 
that it should be considered a special case and thereby a separate
function/method.

I'm willing to concede without evidence that timsort is among the fastest
stable 
sorts, but I'll admit I'd be a bit surprised if even a hand-optimized C 
implementation working on arrays would beat stdlib's qsort() or your algorithm, 
for instance.

I can't test it since the original code uses the Python API instead of sorting
a 
void* or the like. If you find a more portable implementation or feel like 
porting it yourself, feel free to benchmark it and see how it performs.

-- 
E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi

Jan 21 2008

bearophile <bearophileHUGS lycos.com> writes:

Matti Niemenmaa:
 If speed is less important you probably don't care that much about the
algorithm 
   being used either. ;-)

This doesn't change the fact that in many situations in your code you want to
use a key() instead of a cmp().


 I haven't tested timsort but I gather its main selling point is that it's 
 stable. As such I don't think it'll be faster than an average quicksort 
 (although I could be completely wrong, of course), since, as far as I can
tell, 
 it's a highly optimized version of merge sort.

It beats quicksort in the common case of partially ordered input, I think. With
quite varied definition of "partially". Plus other situations too, like partial
reverse ordering, etc. In real world situations you usually have partially
ordered data, or you have to merge two partially sorted sequences, etc. In such
very common cases Timsort is probably among the best. And it's fast enough in
the other cases too. You can take a look at the comparison count too.


 The situations in which one needs a stable sort are, IMHO, rare enough 
 that it should be considered a special case and thereby a separate
 function/method.

In my d libs there is a stable sort too. A stable sort is really useful in lot
of situations, like sorting structs (multi type tuples). And Timosort was
faster anyway than the not-stable sort used before it... I presume it's good
mostly when you have to sort objects with a custom comparator (unlike the
fastSort I have written in my D libs).


 If you find a more portable implementation

The site I have shown you has the most independent version of it you can find,
I think. Jython too uses it (and I have heard that recently the Sun JavaVM may
use it too, but I am not sure).

Bye,
bearophile

Jan 21 2008

torhu <no spam.invalid> writes:

Matti Niemenmaa wrote:
  > This is still pretty impressive. Results from a sort benchmark (and the
 benchmark itself, which requires Tango) are attached. Your code is quite short 
 but still knocks the socks off the built-in sort.

Did you have look at Andrei's new sort template in std.algorithm?  It 
was added in 2.008. I don't know if he plans to optimize it further or 
not.  It can be used like this:

import std.algorithm, std.functional;

sort!(less)(array);     // predicate function from std.functional
sort!("a < b")(array);  // comparison inlined by string mixin

Jan 21 2008

Sean Kelly <sean f4.ca> writes:

Matti Niemenmaa wrote:
 
 Key for understanding the numbers:
 
 For each algorithm, the first column set is for random, the second for
 already sorted, the third for sorted and then reversed, and the fourth
 for "median-of-3 killer" data, specially crafted to bring worst-case
 behaviour in median-of-3 quicksorts. The Tango sort used to hit a stack
 overflow and thus isn't tested with them.

The ticket you linked has revised numbers for the Tango sort, if anyone
is interested.  Also, I think it's worth noting that the built-in sort
routine actually does call a function for comparison (TypeInfo.cmp) --
there's simply no way to override it for a particular type.  Given this,
the average performance of the built-in sort is really excellent.


Sean

Jan 21 2008

bearophile <bearophileHUGS lycos.com> writes:

Matti Niemenmaa:

     TyItem t, tt;

...
         tt = (j > inf) ? v[j - 1] : 0;

Note that last line, I may need to improve it, because generally tt isn't a
number. Next revision.

Bye,
bearophile

Jan 22 2008

BCS <ao pathlink.com> writes:

Reply to bearophile,

 Hello, in this long post I explain what's the current situation with
 the D libs I am developing (I call them just "d libs" for the lack of
 a better name).
 
 http://www.fantascienza.net/leonardo/so/libs_d.zip
 
 I am showing the current version (2.48, but I update them almost
 daily.
 

do you have them in SVN somewhere? If not I would be willing to let youput 
them in scrapple.

Jan 21 2008

bearophile <bearophileHUGS lycos.com> writes:

BCS Wrote:
 do you have them in SVN somewhere? If not I would be willing to let you put 
 them in scrapple.

I don't have them in SVN. (In the meantime I have fixed fastSort and added the
partition/xpartition, now I'll work on Deque).

Thank you for the offer, I think the main problem is that those libs use
Phobos, so to be ported to Tango they require some work. (A smaller problem is
that put/putr/str/repr are meant as alternative to writefln/writef, so they may
be at odd with the way Tango prints things).

Bye,
bearophile

Jan 22 2008

BCS <BCS pathlink.com> writes:

bearophile wrote:
 BCS Wrote:
 
do you have them in SVN somewhere? If not I would be willing to let you put 
them in scrapple.

 
 
 I don't have them in SVN. (In the meantime I have fixed fastSort and added the
partition/xpartition, now I'll work on Deque).
 
 Thank you for the offer, I think the main problem is that those libs use
Phobos, so to be ported to Tango they require some work. (A smaller problem is
that put/putr/str/repr are meant as alternative to writefln/writef, so they may
be at odd with the way Tango prints things).
 
 Bye,
 bearophile

scapple has Phobos only code as well. In fact All of /my/ code is Phobos 
only. (tango.scrapple is a different project.)

Jan 22 2008

D Programming

C/C++ Programming

Other

digitalmars.D - Work done so far