www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How should I sort a doubly linked list the D way?

reply Mirjam Akkersdijk <noreply noreply.com> writes:
Hello there,
If I had a DLL, how would I sort it judging by the node contents, 
the D way?

In C if I were to sort a piece of malloc'd memory pointing to 
node pointers, I would write my compare function and let qsort 
sort it out. In D, I tried to use std.algorithm's sort 
functionality to no avail, because my guess would be it only 
operates on arrays. This opens up another can of worms as I am 
not fond of VLAs and D desperately wants to know what the size is 
at compile time.

Let's say we this trivial implementation:

Node* get_node(DLL* list, uint index);

struct Node {
   int x;
   Node* prev, next;
};

struct DLL {
   Node* head;
   uint size;
};

Node** nodes = cast(Node**) malloc(DLL.size * (Node*).sizeof);

and I would like to sort based on Node.t, how should I tackle it, 
preferably without resorting to C libraries?
Aug 13 2019
next sibling parent reply Mirjam Akkersdijk <noreply noreply.com> writes:
On Tuesday, 13 August 2019 at 09:48:52 UTC, Mirjam Akkersdijk 
wrote:
 Hello there,
 If I had a DLL, how would I sort it judging by the node 
 contents, the D way?

 [...]
Node.t
Node.x, my bad
Aug 13 2019
parent Daniel Kozak <kozzi11 gmail.com> writes:
You can make an array from it

On Tue, Aug 13, 2019 at 12:05 PM Mirjam Akkersdijk via
Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:
 On Tuesday, 13 August 2019 at 09:48:52 UTC, Mirjam Akkersdijk
 wrote:
 Hello there,
 If I had a DLL, how would I sort it judging by the node
 contents, the D way?

 [...]
Node.t
Node.x, my bad
Aug 13 2019
prev sibling next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, August 13, 2019 3:48:52 AM MDT Mirjam Akkersdijk via 
Digitalmars-d-learn wrote:
 Hello there,
 If I had a DLL, how would I sort it judging by the node contents,
 the D way?

 In C if I were to sort a piece of malloc'd memory pointing to
 node pointers, I would write my compare function and let qsort
 sort it out. In D, I tried to use std.algorithm's sort
 functionality to no avail, because my guess would be it only
 operates on arrays. This opens up another can of worms as I am
 not fond of VLAs and D desperately wants to know what the size is
 at compile time.

 Let's say we this trivial implementation:

 Node* get_node(DLL* list, uint index);

 struct Node {
    int x;
    Node* prev, next;
 };

 struct DLL {
    Node* head;
    uint size;
 };

 Node** nodes = cast(Node**) malloc(DLL.size * (Node*).sizeof);

 and I would like to sort based on Node.t, how should I tackle it,
 preferably without resorting to C libraries?
std.algorithm.sort operates on random-access ranges, not just dynamic arrays, and there's no need to know the size at compile time. I don't know why you would think that arrays would need to know their size at compile time unless you've only been using static arrays. However, a doubly-linked list definitely isn't random access, so it wouldn't work with std.algorithm.sort, and I don't think that Phobos has anything which would be able to sort it. There might be a solution somewhere on code.dlang.org, but I don't know of any that I could point you to. Either way, if there's a solution floating around that you can use to sort a doubly-linked list in place, it's not an official one. - Jonathan M Davis
Aug 13 2019
prev sibling next sibling parent ikod <geller.garry gmail.com> writes:
On Tuesday, 13 August 2019 at 09:48:52 UTC, Mirjam Akkersdijk 
wrote:
 Hello there,
 If I had a DLL, how would I sort it judging by the node 
 contents, the D way?

 In C if I were to sort a piece of malloc'd memory pointing to 
 node pointers, I would write my compare function and let qsort 
 sort it out. In D, I tried to use std.algorithm's sort 
 functionality to no avail, because my guess would be it only 
 operates on arrays. This opens up another can of worms as I am 
 not fond of VLAs and D desperately wants to know what the size 
 is at compile time.
You can check std.container.rbtree which will build sorted "list" for you on every Node insert. Again, not sure how this can be done at compile time.
Aug 13 2019
prev sibling next sibling parent reply bachmeier <no spam.net> writes:
On Tuesday, 13 August 2019 at 09:48:52 UTC, Mirjam Akkersdijk 
wrote:

 I would write my compare function and let qsort sort it out.
I'm confused by this statement. Are you referring to the qsort in C's stdlib? I had never heard of using that to sort a linked list, so I searched, and it is not possible. https://cboard.cprogramming.com/c-programming/64520-sorting-linked-list-qsort.html https://stackoverflow.com/questions/7091685/doesnt-qsort-c-library-function-w rk-on-linked-lists: "qsort works on plain arrays of equally sized data, not linked lists"
Aug 13 2019
parent Mirjam Akkersdijk <noreply noreply.com> writes:
On Tuesday, 13 August 2019 at 12:34:46 UTC, bachmeier wrote:
 I'm confused by this statement. Are you referring to the qsort 
 in C's stdlib? I had never heard of using that to sort a linked 
 list, so I searched, and it is not possible.
Ah yes, maybe I should have elaborated. In C, you can just create an array (1) struct Node* nodes[buf_size]; and then loop over it, filling it with (2) nodes[i] = (DLLbuf + i * sizeof(struct Node)); subsequently calling our dear friend from stdlib (3) qsort(nodes, buf_size, sizeof(struct Node*), buf_compare_function); You can then rechain all nodes together but that's a bit out of scope for this thread unless you do really want me to work it out :) But I was looking for a more D-specific approach to solve this, as it feels archaic to do it the C way. If there is anyone with a better one please let me know.
Aug 13 2019
prev sibling next sibling parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Tuesday, 13 August 2019 at 09:48:52 UTC, Mirjam Akkersdijk 
wrote:
 and I would like to sort based on Node.t, how should I tackle 
 it, preferably without resorting to C libraries?
Convert the nodes into an D array, sort the array with nodes.sort!"a.x < b.x" and then iterate the array and repair the next/prev pointers.
Aug 13 2019
parent reply Mirjam Akkersdijk <noreply noreply.com> writes:
On Tuesday, 13 August 2019 at 14:04:45 UTC, Sebastiaan Koppe 
wrote:
 On Tuesday, 13 August 2019 at 09:48:52 UTC, Mirjam Akkersdijk 
 wrote:
 and I would like to sort based on Node.t, how should I tackle 
 it, preferably without resorting to C libraries?
Convert the nodes into an D array, sort the array with nodes.sort!"a.x < b.x" and then iterate the array and repair the next/prev pointers.
Thank you, this is what I eventually chose to do. It's also fairly fast, though doesn't the nature of the dynamic array slow it down a bit? Since I already know the size of the array (but just not at compile time), can't I define an array of fixed size, which is dependent on the input of the function?
Aug 13 2019
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, August 13, 2019 11:33:10 AM MDT Mirjam Akkersdijk via 
Digitalmars-d-learn wrote:
 On Tuesday, 13 August 2019 at 14:04:45 UTC, Sebastiaan Koppe

 wrote:
 On Tuesday, 13 August 2019 at 09:48:52 UTC, Mirjam Akkersdijk

 wrote:
 and I would like to sort based on Node.t, how should I tackle
 it, preferably without resorting to C libraries?
Convert the nodes into an D array, sort the array with nodes.sort!"a.x < b.x" and then iterate the array and repair the next/prev pointers.
Thank you, this is what I eventually chose to do. It's also fairly fast, though doesn't the nature of the dynamic array slow it down a bit? Since I already know the size of the array (but just not at compile time), can't I define an array of fixed size, which is dependent on the input of the function?
In D, static arrays must know their length at compile time. Dynamic arrays can have their length determined at runtime. I don't know why a dynamic array would slow anything down here though - especially if you just allocate it with the correct length. Appending each element individually instead of allocating the full dynamic array and then setting each element would certainly be slower, but once the dynamic array is fully allocated, as far as accessing elements go, accessing a dynamic array and static array should be basically the same. The only real difference is that one would have its elements on the stack, whereas the other would have them on the heap. Even if you used a static array, you'd have to slice it to pass to sort, which would mean that you'd be passing it a dynamic array. A dynamic array is basically just a struct with a pointer and a length. e.g. struct DynamicArray(T) { size_t length; T* ptr; } and accessing its elements is the same as you'd get with a pointer in C except for the fact that in safe code, the bounds are checked when indexing elements. The only thing about dynamic arrays that can get slow is if you're doing a lot of appending to them, because then it has to keep checking to see whether it can expand in place or has to allocate a new block of memory and copy the elements. In that respect, it's similar to C++'s std::vector. - Jonathan M Davis
Aug 13 2019
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Aug 13, 2019 at 12:12:28PM -0600, Jonathan M Davis via
Digitalmars-d-learn wrote:
 On Tuesday, August 13, 2019 11:33:10 AM MDT Mirjam Akkersdijk via 
 Digitalmars-d-learn wrote:
[...]
 Thank you, this is what I eventually chose to do. It's also fairly
 fast, though doesn't the nature of the dynamic array slow it down a
 bit? Since I already know the size of the array (but just not at
 compile time), can't I define an array of fixed size, which is
 dependent on the input of the function?
In D, static arrays must know their length at compile time. Dynamic arrays can have their length determined at runtime. I don't know why a dynamic array would slow anything down here though - especially if you just allocate it with the correct length.
[...] When it comes down to fine points concerning performance like here, my default response is: are you sure this is an *actual* performance hit? Did you run a profiler to measure the actual runtimes? Static arrays have the advantage that the size is known at compile-time, so the compiler can generate better code (by hard-coding the length, optimizing the memory layout, etc.). But when you pass static arrays around, you have to either slice them, in which case they become identical to dynamic arrays anyway, or pass them on the stack, which in some cases can actually slow things down (you have to copy a potentially large amount of data to/from the stack when passing arguments to a function). Passing a dynamic array (or a slice of a static array) is very fast: it's just a pointer + size pair. Dynamic arrays could potentially incur performance hits if you're appending to them (may need a GC allocation). Static arrays don't have this problem -- but then you can't append to a static array at all. :-P Anyway, splitting hairs over whether to use static arrays or dynamic arrays sounds like premature optimization to me. Before worrying about potential performance problems with using dynamic arrays, always use a profiler to see if the bottleneck is actually in that part of the code. Otherwise it's just a waste of time and energy to "optimize" the code when it isn't even anywhere near the real bottleneck(s) in the program. Profile, profile, profile. Until then, don't sweat it over optimization. T -- Mediocrity has been pushed to extremes.
Aug 13 2019
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 08/13/2019 10:33 AM, Mirjam Akkersdijk wrote:
 On Tuesday, 13 August 2019 at 14:04:45 UTC, Sebastiaan Koppe wrote:
 Convert the nodes into an D array, sort the array with nodes.sort!"a.x
 < b.x" and then iterate the array and repair the next/prev pointers.
If possible, I would go further and ditch the linked list altogether: Just append the nodes to an array and then sort the array. It has been shown in research, conference presentations, and in personal code to be the fasted option is most (or all) cases.
 doesn't the nature of the dynamic array slow it down a bit?
Default bounds checking is going to cost a tiny bit, which you can turn off after development with a compiler flag. (I still wouldn't.) The only other option that would be faster is an array that's sitting on the stack, created with alloca. But it's only for cases where the thread will not run out of stack space and the result of the array is not going to be used.
 can't I define an array of fixed size, which is dependent on the input
 of the function?
arr.length = number_of_elements; All elements will be initialized to the element's default value, which happens to be null for pointers. (If we are back to linked list Node pointers.) However, I wouldn't bother with setting length either as the cost of automatic array resizing is amortized, meaning that it won't hurt the O(1) algorithmic complexity in the general case. In the GC case that D uses, it will be even better: because if the GC knowns that the neighboring memory block is free, it will just add that to the dynamic array's capacity without moving elements to the new location. Summary: Ditch the linked list and put the elements into an array. :) Ali
Aug 13 2019
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Aug 13, 2019 at 11:28:35AM -0700, Ali Çehreli via Digitalmars-d-learn
wrote:
[...]
 Summary: Ditch the linked list and put the elements into an array. :)
[...] +1. The linked list may have been faster 20 years ago, before the advent of modern CPUs with caching hierarchies and memory access predictors. These days, with CPU multi-level caches and memory access predictors, in-place arrays are often the best option for performance, up to a certain size. In general, avoid indirection where possible, in order to avoid defeating the memory access predictor and reduce the number of cache misses. T -- "If you're arguing, you're losing." -- Mike Thomas
Aug 13 2019
parent reply Max Haughton <maxhaton gmail.com> writes:
On Tuesday, 13 August 2019 at 18:54:58 UTC, H. S. Teoh wrote:
 On Tue, Aug 13, 2019 at 11:28:35AM -0700, Ali Çehreli via 
 Digitalmars-d-learn wrote: [...]
 Summary: Ditch the linked list and put the elements into an 
 array. :)
[...] +1. The linked list may have been faster 20 years ago, before the advent of modern CPUs with caching hierarchies and memory access predictors. These days, with CPU multi-level caches and memory access predictors, in-place arrays are often the best option for performance, up to a certain size. In general, avoid indirection where possible, in order to avoid defeating the memory access predictor and reduce the number of cache misses. T
I saw a Bjarne Stroustrup talk where he benchmarked that the for n > 1, std::vector was a lot faster than a linked list for all supported operations. I don't know how clever the caching strategies are on a modern processor (Pointer chasing), but to my knowledge the only way of getting a cache efficient linked list would be to effectively have a very contiguous allocator (Which obviously defeats the purpose of using a list in the first place) Found it: https://www.youtube.com/watch?v=YQs6IC-vgmo
Aug 13 2019
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Aug 13, 2019 at 08:42:37PM +0000, Max Haughton via Digitalmars-d-learn
wrote:
 On Tuesday, 13 August 2019 at 18:54:58 UTC, H. S. Teoh wrote:
[...]
 These days, with CPU multi-level caches and memory access
 predictors, in-place arrays are often the best option for
 performance, up to a certain size.  In general, avoid indirection
 where possible, in order to avoid defeating the memory access
 predictor and reduce the number of cache misses.
[...]
 I saw a Bjarne Stroustrup talk where he benchmarked that the for n >
 1, std::vector was a lot faster than a linked list for all supported
 operations. I don't know how clever the caching strategies are on a
 modern processor (Pointer chasing), but to my knowledge the only way
 of getting a cache efficient linked list would be to effectively have
 a very contiguous allocator (Which obviously defeats the purpose of
 using a list in the first place)
 
 Found it: https://www.youtube.com/watch?v=YQs6IC-vgmo
A contiguous allocator doesn't help after the list has undergone a large number of insertions/deletions, because of fragmentation. You can still get a cache-efficient linked list by chunking. I.e., instead of a linked-list of individual elements, group the elements into blocks (arrays) that are then linked into a linked list. Then you can still insert/delete elements with sub-O(1) complexity by updating only the affected block, and only occasionally you need to delete a block or allocate a new block. As long as the block size is suitably-sized, it will be cache-coherent most of the time and enjoy the associated performance boosts from the cache hierarchy. For small lists, it will be equivalent to using a single array. For very large lists, you'll also reap the cheap insert/delete flexibility of a linked list. To minimize space wastage from unused slots in the blocks, you could have a list rebalancing based on some desired utilization factor (e.g., if <50% of a block is used, redistribute elements among adjacent blocks). Redistribution should be pretty cache-coherent (except perhaps for some corner cases), because it mostly involves linear copying of elements from one block to another. T -- I am not young enough to know everything. -- Oscar Wilde
Aug 13 2019
parent Mirjam Akkersdijk <noreply noreply.com> writes:
On Tuesday, 13 August 2019 at 21:11:41 UTC, H. S. Teoh wrote:
 A contiguous allocator doesn't help after the list has 
 undergone a large number of insertions/deletions, because of 
 fragmentation.
Fragmentation should not be an issue, I insist on keep using a DLL for the base structure in my application and create an array just for sorting operations. No new nodes are added and no nodes are permanently removed.
 You can still get a cache-efficient linked list by chunking. 
 [...]
I very much appreciate this thought, as I was thinking of adding such kind of cache to my implementation. It's still in the early stages and I have yet to see when this would make a considerable impact. I will remember this.
Aug 13 2019
prev sibling next sibling parent Sebastiaan Koppe <mail skoppe.eu> writes:
On Tuesday, 13 August 2019 at 18:28:35 UTC, Ali Çehreli wrote:
 On Tuesday, 13 August 2019 at 14:04:45 UTC, Sebastiaan Koppe
 wrote:
 Convert the nodes into an D array, sort the array with 
 nodes.sort!"a.x < b.x" and then iterate the array and repair 
 the next/prev pointers.
If possible, I would go further and ditch the linked list altogether: Just append the nodes to an array and then sort the array. It has been shown in research, conference presentations, and in personal code to be the fasted option is most (or all) cases.
Yeah, very likely. Although in this case there already is an array; it's going to be close :)
Aug 13 2019
prev sibling next sibling parent reply Mirjam Akkersdijk <noreply noreply.com> writes:
On Tuesday, 13 August 2019 at 18:28:35 UTC, Ali Çehreli wrote:
 can't I define an array of fixed size, which is dependent on
the input
 of the function?
arr.length = number_of_elements; All elements will be initialized to the element's default value, which happens to be null for pointers. (If we are back to linked list Node pointers.) However, I wouldn't bother with setting length either as the cost of automatic array resizing is amortized, meaning that it won't hurt the O(1) algorithmic complexity in the general case. In the GC case that D uses, it will be even better: because if the GC knowns that the neighboring memory block is free, it will just add that to the dynamic array's capacity without moving elements to the new location.
For the application I am writing, it makes very much sense to use a DLL. The approach I am using right now is to: (as stated in the first post) void bsort(DLL* list) { Node*[] nodes; foreach (i; 0 .. list.size) nodes ~= get_node(list, i); nodes.sort!("a.x < b.x"); // (unrelated data structure cleaning up goes here) } My concern was that using a dynamic array, continuously appending to it despite known size, would greatly impact overall performance. My goal is to make this operation as fast as possible, the D way. I tried setting the length first and iterating through it with `nodes[i] = ...` and the implementation did not run noticeably faster (10 trials), per Teoh's advice to test out the difference. For now, I consider this a success and would like to thank everyone in the thread for assistance and valuable insights. Though, it left me with some semi-offtopic questions unanswered: (1) Ali, do you mean that from an optimization viewpoint, it's better to keep appending like `nodes ~= ...` instead of setting the length first? I would like to have some more background on that claim. (2) I watched the talk and read the paper [1] to be sure. Many contributors to this thread claim (C++) vectors are nearly always the preferred choice compared to DLLs, but how come they are much faster compared to DLLs, given they have nearly the same functionality? (3) And I sense this is very closely related to what Ali is on? [1] http://www.stroustrup.com/Software-for-infrastructure.pdf
Aug 13 2019
next sibling parent Mike Parker <aldacron gmail.com> writes:
On Tuesday, 13 August 2019 at 22:12:23 UTC, Mirjam Akkersdijk 
wrote:

 Though, it left me with some semi-offtopic questions unanswered:

 (1) Ali, do you mean that from an optimization viewpoint, it's 
 better to keep appending like `nodes ~= ...` instead of setting 
 the length first? I would like to have some more background on 
 that claim.
He's not saying it will be better, but that the cost will be negligible. On every append, the GC allocates new memory only if the current capacity == 0. When it does allocate, it allocates more space than it actually needs based on the current length of the array, so you don't actually have an allocation on every append. Also, when the adjacent memory block is free and can hold the additional elements, the allocation consists of extending the array's memory block and no copying is performed, making the cost of the allocation even cheaper than it could be. Anyway, you can use `reserve` when appending rather than setting the length. This will allocate enough memory without default-initializing anything and you don't have to worry about bounds checking. int[] ints; ints.reserve(100); foreach(i; 0..100) { ints ~= i; } See Steve Schveighoffer's array article for details: https://dlang.org/articles/d-array-article.html
Aug 13 2019
prev sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 08/13/2019 03:12 PM, Mirjam Akkersdijk wrote:

 For the application I am writing, it makes very much sense to use a DLL.
Yes, Dynamically Linked Libraries can be useful. Oh wait... You mean Doubly-Linked List. :o)
 I tried setting the length first and iterating through it with `nodes[i]
 = ...`
As Mike Parker said, it should be .reserve.
 and the implementation did not run noticeably faster (10 trials),
That's the most important part: You profiled and it was the same. (Although, profiling is tricky business as well; it can be difficult to measure the operation that you are interested in.) The performance will depend on many factors: * First, if there aren't many number of nodes it shouldn't matter. (That's why I would choose the simpler arrays.) * Doubly-linked list nodes will have two extra pointers per element. As a general rule, larger the data slower the data structure. * Pointer chasing through 'next' pointers will make the program jump around in memory. This will be slower compared to consecutive elements that arrays provide. But if processing the elements cause pointer chasing anyway, then it shouldn't matter much. Another topic to read and watch is "data oriented programming". This talk by Mike Acton is eye opening: https://www.youtube.com/watch?v=rX0ItVEVjHc * Is data appended and removed throughout the life of the program or are the nodes set once in the beginning and then only used? * Are elements accessed randomly or always in sequence. (If the latter, modern CPUs promise that arrays will be better.) * etc. Ali
Aug 14 2019
prev sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 13 August 2019 at 18:28:35 UTC, Ali Çehreli wrote:
 On 08/13/2019 10:33 AM, Mirjam Akkersdijk wrote:
 On Tuesday, 13 August 2019 at 14:04:45 UTC, Sebastiaan Koppe
wrote:
 Convert the nodes into an D array, sort the array with
nodes.sort!"a.x
 < b.x" and then iterate the array and repair the next/prev
pointers. If possible, I would go further and ditch the linked list altogether: Just append the nodes to an array and then sort the array. It has been shown in research, conference presentations, and in personal code to be the fasted option is most (or all) cases.
 doesn't the nature of the dynamic array slow it down a bit?
Default bounds checking is going to cost a tiny bit, which you can turn off after development with a compiler flag. (I still wouldn't.) The only other option that would be faster is an array that's sitting on the stack, created with alloca. But it's only for cases where the thread will not run out of stack space and the result of the array is not going to be used.
 can't I define an array of fixed size, which is dependent on
the input
 of the function?
arr.length = number_of_elements; All elements will be initialized to the element's default value, which happens to be null for pointers. (If we are back to linked list Node pointers.) However, I wouldn't bother with setting length either as the cost of automatic array resizing is amortized, meaning that it won't hurt the O(1) algorithmic complexity in the general case. In the GC case that D uses, it will be even better: because if the GC knowns that the neighboring memory block is free, it will just add that to the dynamic array's capacity without moving elements to the new location. Summary: Ditch the linked list and put the elements into an array. :)
There are mainly three reasons why arrays are nowadays faster than double linked lists: - pointer chasing can difficultly be paralized and defeats prefetching. Each pointer load may cost the full latency to memory (hundreds of cycles). In a multiprocessor machine may also trigger a lot of coherency trafic. - on 64 bit systems 2 pointers cost 16 bytes. If the payload is small, there is more memory used in the pointer than in the data. - when looping in an array the OO machinery will be able to parallelize execution beyond loop limits. - reduced allocation, i.e. allocation is done in bulk => faster GC for D. It is only when there are a lot of external references to the payload in the list that using an array may become too unwieldy, i.e. if moving an element in memory requires the update of other pointers outside of the list.
Aug 14 2019
prev sibling parent Johannes Loher <johannes.loher fg4f.de> writes:
Am 13.08.19 um 11:48 schrieb Mirjam Akkersdijk:
 Hello there,
 If I had a DLL, how would I sort it judging by the node contents, the D
 way?
 
 In C if I were to sort a piece of malloc'd memory pointing to node
 pointers, I would write my compare function and let qsort sort it out.
 In D, I tried to use std.algorithm's sort functionality to no avail,
 because my guess would be it only operates on arrays. This opens up
 another can of worms as I am not fond of VLAs and D desperately wants to
 know what the size is at compile time.
 
 Let's say we this trivial implementation:
 
 Node* get_node(DLL* list, uint index);
 
 struct Node {
   int x;
   Node* prev, next;
 };
 
 struct DLL {
   Node* head;
   uint size;
 };
 
 Node** nodes = cast(Node**) malloc(DLL.size * (Node*).sizeof);
 
 and I would like to sort based on Node.t, how should I tackle it,
 preferably without resorting to C libraries?
I'd do something along the lines of this: ``` import std; struct Node(T) { private: typeof(this)* prev, next; public: T data; } struct DoublyLinkedList(T) { private: Node!T* head = null; Node!T* tail = null; public: void pushFront(T t) { auto n = new Node!T; if (head != null) { head.prev = n; } else { tail = n; } n.data = t; n.next = head; head = n; } T front() const property { return head.data; } void popFront() { head = head.next; if (head != null) { head.prev = null; } else { tail = null; } } void pushBack(T t) { auto n = new Node!T; if (tail != null) { tail.next = n; } else { head = n; } n.data = t; n.prev = tail; tail = n; } T back() const property { return tail.data; } void popBack() { tail = tail.prev; if (tail != null) { tail.next = null; } else { head = null; } } size_t empty() const property { return head == null; } auto nodes() property { static struct NodePointerRange { private: Node!T* head; public: Node!T* front() property { return head; } void popFront() { head = head.next; } bool empty() const property { return head == null; } } return NodePointerRange(head); } } auto doublyLinkedList(R)(R r) if (isInputRange!R) { DoublyLinkedList!(ElementType!R) list; foreach (e; r) { list.pushBack(e); } return list; } auto doublyLinkedListFromNodePointerRange(R)(R r) if (isInputRange!R && isPointer!(ElementType!R) && isInstanceOf!(Node, PointerTarget!(ElementType!R))) { DoublyLinkedList!(TemplateArgsOf!(PointerTarget!(ElementType!R))) list; foreach (n; r) { n.prev = list.tail; if (list.tail != null) { list.tail.next = n; } else { list.head = n; } list.tail = n; } list.tail.next = null; return list; } void main() { auto list = doublyLinkedList([5, 3, 7, 24, 1]); // looks natural but allocates new nodes auto sortedList = list.array.sort.doublyLinkedList; sortedList.each!writeln; writeln; auto anotherList = doublyLinkedList([54, 23, 8]); // looks a bit uglier but reuses the already allocated nodes auto anotherSortedList = anotherList.nodes .array .sort!((n1, n2) => n1.data < n2.data) .doublyLinkedListFromNodePointerRange; anotherSortedList.each!writeln; } ``` Modulo bugs of course ;) This uses the GC. If you want to avoid it in favor of malloc (or std.experimental.allocator), you need to add `free`s (and possibly referece counting) accordingly.
Aug 13 2019