digitalmars.D.learn - Looking for documentation of D's lower-level aspects.

Sean Silva (25/25) Oct 17 2011 I have just finished reading Alexandrescu's The D Programming Language, ...

Trass3r (15/22) Oct 18 2011 Yes. It was really dumb to introduce that in C back then cause you can't...

Jacob Carlborg (7/30) Oct 18 2011 The GC can be replaced at link time. Tango contains an example of a GC

Jesse Phillips (5/8) Oct 18 2011 There isn't any real documentation on this. Kind of stuff. Instead you c...

Sean Silva (4/5) Oct 19 2011 It looks there's a more-or-less functional kernel written in D (and

Trass3r (3/8) Oct 19 2011 "You do not need the Tango standard library to compile XOmB as it contai...

Jacob Carlborg (4/13) Oct 19 2011 There's not much choice if you develop a kernel.
Sean Silva (7/16) Oct 21 2011 article

Jonathan M Davis (25/51) Oct 21 2011 You _can_ use D with no to minimal GC, but you have to be very careful. ...

Sean Silva (9/33) Oct 21 2011 I don't have anything against GC or just having the GC hang around. In f...

Jonathan M Davis (11/54) Oct 21 2011 The built-in sort on arrays is going away. std.algorithm.sort should be ...

Sean Silva (22/32) Oct 22 2011 the STL have

Jonathan M Davis (17/31) Oct 22 2011 The primarily intended API is listed at the top of std.container. The li...

Sean Silva (4/8) Oct 22 2011 What is the rationale for making them classes, if they are going to be

Jonathan M Davis (6/18) Oct 22 2011 They're all supposed to be reference types. Initially, we went with stru...

Sean Silva (8/10) Oct 22 2011 What prompted the decision for that? Doesn't that incur an extra heap al...

Jonathan M Davis (11/24) Oct 22 2011 I'd have to go digging through the newsgroup archives to give all of the...

Marco Leise (23/39) Nov 09 2011 It is also what people not coming from C++ expect and I honestly think t...

Dmitry Olshansky (14/24) Oct 23 2011 AFAIK using e.g. local vector only a small bunch of info is stored on...

Sean Silva (34/40) Oct 23 2011 directly

Dmitry Olshansky (26/66) Oct 24 2011 While we are going into very specialized territory, I must note that
bearophile (7/24) Oct 24 2011 If the vector needs to grow you can't store it all in the class instance...
Marco Leise (8/18) Nov 09 2011 Then write your special vector class. I'm sure you can optimize it more ...

Sean Silva <chisophugis gmail.com> writes:

I have just finished reading Alexandrescu's The D Programming Language, but it
doesn't seem to talk at all about how to use D as a stand-in for C/C++ almost at
all. E.g., the part of D that doesn't depend on a runtime or garbage collector.

It's not that I have anything against those niceties---it all has its place;
it's just that I would like to learn how to use D as a replacement for C/C++,


like having the safe D subset though, since it allows awesomeness to happen at
compile-time (this is one of my favorite things about D, actually).

As an example of the lack of coverage in this area, I just found this line in
the source code of SList in std.container:

     ahead = n._next;

<https://github.com/D-Programming-
Language/phobos/blob/master/std/container.d#L895>

The C/C++ equivalent of this is `ahead = n->next;`, or equivalently `ahead =
(*n).next;`. This is a difference in semantics from C/C++ with respect to the
`.`---it seems like D turns pointer to struct property accesses into property
access with indirection. Nowhere that I can recall in Alexandrescu's book talked
about this, but it's a really big deal!

Can I get some pointers (no pun intended) to resources about how to use D in
this role? I'm primarily interested in writing optimal (i.e. equivalent to hand-
coded C) generic data structures (like STL, for instance), and this entails
knowing how to use D at the lowest (well, just above asm) level. I'm sure that D
has a means to do this, almost surely better than C++, but I can't seem to find
any documentation about how to go about it.

Oct 17 2011

Trass3r <un known.com> writes:

      ahead = n._next;

 The C/C++ equivalent of this is `ahead = n->next;`, or equivalently  
 `ahead = (*n).next;`. This is a difference in semantics from C/C++ with  
 respect to the `.`---it seems like D turns pointer to struct property  
 accesses into property access with indirection.

Yes. It was really dumb to introduce that in C back then cause you can't  
easily change from a pointer to a class to a real class without editing  
all places where it is accessed.
D chose the sane and safer way of letting the compiler figure out what to  
do.

 Nowhere that I can recall in Alexandrescu's book talked about this, but  
 it's a really big deal!

I can't recall where I read about it back then, but I did know it soon  
after I had started learning D.
Some of the differences to C/C++ are explained there:  
http://www.d-programming-language.org/ctod.html
Though it could use an overhaul.

As for getting rid of the GC, it is theoretically possible.
But nobody has put much effort into making it work yet (cause the only  
application platform is still x86/64).
I guess it will become necessary though once D conquers ARM (we can  
generate code for it with LDC/GDC but druntime isn't ready).

Oct 18 2011

Jacob Carlborg <doob me.com> writes:

On 2011-10-18 12:50, Trass3r wrote:
 ahead = n._next;

 The C/C++ equivalent of this is `ahead = n->next;`, or equivalently
 `ahead = (*n).next;`. This is a difference in semantics from C/C++
 with respect to the `.`---it seems like D turns pointer to struct
 property accesses into property access with indirection.

 Yes. It was really dumb to introduce that in C back then cause you can't
 easily change from a pointer to a class to a real class without editing
 all places where it is accessed.
 D chose the sane and safer way of letting the compiler figure out what
 to do.

 Nowhere that I can recall in Alexandrescu's book talked about this,
 but it's a really big deal!

 I can't recall where I read about it back then, but I did know it soon
 after I had started learning D.
 Some of the differences to C/C++ are explained there:
 http://www.d-programming-language.org/ctod.html
 Though it could use an overhaul.

 As for getting rid of the GC, it is theoretically possible.
 But nobody has put much effort into making it work yet (cause the only
 application platform is still x86/64).
 I guess it will become necessary though once D conquers ARM (we can
 generate code for it with LDC/GDC but druntime isn't ready).

The GC can be replaced at link time. Tango contains an example of a GC 
that uses malloc, should work with druntime as well. Another option is 
to not link a GC at all, then there will linker errors when using 
something that needs the GC.

-- 
/Jacob Carlborg

Oct 18 2011

Jesse Phillips <jessekphillips+D gmail.com> writes:

Sean Silva Wrote:

 I have just finished reading Alexandrescu's The D Programming Language, but it
 doesn't seem to talk at all about how to use D as a stand-in for C/C++ almost
at
 all. E.g., the part of D that doesn't depend on a runtime or garbage collector.

There isn't any real documentation on this. Kind of stuff. Instead you can find
some links that try to teach D for C programmers.

http://www.prowiki.org/wiki4d/wiki.cgi?ComingFrom/PlainC

Also, in terms of programming without a GC, very little has been done on this
front to make it easy. There has been talk about a compiler switch to help with
it and a stubbed GC. Right now D isn't ready to be used in this fashion, imo,
but someone is more than welcome to run with it and make it a reality.

But doing low-level things with the GC is still possible and similar to C.

Oct 18 2011

Sean Silva <chisophugis gmail.com> writes:

== Quote from Jesse Phillips (jessekphillips+D gmail.com)'s article
 Right now D isn't ready to be used in this fashion


It looks there's a more-or-less functional kernel written in D (and
pretty well documented too):
http://wiki.xomb.org/index.php?title=Main_Page

Oct 19 2011

Trass3r <un known.com> writes:

Am 20.10.2011, 00:06 Uhr, schrieb Sean Silva <chisophugis gmail.com>:
 == Quote from Jesse Phillips (jessekphillips+D gmail.com)'s article
 Right now D isn't ready to be used in this fashion

 It looks there's a more-or-less functional kernel written in D (and
 pretty well documented too):
 http://wiki.xomb.org/index.php?title=Main_Page

"You do not need the Tango standard library to compile XOmB as it contains  
its own standard calls and runtime."

Oct 19 2011

Jacob Carlborg <doob me.com> writes:

On 2011-10-20 00:59, Trass3r wrote:
 Am 20.10.2011, 00:06 Uhr, schrieb Sean Silva <chisophugis gmail.com>:
 == Quote from Jesse Phillips (jessekphillips+D gmail.com)'s article
 Right now D isn't ready to be used in this fashion

 It looks there's a more-or-less functional kernel written in D (and
 pretty well documented too):
 http://wiki.xomb.org/index.php?title=Main_Page

 "You do not need the Tango standard library to compile XOmB as it
 contains its own standard calls and runtime."

There's not much choice if you develop a kernel.

-- 
/Jacob Carlborg

Oct 19 2011

Sean Silva <chisophugis gmail.com> writes:

== Quote from Trass3r (un known.com)'s article
 Am 20.10.2011, 00:06 Uhr, schrieb Sean Silva

<chisophugis gmail.com>:
 == Quote from Jesse Phillips (jessekphillips+D gmail.com)'s


article
 Right now D isn't ready to be used in this fashion

 It looks there's a more-or-less functional kernel written in D


(and
 pretty well documented too):
 http://wiki.xomb.org/index.php?title=Main_Page

 "You do not need the Tango standard library to compile XOmB as it

contains
 its own standard calls and runtime."

that was my point ... that it seems that D *is* "ready to be used in
this fashion" ;)

Oct 21 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, October 22, 2011 01:20:05 Sean Silva wrote:
 == Quote from Trass3r (un known.com)'s article
 
 Am 20.10.2011, 00:06 Uhr, schrieb Sean Silva

 
 <chisophugis gmail.com>:
 == Quote from Jesse Phillips (jessekphillips+D gmail.com)'s


 
 article
 
 Right now D isn't ready to be used in this fashion

 
 It looks there's a more-or-less functional kernel written in D


 
 (and
 
 pretty well documented too):
 http://wiki.xomb.org/index.php?title=Main_Page

 
 "You do not need the Tango standard library to compile XOmB as it

 
 contains
 
 its own standard calls and runtime."

 
 that was my point ... that it seems that D *is* "ready to be used in
 this fashion" ;)

You _can_ use D with no to minimal GC, but you have to be very careful. A good 
chunk of the standard library would be completely unusable without the GC 
(primarily anything which might allocate or append to an array), you have to 
be very careful when using arrays (since appending to them wouldn't work, and 
you have to worry about who owns an array so that slices don't result in 
memory leaks or you using a slice which has already be freed), and there are 
some cases where something might allocate using the GC when you don't expect. 
For instance.

int[3] a = [1, 2, 3];

currently allocates a dynamic array which is the copied into the static array. 
It shouldn't allocate like that, and it _will_ be fixed so that it doesn't, but 
for now it does.

Generally, I'd say that the best way to deal with D is to just not worry about 
the GC until you profile your code and see that it's a problem. If don't need 
inheritance and so are using primarily structs rather than classes, you often 
don't need to allocate much on the heap. If you're not doing a lot with 
classes, the primary thing on the heap would be arrays (including strings). 
But if you're smart about avoiding unnecessary allocations, the abilities that 
the GC gives you with arrays (such as concatenation and the ability to use 
slices without worrying about how many references to it there are) are well 
worth it.

Essentially, as long as you avoid constantly allocating stuff on the heap, the 
GC shouldn't cause you much trouble.

- Jonathan M Davis

Oct 21 2011

Sean Silva <chisophugis gmail.com> writes:

 You _can_ use D with no to minimal GC, but you have to be very careful. A good
 chunk of the standard library would be completely unusable without the GC
 (primarily anything which might allocate or append to an array), you have to
 be very careful when using arrays (since appending to them wouldn't work, and
 you have to worry about who owns an array so that slices don't result in
 memory leaks or you using a slice which has already be freed), and there are
 some cases where something might allocate using the GC when you don't expect.
 For instance.
 int[3] a = [1, 2, 3];
 currently allocates a dynamic array which is the copied into the static array.
 It shouldn't allocate like that, and it _will_ be fixed so that it doesn't, but
 for now it does.
 Generally, I'd say that the best way to deal with D is to just not worry about
 the GC until you profile your code and see that it's a problem. If don't need
 inheritance and so are using primarily structs rather than classes, you often
 don't need to allocate much on the heap. If you're not doing a lot with
 classes, the primary thing on the heap would be arrays (including strings).
 But if you're smart about avoiding unnecessary allocations, the abilities that
 the GC gives you with arrays (such as concatenation and the ability to use
 slices without worrying about how many references to it there are) are well
 worth it.
 Essentially, as long as you avoid constantly allocating stuff on the heap, the
 GC shouldn't cause you much trouble.
 - Jonathan M Davis

I don't have anything against GC or just having the GC hang around. In fact, it
is GC and D's "nice" dynamic arrays that let it do so much
cool stuff at compile time (since they are safe), which is one of the biggest
wins of D IMO.

For me, it's not so much a matter of performance as that I personally prefer to
have a coherent interface to all my containers, so even though
it is a bit of syntactic overhead, I would prefer `Vector!int` and `List!int`
than `int[]` and `List!int`. Built-in niceties are great, but
for real work I prefer to have a coherent library. E.g. my C++ code using STL
has *far* fewer bugs than my Python code, and I'm equally
knowledgeable about both languages, maybe Python a bit more since I've used it
longer (that STL achieves raw C performance is another big
benefit). For me, it's the same reason that it's preferable to have `sort()` be
a free function (within a coherent algorithms library like
std.algorithm) rather than a method of the built-in arrays (I noticed that D
has `array.sort`, but I think it must be a carry-over from the
past before D had std.algorithm; or maybe it is for compile-time programming?).

Oct 21 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, October 22, 2011 04:12:36 Sean Silva wrote:
 You _can_ use D with no to minimal GC, but you have to be very careful.
 A good chunk of the standard library would be completely unusable
 without the GC (primarily anything which might allocate or append to an
 array), you have to be very careful when using arrays (since appending
 to them wouldn't work, and you have to worry about who owns an array so
 that slices don't result in memory leaks or you using a slice which has
 already be freed), and there are some cases where something might
 allocate using the GC when you don't expect. For instance.
 int[3] a = [1, 2, 3];
 currently allocates a dynamic array which is the copied into the static
 array. It shouldn't allocate like that, and it _will_ be fixed so that
 it doesn't, but for now it does.
 Generally, I'd say that the best way to deal with D is to just not worry
 about the GC until you profile your code and see that it's a problem.
 If don't need inheritance and so are using primarily structs rather
 than classes, you often don't need to allocate much on the heap. If
 you're not doing a lot with classes, the primary thing on the heap
 would be arrays (including strings). But if you're smart about avoiding
 unnecessary allocations, the abilities that the GC gives you with
 arrays (such as concatenation and the ability to use slices without
 worrying about how many references to it there are) are well worth it.
 Essentially, as long as you avoid constantly allocating stuff on the
 heap, the GC shouldn't cause you much trouble.
 - Jonathan M Davis

 
 I don't have anything against GC or just having the GC hang around. In fact,
 it is GC and D's "nice" dynamic arrays that let it do so much cool stuff at
 compile time (since they are safe), which is one of the biggest wins of D
 IMO.
 
 For me, it's not so much a matter of performance as that I personally prefer
 to have a coherent interface to all my containers, so even though it is a
 bit of syntactic overhead, I would prefer `Vector!int` and `List!int` than
 `int[]` and `List!int`. Built-in niceties are great, but for real work I
 prefer to have a coherent library. E.g. my C++ code using STL has *far*
 fewer bugs than my Python code, and I'm equally knowledgeable about both
 languages, maybe Python a bit more since I've used it longer (that STL
 achieves raw C performance is another big benefit). For me, it's the same
 reason that it's preferable to have `sort()` be a free function (within a
 coherent algorithms library like std.algorithm) rather than a method of the
 built-in arrays (I noticed that D has `array.sort`, but I think it must be
 a carry-over from the past before D had std.algorithm; or maybe it is for
 compile-time programming?).

The built-in sort on arrays is going away. std.algorithm.sort should be used 
instead. I'm afraid that I don't understand what your comments on the STL have 
to do with the GC though. And stuff in Phobos (such as std.algorithm) is very 
much like the STL - only it uses ranges instead of iterators. The main item 
lacking is a comprehensive list of containers, and the main reason that 
std.container doesn't have more yet is because the custom allocator stuff is 
still being sorted out, and Andrei doesn't want to implement them all and then 
have to change them to work with custom allocators. So, once that's done, I 
don't know what the STL would really give you that Phobos doesn't.

- Jonathan M Davis

Oct 21 2011

Sean Silva <chisophugis gmail.com> writes:

== Quote from Jonathan M Davis (jmdavisProg gmx.com)'s article
 The built-in sort on arrays is going away. std.algorithm.sort should

be used
 instead. I'm afraid that I don't understand what your comments on

the STL have
 to do with the GC though. And stuff in Phobos (such as

std.algorithm) is very
 much like the STL - only it uses ranges instead of iterators. The

main item
 lacking is a comprehensive list of containers, and the main reason

that
 std.container doesn't have more yet is because the custom allocator

stuff is
 still being sorted out, and Andrei doesn't want to implement them

all and then
 have to change them to work with custom allocators. So, once that's

done, I
 don't know what the STL would really give you that Phobos doesn't.
 - Jonathan M Davis

I don't doubt anything that you just said. But as you said, Phobos
*currently* doesn't have what I want, which is an issue if I am
wanting to develop code now or soon. The path of least resistance in
the interim is to just implement some familiar STL containers in such
a way that they work with std.algorithm, and just use them.

Also, do you know whether any of Andrei's thoughts on the directions
he wants to take std.container are available online? I'm just curious
because I've seen mention of an article "Sealed Containers" he was
going to write, but it seems that it fell through. I'm just interested
because if I'm going to be writing containers from scratch, I would
like to take some of his ideas and put them into practice, both to be
forward-looking to what std.container will eventually have, and to
maybe help out by field-testing some of the ideas.

Oct 22 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, October 22, 2011 17:04:57 Sean Silva wrote:
 I don't doubt anything that you just said. But as you said, Phobos
 *currently* doesn't have what I want, which is an issue if I am
 wanting to develop code now or soon. The path of least resistance in
 the interim is to just implement some familiar STL containers in such
 a way that they work with std.algorithm, and just use them.
 
 Also, do you know whether any of Andrei's thoughts on the directions
 he wants to take std.container are available online? I'm just curious
 because I've seen mention of an article "Sealed Containers" he was
 going to write, but it seems that it fell through. I'm just interested
 because if I'm going to be writing containers from scratch, I would
 like to take some of his ideas and put them into practice, both to be
 forward-looking to what std.container will eventually have, and to
 maybe help out by field-testing some of the ideas.

The primarily intended API is listed at the top of std.container. The list of 
function names and what the worse complexity thate they're supposed to have is 
are listed there. The main planned changes that I'm aware of are to

1. Make all containers final classes (Array and SList are currently reference-
counted structs).

2. Add support for custom allocators (likely done at runtime rather than as 
template arguments, but the custom allocator situation is still being sorted 
out).

3. Add more containers.

How that relates to sealed containers, I don't know. I don't remember the 
details on them, so I don't remember if what we have is sealed or not, let 
alone whether what we're going to have is sealed or not. If you want 
additional container classes in the interim, I'd suggest checking out 
dcollections ( http://dsource.org/projects/dcollections ). The implementation 
for std.container.RedBlackTree came from there.

- Jonathan M Davis

Oct 22 2011

Sean Silva <chisophugis gmail.com> writes:

== Quote from Jonathan M Davis (jmdavisProg gmx.com)'s article
 On Saturday, October 22, 2011 17:04:57 Sean Silva wrote:
 The main planned changes that I'm aware of are to
 1. Make all containers final classes (Array and SList are currently

reference-
 counted structs).

What is the rationale for making them classes, if they are going to be
final?

Oct 22 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, October 23, 2011 00:01:42 Sean Silva wrote:
 == Quote from Jonathan M Davis (jmdavisProg gmx.com)'s article
 
 On Saturday, October 22, 2011 17:04:57 Sean Silva wrote:
 The main planned changes that I'm aware of are to
 1. Make all containers final classes (Array and SList are currently

 
 reference-
 
 counted structs).

 
 What is the rationale for making them classes, if they are going to be
 final?

They're all supposed to be reference types. Initially, we went with structs, 
but it was eventually decided to just go with classes instead, and making them 
final makes it so that the compiler can make the their functions non-virtual 
and therefore more efficient.

- Jonathan M Davis

Oct 22 2011

Sean Silva <chisophugis gmail.com> writes:

== Quote from Jonathan M Davis (jmdavisProg gmx.com)'s article
 On Sunday, October 23, 2011 00:01:42 Sean Silva wrote:
 They're all supposed to be reference types.

What prompted the decision for that? Doesn't that incur an extra heap
allocation for the
containers, and an extra level of indirection? I mean, with value-semantics
like STL
containers, you can use it like a value, e.g. as a local in a function, and
have no overhead,
but if you want to wrap a class around it and have it by reference, then you
can, and it is no
less efficient than if the containers were written that way. But if the
containers are already
by reference, you can't return them to having value semantics without adding
even more
indirection and inefficiency.

Oct 22 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, October 23, 2011 03:25:48 Sean Silva wrote:
 == Quote from Jonathan M Davis (jmdavisProg gmx.com)'s article
 
 On Sunday, October 23, 2011 00:01:42 Sean Silva wrote:
 They're all supposed to be reference types.

 
 What prompted the decision for that? Doesn't that incur an extra heap
 allocation for the containers, and an extra level of indirection? I mean,
 with value-semantics like STL containers, you can use it like a value, e.g.
 as a local in a function, and have no overhead, but if you want to wrap a
 class around it and have it by reference, then you can, and it is no less
 efficient than if the containers were written that way. But if the
 containers are already by reference, you can't return them to having value
 semantics without adding even more indirection and inefficiency.

I'd have to go digging through the newsgroup archives to give all of the 
reasons but it basically comes down to the fact that people very rarely want 
to pass containers by value, and the fact that C++ makes it so that the 
default behavior of passing container is to copy them is a frequent source of 
bugs. In C++, containers are almost always passed by reference with & or * 
rather than passing them by value, and if you forget to make the function take 
or return via reference or pointer, it's a major performance hit. So IIRC, 
Andrei thought that the fact that C++ containers are value types was a big 
mistake of C++.

- Jonathan M Davis

Oct 22 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 23.10.2011, 06:04 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 I'd have to go digging through the newsgroup archives to give all of the
 reasons but it basically comes down to the fact that people very rarely  
 want
 to pass containers by value, and the fact that C++ makes it so that the
 default behavior of passing container is to copy them is a frequent  
 source of
 bugs. In C++, containers are almost always passed by reference with & or  
 *
 rather than passing them by value, and if you forget to make the  
 function take
 or return via reference or pointer, it's a major performance hit. So  
 IIRC,
 Andrei thought that the fact that C++ containers are value types was a  
 big
 mistake of C++.

 - Jonathan M Davis

It is also what people not coming from C++ expect and I honestly think the  
containers should be more newbee friendly. RedBlackTree was a struct, that  
could not exist without at least one element in it. So instead of  
allocating it in the constructor of a class and using it in one of its  
methods, wherever you add an element you have to check if you have to  
create it first.
If the community is really split that much amongst people who want the  
most low-level containers as structs and people who want easy to use  
classes, then we actually need two container modules in Phobos. It is not  
done by saying, "you can wrap them in a class". Because then they are not  
easy to use in the first place.

Maybe we have so many arguments over how stuff should be done, because D  
can be used from scripting to writing OS kernels. There are probably three  
or more audiences with their expectations and in a few years we will see  
Tango resurrected by one of the thirds. Shouldn't those people with a very  
clear idea on how a container should be implemented, just implement their  
own or look for third party libraries and those who know little about  
containers can just use some whishy-whashy container that supports most  
common needs (can create empty, is by reference, has removal method, ...)?  
After all most of the performance stems from the used algorithm, not from  
going more low-level.

- Marco

Nov 09 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 23.10.2011 7:25, Sean Silva wrote:
 == Quote from Jonathan M Davis (jmdavisProg gmx.com)'s article
 On Sunday, October 23, 2011 00:01:42 Sean Silva wrote:
 They're all supposed to be reference types.

 What prompted the decision for that? Doesn't that incur an extra heap
allocation for the
 containers, and an extra level of indirection? I mean, with value-semantics
like STL
 containers, you can use it like a value, e.g. as a local in a function, and
have no overhead,

AFAIK using e.g. local vector<T> only a small bunch of info is stored on 
stack (ptr, length, capacity, whatever else). It still allocates it's 
elements on heap. Plus there are ways to place class instance on stack 
or anywhere else (e. g. some scratch memory page) if you really need to, 
look for emplace in phobos.

 but if you want to wrap a class around it and have it by reference, then you
can, and it is no
 less efficient than if the containers were written that way. But if the
containers are already
 by reference, you can't return them to having value semantics without adding
even more
 indirection and inefficiency.

Less efficient is a moot point.
When you do iteration and other stuff you'd use range, like you'd use 
iterators in c++. Range gets stack/register allocated pointers directly 
to data (or close to it, that depends on container) so the only extra 
cost in reference type compared to value is the first indirect access to 
construct range and it's negligible.


-- 
Dmitry Olshansky

Oct 23 2011

Sean Silva <chisophugis gmail.com> writes:

== Quote from Dmitry Olshansky (dmitry.olsh gmail.com)'s article
 Less efficient is a moot point.
 When you do iteration and other stuff you'd use range, like you'd

use
 iterators in c++. Range gets stack/register allocated pointers

directly
 to data (or close to it, that depends on container) so the only

extra
 cost in reference type compared to value is the first indirect

access to
 construct range and it's negligible.

The problem isn't the speed of iteration, it's the extra heap traffic
that is involved. I mean, for you average app this isn't going to
matter; those are the apps that can just as easily be written in

that is pretty much always written in C/C++).

For example, if you look in the LLVM source tree, you'll see that they
bend over backwards to avoid heap allocations. For example, in some
cases, std::vector causes too much heap traffic so they have
SmallVector which preallocates a certain amount of storage *inside* of
the object itself in order to avoid heap traffic if the number of
elements doesn't exceed some predetermined amount. Even still, LLVM
uses std::vector all over the place, and it I've never seen a
std::vector embedded in a class by reference; it is always held by
value precisely because then you don't do an unnecessary heap
allocation.

I'm positive that if std::vector involved a heap allocation for the
vector object itself, llvm would basically have rewritten a heap-less
vector object, just like they have done in the more extreme case for
SmallVector. But the thing is that in D, it is possible to write an
easy-to-use vector for which it is a one-liner to switch between GC
heap-allocated vector object, by-value vector, preallocated internal
vector (like SmallVector), and beyond!

I admit, I'm very biased because my use case for D is low-level
systems programming a la C/C++, so naturally I want a standard library
that will not compromise on aspects that are important for the kinds
of programs that I write. Nonetheless, if the goal is to have "good
enough" containers, then it doesn't matter, but if the goal is to have
"truly optimal" containers (as I think it should be; D is certainly
powerful enough to pull it off elegantly), then it does matter.

Oct 23 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 24.10.2011 5:57, Sean Silva wrote:
 == Quote from Dmitry Olshansky (dmitry.olsh gmail.com)'s article
 Less efficient is a moot point.
 When you do iteration and other stuff you'd use range, like you'd

 use
 iterators in c++. Range gets stack/register allocated pointers

 directly
 to data (or close to it, that depends on container) so the only

 extra
 cost in reference type compared to value is the first indirect

 access to
 construct range and it's negligible.

 The problem isn't the speed of iteration, it's the extra heap traffic
 that is involved. I mean, for you average app this isn't going to
 matter; those are the apps that can just as easily be written in

 that is pretty much always written in C/C++).

While we are going into very specialized territory, I must note that 
there would never be a clear winner in this battle.
E.g. null reference saves you from one nice overhead - default 
initialization / checking for it. And while it was some time ago, I do 
remember that some std vectors preallocated space on *heap* for around 
10 elements, this might not be true now, of course.

 For example, if you look in the LLVM source tree, you'll see that they
 bend over backwards to avoid heap allocations. For example, in some
 cases, std::vector causes too much heap traffic so they have
 SmallVector which preallocates a certain amount of storage *inside* of
 the object itself in order to avoid heap traffic if the number of
 elements doesn't exceed some predetermined amount.

Like I said before you can place class instances wherever you want to, 
in this case it's most likely a pool or free list, stack space. The only 
difference is that manual memory management is not that common in D yet 
and is fully explicit, done on case by case basis ( no class allocators 
etc. ). And "small string optimization" also is not going away, that's 
for sure.
Seeing that you are concerned by memory layouts and such, you might as 
well add some thought power to push forward allocators design. IIRC 
David Simcha is working on a version 2 proposal for it.

Even still, LLVM
 uses std::vector all over the place, and it I've never seen a
 std::vector embedded in a class by reference; it is always held by
 value precisely because then you don't do an unnecessary heap
 allocation.

And let me guess these objects are passed by... reference? And why? - 
because they are too big value types, that sort of defeats the value 
type doctrine here.

 I'm positive that if std::vector involved a heap allocation for the
 vector object itself, llvm would basically have rewritten a heap-less
 vector object, just like they have done in the more extreme case for
 SmallVector. But the thing is that in D, it is possible to write an
 easy-to-use vector for which it is a one-liner to switch between GC
 heap-allocated vector object, by-value vector, preallocated internal
 vector (like SmallVector), and beyond!

I note that heap/no heap is completely orthogonal matter, the default is 
however safe and reasonably fast. And IMO changing storage most of the 
time involves some work behind the scenes to get it as "one-liner".

 I admit, I'm very biased because my use case for D is low-level
 systems programming a la C/C++, so naturally I want a standard library
 that will not compromise on aspects that are important for the kinds
 of programs that I write. Nonetheless, if the goal is to have "good
 enough" containers, then it doesn't matter, but if the goal is to have
 "truly optimal" containers (as I think it should be; D is certainly
 powerful enough to pull it off elegantly), then it does matter.

Aye, being a C++ turncoat myself I can understand this sentiment ;)

-- 
Dmitry Olshansky

Oct 24 2011

bearophile <bearophileHUGS lycos.com> writes:

Sean Silva:

 in some
 cases, std::vector causes too much heap traffic so they have
 SmallVector which preallocates a certain amount of storage *inside* of
 the object itself in order to avoid heap traffic if the number of
 elements doesn't exceed some predetermined amount.

I expect Phobos to eventually gain such data structure too, if it's so useful.
It doesn't look hard to implement.


 Even still, LLVM
 uses std::vector all over the place, and it I've never seen a
 std::vector embedded in a class by reference; it is always held by
 value precisely because then you don't do an unnecessary heap
 allocation.

If the vector needs to grow you can't store it all in the class instance. If it
can't grow, in D you use a fixed-sixed array inside the class instance. So the
situation seems the same as C++, or better.


 I admit, I'm very biased because my use case for D is low-level
 systems programming a la C/C++, so naturally I want a standard library
 that will not compromise on aspects that are important for the kinds
 of programs that I write. Nonetheless, if the goal is to have "good
 enough" containers, then it doesn't matter, but if the goal is to have
 "truly optimal" containers (as I think it should be; D is certainly
 powerful enough to pull it off elegantly), then it does matter.

Phobos data structures are meant to be "very good" in their implementation. But
every data structure is the result of a balance of several trade-offs. Phobos
and D try to lead to correct code, because a fast but wrong program is useless.
Even C++ STL is not the fastest possible, so people write other libraries:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html

Bye,
bearophile

Oct 24 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 24.10.2011, 03:57 Uhr, schrieb Sean Silva <chisophugis gmail.com>:

 For example, if you look in the LLVM source tree, you'll see that they
 bend over backwards to avoid heap allocations. For example, in some
 cases, std::vector causes too much heap traffic so they have
 SmallVector which preallocates a certain amount of storage *inside* of
 the object itself in order to avoid heap traffic if the number of
 elements doesn't exceed some predetermined amount. [...]
 But the thing is that in D, it is possible to write an
 easy-to-use vector for which it is a one-liner to switch between GC
 heap-allocated vector object, by-value vector, preallocated internal
 vector (like SmallVector), and beyond!

Then write your special vector class. I'm sure you can optimize it more  
than anything that Phobos could ever offer. My point is that you know  
exactly what you want for your special use case and something in a  
standard library should not be biased or deviate too much from the 'book  
version' of a container. If anything I would support the inclusion of Judy  
Arrays, or other specialized data structures, that have a name that rings  
a bell.

Nov 09 2011

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Looking for documentation of D's lower-level aspects.