digitalmars.D - Escaping the Tyranny of the GC: std.rcstring, first blood

Andrei Alexandrescu (39/39) Sep 14 2014 Walter, Brad, myself, and a couple of others have had a couple of quite

Vladimir Panteleev (19/21) Sep 14 2014 An unrelated question, but how will reference counting work with

Andrei Alexandrescu (11/28) Sep 14 2014 At least for the time being, bona fide class objects with refcounted

Vladimir Panteleev (6/11) Sep 15 2014 RefCounted currently does not work at all with class objects.

Andrei Alexandrescu (8/18) Sep 15 2014 Yes, we should define RefCounted for classes as well. (Sorry, I was

deadalnix (18/18) Sep 14 2014 I don't want to be the smart ass that did nothing and complains

Andrei Alexandrescu (33/51) Sep 14 2014 I've got to give it to you - it's rare to get a review on a design that

Rikki Cattermole (56/60) Sep 14 2014 A few ideas:

Andrei Alexandrescu (9/10) Sep 14 2014 The idea is we want to write things like:

Jakob Ovrum (6/9) Sep 14 2014 The following works fine:

Andrei Alexandrescu (2/11) Sep 15 2014 Yah, sorry for the confusion. -- Andrei

Rikki Cattermole (4/15) Sep 14 2014 Yeah I thought so.

Jakob Ovrum (16/19) Sep 15 2014 It should support appending single code units:
John Colvin (4/48) Sep 15 2014 Why not open this up to all slices of immutable value type

Andrei Alexandrescu (3/4) Sep 15 2014 That will come in good time. For now I didn't want to worry about

monarch_dodra (30/35) Sep 15 2014 ***Blocker thoughts***

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (7/32) Sep 15 2014 Another perfect use case for borrowing...

monarch_dodra (4/11) Sep 15 2014 Right, but RCString already has the RA primitives (and

Andrei Alexandrescu (9/34) Sep 15 2014 I think a @system unsafeSlice() property would be needed indeed.

Kagamin (5/6) Sep 15 2014 Then range primitives should move to std.range or where they are
Wyatt (7/14) Sep 15 2014 I certainly would. If I wanted a GC string from an RCString, I'd
Jacob Carlborg (4/22) Sep 15 2014 Yes, most likely. How about "gcDup" or something like that.

Robert burner Schadek (13/18) Sep 15 2014 I haven't found a single lock, is single threading by design or

Jakob Ovrum (3/15) Sep 15 2014 There's no use of `shared`, so all data involved is TLS.

Robert burner Schadek (2/7) Sep 15 2014 Then it must be made sure that send and receive work properly.

Jakob Ovrum (4/12) Sep 15 2014 They do. They only accept shared or immutable arguments (or

Robert burner Schadek (4/9) Sep 15 2014 compiler says no: concurrency.d(554): Error: static assert

Jakob Ovrum (29/39) Sep 15 2014 Yes, that was my point. std.concurrency handles it correctly -

Robert burner Schadek (6/17) Sep 15 2014 Yes, you must be able to get a RCString from one thread to the
Andrei Alexandrescu (3/10) Sep 15 2014 I think shared(RCString) should be supported. Unique!T is, of course,

Sean Kelly (5/15) Sep 15 2014 Probably because RCString is only logically immutable--it

Andrei Alexandrescu (8/27) Sep 15 2014 Good idea.

Rainer Schuetze (9/12) Sep 15 2014 Please also consider usage with const and immutable:

Andrei Alexandrescu (8/22) Sep 15 2014 Hmmm, good point. That's a bug. Immutable postblit and dtors should use

Rainer Schuetze (12/37) Sep 15 2014 Hmm, seems fine when I try it. It feels like a bug in the type system,

Andrei Alexandrescu (8/39) Sep 15 2014 The conversion relies on pure constructors. As I noted in the opening

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (3/6) Sep 15 2014 http://www.gotw.ca/gotw/045.htm

po (2/6) Sep 15 2014 I don't see how that link answers Andrei's question? He just

Ola Fosheim Gr (5/12) Sep 15 2014 As I understand the issue it works if you make sure to transfer

po (7/11) Sep 15 2014 Ah, I think I follow.

Ola Fosheim Gr (10/17) Sep 15 2014 I think you need to either have multiple shared_ptr objects
Andrei Alexandrescu (8/22) Sep 15 2014 No, and it neeedn't. The article is not that good. In C++, if a thread

Rainer Schuetze (7/32) Sep 15 2014 Huuh? So you must not read a reference to a ref-counted object that

Andrei Alexandrescu (8/43) Sep 16 2014 I meant: by the time the smart pointer got to the thread, its reference

Rainer Schuetze (19/65) Sep 16 2014 Here is a link with a discussion, links and code:

Kagamin (6/8) Sep 16 2014 A slice is two words, concurrently reading and writing them is

po (2/9) Sep 16 2014 Alright sounds sensible enough, it does seem like you would have

Rainer Schuetze (5/18) Sep 15 2014 This describes the scenario I meant in the ARC discussions.

Ola Fosheim Grostad (3/6) Sep 15 2014 Modern x86 has 128 bit CAS instruction too: lock cmpxchg16b

Sean Kelly (3/9) Sep 16 2014 ... which I really need to add to core.atomic.

Andrei Alexandrescu (2/10) Sep 16 2014 Yes please. -- Andrei

Rainer Schuetze (20/28) Sep 15 2014 Here's an example:

Andrei Alexandrescu (18/47) Sep 16 2014 Not sure whether that's a bug or feature :o). In fact I'm not even

Rainer Schuetze (9/63) Sep 21 2014 There is already bug report for this:

Timon Gehr (13/19) Sep 21 2014 This change is unsound.

Sean Kelly (14/23) Sep 15 2014 To be fair, you still have to be a bit careful here or things

Andrei Alexandrescu (5/31) Sep 15 2014 That's news to me. Perhaps it's weak pointer management they need to

Rainer Schuetze (4/8) Sep 15 2014 Do you have any benchmarks to share? Last time I measured, the GC is

bearophile (9/12) Sep 15 2014 An alternative design solution is to follow the Java way, leave

Andrei Alexandrescu (5/14) Sep 15 2014 Again, it's become obvious that a category of users will simply refuse

bearophile (18/21) Sep 15 2014 Is adding reference counted strings to D going to add a

Andrei Alexandrescu (8/26) Sep 15 2014 Increasing the standard library with good artifacts is important. So is

Manu via Digitalmars-d (6/25) Sep 22 2014 I still think most of those users would accept RC instead of GC. Why not

Andrei Alexandrescu (4/8) Sep 22 2014 For class objects that's what's going to happen indeed.

Manu via Digitalmars-d (7/15) Sep 22 2014 How so? In what instances are complicated templates superior to a langua...

Andrei Alexandrescu (7/23) Sep 22 2014 It just works out that way. I don't know exactly why. In fact I have an

Manu via Digitalmars-d (32/65) Sep 22 2014 The trouble with library types like RefCounted!, is that they appear to ...

Andrei Alexandrescu (11/34) Sep 23 2014 That's at most a syntactic issue but not a conceptual one. We have

deadalnix (4/11) Sep 22 2014 I think a library solution + intrinsic for increment/decrement

Manu via Digitalmars-d (10/19) Sep 22 2014 Right, that's pretty much how I imagined it too. Like ranges, where fore...

Dmitry Olshansky (9/30) Sep 23 2014 In my imagination it would be along the lines of

Manu via Digitalmars-d (9/44) Sep 23 2014 Problem with this is you can't make a refcounted int[] without

Dmitry Olshansky (10/52) Sep 24 2014 Is that a problem at all? Why should int[] some how become ref-counted.

Andrei Alexandrescu (3/6) Sep 23 2014 So that would be a pointer type or a value type? Is there copy on write

Dmitry Olshansky (5/11) Sep 24 2014 It would be an intrusively counted type with pointer somewhere in the

Andrei Alexandrescu (9/19) Sep 24 2014 Then that would be confusing seeing as structs are value types. What

Manu via Digitalmars-d (20/40) Sep 24 2014 I think the way I imagine refcounting is the opposite of what you're

Andrei Alexandrescu (11/21) Sep 24 2014 Whatever syntax I like? Awesome! How about:

Manu via Digitalmars-d (32/54) Sep 24 2014 No they don't. It's not a ref counting mechanism, the compiler can't

Andrei Alexandrescu (20/82) Sep 24 2014 D's copy semantics are different from C++'s.

Andrei Alexandrescu (4/6) Sep 24 2014 s/bullshitting/fumbling/
Manu via Digitalmars-d (80/172) Sep 24 2014 Elaborate? I could be doing anything to the effect of refcounting, and

Andrei Alexandrescu (49/49) Sep 24 2014 On 9/24/14, 7:16 PM, Manu via Digitalmars-d wrote:

Manu via Digitalmars-d (3/51) Sep 24 2014 I'm afk (on a phone) for 2 days, but I'll get back to this.

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (10/30) Sep 25 2014 For one, it allows to turn copy+destruction into a move, e.g. for

Walter Bright (9/16) Sep 24 2014 I think Microsoft's C++/CLI tried that mixed pointer approach, and it wa...

Manu via Digitalmars-d (20/33) Sep 25 2014 a disaster. I don't have personal knowledge of this.

Walter Bright (2/2) Sep 25 2014 Your double posting is baaaaack!
Walter Bright (7/11) Sep 25 2014 Consider that people complain a lot about annotations. See the other thr...

bearophile (29/34) Sep 25 2014 It's much better to try to quantify how much this "a lot" means,
Marco Leise (26/39) Sep 28 2014 t was a

bearophile (7/13) Sep 28 2014 I consider the tracking of memory ownership more important for D
Paulo Pinto (12/43) Sep 28 2014 Depends on how you look at it.

Paulo Pinto (21/41) Sep 25 2014 Why it was a disaster? Microsoft is still using it.

Dmitry Olshansky (10/30) Sep 26 2014 Read that as

Andrei Alexandrescu (13/36) Sep 26 2014 Consider:

Foo (4/16) Sep 27 2014 a.x == 42

Dmitry Olshansky (19/40) Sep 27 2014 There is no implicit ref-count. opInc may just as well create a file on

Andrei Alexandrescu (2/5) Sep 27 2014 I literally have no idea what you are discussing here. -- Andrei

Dmitry Olshansky (10/15) Sep 27 2014 That proposed scheme is completely abstract as to what exactly adding X

Andrei Alexandrescu (5/20) Sep 27 2014 What is "that proposed scheme?" I feel like I'm watching a movie

Andrei Alexandrescu (2/23) Sep 27 2014 So then when does the counter get ever incremented? -- Andrei

Foo (9/33) Sep 27 2014 increment: by postblit call

Dmitry Olshansky (56/94) Sep 27 2014 Okay it serves no good for me to make these tiny comments while on the g...

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (43/105) Sep 27 2014 This cannot be stressed enough.

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (3/11) Sep 27 2014 Yepp, it's possible, it turned out to work quite naturally:
Dmitry Olshansky (16/119) Sep 27 2014 You must be missing something big. Ref-counting ain't singular thing,

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (38/115) Sep 28 2014 Ok, you're right about the different possible implementations.

Andrei Alexandrescu (7/11) Sep 27 2014 I still don't understand what "this feature" is after reading your long

Dmitry Olshansky (6/20) Sep 27 2014 Compiler is aware that opInc and opDec are indeed ref-countinng ops,

Andrei Alexandrescu (5/26) Sep 27 2014 You give marginal details but still don't describe the thing. When are

Dmitry Olshansky (81/107) Sep 28 2014 The key point is that the change is small, that's why (maybe) it's hard

Andrei Alexandrescu (10/36) Sep 28 2014 I think I get it now. It has a huge overlap with stuff that we already

Dmitry Olshansky (6/8) Sep 28 2014 Fair enough. Personally I suspect compilers would have to go a long way

Andrei Alexandrescu (2/23) Sep 23 2014 That won't work. Sorry, it has too many holes to enumerate! -- Andrei

Walter Bright (3/5) Sep 24 2014 Intrinsics are unnecessary. The compiler is perfectly capable of recogni...
Ola Fosheim Grostad (5/16) Sep 25 2014 Yes, inc/dec intrinsic is needed to support TSX. I.e. You dont

Jakob Ovrum (11/12) Sep 15 2014 The test on line 267 fails on a 32-bit build:

Marco Leise (5/13) Sep 15 2014 https://issues.dlang.org/show_bug.cgi?id=5063 >.<

Sean Kelly (3/8) Sep 15 2014 So slicing an RCString doesn't increment its refcount?

Andrei Alexandrescu (2/11) Sep 15 2014 It does. -- Andrei

Sean Kelly (3/19) Sep 15 2014 Oops, I was looking at the opSlice for Large, not RCString.

Dicebot (6/20) Sep 17 2014 Ironically, strings have been probably least of my GC-related

Andrei Alexandrescu (6/10) Sep 17 2014 Simplest is "I want to use D without a GC and suddenly the string

Piotrek (25/38) Sep 17 2014 I think the biggest gc=(partially?)off customers are game makers:
Dicebot (16/28) Sep 19 2014 Well this is exactly what I don't understand. Strings we have

Andrei Alexandrescu (10/36) Sep 19 2014 It does affect management, i.e. you don't know when to free the buffer

Dicebot (19/41) Sep 20 2014 I see where you are going at. A bit hard to imagine how it fits

Andrei Alexandrescu (4/17) Sep 20 2014 I understand. RC strings will work just fine. Compared to interlocked

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (5/9) Sep 20 2014 Can someone explain why?

Dmitry Olshansky (7/14) Sep 21 2014 Not spontaneously :)

Ola Fosheim Grostad (6/10) Sep 21 2014 That will have to change if Go is a target. To get full load you

Dicebot (14/24) Sep 21 2014 It doesn't ring a bell to me. For several reasons:
Dmitry Olshansky (12/21) Sep 21 2014 Go is not a target. The fixed concurrency model the have is not the

Ola Fosheim Grostad (5/11) Sep 21 2014 Caches are not a big deal when you wait for io.

Paulo Pinto (7/14) Sep 21 2014 Since when Go is a competitor in the webspace?

Ola Fosheim Grostad (2/3) Sep 21 2014 Since people who create high throughput servers started using it?

Paulo Pinto (6/9) Sep 21 2014 Which people? A few Silicon Valley startups, besides Google?

Ola Fosheim Grostad (4/14) Sep 21 2014 I am not keeping track, but e.g.
Googler Lurker (3/12) Sep 21 2014 Go fizzled inside google but granted has traction outside of

Ola Fosheim Grostad (10/12) Sep 22 2014 Don't be such a coward, show your face and publish you real name.

Dmitry Olshansky (25/31) Sep 22 2014 This statement doesn't make any sense taken in isolation. It lacks way

Ola Fosheim Grostad (14/44) Sep 25 2014 If you porocess and compress a large dataset in one fiber you

Ola Fosheim Grostad (2/2) Sep 25 2014 Analysis of Go growth / usage.

Dmitry Olshansky (4/6) Sep 27 2014 Google was popular last time I heard, so does their language.

Dmitry Olshansky (25/70) Sep 27 2014 So do not. Large dataset is not something a single thread should do

Sean Kelly (9/14) Sep 23 2014 I don't understand what you're getting at. Nothing in D locks

Dmitry Olshansky (9/20) Sep 22 2014 E-hm Go is hardly the top dog in the web space. Java and JVM crowd like

Andrei Alexandrescu (6/25) Sep 22 2014 I agree. It does have legs however. We should learn a few things from

Dmitry Olshansky (12/39) Sep 22 2014 Well in short term that would mean..
Wyatt (5/10) Sep 23 2014 Go also shows the viability of a fixup tool for minor automated

Andrei Alexandrescu (3/14) Sep 23 2014 Yah, we definitely should have one of our mythical lieutenants on that.

Wyatt (5/7) Sep 23 2014 I distinctly remember someone offering to write one and being

Andrei Alexandrescu (2/8) Sep 23 2014 The offer was in the context of a feature that was being rejected. -- An...

Meta (7/9) Sep 23 2014 Walter *has* said before that he's uncomfortable with tools that

Andrei Alexandrescu (3/11) Sep 23 2014 I've been at a conference where Pike spoke about gofix. He convinced me
ketmar via Digitalmars-d (10/12) Sep 23 2014 i can't understand this, though. any big project using SCM nowdays, and
deadalnix (3/13) Sep 23 2014 We are in 2014, and we have good source control. That will be

Kagamin (5/9) Sep 21 2014 Only isolated cluster can safely migrate between threads. D has

Ola Fosheim Grostad (9/12) Sep 21 2014 This can easily be borked if built in RC does not provide

Kagamin (6/12) Sep 22 2014 Isolated data is single-threaded w.r.t. concurrent access. What

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (11/12) Sep 20 2014 I'm testing your RCstring right now in my code to see how much

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (20/21) Sep 20 2014 I'm guessing

Andrei Alexandrescu (4/22) Sep 20 2014 No.

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (15/21) Sep 20 2014 Good idea :)

Andrei Alexandrescu (7/17) Sep 20 2014 Ballpark would be probably 1.1-2.5x. But there's of course a bunch of

Andrei Alexandrescu (3/12) Sep 20 2014 Oh in fact this.small.hashOf is incorrect anyway because it hashes

Andrei Alexandrescu (4/15) Sep 20 2014 Yah, that's the one.

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (9/10) Sep 20 2014 Calling writeln(rcstring) in a module other than rcstring.d

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (8/9) Sep 22 2014 You implementation seems to hold water at least in my tests and

Andrei Alexandrescu (3/8) Sep 22 2014 Awesome, thanks for doing this. How did you measure and what results did...

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (12/24) Sep 24 2014 I just checked that I didn't get any segfaults :)

Andrei Alexandrescu (3/21) Sep 24 2014 Thanks for this work! -- Andrei

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (3/4) Sep 24 2014 So the pro must be (de)allocation speed then, I suppose?

Andrei Alexandrescu (10/13) Sep 24 2014 The pro is twofold:

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (3/10) Sep 24 2014 Ok, thanks.

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (7/8) Sep 24 2014 BTW: If I want to construct my network once and destroy it all in

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (53/56) Sep 24 2014 Further,

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter, Brad, myself, and a couple of others have had a couple of quite 
exciting ideas regarding code that is configurable to use the GC or 
alternate resource management strategies. One thing that became obvious 
to us is we need to have a reference counted string in the standard 
library. That would be usable with applications that want to benefit 
from comfortable string manipulation whilst using classic reference 
counting for memory management. I'll get into more details into the 
mechanisms that would allow the stdlib to provide functionality for both 
GC strings and RC strings; for now let's say that we hope and aim for 
swapping between these with ease. We hope that at one point people would 
be able to change one line of code, rebuild, and get either GC or RC 
automatically (for Phobos and their own code).

The road there is long, but it starts with the proverbial first step. As 
it were, I have a rough draft of a almost-drop-in replacement of string 
(aka immutable(char)[]). Destroy with maximum prejudice:

http://dpaste.dzfl.pl/817283c163f5

For now RCString supports only immutable char as element type. That 
means you can't modify individual characters in an RCString object but 
you can take slices, append to it, etc. - just as you can with string. A 
compact reference counting scheme is complemented with a small buffer 
optimization, so performance should be fairly decent.

Somewhat surprisingly, pure constructors and inout took good care of 
qualified semantics (you can convert a mutable to an immutable string 
and back safely). I'm not sure whether semantics there are a bit too 
lax, but at least for RCString they turned out to work beautifully and 
without too much fuss.

The one wrinkle is that you need to wrap string literals "abc" with 
explicit constructor calls, e.g. RCString("abc"). This puts RCString on 
a lower footing than built-in strings and makes swapping configurations 
a tad more difficult.

Currently I've customized RCString with the allocation policy, which I 
hurriedly reduced to just one function with the semantics of realloc. 
That will probably change in a future pass; the point for now is that 
allocation is somewhat modularized away from the string workings.

So, please fire away. I'd appreciate it if you used RCString in lieu of 
string and note the differences. The closer we get to parity in 
semantics, the better.


Thanks,

Andrei

Sep 14 2014

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 The road there is long, but it starts with the proverbial first 
 step.

An unrelated question, but how will reference counting work with 
classes?

I've recently switched my networking library's raw memory wrapper 
to reference counting (previously it just relied on the GC for 
cleanup), and it was going well until I've hit a brick wall.

The thing with reference counting is it doesn't seem to make 
sense to do it half-way. Everything across the ownership chain 
must be reference counted, because otherwise the non-ref-counted 
link will hold on to its ref-counted child objects forever (until 
the next GC cycle).

In my case, the classes in my applications were holding on to my 
reference-counted structs, and wouldn't let go until they were 
eventually garbage-collected. I can't convert the classes to 
structs because I need their inheritance/polymorphism, and I 
don't see an obvious way to refcount classes (RefCounted 
explicitly does not support classes).

Am I overlooking something?

Sep 14 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/14/14, 7:52 PM, Vladimir Panteleev wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
 The road there is long, but it starts with the proverbial first step.

 An unrelated question, but how will reference counting work with classes?

 I've recently switched my networking library's raw memory wrapper to
 reference counting (previously it just relied on the GC for cleanup),
 and it was going well until I've hit a brick wall.

 The thing with reference counting is it doesn't seem to make sense to do
 it half-way. Everything across the ownership chain must be reference
 counted, because otherwise the non-ref-counted link will hold on to its
 ref-counted child objects forever (until the next GC cycle).

 In my case, the classes in my applications were holding on to my
 reference-counted structs, and wouldn't let go until they were
 eventually garbage-collected. I can't convert the classes to structs
 because I need their inheritance/polymorphism, and I don't see an
 obvious way to refcount classes (RefCounted explicitly does not support
 classes).

 Am I overlooking something?

At least for the time being, bona fide class objects with refcounted 
members will hold on to them until they're manually freed or a GC cycle 
comes about.

We're thinking of a number of schemes for reference counted objects, and 
we think a bottom-up approach to design would work well here: try a 
simple design and assess its limitations. In this case, it would be 
great if you tried to use RefCounted with your class objects and figure 
out what its limitations are.


Thanks,

Andrei

Sep 14 2014

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Monday, 15 September 2014 at 03:52:34 UTC, Andrei Alexandrescu 
wrote:
 We're thinking of a number of schemes for reference counted 
 objects, and we think a bottom-up approach to design would work 
 well here: try a simple design and assess its limitations. In 
 this case, it would be great if you tried to use RefCounted 
 with your class objects and figure out what its limitations are.

RefCounted currently does not work at all with class objects. 
This is explicitly indicated in RefCounted's template constraint.

Are you saying we should try to make RefCounted work with classes 
or something else?

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 11:30 AM, Vladimir Panteleev wrote:
 On Monday, 15 September 2014 at 03:52:34 UTC, Andrei Alexandrescu wrote:
 We're thinking of a number of schemes for reference counted objects,
 and we think a bottom-up approach to design would work well here: try
 a simple design and assess its limitations. In this case, it would be
 great if you tried to use RefCounted with your class objects and
 figure out what its limitations are.

 RefCounted currently does not work at all with class objects. This is
 explicitly indicated in RefCounted's template constraint.

 Are you saying we should try to make RefCounted work with classes or
 something else?

Yes, we should define RefCounted for classes as well. (Sorry, I was 
confused.) Extending to class types should be immediate at least in the 
first approximation. Then we can stand back and take a look at the 
advantages and liabilities.

Could someone please initiate that work?


Thanks,

Andrei

Sep 15 2014

"deadalnix" <deadalnix gmail.com> writes:

I don't want to be the smart ass that did nothing and complains 
about what other did, but I'll be it anyway.

It doesn't look very scalable to me to implement various versions 
of modules with various memory management schemes. Inevitably, 
these will have different subtle variation in semantic, different 
set of bugs, it is twice as many work to maintain and so on.

Have you tried to explore solution where an allocator is passed 
to functions (as far as I can tell, this wasn't very successful 
in C++, but D greater metaprogramming capabilities may offer 
better solutions than C++'s) ?

Another option is to use output ranges. This look like an area 
that is way underused in D. It looks like it is possible for the 
allocation policy to be part of the output range, and so we can 
let users decide without duplication bunch of code.

Finally, concepts like isolated allow the compiler to insert free 
in the generated code in a safe manner. In the same way, it is 
possible to remove a bunch of GC allocation by sticking some 
passes in the middle of the optimizer (

Sep 14 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/14/14, 9:50 PM, deadalnix wrote:
 I don't want to be the smart ass that did nothing and complains about
 what other did, but I'll be it anyway.

 It doesn't look very scalable to me to implement various versions of
 modules with various memory management schemes. Inevitably, these will
 have different subtle variation in semantic, different set of bugs, it
 is twice as many work to maintain and so on.

I've got to give it to you - it's rare to get a review on a design that 
hasn't been described yet :o).

There is no code duplication.

 Have you tried to explore solution where an allocator is passed to
 functions (as far as I can tell, this wasn't very successful in C++, but
 D greater metaprogramming capabilities may offer better solutions than
 C++'s) ?

 Another option is to use output ranges. This look like an area that is
 way underused in D. It looks like it is possible for the allocation
 policy to be part of the output range, and so we can let users decide
 without duplication bunch of code.

I've been thinking for a long time about these:

1. Output ranges;

2. Allocator objects;

3. Reference counting and encapsulations thereof.

Each has a certain attractiveness, particularly when thought of in the 
context of stdlib which tends to use limited, confined allocation patterns.

Took me a while to figure there's some red herring tracking there. I 
probably half convinced Walter too.

The issue is these techniques seem they overlap at all, but in fact the 
overlap is rather thin. In fact, output ranges are rather limited: they 
only fit the bill when (a) only output needs to be allocated, and (b) 
output is produced linearly. Outside these applications, there's simply 
no use.

As soon as thought enters more complex applications, the lure of 
allocators becomes audible. Pass an allocator into the algorithm, they 
say, and you've successfully pushed up memory allocation policy from the 
algorithm into the client.

The reality is allocators are low-level, unstructured devices that 
allocate memory but are not apt at managing it beyond blindly responding 
to client calls "allocate this much memory, now take it back". The many 
subtleties associated with actual _management_ of memory via reference 
counting (evidence: http://dpaste.dzfl.pl/817283c163f5) are completely 
lost on allocators.

I am convinced that we need to improve the lot of people who want to use 
the stdlib without a garbage collector, or with minimal use of it (more 
on that later). To do so, it is obvious we need good alternative 
abstractions, and reference counting is an obvious contender.

 Finally, concepts like isolated allow the compiler to insert free in the
 generated code in a safe manner. In the same way, it is possible to
 remove a bunch of GC allocation by sticking some passes in the middle of
 the optimizer (

)


Andrei

Sep 14 2014

"Rikki Cattermole" <alphaglosined gmail.com> writes:

 The one wrinkle is that you need to wrap string literals "abc" 
 with explicit constructor calls, e.g. RCString("abc"). This 
 puts RCString on a lower footing than built-in strings and 
 makes swapping configurations a tad more difficult.

A few ideas:

import std.traits : isSomeString;
auto refCounted(T)(T value) if (isSomeString!T) {
	static if (is(T == string))
		return new RCXString!(immutable char)(value);
	//....
	static assert(0);
}

static assert("abc".refCounted == "abc");

Wrapper type scenario. May look nicer.

Other which would require a language change of:

struct MyType {
  string value;
  alias value this;

  this(string value) {
   this.value = value;
  }
}

static assert("abc".MyType == "abc");

*shudder* does remind me a little too much of the Jade 
programming language with its casts like that.

There is one other thing which I don't think people are taking 
too seriously is my idea of using with statement to swap out e.g. 
the GC during runtime.

with(myAllocator) {
  Foo foo = new Foo; // calls the allocator to assign memory for 
new instance of Foo
}
// tell allocator to free foo

with(myAllocator) {
  Foo foo = new Foo; // calls the allocator to assign memory for 
new instance of Foo
  myFunc(foo);
}
// if myFunc modifies foo or if myFunc passes foo to another 
function then:
//  tell GC it has to free it when able to
// otherwise:
//  tell allocator to free foo

class MyAllocator : Allocator {
  void opWithIn(string file = __MODULE__, int line = __LINE__, 
string function = ?) {
   GC.pushAllocator(this);
  }

  void opWithOut(string file = __MODULE__, int line = __LINE__, 
string function = ?) {
   GC.popAllocator();
  }
}

By using the with statement this is possible:
void something() {
  with(new RCAllocator) {
   string value = "Hello World!"; // allocates memory via 
RCAllocator
  } // frees here
}

Sep 14 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/14/14, 9:51 PM, Rikki Cattermole wrote:
 static assert("abc".refCounted == "abc");

The idea is we want to write things like:

String s = "abc";

and have it be either refcounted or "classic" depending on the 
definition of String. With a user-defined String, you need:

String s = String("abc");

or

auto s = String("abc");


Andrei

Sep 14 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Monday, 15 September 2014 at 05:50:36 UTC, Andrei Alexandrescu 
wrote:
 and have it be either refcounted or "classic" depending on the 
 definition of String. With a user-defined String, you need:

 String s = String("abc");

The following works fine:

RCString s = "abc";

It will call RCString.this with "abc". The problem is passing 
string literals or slices to functions that receive RCString.

Sep 14 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/14/14, 10:55 PM, Jakob Ovrum wrote:
 On Monday, 15 September 2014 at 05:50:36 UTC, Andrei Alexandrescu wrote:
 and have it be either refcounted or "classic" depending on the
 definition of String. With a user-defined String, you need:

 String s = String("abc");

 The following works fine:

 RCString s = "abc";

 It will call RCString.this with "abc". The problem is passing string
 literals or slices to functions that receive RCString.

Yah, sorry for the confusion. -- Andrei

Sep 15 2014

Rikki Cattermole <alphaglosined gmail.com> writes:

On 15/09/2014 5:51 p.m., Andrei Alexandrescu wrote:
 On 9/14/14, 9:51 PM, Rikki Cattermole wrote:
 static assert("abc".refCounted == "abc");

 The idea is we want to write things like:

 String s = "abc";

 and have it be either refcounted or "classic" depending on the
 definition of String. With a user-defined String, you need:

 String s = String("abc");

 or

 auto s = String("abc");


 Andrei

Yeah I thought so.
Still I think the whole with statement would be a better direction to 
go, but what ever. I'll drop it.

Sep 14 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 So, please fire away. I'd appreciate it if you used RCString in 
 lieu of string and note the differences. The closer we get to 
 parity in semantics, the better.

It should support appending single code units:

---
alias String = RCString;

void main()
{
     String s = "abc";

     s ~= cast(char)'0';
     s ~= cast(wchar)'0';
     s ~= cast(dchar)'0';

     writeln(s); // abc000
}
---

Works with C[], fails with RCString. The same is true for 
concatenation.

Sep 15 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 Walter, Brad, myself, and a couple of others have had a couple 
 of quite exciting ideas regarding code that is configurable to 
 use the GC or alternate resource management strategies. One 
 thing that became obvious to us is we need to have a reference 
 counted string in the standard library. That would be usable 
 with applications that want to benefit from comfortable string 
 manipulation whilst using classic reference counting for memory 
 management. I'll get into more details into the mechanisms that 
 would allow the stdlib to provide functionality for both GC 
 strings and RC strings; for now let's say that we hope and aim 
 for swapping between these with ease. We hope that at one point 
 people would be able to change one line of code, rebuild, and 
 get either GC or RC automatically (for Phobos and their own 
 code).

 The road there is long, but it starts with the proverbial first 
 step. As it were, I have a rough draft of a almost-drop-in 
 replacement of string (aka immutable(char)[]). Destroy with 
 maximum prejudice:

 http://dpaste.dzfl.pl/817283c163f5

 For now RCString supports only immutable char as element type. 
 That means you can't modify individual characters in an 
 RCString object but you can take slices, append to it, etc. - 
 just as you can with string. A compact reference counting 
 scheme is complemented with a small buffer optimization, so 
 performance should be fairly decent.

 Somewhat surprisingly, pure constructors and inout took good 
 care of qualified semantics (you can convert a mutable to an 
 immutable string and back safely). I'm not sure whether 
 semantics there are a bit too lax, but at least for RCString 
 they turned out to work beautifully and without too much fuss.

 The one wrinkle is that you need to wrap string literals "abc" 
 with explicit constructor calls, e.g. RCString("abc"). This 
 puts RCString on a lower footing than built-in strings and 
 makes swapping configurations a tad more difficult.

 Currently I've customized RCString with the allocation policy, 
 which I hurriedly reduced to just one function with the 
 semantics of realloc. That will probably change in a future 
 pass; the point for now is that allocation is somewhat 
 modularized away from the string workings.

 So, please fire away. I'd appreciate it if you used RCString in 
 lieu of string and note the differences. The closer we get to 
 parity in semantics, the better.


 Thanks,

 Andrei

Why not open this up to all slices of immutable value type 
elements?

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 1:51 AM, John Colvin wrote:
 Why not open this up to all slices of immutable value type elements?

That will come in good time. For now I didn't want to worry about 
indirections, constructors, etc. -- Andrei

Sep 15 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 So, please fire away. I'd appreciate it if you used RCString in 
 lieu of string and note the differences. The closer we get to 
 parity in semantics, the better.


 Thanks,

 Andrei

***Blocker thoughts***
(unless I'm misunderstood)

- Does not provide Forward range iteration that I can find. This 
makes it unuseable for algorithms:
     find (myRCString, "hello"); //Nope
Also, adding "save" to make it forward might not be a good idea, 
since it would also mean it becomes an RA range (which it isn't).

- Does not provide any way to (even "unsafely") extract a raw 
array. Makes it difficult to interface with existing functions. 
It would also be important for "RCString aware" functions to be 
properly optimized (eg memchr for searching etc...)

- No way to "GC-dup" the RCString. giving "dup"/"idup" members on 
RCstring, for when you really just need to revert to pure 
un-collected GC.

Did I miss something? It seems actually *doing* something with an 
RCString is really difficult.


***Random implementation thought:***
"size_t maxSmall = 23" is (IMO) gratuitous: It can only lead to 
non-optimization and binary bloat. We'd end up having 
incompatible RCStrings, which is bad.

At the very least, I'd say make it a parameter *after* the 
"realloc" function (as arguably, maxSmall  depends on the 
allocation scheme, and not the other way around).

In particular, it seems RCBuffer does not depend on maxSmall, so 
it might be possible to move that out of RCXString.

***Extra thoughts***
There have been requests for non auto-decoding strings. Maybe 
this would be a good opportunity for "RCXUString" ?

Sep 15 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Monday, 15 September 2014 at 09:50:30 UTC, monarch_dodra wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei 
 Alexandrescu wrote:
 So, please fire away. I'd appreciate it if you used RCString 
 in lieu of string and note the differences. The closer we get 
 to parity in semantics, the better.


 Thanks,

 Andrei

 ***Blocker thoughts***
 (unless I'm misunderstood)

 - Does not provide Forward range iteration that I can find. 
 This makes it unuseable for algorithms:
     find (myRCString, "hello"); //Nope
 Also, adding "save" to make it forward might not be a good 
 idea, since it would also mean it becomes an RA range (which it 
 isn't).

No, RA is not implied by forward.

 - Does not provide any way to (even "unsafely") extract a raw 
 array. Makes it difficult to interface with existing functions. 
 It would also be important for "RCString aware" functions to be 
 properly optimized (eg memchr for searching etc...)

Another perfect use case for borrowing...

 ***Extra thoughts***
 There have been requests for non auto-decoding strings. Maybe 
 this would be a good opportunity for "RCXUString" ?

Yes. I'm surprised by this proposal, because I thought Walter was 
totally opposed to a dedicated string type. If it now becomes 
acceptable, it's a good opportunity for moving away for 
auto-decoding.

Sep 15 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Monday, 15 September 2014 at 13:15:28 UTC, Marc Schütz wrote:
 - Does not provide Forward range iteration that I can find. 
 This makes it unuseable for algorithms:
    find (myRCString, "hello"); //Nope
 Also, adding "save" to make it forward might not be a good 
 idea, since it would also mean it becomes an RA range (which 
 it isn't).

 No, RA is not implied by forward.

Right, but RCString already has the RA primitives (and 
hasLength), it's only missing ForwardRange traits to *also* 
become RandomAccess.

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 2:50 AM, monarch_dodra wrote:
 - Does not provide Forward range iteration that I can find. This makes
 it unuseable for algorithms:
      find (myRCString, "hello"); //Nope
 Also, adding "save" to make it forward might not be a good idea, since
 it would also mean it becomes an RA range (which it isn't).

If we move forward with this type, traits will recognize it as isSomeString.

 - Does not provide any way to (even "unsafely") extract a raw array.
 Makes it difficult to interface with existing functions. It would also
 be important for "RCString aware" functions to be properly optimized (eg
 memchr for searching etc...)

I think a  system unsafeSlice() property would be needed indeed.

 - No way to "GC-dup" the RCString. giving "dup"/"idup" members on
 RCstring, for when you really just need to revert to pure un-collected GC.

Nice. But then I'm thinking, wouldn't people think .dup produces another 
RCString?

 Did I miss something? It seems actually *doing* something with an
 RCString is really difficult.

Yah it's too tightly wound right now, but that's the right way!

 ***Random implementation thought:***
 "size_t maxSmall = 23" is (IMO) gratuitous: It can only lead to
 non-optimization and binary bloat. We'd end up having incompatible
 RCStrings, which is bad.

 At the very least, I'd say make it a parameter *after* the "realloc"
 function (as arguably, maxSmall  depends on the allocation scheme, and
 not the other way around).

I think realloc will disappear.

 In particular, it seems RCBuffer does not depend on maxSmall, so it
 might be possible to move that out of RCXString.

 ***Extra thoughts***
 There have been requests for non auto-decoding strings. Maybe this would
 be a good opportunity for "RCXUString" ?

For now I was aiming at copying string's semantics.


Andrei

Sep 15 2014

"Kagamin" <spam here.lot> writes:

On Monday, 15 September 2014 at 14:44:53 UTC, Andrei Alexandrescu 
wrote:
 For now I was aiming at copying string's semantics.

Then range primitives should move to std.range or where they are 
now. By default string iterates over its array elements, which is 
char in this case.

Sep 15 2014

"Wyatt" <wyatt.epp gmail.com> writes:

On Monday, 15 September 2014 at 14:44:53 UTC, Andrei Alexandrescu 
wrote:
 On 9/15/14, 2:50 AM, monarch_dodra wrote:
 - No way to "GC-dup" the RCString. giving "dup"/"idup" members 
 on RCstring, for when you really just need to revert to pure 
 un-collected GC.

 Nice. But then I'm thinking, wouldn't people think .dup 
 produces another RCString?

I certainly would.  If I wanted a GC string from an RCString, I'd 
probably reach for std.conv for clarity's sake.  e.g.

RCString foo = "banana!";
string bar = to!string(foo);

-Wyatt

Sep 15 2014

Jacob Carlborg <doob me.com> writes:

On 2014-09-15 16:45, Andrei Alexandrescu wrote:
 On 9/15/14, 2:50 AM, monarch_dodra wrote:
 - Does not provide Forward range iteration that I can find. This makes
 it unuseable for algorithms:
      find (myRCString, "hello"); //Nope
 Also, adding "save" to make it forward might not be a good idea, since
 it would also mean it becomes an RA range (which it isn't).

 If we move forward with this type, traits will recognize it as
 isSomeString.

 - Does not provide any way to (even "unsafely") extract a raw array.
 Makes it difficult to interface with existing functions. It would also
 be important for "RCString aware" functions to be properly optimized (eg
 memchr for searching etc...)

 I think a  system unsafeSlice() property would be needed indeed.

 - No way to "GC-dup" the RCString. giving "dup"/"idup" members on
 RCstring, for when you really just need to revert to pure un-collected
 GC.

 Nice. But then I'm thinking, wouldn't people think .dup produces another
 RCString?

Yes, most likely. How about "gcDup" or something like that.

-- 
/Jacob Carlborg

Sep 15 2014

"Robert burner Schadek" <rburners gmail.com> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 The road there is long, but it starts with the proverbial first 
 step. As it were, I have a rough draft of a almost-drop-in 
 replacement of string (aka immutable(char)[]). Destroy with 
 maximum prejudice:

 http://dpaste.dzfl.pl/817283c163f5

I haven't found a single lock, is single threading by design or 
is thread-safety on your todo?

Could you transfer this into phobos and make it work with the 
functions in std.string, it would be a shame if they wouldn't 
work out of the box when this gets merged. I haven't seen 
anything that should prevent using the functions of std.string 
except isSomeString but that should be no problem to fix. This is 
sort of personal to me as most of my PR are in std.string and I 
sort of aspire to become the LT for std.string ;-)

I would assume RCString should be faster than string, so could 
you provide a benchmark of the two.

Sep 15 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Monday, 15 September 2014 at 09:53:28 UTC, Robert burner
Schadek wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei 
 Alexandrescu wrote:
 The road there is long, but it starts with the proverbial 
 first step. As it were, I have a rough draft of a 
 almost-drop-in replacement of string (aka immutable(char)[]). 
 Destroy with maximum prejudice:

 http://dpaste.dzfl.pl/817283c163f5

 I haven't found a single lock, is single threading by design or 
 is thread-safety on your todo?

There's no use of `shared`, so all data involved is TLS.

Sep 15 2014

"Robert burner Schadek" <rburners gmail.com> writes:

On Monday, 15 September 2014 at 10:13:28 UTC, Jakob Ovrum wrote:
 On Monday, 15 September 2014 at 09:53:28 UTC, Robert burner
 I haven't found a single lock, is single threading by design 
 or is thread-safety on your todo?

 There's no use of `shared`, so all data involved is TLS.

Then it must be made sure that send and receive work properly.

Sep 15 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Monday, 15 September 2014 at 11:53:15 UTC, Robert burner
Schadek wrote:
 On Monday, 15 September 2014 at 10:13:28 UTC, Jakob Ovrum wrote:
 On Monday, 15 September 2014 at 09:53:28 UTC, Robert burner
 I haven't found a single lock, is single threading by design 
 or is thread-safety on your todo?

 There's no use of `shared`, so all data involved is TLS.

 Then it must be made sure that send and receive work properly.

They do. They only accept shared or immutable arguments (or
arguments with no mutable indirection).

Sep 15 2014

"Robert burner Schadek" <rburners gmail.com> writes:

On Monday, 15 September 2014 at 12:11:14 UTC, Jakob Ovrum wrote:
 There's no use of `shared`, so all data involved is TLS.

 Then it must be made sure that send and receive work properly.

 They do. They only accept shared or immutable arguments (or
 arguments with no mutable indirection).

compiler says no: concurrency.d(554): Error: static assert  
"Aliases to mutable thread-local data not allowed."

I used the std.concurrency example

Sep 15 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Monday, 15 September 2014 at 12:47:08 UTC, Robert burner 
Schadek wrote:
 On Monday, 15 September 2014 at 12:11:14 UTC, Jakob Ovrum wrote:
 There's no use of `shared`, so all data involved is TLS.

 Then it must be made sure that send and receive work properly.

 They do. They only accept shared or immutable arguments (or
 arguments with no mutable indirection).

 compiler says no: concurrency.d(554): Error: static assert  
 "Aliases to mutable thread-local data not allowed."

 I used the std.concurrency example

Yes, that was my point. std.concurrency handles it correctly - 
there's no unsafe memory sharing going on with RCString's 
implementation.

If you are suggesting we somehow make this work so it can be a 
drop-in replacement for `string`:

I don't think that should be implicitly supported.

One method would be to support shared(RCString). This isn't very 
practical for this use-case, as since atomic reference counting 
is super slow, you wouldn't want to be using shared(RCString) 
throughout your program. So you'd have to make a copy on each 
side (unshared -> shared, then send, then shared -> unshared) 
which is one copy more than necessary and would still require 
support for shared(RCString) which is non-trivial.

Another option would be to hardcode support for RCString in 
std.concurrency. This would make the copy hidden, which would go 
against good practices concerning arrays in D, and not very 
useful for  nogc if the copy has to be a GC copy. Additionally, 
RCString's interface would need to be compromised to allow 
constructing from an existing buffer somehow.

Maybe the right solution involves integration with 
std.typecons.Unique. Passing an instance of Unique!T to another 
thread is something std.concurrency should support, and RCString 
could be given a method that returns Unique!RCString if the 
reference count is 1 and errors otherwise. Unique's current 
implementation would have to be overhauled to carry its payload 
in-situ instead of on the GC heap like it currently does, but 
that's something we should do regardless.

Sep 15 2014

"Robert burner Schadek" <rburners gmail.com> writes:

On Monday, 15 September 2014 at 13:13:34 UTC, Jakob Ovrum wrote:
 If you are suggesting we somehow make this work so it can be a 
 drop-in replacement for `string`:

Yes, you must be able to get a RCString from one thread to the 
next.

 I don't think that should be implicitly supported.

Well, it should be at least supported in phobos.

How is another matter.

 Maybe the right solution involves integration with 
 std.typecons.Unique. Passing an instance of Unique!T to another 
 thread is something std.concurrency should support, and 
 RCString could be given a method that returns Unique!RCString 
 if the reference count is 1 and errors otherwise. Unique's 
 current implementation would have to be overhauled to carry its 
 payload in-situ instead of on the GC heap like it currently 
 does, but that's something we should do regardless.

Sounds good.

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 6:13 AM, Jakob Ovrum wrote:
 One method would be to support shared(RCString). This isn't very
 practical for this use-case, as since atomic reference counting is super
 slow, you wouldn't want to be using shared(RCString) throughout your
 program. So you'd have to make a copy on each side (unshared -> shared,
 then send, then shared -> unshared) which is one copy more than
 necessary and would still require support for shared(RCString) which is
 non-trivial.

I think shared(RCString) should be supported. Unique!T is, of course, 
also worth exploring. -- Andrei

Sep 15 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Monday, 15 September 2014 at 12:47:08 UTC, Robert burner
Schadek wrote:
 On Monday, 15 September 2014 at 12:11:14 UTC, Jakob Ovrum wrote:
 There's no use of `shared`, so all data involved is TLS.

 Then it must be made sure that send and receive work properly.

 They do. They only accept shared or immutable arguments (or
 arguments with no mutable indirection).

 compiler says no: concurrency.d(554): Error: static assert  
 "Aliases to mutable thread-local data not allowed."

 I used the std.concurrency example

Probably because RCString is only logically immutable--it
contains unions of mutable and immutable members to simplify
construction.

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 2:53 AM, Robert burner Schadek wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
 The road there is long, but it starts with the proverbial first step.
 As it were, I have a rough draft of a almost-drop-in replacement of
 string (aka immutable(char)[]). Destroy with maximum prejudice:

 http://dpaste.dzfl.pl/817283c163f5

 I haven't found a single lock, is single threading by design or is
 thread-safety on your todo?

Currently shared strings are not addressed.

 Could you transfer this into phobos and make it work with the functions
 in std.string, it would be a shame if they wouldn't work out of the box
 when this gets merged. I haven't seen anything that should prevent using
 the functions of std.string except isSomeString but that should be no
 problem to fix.

Good idea.

 This is sort of personal to me as most of my PR are in
 std.string and I sort of aspire to become the LT for std.string ;-)

Oooh, nice!

 I would assume RCString should be faster than string, so could you
 provide a benchmark of the two.

Good idea. It likely won't be faster for the most part (unless it uses 
realloc and realloc is a lot faster than GC.realloc). Designs based on 
RCString will, however, have a tighter memory footprint.


Andrei

Sep 15 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 15.09.2014 07:49, Andrei Alexandrescu wrote:
 I haven't found a single lock, is single threading by design or is
 thread-safety on your todo?

 Currently shared strings are not addressed.

Please also consider usage with const and immutable:

* both will disallow changing the reference count without casting

* immutable means implicitely shared between threads, so you'll have to 
make RCString thread-safe even if shared isn't explicitly supported.

Unfortunately, I've yet to see an efficient thread-safe implementation 
of reference counting (i.e. without locks).

VC used to have reference counted strings, but moved away from it. Maybe 
it doesn't pull its own weight in the face of the small-string-optimization.

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 8:58 AM, Rainer Schuetze wrote:
 On 15.09.2014 07:49, Andrei Alexandrescu wrote:
 I haven't found a single lock, is single threading by design or is
 thread-safety on your todo?

 Currently shared strings are not addressed.

 Please also consider usage with const and immutable:

 * both will disallow changing the reference count without casting

I think these work fine. If not, please send examples.

 * immutable means implicitely shared between threads, so you'll have to
 make RCString thread-safe even if shared isn't explicitly supported.

Hmmm, good point. That's a bug. Immutable postblit and dtors should use 
atomic ops.

 Unfortunately, I've yet to see an efficient thread-safe implementation
 of reference counting (i.e. without locks).

No locks needed, just interlocked ++/--.

 VC used to have reference counted strings, but moved away from it. Maybe
 it doesn't pull its own weight in the face of the
 small-string-optimization.

The reason of C++ strings moving away from refcounting is not strongly 
related to interlocked refcounting being slow.


Andrei

Sep 15 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 15.09.2014 09:22, Andrei Alexandrescu wrote:
 On 9/15/14, 8:58 AM, Rainer Schuetze wrote:
 On 15.09.2014 07:49, Andrei Alexandrescu wrote:
 I haven't found a single lock, is single threading by design or is
 thread-safety on your todo?

 Currently shared strings are not addressed.

 Please also consider usage with const and immutable:

 * both will disallow changing the reference count without casting

 I think these work fine. If not, please send examples.

Hmm, seems fine when I try it. It feels like a bug in the type system, 
though: when you make a copy of const(RCXString) to some RCXString, it 
removes the const from the referenced RCBuffer struct mbuf!?

 * immutable means implicitely shared between threads, so you'll have to
 make RCString thread-safe even if shared isn't explicitly supported.

 Hmmm, good point. That's a bug. Immutable postblit and dtors should use
 atomic ops.

 Unfortunately, I've yet to see an efficient thread-safe implementation
 of reference counting (i.e. without locks).

 No locks needed, just interlocked ++/--.

Eager reference counting with atomics is not thread safe. See the 
discussions about automatic reference counting.

 VC used to have reference counted strings, but moved away from it. Maybe
 it doesn't pull its own weight in the face of the
 small-string-optimization.

 The reason of C++ strings moving away from refcounting is not strongly
 related to interlocked refcounting being slow.

Yes, they did not care for thread safety back then. IIRC they had no 
small-buffer-optimization. With that, reference counting only kicks in 
with large strings.

If we need a lock on these for proper reference counting, it's still 
better than making a copy including a global lock by the allocator.

Rainer

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 9:56 AM, Rainer Schuetze wrote:
 On 15.09.2014 09:22, Andrei Alexandrescu wrote:
 On 9/15/14, 8:58 AM, Rainer Schuetze wrote:
 On 15.09.2014 07:49, Andrei Alexandrescu wrote:
 I haven't found a single lock, is single threading by design or is
 thread-safety on your todo?

 Currently shared strings are not addressed.

 Please also consider usage with const and immutable:

 * both will disallow changing the reference count without casting

 I think these work fine. If not, please send examples.

 Hmm, seems fine when I try it. It feels like a bug in the type system,
 though: when you make a copy of const(RCXString) to some RCXString, it
 removes the const from the referenced RCBuffer struct mbuf!?

The conversion relies on pure constructors. As I noted in the opening 
post, I also think there's something too lax in there. If you have a 
reduced example that shows a type system breakage without cast, please 
submit.

 * immutable means implicitely shared between threads, so you'll have to
 make RCString thread-safe even if shared isn't explicitly supported.

 Hmmm, good point. That's a bug. Immutable postblit and dtors should use
 atomic ops.

 Unfortunately, I've yet to see an efficient thread-safe implementation
 of reference counting (i.e. without locks).

 No locks needed, just interlocked ++/--.

 Eager reference counting with atomics is not thread safe. See the
 discussions about automatic reference counting.

I'm not sure about that discussion, but there's good evidence from C++ 
that refcounting with atomics works. What was the smoking gun?


Andrei

Sep 15 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Monday, 15 September 2014 at 17:23:32 UTC, Andrei Alexandrescu
wrote:
 I'm not sure about that discussion, but there's good evidence 
 from C++ that refcounting with atomics works. What was the 
 smoking gun?

http://www.gotw.ca/gotw/045.htm

Sep 15 2014

"po" <yes no.com> writes:

 I'm not sure about that discussion, but there's good evidence 
 from C++ that refcounting with atomics works. What was the 
 smoking gun?

 http://www.gotw.ca/gotw/045.htm

  I don't see how that link answers Andrei's question? He just 
compares different methods of implementing COW.

Sep 15 2014

"Ola Fosheim Gr" <ola.fosheim.grostad+dlang gmail.com> writes:

On Monday, 15 September 2014 at 18:08:31 UTC, po wrote:
 I'm not sure about that discussion, but there's good evidence 
 from C++ that refcounting with atomics works. What was the 
 smoking gun?

 http://www.gotw.ca/gotw/045.htm

  I don't see how that link answers Andrei's question? He just 
 compares different methods of implementing COW.

As I understand the issue it works if you make sure to transfer 
ownership explicitly before the other thread gains access?

Maybe this is more clear:

http://www.1024cores.net/home/lock-free-algorithms/object-life-time-management/differential-reference-counting

Sep 15 2014

"po" <yes no.com> writes:

 As I understand the issue it works if you make sure to transfer 
 ownership explicitly before the other thread gains access?

 Maybe this is more clear:

 http://www.1024cores.net/home/lock-free-algorithms/object-life-time-management/differential-reference-counting

  Ah, I think I follow.

So in C++ terms:
  It basically requires either a global shared_ptr, or that you 
passed one around by reference between threads. And that you then 
killed it in one thread at the exact moment the control block was 
read in another thread. That blog post discusses a solution, I 
wonder if that is implemented in C++'s shared_ptr?

Sep 15 2014

"Ola Fosheim Gr" <ola.fosheim.grostad+dlang gmail.com> writes:

On Monday, 15 September 2014 at 19:43:42 UTC, po wrote:
  Ah, I think I follow.

 So in C++ terms:
  It basically requires either a global shared_ptr, or that you 
 passed one around by reference between threads. And that you 
 then killed it in one thread at the exact moment the control 
 block was read in another thread. That blog post discusses a 
 solution, I wonder if that is implemented in C++'s shared_ptr?

I think you need to either have multiple shared_ptr objects 
(owned by threads) or use atomic_* in cpp?

If I got this right for regular RC you have to increment the 
refcount before handing it to the other thread who is then 
responsible for decrementing it, but if the reference is obtained 
through a global datastructure you need the strong semantics in 
the blog post at 1024cores since you need to increase the count 
to take (thread) ownership of it before accessing it?

(I could be wrong.)

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 12:43 PM, po wrote:
 As I understand the issue it works if you make sure to transfer
 ownership explicitly before the other thread gains access?

 Maybe this is more clear:

 http://www.1024cores.net/home/lock-free-algorithms/object-life-time-management/differential-reference-counting

   Ah, I think I follow.

 So in C++ terms:
   It basically requires either a global shared_ptr, or that you passed
 one around by reference between threads. And that you then killed it in
 one thread at the exact moment the control block was read in another
 thread. That blog post discusses a solution, I wonder if that is
 implemented in C++'s shared_ptr?

No, and it neeedn't. The article is not that good. In C++, if a thread 
must increment a reference counter while it's going to zero due to 
another thread, that's 100% a programming error, not a concurrency 
error. That's a well known and well studied problem. As an aside, 
searching the net for differential reference counting yields pretty much 
only this article.


Andrei

Sep 15 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 15.09.2014 21:49, Andrei Alexandrescu wrote:
 On 9/15/14, 12:43 PM, po wrote:
 As I understand the issue it works if you make sure to transfer
 ownership explicitly before the other thread gains access?

 Maybe this is more clear:

 http://www.1024cores.net/home/lock-free-algorithms/object-life-time-management/differential-reference-counting

   Ah, I think I follow.

 So in C++ terms:
   It basically requires either a global shared_ptr, or that you passed
 one around by reference between threads. And that you then killed it in
 one thread at the exact moment the control block was read in another
 thread. That blog post discusses a solution, I wonder if that is
 implemented in C++'s shared_ptr?

 No, and it neeedn't. The article is not that good. In C++, if a thread
 must increment a reference counter while it's going to zero due to
 another thread, that's 100% a programming error, not a concurrency
 error. That's a well known and well studied problem. As an aside,
 searching the net for differential reference counting yields pretty much
 only this article.

Huuh? So you must not read a reference to a ref-counted object that 
might get changed in another thread? Maybe you mean destruction of the 
shared pointer?

Please note that the scenario is also described by Richard Jones in his 
2nd edition of the "Handbook of Garbage Collection" (see algorithm 18.2 
"Eager reference counting with CompareAndSwap is broken").

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 10:22 PM, Rainer Schuetze wrote:
On 15.09.2014 21:49, Andrei Alexandrescu wrote:
On 9/15/14, 12:43 PM, po wrote:
As I understand the issue it works if you make sure to transfer
ownership explicitly before the other thread gains access?

Maybe this is more clear:

http://www.1024cores.net/home/lock-free-algorithms/object-life-time-management/differential-reference-counting

Ah, I think I follow.

So in C++ terms:
It basically requires either a global shared_ptr, or that you passed
one around by reference between threads. And that you then killed it in
one thread at the exact moment the control block was read in another
thread. That blog post discusses a solution, I wonder if that is
implemented in C++'s shared_ptr?

No, and it neeedn't. The article is not that good. In C++, if a thread
must increment a reference counter while it's going to zero due to
another thread, that's 100% a programming error, not a concurrency
error. That's a well known and well studied problem. As an aside,
searching the net for differential reference counting yields pretty much
only this article.

Huuh? So you must not read a reference to a ref-counted object that
might get changed in another thread?

I didn't say that.

Maybe you mean destruction of the
shared pointer?

I meant: by the time the smart pointer got to the thread, its reference
count has increased already.

Please note that the scenario is also described by Richard Jones in his
2nd edition of the "Handbook of Garbage Collection" (see algorithm 18.2
"Eager reference counting with CompareAndSwap is broken").

I agree such a problem may occur in code generated automatically under
the wraps for high-level languages, but not with shared_ptr (or COM
objects etc).

Andrei

Sep 16 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 16.09.2014 00:44, Andrei Alexandrescu wrote:
On 9/15/14, 10:22 PM, Rainer Schuetze wrote:
On 15.09.2014 21:49, Andrei Alexandrescu wrote:
On 9/15/14, 12:43 PM, po wrote:
As I understand the issue it works if you make sure to transfer
ownership explicitly before the other thread gains access?

Maybe this is more clear:

http://www.1024cores.net/home/lock-free-algorithms/object-life-time-management/differential-reference-counting

Ah, I think I follow.

Here is a link with a discussion, links and code:

https://groups.google.com/forum/#!topic/comp.programming.threads/6mXgQEiAOW8

It seems there were multiple patents claiming invention of that technique.

Huuh? So you must not read a reference to a ref-counted object that
might get changed in another thread?

I didn't say that.

Maybe you mean destruction of the
shared pointer?

I meant: by the time the smart pointer got to the thread, its reference
count has increased already.

This works if you use message passing. The issue exists for
"shared(shared_ptr!T)". It might be bad style, but that is a convention
not enforced by the language.

Incidentally, Herb Sutter used "shared_ptr<T>" as a means to implement
lock-free linked lists in his talk at the CppCon. To avoid issues, the
list head has to be "atomic<shared_ptr<T>>". Which currently needs a
lock to do an assignment for similar reasons. ;-o He said there might be
ways around that...

I agree such a problem may occur in code generated automatically under
the wraps for high-level languages, but not with shared_ptr (or COM
objects etc).

I agree it is worse if the mechanism is hidden by the system, pretending
you are dealing with a single pointer. I'm not yet ready to accept it
doesn't exist elsewhere.

Coming back to RCString, immutable(RCString) does not have this problem,
because it must not be modified by any thread. Working with
shared(RCString) isn't supported without a lot of overloads, so you'll
have to synchronize externally and cast away shared.

Sep 16 2014

"Kagamin" <spam here.lot> writes:

On Tuesday, 16 September 2014 at 05:22:15 UTC, Rainer Schuetze 
wrote:
 Huuh? So you must not read a reference to a ref-counted object 
 that might get changed in another thread?

A slice is two words, concurrently reading and writing them is 
not thread-safe in current GC model too, as another thread can 
get in between writing length and ptr fields, so it's not a new 
behavior.

Sep 16 2014

"po" <yes no.com> writes:

 No, and it neeedn't. The article is not that good. In C++, if a 
 thread must increment a reference counter while it's going to 
 zero due to another thread, that's 100% a programming error, 
 not a concurrency error. That's a well known and well studied 
 problem. As an aside, searching the net for differential 
 reference counting yields pretty much only this article.


 Andrei

  Alright sounds sensible enough, it does seem like you would have 
to write some crappy code to trigger it

Sep 16 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 15.09.2014 11:31, Ola Fosheim Gr wrote:
 On Monday, 15 September 2014 at 18:08:31 UTC, po wrote:
 I'm not sure about that discussion, but there's good evidence from
 C++ that refcounting with atomics works. What was the smoking gun?

 http://www.gotw.ca/gotw/045.htm

  I don't see how that link answers Andrei's question? He just compares
 different methods of implementing COW.

 As I understand the issue it works if you make sure to transfer
 ownership explicitly before the other thread gains access?

 Maybe this is more clear:

 http://www.1024cores.net/home/lock-free-algorithms/object-life-time-management/differential-reference-counting

This describes the scenario I meant in the ARC discussions.

Thanks for the link, I didn't know a solution exists. I'll have to study 
the "differential" approach to see if it works for our case and at what 
cost it comes...

Sep 15 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Monday, 15 September 2014 at 23:41:27 UTC, Rainer Schuetze 
wrote:
 Thanks for the link, I didn't know a solution exists. I'll have 
 to study the "differential" approach to see if it works for our 
 case and at what cost it comes...

Modern x86 has 128 bit CAS instruction too: lock cmpxchg16b

Sep 15 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Tuesday, 16 September 2014 at 04:19:45 UTC, Ola Fosheim 
Grostad wrote:
 On Monday, 15 September 2014 at 23:41:27 UTC, Rainer Schuetze 
 wrote:
 Thanks for the link, I didn't know a solution exists. I'll 
 have to study the "differential" approach to see if it works 
 for our case and at what cost it comes...

 Modern x86 has 128 bit CAS instruction too: lock cmpxchg16b

... which I really need to add to core.atomic.

Sep 16 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/16/14, 7:22 AM, Sean Kelly wrote:
 On Tuesday, 16 September 2014 at 04:19:45 UTC, Ola Fosheim Grostad wrote:
 On Monday, 15 September 2014 at 23:41:27 UTC, Rainer Schuetze wrote:
 Thanks for the link, I didn't know a solution exists. I'll have to
 study the "differential" approach to see if it works for our case and
 at what cost it comes...

 Modern x86 has 128 bit CAS instruction too: lock cmpxchg16b

 .... which I really need to add to core.atomic.

Yes please. -- Andrei

Sep 16 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 15.09.2014 10:24, Andrei Alexandrescu wrote:
 Hmm, seems fine when I try it. It feels like a bug in the type system,
 though: when you make a copy of const(RCXString) to some RCXString, it
 removes the const from the referenced RCBuffer struct mbuf!?

 The conversion relies on pure constructors. As I noted in the opening
 post, I also think there's something too lax in there. If you have a
 reduced example that shows a type system breakage without cast, please
 submit.

Here's an example:

module module2;

struct S
{
	union
	{
		immutable(char)* iptr;
		char* ptr;
	}
}

void main()
{
	auto s = immutable(S)("hi".ptr);
	S t = s;
	t.ptr[0] = 'A';
}

It seems the union is hiding the fact that there are mutable references. 
Only the first field is verified when copying the struct. Is this by 
design? (typeof(s.ptr) is "immutable(char*)")

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 4:49 PM, Rainer Schuetze wrote:
 On 15.09.2014 10:24, Andrei Alexandrescu wrote:
 Hmm, seems fine when I try it. It feels like a bug in the type system,
 though: when you make a copy of const(RCXString) to some RCXString, it
 removes the const from the referenced RCBuffer struct mbuf!?

 The conversion relies on pure constructors. As I noted in the opening
 post, I also think there's something too lax in there. If you have a
 reduced example that shows a type system breakage without cast, please
 submit.

 Here's an example:

 module module2;

 struct S
 {
      union
      {
          immutable(char)* iptr;
          char* ptr;
      }
 }

 void main()
 {
      auto s = immutable(S)("hi".ptr);
      S t = s;
      t.ptr[0] = 'A';
 }

 It seems the union is hiding the fact that there are mutable references.
 Only the first field is verified when copying the struct. Is this by
 design? (typeof(s.ptr) is "immutable(char*)")

Not sure whether that's a bug or feature :o). In fact I'm not even 
kidding. The "it's a bug" view is obvious. The "it's a feature" view 
goes by the reasoning: if you're using a union, it means you plan to do 
gnarly things with the type system anyway, so the compiler may as well 
tread carefully around you.

Through a rather interesting coincidence, I was talking to Walter during 
the weekend about the idiom:

union
{
     immutable T data;
     T mdata;
}

which I found useful for things like incrementing the reference counter 
for non-mutable data. I was discussing how it would be cool if the 
compiler recognized the construct and did something interesting about 
it. It seems it already does.


Andrei

Sep 16 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 16.09.2014 17:38, Andrei Alexandrescu wrote:
 On 9/15/14, 4:49 PM, Rainer Schuetze wrote:
 On 15.09.2014 10:24, Andrei Alexandrescu wrote:
 Hmm, seems fine when I try it. It feels like a bug in the type system,
 though: when you make a copy of const(RCXString) to some RCXString, it
 removes the const from the referenced RCBuffer struct mbuf!?

 The conversion relies on pure constructors. As I noted in the opening
 post, I also think there's something too lax in there. If you have a
 reduced example that shows a type system breakage without cast, please
 submit.

 Here's an example:

 module module2;

 struct S
 {
      union
      {
          immutable(char)* iptr;
          char* ptr;
      }
 }

 void main()
 {
      auto s = immutable(S)("hi".ptr);
      S t = s;
      t.ptr[0] = 'A';
 }

 It seems the union is hiding the fact that there are mutable references.
 Only the first field is verified when copying the struct. Is this by
 design? (typeof(s.ptr) is "immutable(char*)")

 Not sure whether that's a bug or feature :o). In fact I'm not even
 kidding. The "it's a bug" view is obvious. The "it's a feature" view
 goes by the reasoning: if you're using a union, it means you plan to do
 gnarly things with the type system anyway, so the compiler may as well
 tread carefully around you.

 Through a rather interesting coincidence, I was talking to Walter during
 the weekend about the idiom:

 union
 {
      immutable T data;
      T mdata;
 }

 which I found useful for things like incrementing the reference counter
 for non-mutable data. I was discussing how it would be cool if the
 compiler recognized the construct and did something interesting about
 it. It seems it already does.


 Andrei

There is already bug report for this: 
https://issues.dlang.org/show_bug.cgi?id=12885

It also references the issue why this has been changed pretty recently: 
https://issues.dlang.org/show_bug.cgi?id=11257

I'm on the fence whether this is convenient or makes it too easy to 
break const "guarantees". It seems strange that you can modify a 
const-reference only after you make a copy of the "pointer". ATM I'd 
prefer seeing an explicite cast for that.

Sep 21 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/21/2014 11:53 AM, Rainer Schuetze wrote:
 It also references the issue why this has been changed pretty recently:
 https://issues.dlang.org/show_bug.cgi?id=11257

 I'm on the fence whether this is convenient or makes it too easy to
 break const "guarantees". It seems strange that you can modify a
 const-reference only after you make a copy of the "pointer". ATM I'd
 prefer seeing an explicite cast for that.

This change is unsound.

import std.variant;

void foo(const(Algebraic!(int*,const(int)*)) x) safe{
     Algebraic!(int*,const(int)*) y=x;
     *y.get!(int*)()=2;
}

void main() safe{
     auto x=Algebraic!(int*,const(int)*)(new int);
     assert(*x.get!(int*)()==0); // pass
     foo(x); // passed as const, so shouldn't change
     assert(*x.get!(int*)()==2); // pass!
}

Sep 21 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Monday, 15 September 2014 at 16:22:01 UTC, Andrei Alexandrescu
wrote:
 On 9/15/14, 8:58 AM, Rainer Schuetze wrote:

 * immutable means implicitely shared between threads, so 
 you'll have to make RCString thread-safe even if shared isn't 
 explicitly supported.

 Hmmm, good point. That's a bug. Immutable postblit and dtors 
 should use atomic ops.

 Unfortunately, I've yet to see an efficient thread-safe 
 implementation of reference counting (i.e. without locks).

 No locks needed, just interlocked ++/--.

To be fair, you still have to be a bit careful here or things
could be optimized such that data is seen to disappear or change
when it's not expected to.  The original boost::shared_ptr used
an atomic integer as an internal refcount, and that's probably a
good template for how to do RC here.  The newer implementation is
a lot fancier with spinlocks and such, I believe, and is a lot
more complicated.

Also... this is why I'm not over-fond of having immutable being
implicitly shared.  Being unable to create an efficient RCString
that I know is thread-local (the normal case) kind of stinks.  
Maybe there can be a
template parameter option along these lines?

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 11:28 AM, Sean Kelly wrote:
 On Monday, 15 September 2014 at 16:22:01 UTC, Andrei Alexandrescu
 wrote:
 On 9/15/14, 8:58 AM, Rainer Schuetze wrote:

 * immutable means implicitely shared between threads, so you'll have
 to make RCString thread-safe even if shared isn't explicitly supported.

 Hmmm, good point. That's a bug. Immutable postblit and dtors should
 use atomic ops.

 Unfortunately, I've yet to see an efficient thread-safe
 implementation of reference counting (i.e. without locks).

 No locks needed, just interlocked ++/--.

 To be fair, you still have to be a bit careful here or things
 could be optimized such that data is seen to disappear or change
 when it's not expected to.  The original boost::shared_ptr used
 an atomic integer as an internal refcount, and that's probably a
 good template for how to do RC here.  The newer implementation is
 a lot fancier with spinlocks and such, I believe, and is a lot
 more complicated.

That's news to me. Perhaps it's weak pointer management they need to 
address?

 Also... this is why I'm not over-fond of having immutable being
 implicitly shared.  Being unable to create an efficient RCString
 that I know is thread-local (the normal case) kind of stinks. Maybe
 there can be a
 template parameter option along these lines?

Non-immutable and non-shared RCStrings are the ticket.


Andrei

Sep 15 2014

Rainer Schuetze <r.sagitario gmx.de> writes:

On 15.09.2014 07:49, Andrei Alexandrescu wrote:
 I would assume RCString should be faster than string, so could you
 provide a benchmark of the two.

 Good idea. It likely won't be faster for the most part (unless it uses
 realloc and realloc is a lot faster than GC.realloc).

Do you have any benchmarks to share? Last time I measured, the GC is 
quite a bit faster with manual memory management than the C runtime on 
Win32 and on par on Win64.

Sep 15 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 Walter, Brad, myself, and a couple of others have had a couple 
 of quite exciting ideas regarding code that is configurable to 
 use the GC or alternate resource management strategies.

An alternative design solution is to follow the Java way, leave 
the D strings as they are, and avoid to make a mess of user D 
code. Java GC and runtime contain numerous optimizations for the 
management of strings, like the recently introduced string 
de-duplication at run-time:

https://blog.codecentric.de/en/2014/08/string-deduplication-new-feature-java-8-update-20-2

Bye,
bearophile

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 3:30 AM, bearophile wrote:
 Andrei Alexandrescu:

 Walter, Brad, myself, and a couple of others have had a couple of
 quite exciting ideas regarding code that is configurable to use the GC
 or alternate resource management strategies.

 An alternative design solution is to follow the Java way, leave the D
 strings as they are, and avoid to make a mess of user D code. Java GC
 and runtime contain numerous optimizations for the management of
 strings, like the recently introduced string de-duplication at run-time:

 https://blog.codecentric.de/en/2014/08/string-deduplication-new-feature-java-8-update-20-2

Again, it's become obvious that a category of users will simply refuse 
to use a GC, either for the right or the wrong reasons. We must make D 
eminently usable for them.

Andrei

Sep 15 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 Again, it's become obvious that a category of users will simply 
 refuse to use a GC, either for the right or the wrong reasons. 
 We must make D eminently usable for them.

Is adding reference counted strings to D going to add a 
significant amount of complexity for the programmers?

As usual your judgement is better than mine, but surely the 
increase in complexity of D language and its usage must be 
considered in this rcstring discussion. So far I have not seen 
this point discussed enough in this thread.

D is currently quite complex, so I prefer enhancements that 
simplify the code (like tuples), or that make it safer (this 
mostly means type system improvements, like eprovably correct 
tracking of memory areas and lifetimes, or stricter types for 
array indexes, or better means to detect errors at compile-times 
with more compile-time introspection for function/ctor 
arguments), or features that have a limited scope and don't 
increase the general code complexity much (like the partial type 
inference patch created by Kenji).

Bye,
bearophile

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 8:07 AM, bearophile wrote:
 Andrei Alexandrescu:

 Again, it's become obvious that a category of users will simply refuse
 to use a GC, either for the right or the wrong reasons. We must make D
 eminently usable for them.

 Is adding reference counted strings to D going to add a significant
 amount of complexity for the programmers?

Time will tell, but I don't think so.

 As usual your judgement is better than mine, but surely the increase in
 complexity of D language and its usage must be considered in this
 rcstring discussion. So far I have not seen this point discussed enough
 in this thread.

Increasing the standard library with good artifacts is important. So is 
making it more generic by (in this case) expanding the kinds of strings 
it supports.

 D is currently quite complex, so I prefer enhancements that simplify the
 code (like tuples), or that make it safer (this mostly means type system
 improvements, like eprovably correct tracking of memory areas and
 lifetimes, or stricter types for array indexes, or better means to
 detect errors at compile-times with more compile-time introspection for
 function/ctor arguments), or features that have a limited scope and
 don't increase the general code complexity much (like the partial type
 inference patch created by Kenji).

I think most people exclude the library when discussing the complexity 
of a language.


Andrei

Sep 15 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 16 September 2014 00:51, Andrei Alexandrescu via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 9/15/14, 3:30 AM, bearophile wrote:

 Andrei Alexandrescu:

  Walter, Brad, myself, and a couple of others have had a couple of
 quite exciting ideas regarding code that is configurable to use the GC
 or alternate resource management strategies.

 An alternative design solution is to follow the Java way, leave the D
 strings as they are, and avoid to make a mess of user D code. Java GC
 and runtime contain numerous optimizations for the management of
 strings, like the recently introduced string de-duplication at run-time:

 https://blog.codecentric.de/en/2014/08/string-deduplication-new-feature-
 java-8-update-20-2

 Again, it's become obvious that a category of users will simply refuse to
 use a GC, either for the right or the wrong reasons. We must make D
 eminently usable for them.


I still think most of those users would accept RC instead of GC. Why not
support RC in the language, and make all of this library noise redundant?
Library RC can't really optimise well, RC requires language support to
elide ref fiddling.

Sep 22 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/22/14, 8:03 PM, Manu via Digitalmars-d wrote:
 I still think most of those users would accept RC instead of GC. Why not
 support RC in the language, and make all of this library noise redundant?

A combo approach language + library delivers the most punch.

 Library RC can't really optimise well, RC requires language support to
 elide ref fiddling.

For class objects that's what's going to happen indeed.


Andrei

Sep 22 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 23 September 2014 14:41, Andrei Alexandrescu via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 9/22/14, 8:03 PM, Manu via Digitalmars-d wrote:

 I still think most of those users would accept RC instead of GC. Why not
 support RC in the language, and make all of this library noise redundant?

 A combo approach language + library delivers the most punch.


How so? In what instances are complicated templates superior to a language
RC type?

 Library RC can't really optimise well, RC requires language support to
 elide ref fiddling.

 For class objects that's what's going to happen indeed.


Where is this discussion? Last time I raised it, it was fiercely shut down
and dismissed.

Sep 22 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/22/14, 9:53 PM, Manu via Digitalmars-d wrote:
 On 23 September 2014 14:41, Andrei Alexandrescu via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>> wrote:

     On 9/22/14, 8:03 PM, Manu via Digitalmars-d wrote:

         I still think most of those users would accept RC instead of GC.
         Why not
         support RC in the language, and make all of this library noise
         redundant?


     A combo approach language + library delivers the most punch.


 How so? In what instances are complicated templates superior to a
 language RC type?

It just works out that way. I don't know exactly why. In fact I have an 
idea why, but conveying it requires building a bunch of context.

         Library RC can't really optimise well, RC requires language
         support to
         elide ref fiddling.


     For class objects that's what's going to happen indeed.


 Where is this discussion? Last time I raised it, it was fiercely shut
 down and dismissed.

Consider yourself vindicated! (Not really, the design will be different 
from what you asked.) The relevant discussion is entitled "RFC: 
reference counted Throwable", and you've already participated to it :o).


Andrei

Sep 22 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 23 September 2014 15:37, Andrei Alexandrescu via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 9/22/14, 9:53 PM, Manu via Digitalmars-d wrote:

 On 23 September 2014 14:41, Andrei Alexandrescu via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>> wrote:

     On 9/22/14, 8:03 PM, Manu via Digitalmars-d wrote:

         I still think most of those users would accept RC instead of GC.
         Why not
         support RC in the language, and make all of this library noise
         redundant?


     A combo approach language + library delivers the most punch.


 How so? In what instances are complicated templates superior to a
 language RC type?

 It just works out that way. I don't know exactly why. In fact I have an
 idea why, but conveying it requires building a bunch of context.

The trouble with library types like RefCounted!, is that they appear to be
conceptually backwards to me.
RefCounted!T suggests that T is a parameter to RefCounted, ie, RefCounted
is the significant object, not 'T', which is what I actually want. T is
just some parameter... I want a ref-counted T, not a T RefCounted, if that
makes sense.
When we have T* or T[], we don't lose the 'T'-ness of the object, we're
just appending a certain type of pointer, and I really think that RC should
be applied the same way.

All these library solutions make T into something else, and that has a
tendency to complicate generic code in my experience. In most cases,
templates are used to capture some type of thing, but in these RefCounted
style cases, it's backwards, it effectively obscures the type. We end out
with inevitable code like is(T == RefCounted!U, U) to get U from T, which
is the thing we typically want to know about, and every instance of a
template like this must be special-cased; they can't be rolled into
PointerTarget!T, or other patterns like Unqual!T can't affect these cases
(not applicable here, but the reliable pattern is what I refer to).

I guess I'm saying, RC should be a type of pointer, not a type of thing...
otherwise generic code that deals with particular things always seems to
run into complications when it expects particular things, and gets
something that looks like a completely different sort of thing.


         Library RC can't really optimise well, RC requires language
         support to
         elide ref fiddling.


     For class objects that's what's going to happen indeed.


 Where is this discussion? Last time I raised it, it was fiercely shut
 down and dismissed.

 Consider yourself vindicated! (Not really, the design will be different
 from what you asked.) The relevant discussion is entitled "RFC: reference
 counted Throwable", and you've already participated to it :o).


I see. I didn't really get that from that thread, but I only skimmed it
quite quickly, since I missed most of the action.

I also don't think I ever insisted on a particular design, I asked to have
it *explored* (I think I made that point quite clearly), and suggested a
design that made sense to me. The idea was shut down in principle, no
competing design's explored.
I'm very happy to see renewed interest in the topic :)

Sep 22 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/22/14, 11:44 PM, Manu via Digitalmars-d wrote:
 On 23 September 2014 15:37, Andrei Alexandrescu via Digitalmars-d
 The trouble with library types like RefCounted!, is that they appear to
 be conceptually backwards to me.
 RefCounted!T suggests that T is a parameter to RefCounted, ie,
 RefCounted is the significant object, not 'T', which is what I actually
 want. T is just some parameter... I want a ref-counted T, not a T
 RefCounted, if that makes sense.
 When we have T* or T[], we don't lose the 'T'-ness of the object, we're
 just appending a certain type of pointer, and I really think that RC
 should be applied the same way.

That's at most a syntactic issue but not a conceptual one. We have 
things like Array!T and nobody blinks an eye.

 All these library solutions make T into something else, and that has a
 tendency to complicate generic code in my experience. In most cases,
 templates are used to capture some type of thing, but in these
 RefCounted style cases, it's backwards, it effectively obscures the
 type.

As it should. You wouldn't want RefCounted!int to be the same as int*.

 We end out with inevitable code like is(T == RefCounted!U, U) to
 get U from T, which is the thing we typically want to know about, and
 every instance of a template like this must be special-cased; they can't
 be rolled into PointerTarget!T, or other patterns like Unqual!T can't
 affect these cases (not applicable here, but the reliable pattern is
 what I refer to).

It turns out a class type is a good candidate for embedding ref 
countedness in its type; in contrast, RefCounted can be slapped on any 
value type. That's how we plan to make it to work.

 I guess I'm saying, RC should be a type of pointer, not a type of
 thing...

It /is/ a pointer. The longer name for it would be RefCountedPointer. It 
has pointer semantics. If you're looking for pointer syntax as well, 
that would be a bummer.


Andrei

Sep 23 2014

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 23 September 2014 at 03:03:49 UTC, Manu via 
Digitalmars-d wrote:
 I still think most of those users would accept RC instead of 
 GC. Why not
 support RC in the language, and make all of this library noise 
 redundant?
 Library RC can't really optimise well, RC requires language 
 support to
 elide ref fiddling.

I think a library solution + intrinsic for increment/decrement 
(so they can be better optimized) would be the best option.

Sep 22 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 23 September 2014 16:19, deadalnix via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On Tuesday, 23 September 2014 at 03:03:49 UTC, Manu via Digitalmars-d
 wrote:

 I still think most of those users would accept RC instead of GC. Why not
 support RC in the language, and make all of this library noise redundant?
 Library RC can't really optimise well, RC requires language support to
 elide ref fiddling.

 I think a library solution + intrinsic for increment/decrement (so they
 can be better optimized) would be the best option.

Right, that's pretty much how I imagined it too. Like ranges, where foreach
makes implicit calls to contractual methods, there would also be a contract
for refcounted objects, and the compiler will emit implicit calls to
inc/dec if they exist?
That should eliminate 'RefCounted', you would only need to provide
opInc()/opDec() and rc fiddling calls would be generated automatically?
Then we can preserve the type of things, rather than obscuring them in
layers of wrapper templates...

Sep 22 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-Sep-2014 10:47, Manu via Digitalmars-d пишет:
 On 23 September 2014 16:19, deadalnix via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>> wrote:

     On Tuesday, 23 September 2014 at 03:03:49 UTC, Manu via
     Digitalmars-d wrote:

         I still think most of those users would accept RC instead of GC.
         Why not
         support RC in the language, and make all of this library noise
         redundant?
         Library RC can't really optimise well, RC requires language
         support to
         elide ref fiddling.


     I think a library solution + intrinsic for increment/decrement (so
     they can be better optimized) would be the best option.


 Right, that's pretty much how I imagined it too. Like ranges, where
 foreach makes implicit calls to contractual methods, there would also be
 a contract for refcounted objects, and the compiler will emit implicit
 calls to inc/dec if they exist?

In my imagination it would be along the lines of
 ARC
struct MyCountedStuff{ void opInc(); void opDec(); }

 That should eliminate 'RefCounted', you would only need to provide
 opInc()/opDec() and rc fiddling calls would be generated automatically?

Non-intrusive ref-counts are useful too. And not everybody is thrilled 
by writing inc/dec code again and again.

 Then we can preserve the type of things, rather than obscuring them in
 layers of wrapper templates...

This would be intrusive ref-counting which may be more efficient.


-- 
Dmitry Olshansky

Sep 23 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 23 September 2014 17:17, Dmitry Olshansky via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 23-Sep-2014 10:47, Manu via Digitalmars-d пишет:
 On 23 September 2014 16:19, deadalnix via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>> wrote:

     On Tuesday, 23 September 2014 at 03:03:49 UTC, Manu via
     Digitalmars-d wrote:

         I still think most of those users would accept RC instead of GC.
         Why not
         support RC in the language, and make all of this library noise
         redundant?
         Library RC can't really optimise well, RC requires language
         support to
         elide ref fiddling.


     I think a library solution + intrinsic for increment/decrement (so
     they can be better optimized) would be the best option.


 Right, that's pretty much how I imagined it too. Like ranges, where
 foreach makes implicit calls to contractual methods, there would also be
 a contract for refcounted objects, and the compiler will emit implicit
 calls to inc/dec if they exist?


 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }

Problem with this is you can't make a refcounted int[] without
mangling the type, and you also can't allocate a ref counted 3rd-party
type.

 That should eliminate 'RefCounted', you would only need to provide
 opInc()/opDec() and rc fiddling calls would be generated automatically?


 Non-intrusive ref-counts are useful too. And not everybody is thrilled by
 writing inc/dec code again and again.

It's important to be able to override opInc/opDec for user types, but
I think a default scheme for foreign or builtin types is very
important.

 Then we can preserve the type of things, rather than obscuring them in
 layers of wrapper templates...


 This would be intrusive ref-counting which may be more efficient.

Perhaps I'm not clear what you mean by intrusive/non-intrusive?

Sep 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-Sep-2014 16:17, Manu via Digitalmars-d пишет:
 On 23 September 2014 17:17, Dmitry Olshansky via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 23-Sep-2014 10:47, Manu via Digitalmars-d пишет:
 On 23 September 2014 16:19, deadalnix via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>> wrote:

      On Tuesday, 23 September 2014 at 03:03:49 UTC, Manu via
      Digitalmars-d wrote:

          I still think most of those users would accept RC instead of GC.
          Why not
          support RC in the language, and make all of this library noise
          redundant?
          Library RC can't really optimise well, RC requires language
          support to
          elide ref fiddling.


      I think a library solution + intrinsic for increment/decrement (so
      they can be better optimized) would be the best option.


 Right, that's pretty much how I imagined it too. Like ranges, where
 foreach makes implicit calls to contractual methods, there would also be
 a contract for refcounted objects, and the compiler will emit implicit
 calls to inc/dec if they exist?


 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }

 Problem with this is you can't make a refcounted int[] without
 mangling the type,

Is that a problem at all? Why should int[] some how become ref-counted. 
I constantly at loss with strange edge requirements in your questions. 
Why "mangling" a type is bad?

and you also can't allocate a ref counted 3rd-party
 type.

Ref-Counted!T or make a wrapper that does call some 3-rd party opInc/opDec.
 Then we can preserve the type of things, rather than obscuring them in
 layers of wrapper templates...


 This would be intrusive ref-counting which may be more efficient.

 Perhaps I'm not clear what you mean by intrusive/non-intrusive?

Embed the count vs add a count along the user type.
A use case is for intrusive is to stuff a ref-count into the padding of 
some struct. A non-intrusive is shared_ptr<T> of C++.


-- 
Dmitry Olshansky

Sep 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }

So that would be a pointer type or a value type? Is there copy on write 
somewhere? -- Andrei

Sep 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-Sep-2014 19:13, Andrei Alexandrescu пишет:
 On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }

 So that would be a pointer type or a value type? Is there copy on write
 somewhere? -- Andrei

It would be an intrusively counted type with pointer somewhere in the 
body. To put it simply MyCountedStuff is a kind of smart pointer.

-- 
Dmitry Olshansky

Sep 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/24/14, 3:31 AM, Dmitry Olshansky wrote:
 23-Sep-2014 19:13, Andrei Alexandrescu пишет:
 On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }

 So that would be a pointer type or a value type? Is there copy on write
 somewhere? -- Andrei

 It would be an intrusively counted type with pointer somewhere in the
 body. To put it simply MyCountedStuff is a kind of smart pointer.

Then that would be confusing seeing as structs are value types. What 
you're saying is that a struct with opInc() and opDec() has pointer 
semantics whereas one with not has value semantics. That design isn't 
going to fly.

For classes such a design makes sense as long as the class is no longer 
convertible to Object. That's what I'm proposing for RCObject (and 
Throwable that would inherit it).


Andrei

Sep 24 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 25 September 2014 00:55, Andrei Alexandrescu via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 9/24/14, 3:31 AM, Dmitry Olshansky wrote:
 23-Sep-2014 19:13, Andrei Alexandrescu пишет:
 On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }


 So that would be a pointer type or a value type? Is there copy on write
 somewhere? -- Andrei


 It would be an intrusively counted type with pointer somewhere in the
 body. To put it simply MyCountedStuff is a kind of smart pointer.


 Then that would be confusing seeing as structs are value types. What you're
 saying is that a struct with opInc() and opDec() has pointer semantics
 whereas one with not has value semantics. That design isn't going to fly.

I think the way I imagine refcounting is the opposite of what you're
saying here.

RC would be a type of pointer. An RC struct pointer can be handled
implicitly without problem.
That said, it's potentially very useful to support RC value types too,
and the way to do that would be for the struct to have opInc/opDec. In
that event, the compiler can generate calls to those operators on
assignment.

Something like (whatever syntax you like):

int^ rcInt; // refcounted pointer to an int
MyStruct^ rcStruct; // refcounted pointer to a struct
MyStruct s; // normal value-type struct, but if the struct has
opInc/opDec, the RC handling code in the compiler can implicitly
generate calls to opInc/opDec on assignment, which will allow the
struct to manage itself.

Not sure how to express an RC dynamic array... int[^] rcArray? Anyway,
syntax is whatever it is, I think this approach is what makes sense to
me though.

Sep 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/24/14, 1:15 PM, Manu via Digitalmars-d wrote:
 Something like (whatever syntax you like):

 int^ rcInt; // refcounted pointer to an int
 MyStruct^ rcStruct; // refcounted pointer to a struct
 MyStruct s; // normal value-type struct, but if the struct has
 opInc/opDec, the RC handling code in the compiler can implicitly
 generate calls to opInc/opDec on assignment, which will allow the
 struct to manage itself.

You're getting confused here, postblit and destructor take care of that.

 Not sure how to express an RC dynamic array... int[^] rcArray? Anyway,
 syntax is whatever it is, I think this approach is what makes sense to
 me though.

Whatever syntax I like? Awesome! How about:

RefCounted!int rcInt; // refcounted pointer to an int
RefCounted!MyStruct rcStruct; // refcounted pointer to a struct
RefCounted!(int[]) rcArray; // refcounted array

The irony is the first two already work with the semantics you need, but 
apparently I've had difficulty convincing you to try them and report 
back. My work on RCString has also gone ignored by you, although it's 
exactly stuff that you're asking for.


Andrei

Sep 24 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 25 September 2014 07:17, Andrei Alexandrescu via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 9/24/14, 1:15 PM, Manu via Digitalmars-d wrote:
 Something like (whatever syntax you like):

 int^ rcInt; // refcounted pointer to an int
 MyStruct^ rcStruct; // refcounted pointer to a struct
 MyStruct s; // normal value-type struct, but if the struct has
 opInc/opDec, the RC handling code in the compiler can implicitly
 generate calls to opInc/opDec on assignment, which will allow the
 struct to manage itself.


 You're getting confused here, postblit and destructor take care of that.

No they don't. It's not a ref counting mechanism, the compiler can't
elide those calls.
It's what I use now, and it's as good at C++, but we can do much
better than that.


 Not sure how to express an RC dynamic array... int[^] rcArray? Anyway,
 syntax is whatever it is, I think this approach is what makes sense to
 me though.


 Whatever syntax I like? Awesome! How about:

 RefCounted!int rcInt; // refcounted pointer to an int
 RefCounted!MyStruct rcStruct; // refcounted pointer to a struct
 RefCounted!(int[]) rcArray; // refcounted array

 The irony is the first two already work with the semantics you need, but
 apparently I've had difficulty convincing you to try them and report back.
 My work on RCString has also gone ignored by you, although it's exactly
 stuff that you're asking for.

It all feels backwards to me. You've completely alienated me from the
discussion. I've given up, but I am watching with interest.

I'm not interested in a library solution until I know if, and how the
compiler will optimise it... in which case it's not a library solution
anymore, so why make the implementation a lib?
 nogc users will use this stuff EXCLUSIVELY. There is already more
than enough attribution in D, I *WILL NOT* wrap everything in my
program with RefCounted!(). I will continue to invent my own solutions
in every instance, and we will live the same reality as C++; where
everyone has their own implementations, and none of them are
compatible.

Call it a bikeshed, whatever. I'm certain this is the practical reality.

I have tried RefCounted extensively in the past. Mangling my types
like that caused lots of problems, heaps of is(T == RefCounted!U, U)
started appearing throughout my code, and incompatibility with
existing libraries.
Perhaps the most annoying thing about a library implementation though
is the debuginfo baggage. It's extremely frustrating dealing with
things that are wrapped up like RefCounted while debugging. You can't
evaluate the thing anymore, stepping through your code leads you
through countless library stubs and indirections. You lose the ability
to just read a symbol name without fuss, you need to make your
callstack window half the screen wide to see what you're looking at.

I'm also not convinced meaningful refcounting can be implemented
before we have scope(T) working properly. I think we should be
addressing that first.

Sep 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/24/14, 3:34 PM, Manu via Digitalmars-d wrote:
 On 25 September 2014 07:17, Andrei Alexandrescu via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 9/24/14, 1:15 PM, Manu via Digitalmars-d wrote:
 Something like (whatever syntax you like):

 int^ rcInt; // refcounted pointer to an int
 MyStruct^ rcStruct; // refcounted pointer to a struct
 MyStruct s; // normal value-type struct, but if the struct has
 opInc/opDec, the RC handling code in the compiler can implicitly
 generate calls to opInc/opDec on assignment, which will allow the
 struct to manage itself.


 You're getting confused here, postblit and destructor take care of that.

 No they don't. It's not a ref counting mechanism, the compiler can't
 elide those calls.

It can.

 It's what I use now, and it's as good at C++, but we can do much
 better than that.

D's copy semantics are different from C++'s.

 Not sure how to express an RC dynamic array... int[^] rcArray? Anyway,
 syntax is whatever it is, I think this approach is what makes sense to
 me though.


 Whatever syntax I like? Awesome! How about:

 RefCounted!int rcInt; // refcounted pointer to an int
 RefCounted!MyStruct rcStruct; // refcounted pointer to a struct
 RefCounted!(int[]) rcArray; // refcounted array

 The irony is the first two already work with the semantics you need, but
 apparently I've had difficulty convincing you to try them and report back.
 My work on RCString has also gone ignored by you, although it's exactly
 stuff that you're asking for.

 It all feels backwards to me.

So it's not whatever syntax I like? Make up your mind.

 You've completely alienated me from the
 discussion. I've given up, but I am watching with interest.

How do you mean that?

 I'm not interested in a library solution until I know if, and how the
 compiler will optimise it... in which case it's not a library solution
 anymore, so why make the implementation a lib?

This can't be serious. So you're not looking at any prototype because 
it's not optimized up to wazoo?

  nogc users will use this stuff EXCLUSIVELY. There is already more
 than enough attribution in D, I *WILL NOT* wrap everything in my
 program with RefCounted!(). I will continue to invent my own solutions
 in every instance, and we will live the same reality as C++; where
 everyone has their own implementations, and none of them are
 compatible.

But in C++ you'd use shared_ptr<T> all the time. Your position is 
completely irrational. Frankly it looks you're bullshitting your way 
through the whole conversation. Whenever anyone puts you in the position 
of making a positive contribution, you run away for something else to 
whine about.

 Call it a bikeshed, whatever. I'm certain this is the practical reality.

 I have tried RefCounted extensively in the past. Mangling my types
 like that caused lots of problems, heaps of is(T == RefCounted!U, U)
 started appearing throughout my code, and incompatibility with
 existing libraries.

Existing libraries won't be helped by adding new functions.

What issues did you have with RefCounted? Why do you keep on coming back 
to "it's mangling my types"? Why do you need heaps of is(...)? What is 
it that you're trying to solve, that only a solution you're unable to 
specify can fix?

 Perhaps the most annoying thing about a library implementation though
 is the debuginfo baggage. It's extremely frustrating dealing with
 things that are wrapped up like RefCounted while debugging.

How do you debug code using shared_ptr?

 You can't
 evaluate the thing anymore, stepping through your code leads you
 through countless library stubs and indirections. You lose the ability
 to just read a symbol name without fuss, you need to make your
 callstack window half the screen wide to see what you're looking at.

 I'm also not convinced meaningful refcounting can be implemented
 before we have scope(T) working properly. I think we should be
 addressing that first.

That may as well be true.


Andrei

Sep 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/24/14, 3:54 PM, Andrei Alexandrescu wrote:
 Frankly it looks you're bullshitting your way through the whole
 conversation.

s/bullshitting/fumbling/

Yah, I've done it. Apologies.


Andrei

Sep 24 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 25 September 2014 08:54, Andrei Alexandrescu via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 9/24/14, 3:34 PM, Manu via Digitalmars-d wrote:
 On 25 September 2014 07:17, Andrei Alexandrescu via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 9/24/14, 1:15 PM, Manu via Digitalmars-d wrote:
 Something like (whatever syntax you like):

 int^ rcInt; // refcounted pointer to an int
 MyStruct^ rcStruct; // refcounted pointer to a struct
 MyStruct s; // normal value-type struct, but if the struct has
 opInc/opDec, the RC handling code in the compiler can implicitly
 generate calls to opInc/opDec on assignment, which will allow the
 struct to manage itself.



 You're getting confused here, postblit and destructor take care of that.


 No they don't. It's not a ref counting mechanism, the compiler can't
 elide those calls.


 It can.

Elaborate? I could be doing anything to the effect of refcounting, and
it could be interleaved with anything else performed by those
functions...


 It's what I use now, and it's as good at C++, but we can do much
 better than that.


 D's copy semantics are different from C++'s.

I don't see how that influences this.


 Not sure how to express an RC dynamic array... int[^] rcArray? Anyway,
 syntax is whatever it is, I think this approach is what makes sense to
 me though.



 Whatever syntax I like? Awesome! How about:

 RefCounted!int rcInt; // refcounted pointer to an int
 RefCounted!MyStruct rcStruct; // refcounted pointer to a struct
 RefCounted!(int[]) rcArray; // refcounted array

 The irony is the first two already work with the semantics you need, but
 apparently I've had difficulty convincing you to try them and report
 back.
 My work on RCString has also gone ignored by you, although it's exactly
 stuff that you're asking for.


 It all feels backwards to me.


 So it's not whatever syntax I like? Make up your mind.

This isn't syntax distinction, it's a whole approach.


 You've completely alienated me from the
 discussion. I've given up, but I am watching with interest.


 How do you mean that?

I think it's safe to say I've been the most vocal participant on this
topic for more than half a decade now (6 years maybe?).
Suddenly there is motion, and it goes precisely the opposite direction
that I have been hoping for all this time.
That's fine, I'm very happy there is motion. I'm just clearly not the customer.

I'll be extremely happy to be proven wrong, but I don't feel like I
have an awful lot to add.
I hope it gets to a place I'm happy with, but I find it hard to get on
this train, I'm just not excited by this direction at all.

I specifically don't want shared_ptr<> and std::string, which is where
it seems to be going.


 I'm not interested in a library solution until I know if, and how the
 compiler will optimise it... in which case it's not a library solution
 anymore, so why make the implementation a lib?


 This can't be serious. So you're not looking at any prototype because it's
 not optimized up to wazoo?

I'm completely serious, but not for the reason you paraphrased.
You're suggesting that this is supported by the language somehow? I
have no idea how that works, and that's what I'm interested in.
We all already have RC libs of our own.


  nogc users will use this stuff EXCLUSIVELY. There is already more
 than enough attribution in D, I *WILL NOT* wrap everything in my
 program with RefCounted!(). I will continue to invent my own solutions
 in every instance, and we will live the same reality as C++; where
 everyone has their own implementations, and none of them are
 compatible.


 But in C++ you'd use shared_ptr<T> all the time.

I would never use shared_ptr<> for all the same reasons.


 Your position is completely irrational. Frankly it looks you're bullshitting
your way through the whole
 conversation. Whenever anyone puts you in the position of making a positive
 contribution, you run away for something else to whine about.

Whatever. I'll resist the urge to respond similarly ad hominem.

I've made my position on shared_ptr<> clear on so many occasions. I'm
not interested in a D implementation of shared_ptr<>.
I'm not alone, it's not a tool that people use in my industry; most
companies I know even have rigid policies against stl and/or boost.

What contribution am I supposed to make?
I'm not a language developer, I don't want to be. I'm a customer/end
user, that's the relationship I want. I have code to write and I'm
sick of C++, that's why I'm here.
I'm also a strong advocate for D, giving lots of demonstrations,
lecture/presentations, and time/energy within my industry and local
community.
I have in the past had significant involvement in this NG and IRC.
I also influenced a major commercial company to adopt D.
I don't feel my contribution is worthless, or can be summarised as you
so nicely did.

I complain about things that hinder my work. That's what I care about.
You'll notice that I rarely get involved or excited about fancy
futuristic language constructs or library stuff, I am almost entirely
task oriented, and if something is causing chronic friction between me
and getting my work done, then I raise that as an issue.

In terms of 'positive contribution', whatever that means exactly in
this context, all I have ever really managed to achieve around here is
to be a catalyst for action on a few particularly important issues
that affect myself and my industry.
Many of these have been addressed, and that's awesome. I find the
frequency of language related problems has reduced significantly; most
of my current problems are practical (tooling, etc).


 Call it a bikeshed, whatever. I'm certain this is the practical reality.

 I have tried RefCounted extensively in the past. Mangling my types
 like that caused lots of problems, heaps of is(T == RefCounted!U, U)
 started appearing throughout my code, and incompatibility with
 existing libraries.


 Existing libraries won't be helped by adding new functions.

 What issues did you have with RefCounted? Why do you keep on coming back to
 "it's mangling my types"?

It's unsightly, interferes with the type system (every time I want T,
I need to check if it's wrapped, and then unwrap it), compromises
debugging experience, runs particularly poorly in non-optimised
builds.

For instance: __traits(allMembers, T) == tuple("RefCountedStore",
"_refCounted", "refCountedStore", "__ctor", "__postblit", "__dtor",
"opAssign", "refCountedPayload")
That's not what the user expects. In such an imaginary context, there
is probably already a 'static if(isPointer!T)' branch, which recurses
with PointerTarget!T. Such is likely to exist in libs, and I expect it
would be handled transparent if RC were a type of pointer.


 Why do you need heaps of is(...)?

Because that's how you find T from RefCounted!T. It's not uncommon to
want to know what type you're dealing with.
I expect things like PointerTarget, isPointer, etc, would be enhanced
to support RC pointers in the event they were to exist.
They don't support RefCounted, and I'm not even sure that really makes
sense, because as I've said before, RefCounted!T is conceptually
backwards; ie, T is a kind of RefCounted, rather than refcounted being
a kind of pointer.

 What is it that you're trying to solve, that only a solution you're unable to
specify can fix?

I'd like RC pointers parallel to GC pointers, since I think we've
established in the past that there is no chance we will ever replace
the GC with something like GC backed ARC across the board.

I don't recall trying to specify a solution, just suggesting an
approach that feels natural to me, and is similarly implemented in
other languages.


 Perhaps the most annoying thing about a library implementation though
 is the debuginfo baggage. It's extremely frustrating dealing with
 things that are wrapped up like RefCounted while debugging.


 How do you debug code using shared_ptr?

I don't, for the same reasons.


 You can't
 evaluate the thing anymore, stepping through your code leads you
 through countless library stubs and indirections. You lose the ability
 to just read a symbol name without fuss, you need to make your
 callstack window half the screen wide to see what you're looking at.

 I'm also not convinced meaningful refcounting can be implemented
 before we have scope(T) working properly. I think we should be
 addressing that first.


 That may as well be true.

Shall we declare this a decisive area of development and start a conversation?
I particularly liked the proposal in the other thread, until it
transformed into a storage class.

Sep 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/24/14, 7:16 PM, Manu via Digitalmars-d wrote:
[snip]

Thanks for answering. I've figured I have no meaningful answer to this, 
but allow me to elaborate why out of respect.

Apologies for letting go of the horses a little. An explanation (but not 
an excuse) is that Walter and I are humans and as much as we do our best 
to keep levelheaded at all times we get a tad frustrated.

These are heady days for D. We're more optimistic than ever about future 
prospects and possibilities, which we hope to confirm publicly soon. We 
have a very strong vision (C++ and GC, C++ and GC...) and we have 
designs realizing it that we believe we can make work. This is literally 
the best time in the history of D for the community to help us. Amid 
this feverish preparation (we talk some 15-45 minutes on the phone every 
day), we find ourselves stumped by the occasional idle chatter on 
inconsequential matters that sometimes descends into a spiral of 
gratuitous negativity. Or wine party, as graciously dubbed by Walter.

I have difficulty communicating with you, to the extent that I very 
candidly have no idea what you're trying to accomplish, and how you 
propose language technology to help you. But I think it's possible to 
improve on that.

One simple rule of thumb is to pop one level up and describe the task 
you're trying to accomplish, instead of describing at low level what you 
believe would be obviously the language-based solution. Two examples:

1. You say "if ref were part of a type" and not only say it, but also 
build on the good presumed consequence of it. That can't be done in D, 
simple as that. We can't flip a switch and do it. The ripples throughout 
the entire fabric of the language would essentially transform it in a 
whole different language, with most rules subtly different from today's. 
You yourself confessed "I'm not a language developer, I don't want to 
be." Then the best route is to focus on the high-level task as opposed 
on what you believe would be the language change fixing it. Please take 
this as kindly as I mean it: most language-space solutions you propose 
are alien and unworkable.

2. You wrongly believe language solutions are innately better than 
engineering solutions. Please understand that no amount of notation 
would save you from the issues you encounter with RefCounted. Also, your 
notion that optimization technology only kicks in for language-baked 
artifacts is wrong. Please trust us on this: YES, we can define 
increment and decrement in libraries in such a way they're elided if 
they cancel themselves out. I find it very difficult to sympathize with 
you completely dismissing library engineering solutions for vague 
reasons, which, from what I can tell, no amount of built-in notation can 
save you.

What I hope to come out of this is a clear idea of what you're trying to 
accomplish (not "I want ref part of the type", but "I want to write a 
framework that does ..."), and how you find the current offering in the 
language + standard library + your own custom libraries wanting. Can we 
get that kind of dialog going?


Andrei

Sep 24 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

I'm afk (on a phone) for 2 days, but I'll get back to this.
On 25 Sep 2014 15:30, "Andrei Alexandrescu via Digitalmars-d" <
digitalmars-d puremagic.com> wrote:

 On 9/24/14, 7:16 PM, Manu via Digitalmars-d wrote:
 [snip]

 Thanks for answering. I've figured I have no meaningful answer to this,
 but allow me to elaborate why out of respect.

 Apologies for letting go of the horses a little. An explanation (but not
 an excuse) is that Walter and I are humans and as much as we do our best to
 keep levelheaded at all times we get a tad frustrated.

 These are heady days for D. We're more optimistic than ever about future
 prospects and possibilities, which we hope to confirm publicly soon. We
 have a very strong vision (C++ and GC, C++ and GC...) and we have designs
 realizing it that we believe we can make work. This is literally the best
 time in the history of D for the community to help us. Amid this feverish
 preparation (we talk some 15-45 minutes on the phone every day), we find
 ourselves stumped by the occasional idle chatter on inconsequential matters
 that sometimes descends into a spiral of gratuitous negativity. Or wine
 party, as graciously dubbed by Walter.

 I have difficulty communicating with you, to the extent that I very
 candidly have no idea what you're trying to accomplish, and how you propose
 language technology to help you. But I think it's possible to improve on
 that.

 One simple rule of thumb is to pop one level up and describe the task
 you're trying to accomplish, instead of describing at low level what you
 believe would be obviously the language-based solution. Two examples:

 1. You say "if ref were part of a type" and not only say it, but also
 build on the good presumed consequence of it. That can't be done in D,
 simple as that. We can't flip a switch and do it. The ripples throughout
 the entire fabric of the language would essentially transform it in a whole
 different language, with most rules subtly different from today's. You
 yourself confessed "I'm not a language developer, I don't want to be." Then
 the best route is to focus on the high-level task as opposed on what you
 believe would be the language change fixing it. Please take this as kindly
 as I mean it: most language-space solutions you propose are alien and
 unworkable.

 2. You wrongly believe language solutions are innately better than
 engineering solutions. Please understand that no amount of notation would
 save you from the issues you encounter with RefCounted. Also, your notion
 that optimization technology only kicks in for language-baked artifacts is
 wrong. Please trust us on this: YES, we can define increment and decrement
 in libraries in such a way they're elided if they cancel themselves out. I
 find it very difficult to sympathize with you completely dismissing library
 engineering solutions for vague reasons, which, from what I can tell, no
 amount of built-in notation can save you.

 What I hope to come out of this is a clear idea of what you're trying to
 accomplish (not "I want ref part of the type", but "I want to write a
 framework that does ..."), and how you find the current offering in the
 language + standard library + your own custom libraries wanting. Can we get
 that kind of dialog going?


 Andrei

Sep 24 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Thursday, 25 September 2014 at 02:16:32 UTC, Manu via 
Digitalmars-d wrote:
 On 25 September 2014 08:54, Andrei Alexandrescu via 
 Digitalmars-d
 It's what I use now, and it's as good at C++, but we can do 
 much
 better than that.


 D's copy semantics are different from C++'s.

 I don't see how that influences this.

For one, it allows to turn copy+destruction into a move, e.g. for 
tail calls.

 I'm also not convinced meaningful refcounting can be 
 implemented
 before we have scope(T) working properly. I think we should be
 addressing that first.


 That may as well be true.

 Shall we declare this a decisive area of development and start 
 a conversation?
 I particularly liked the proposal in the other thread, until it
 transformed into a storage class.

Please note that I'm not opposed to turn it back into a type 
modifier. I just don't understand your problems enough to form an 
opinion on it, and as long as that is the case, it's not a good 
idea to switch it back and forth, especially since it brings 
along its own problems. Please post a create example where a 
storage class fails (in the other thread).

Sep 25 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 9/24/2014 1:15 PM, Manu via Digitalmars-d wrote:
 Something like (whatever syntax you like):

 int^ rcInt; // refcounted pointer to an int
 MyStruct^ rcStruct; // refcounted pointer to a struct
 MyStruct s; // normal value-type struct, but if the struct has
 opInc/opDec, the RC handling code in the compiler can implicitly
 generate calls to opInc/opDec on assignment, which will allow the
 struct to manage itself.

I think Microsoft's C++/CLI tried that mixed pointer approach, and it was a 
disaster. I don't have personal knowledge of this.

I suspect I know why. C++ on DOS had mixed pointer types - the near and far 
thing. It was simply unworkable in C++. Sure, it technically worked, but was so 
awful trying to deal with "is my ref type near or far" that people just
couldn't 
write comprehensible code with it.

Note that a ^ pointer will be a different size than a * pointer. Things go 
downhill from there.

Sep 24 2014

Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 25 Sep 2014 15:50, "Walter Bright via Digitalmars-d" <
digitalmars-d puremagic.com> wrote:
 On 9/24/2014 1:15 PM, Manu via Digitalmars-d wrote:
 Something like (whatever syntax you like):

 int^ rcInt; // refcounted pointer to an int
 MyStruct^ rcStruct; // refcounted pointer to a struct
 MyStruct s; // normal value-type struct, but if the struct has
 opInc/opDec, the RC handling code in the compiler can implicitly
 generate calls to opInc/opDec on assignment, which will allow the
 struct to manage itself.


 I think Microsoft's C++/CLI tried that mixed pointer approach, and it was

a disaster. I don't have personal knowledge of this.

C++ doesn't have any notion of borrowing. Scope can (will?) fix a whole lot
of existing problems, and also allow RC -> raw pointers work nicely.

Also, I have no experience that leads me to believe it was any sort of
disaster. It would have been nice if it didn't imply com specifically,
that's probably the real problem that got the bad press; ^ pointers are
implicitly com, which I don't recommend repeating ;)

 I suspect I know why. C++ on DOS had mixed pointer types - the near and

far thing. It was simply unworkable in C++. Sure, it technically worked,
but was so awful trying to deal with "is my ref type near or far" that
people just couldn't write comprehensible code with it.

I don't think this is the same thing. There's a practical and functional
difference. People would definitely use it if they have it.

I don't think the MS problem was in any way related.

 Note that a ^ pointer will be a different size than a * pointer. Things

go downhill from there.

Dynamic arrays are a different size from pointers, that never caused any
problems. I can't imagine any issues from this.

Anyway, it hasn't been explored at length. If it has any chance then I'll
put extensive thought into it.

Sep 25 2014

Walter Bright <newshound2 digitalmars.com> writes:

Your double posting is baaaaack!

Seriously, get a better newsreader/poster. You're the only one with this issue!

Sep 25 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 9/25/2014 12:10 AM, Manu via Digitalmars-d wrote:
  > I think Microsoft's C++/CLI tried that mixed pointer approach, and it was a
 disaster. I don't have personal knowledge of this.
 C++ doesn't have any notion of borrowing. Scope can (will?) fix a whole lot of
 existing problems, and also allow RC -> raw pointers work nicely.

Consider that people complain a lot about annotations. See the other thread. 
Adding the scope annotations everywhere is a LOT of annotations. Do you think 
people will be happy with that? I don't.

I remember reading a paper about someone adding pointer annotations to Java. It 
was a technical success, and a usability failure. People just couldn't be 
bothered to add the annotations.

Sep 25 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 Consider that people complain a lot about annotations.

It's much better to try to quantify how much this "a lot" means, 
even roughly. Otherwise you can't reason on anecdote. (I think D 
annotations are a burden, and their return of investment is 
currently not large, but so far they are acceptable for me. We 
can discuss possible ways to improve the situation).


 See the other thread. Adding the scope annotations everywhere
 is a LOT of annotations. Do you think people will be happy with 
 that?
 I don't.

How much is that "a LOT"? I'd like you to give this idea a chance 
to show its advantages in practice. Removing problems like this, 
and at the same time giving a help to garbage collection 
(reference counting is another kind of garbage collection. And it 
has some costs) is good:

int* foo()  safe {
     int[10] a;
     int[] b = a[];
     return &b[1];
}
void main() {}


So it's a matter of weighting the costs of a system to track 
ownership of memory areas compared to the costs in correctness 
and performance of the current situation (or the current 
situation plus an additional garbage collection strategy).

Lately for me correctness has become very important (more than 
for the author of the video about video game language design), I 
am willing to work more when I think about the code and adding 
some annotations if this avoids me the wasted time to search bugs 
or avoids me troubles with too much GC activity when the program 
gets larger (as shown by Maxime Chevalier-Boisvert).

Bye,
bearophile

Sep 25 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 25 Sep 2014 01:27:13 -0700
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 9/25/2014 12:10 AM, Manu via Digitalmars-d wrote:
  > I think Microsoft's C++/CLI tried that mixed pointer approach, and i=


t was a
 disaster. I don't have personal knowledge of this.
 C++ doesn't have any notion of borrowing. Scope can (will?) fix a whole=


 lot of
 existing problems, and also allow RC -> raw pointers work nicely.

=20
 Consider that people complain a lot about annotations. See the other thre=

ad.=20
 Adding the scope annotations everywhere is a LOT of annotations. Do you t=

hink=20
 people will be happy with that? I don't.
=20
 I remember reading a paper about someone adding pointer annotations to Ja=

va. It=20
 was a technical success, and a usability failure. People just couldn't be=

=20
 bothered to add the annotations.

You brought up these comparisons with near/far pointer earlier.
(They stay as vague.) And now you also argue against borrowing.

Please reconsider, since pointers do differ in their lifetimes.
We have that friction already and D is unable to plug the hole.
=46rom the top of my head there are several bugs about escaping
stack pointers and RC is around the corner.

The extent of what borrowing solves is really yaw dropping
when you read the list of use cases and I will surely add
scope to every function argument that it applies to (and
already do that), because it makes it verifiable safe to call
with any pointer type, be it ARC, GC or stack. I consider that
more important than const or pure.

And why do you bring up Java? Does Java have any pointer types
other than GC? Is it a low level systems programming language?
No! So those annotations probably weren't an enabling feature
like "scope", right?

--=20
Marco

Sep 28 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Marco Leise:

 The extent of what borrowing solves is really yaw dropping
 when you read the list of use cases and I will surely add
 scope to every function argument that it applies to (and
 already do that), because it makes it verifiable safe to call
 with any pointer type, be it ARC, GC or stack. I consider that
 more important than const or pure.

I consider the tracking of memory ownership more important for D 
than the whole reference counting discussions (despite the two 
topics are partially linked). Because it's a correctness issue 
(and will also lead to some performance improvements too).

Bye,
bearophile

Sep 28 2014

Paulo Pinto <pjmlp progtools.org> writes:

Am 28.09.2014 12:15, schrieb Marco Leise:
 Am Thu, 25 Sep 2014 01:27:13 -0700
 schrieb Walter Bright <newshound2 digitalmars.com>:

 On 9/25/2014 12:10 AM, Manu via Digitalmars-d wrote:
   > I think Microsoft's C++/CLI tried that mixed pointer approach, and it was a
 disaster. I don't have personal knowledge of this.
 C++ doesn't have any notion of borrowing. Scope can (will?) fix a whole lot of
 existing problems, and also allow RC -> raw pointers work nicely.

 Consider that people complain a lot about annotations. See the other thread.
 Adding the scope annotations everywhere is a LOT of annotations. Do you think
 people will be happy with that? I don't.

 I remember reading a paper about someone adding pointer annotations to Java. It
 was a technical success, and a usability failure. People just couldn't be
 bothered to add the annotations.

 You brought up these comparisons with near/far pointer earlier.
 (They stay as vague.) And now you also argue against borrowing.

 Please reconsider, since pointers do differ in their lifetimes.
 We have that friction already and D is unable to plug the hole.
  From the top of my head there are several bugs about escaping
 stack pointers and RC is around the corner.

 The extent of what borrowing solves is really yaw dropping
 when you read the list of use cases and I will surely add
 scope to every function argument that it applies to (and
 already do that), because it makes it verifiable safe to call
 with any pointer type, be it ARC, GC or stack. I consider that
 more important than const or pure.

 And why do you bring up Java? Does Java have any pointer types
 other than GC? Is it a low level systems programming language?
 No! So those annotations probably weren't an enabling feature
 like "scope", right?

Depends on how you look at it.

Those type of annotations are used in meta-circular implementations of 
Java. In those cases Java is a low level systems programming language.

Case in point,

Jikes, http://jikesrvm.org/

Annotations examples, 
http://jikesrvm.sourceforge.net/apidocs/latest/org/vmmagic/pragma/package-summary.html

Oracle's Hotspot replacement Graal and Substrate VM AOT compiler,
http://lafo.ssw.uni-linz.ac.at/papers/2014_CGO_OneVMToRuleThemAll.pdf

--
PAulo

Sep 28 2014

"Paulo Pinto" <pjmlp progtools.org> writes:

On Thursday, 25 September 2014 at 05:46:35 UTC, Walter Bright 
wrote:
 On 9/24/2014 1:15 PM, Manu via Digitalmars-d wrote:
 Something like (whatever syntax you like):

 int^ rcInt; // refcounted pointer to an int
 MyStruct^ rcStruct; // refcounted pointer to a struct
 MyStruct s; // normal value-type struct, but if the struct has
 opInc/opDec, the RC handling code in the compiler can 
 implicitly
 generate calls to opInc/opDec on assignment, which will allow 
 the
 struct to manage itself.

 I think Microsoft's C++/CLI tried that mixed pointer approach, 
 and it was a disaster. I don't have personal knowledge of this.

 I suspect I know why. C++ on DOS had mixed pointer types - the 
 near and far thing. It was simply unworkable in C++. Sure, it 
 technically worked, but was so awful trying to deal with "is my 
 ref type near or far" that people just couldn't write 
 comprehensible code with it.

 Note that a ^ pointer will be a different size than a * 
 pointer. Things go downhill from there.

Why it was a disaster? Microsoft is still using it.

It is how C++/CX works, recently introduced with Windows 8 new 
COM based runtime.

The different to C++/CLI being that ^ (handle types) are not GC 
pointers, but rather compiler managed COM (AddRef/Release) 
instances.

For those that wish to stay away from C++/CX, there is the more 
complex Windows Runtime C++ Template Library with ComPtr<>(), but 
then bye compiler support for increment/decrement removals.

http://msdn.microsoft.com/en-us/library/hh699870.aspx

http://msdn.microsoft.com/en-us/library/windows/apps/jj822931.aspx

Example for those lazy to follow the link,

using namespace Platform;

Person^ p = ref new Person("Clark Kent");
p->AddPhoneNumber("Home", "425-555-4567");
p->AddPhoneNumber("Work", "206-555-9999");
String^ workphone = p->PhoneNumbers->Lookup("Work");



--
Paulo

Sep 25 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Sep-2014 18:55, Andrei Alexandrescu пишет:
 On 9/24/14, 3:31 AM, Dmitry Olshansky wrote:
 23-Sep-2014 19:13, Andrei Alexandrescu пишет:
 On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }

 So that would be a pointer type or a value type? Is there copy on write
 somewhere? -- Andrei

 It would be an intrusively counted type with pointer somewhere in the
 body. To put it simply MyCountedStuff is a kind of smart pointer.

 Then that would be confusing seeing as structs are value types. What
 you're saying is that a struct with opInc() and opDec() has pointer
 semantics whereas one with not has value semantics. That design isn't
 going to fly.

Read that as
struct RefCounted(T){

	void opInc();
	void opDec();
}

The main thing is to let compiler know the stuff is ref-counted in some 
generic way.

 For classes such a design makes sense as long as the class is no longer
 convertible to Object. That's what I'm proposing for RCObject (and
 Throwable that would inherit it).


 Andrei

-- 
Dmitry Olshansky

Sep 26 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/26/14, 2:50 PM, Dmitry Olshansky wrote:
 24-Sep-2014 18:55, Andrei Alexandrescu пишет:
 On 9/24/14, 3:31 AM, Dmitry Olshansky wrote:
 23-Sep-2014 19:13, Andrei Alexandrescu пишет:
 On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }

 So that would be a pointer type or a value type? Is there copy on write
 somewhere? -- Andrei

 It would be an intrusively counted type with pointer somewhere in the
 body. To put it simply MyCountedStuff is a kind of smart pointer.

 Then that would be confusing seeing as structs are value types. What
 you're saying is that a struct with opInc() and opDec() has pointer
 semantics whereas one with not has value semantics. That design isn't
 going to fly.

 Read that as
 struct RefCounted(T){

      void opInc();
      void opDec();
 }

Consider:

struct MyRefCounted
     void opInc();
     void opDec();
     int x;
}

MyRefCounted a;
a.x = 42;
MyRefCounted b = a;
b.x = 43;

What is a.x after this?


Andrei

Sep 26 2014

"Foo" <Foo test.de> writes:

 Consider:

 struct MyRefCounted
     void opInc();
     void opDec();
     int x;
 }

 MyRefCounted a;
 a.x = 42;
 MyRefCounted b = a;
 b.x = 43;

 What is a.x after this?


 Andrei

a.x == 42
a.ref_count == 1 (1 for init, +1 for copy, -1 for destruction)
b.x == 43
b.ref_count == 1 (only init)

Sep 27 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

27-Sep-2014 12:11, Foo пишет:
 Consider:

 struct MyRefCounted
     void opInc();
     void opDec();
     int x;
 }

 MyRefCounted a;
 a.x = 42;
 MyRefCounted b = a;
 b.x = 43;

 What is a.x after this?


 Andrei

 a.x == 42
 a.ref_count == 1 (1 for init, +1 for copy, -1 for destruction)
 b.x == 43
 b.ref_count == 1 (only init)

There is no implicit ref-count. opInc may just as well create a file on 
harddrive and count refs there. Guaranteed it would be idiotic idea, but 
the mechanism itself opens door to some cool alternatives like:

- separate tables for ref-counts (many gamedevs seem to favor this, also 
see Objective-C)
- use padding of some stuff for ref-count
- may go on and use e.g. 1 byte for ref-count on their own risk, or even 
a couple of bits here and there

I may go on, and on. But also consider:

GObject of GLib (GNOME libraries)
XPCOM (something I think Mozila did as sort-of COM)
MS COM
etc.

Refcounting is process of add(x), and sub(x), and calling destructor 
should the subtract call report zero. Everything else is in the hands of 
the creator.

-- 
Dmitry Olshansky

Sep 27 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/27/14, 2:43 AM, Dmitry Olshansky wrote:
 Refcounting is process of add(x), and sub(x), and calling destructor
 should the subtract call report zero. Everything else is in the hands of
 the creator.

I literally have no idea what you are discussing here. -- Andrei

Sep 27 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

27-Sep-2014 23:15, Andrei Alexandrescu пишет:
 On 9/27/14, 2:43 AM, Dmitry Olshansky wrote:
 Refcounting is process of add(x), and sub(x), and calling destructor
 should the subtract call report zero. Everything else is in the hands of
 the creator.

 I literally have no idea what you are discussing here. -- Andrei

That proposed scheme is completely abstract as to what exactly adding X 
to a ref-count does. Or what exactly subtracting X from ref-count does.
The only invariant is that there is something that will tell us after 
yet another decrement that ref-count is zero and we should free resources.

As I said elsewhere - automagically embedding ref-count is too limiting, 
there is whole lot of ways to implement count including intrusive and 
non-intrusive.


-- 
Dmitry Olshansky

Sep 27 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/27/14, 12:53 PM, Dmitry Olshansky wrote:
 27-Sep-2014 23:15, Andrei Alexandrescu пишет:
 On 9/27/14, 2:43 AM, Dmitry Olshansky wrote:
 Refcounting is process of add(x), and sub(x), and calling destructor
 should the subtract call report zero. Everything else is in the hands of
 the creator.

 I literally have no idea what you are discussing here. -- Andrei

 That proposed scheme is completely abstract as to what exactly adding X
 to a ref-count does.

What is "that proposed scheme?" I feel like I'm watching a movie 
starting half time.

 Or what exactly subtracting X from ref-count does.
 The only invariant is that there is something that will tell us after
 yet another decrement that ref-count is zero and we should free resources.

 As I said elsewhere - automagically embedding ref-count is too limiting,
 there is whole lot of ways to implement count including intrusive and
 non-intrusive.

Sure, agreed. But what's the deal here?


Andrei

Sep 27 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/27/14, 1:11 AM, Foo wrote:
 Consider:

 struct MyRefCounted
     void opInc();
     void opDec();
     int x;
 }

 MyRefCounted a;
 a.x = 42;
 MyRefCounted b = a;
 b.x = 43;

 What is a.x after this?


 Andrei

 a.x == 42
 a.ref_count == 1 (1 for init, +1 for copy, -1 for destruction)
 b.x == 43
 b.ref_count == 1 (only init)

So then when does the counter get ever incremented? -- Andrei

Sep 27 2014

"Foo" <Foo test.de> writes:

On Saturday, 27 September 2014 at 19:11:08 UTC, Andrei 
Alexandrescu wrote:
 On 9/27/14, 1:11 AM, Foo wrote:
 Consider:

 struct MyRefCounted
    void opInc();
    void opDec();
    int x;
 }

 MyRefCounted a;
 a.x = 42;
 MyRefCounted b = a;
 b.x = 43;

 What is a.x after this?


 Andrei

 a.x == 42
 a.ref_count == 1 (1 for init, +1 for copy, -1 for destruction)
 b.x == 43
 b.ref_count == 1 (only init)

 So then when does the counter get ever incremented? -- Andrei

increment: by postblit call
decrement: by dtor call

But if you ask me, we should either use a valid library solution 
like: http://dpaste.dzfl.pl/b146ac2e599a (it is only a draft of 
10 min work)
Or we should extend UDA's, so that:
 rc(int) x; is rewritten to: Rc!int x;

Sep 27 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

27-Sep-2014 02:51, Andrei Alexandrescu пишет:
 On 9/26/14, 2:50 PM, Dmitry Olshansky wrote:
 24-Sep-2014 18:55, Andrei Alexandrescu пишет:
 On 9/24/14, 3:31 AM, Dmitry Olshansky wrote:
 23-Sep-2014 19:13, Andrei Alexandrescu пишет:
 On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
 In my imagination it would be along the lines of
  ARC
 struct MyCountedStuff{ void opInc(); void opDec(); }

 So that would be a pointer type or a value type? Is there copy on
 write
 somewhere? -- Andrei

 It would be an intrusively counted type with pointer somewhere in the
 body. To put it simply MyCountedStuff is a kind of smart pointer.

 Then that would be confusing seeing as structs are value types. What
 you're saying is that a struct with opInc() and opDec() has pointer
 semantics whereas one with not has value semantics. That design isn't
 going to fly.

 Read that as
 struct RefCounted(T){

      void opInc();
      void opDec();
 }

 Consider:

 struct MyRefCounted
      void opInc();
      void opDec();
      int x;
 }

 MyRefCounted a;
 a.x = 42;
 MyRefCounted b = a;
 b.x = 43;

 What is a.x after this?

Okay it serves no good for me to make these tiny comments while on the go.

As usual, structs are value types, so this feature can be mis-used, no 
two thoughts abouts it. It may need a bit of improvement in 
user-friendliness, compiler may help there by auto-detecting common misuse.

Theoretically class-es would be better choice, except that question of 
allocation pops up immediately, then consider for instance COM objects.

The good thing w.r.t. to memory about structs - they are themselves 
already allocated "somewhere", and it's only ref-counted payload that is 
allocated and destroyed in a user-defined way.

And now for the killer reasons to go for struct is the following:

Compiler _already_ does all of life-time management and had numerous bug 
fixes to make sure it does the right thing. In contrast there is nothing 
for classes that tracks their lifetimes to call proper hooks.

Let's REUSE that mechanism we have with structs and go as lightly as 
possible on  untested LOCs budget.

Full outline, of generic to the max, dirt-cheap implementation with a 
bit of lowering:

ARC or anything close to it, is implemented as follows:
1. Any struct that have  ARC attached, must have the following methods:
	void opInc();
	bool opDec(); // true - time to destroy
It also MUST NOT have postblit, and MUST have destructor.

2. Compiler takes user-defined destructor and creates proper destructor, 
as equivalent of this:
	if(opDec()){
		user__defined_dtor;
	}
3. postblit is defined as opInc().

4. any ctor has opInc() appended to its body.

Everything else is taken care of by the very nature of the structs.
Now this is enough to make ref-counted stuff a bit simpler to write but 
not much beyond. So here the next "consequences" that we can then implement:

4. Compiler is expected to assume anywhere in fully inlined code, that 
opInc()/opDec() pairs are no-op. It should do so even in debug mode 
(though there is less opportunity to do so without inlining). Consider 
it an NRVO of the new age, required optimization.

5. If we extend opInc/opDec to take an argument, the compiler may go 
further and batch up multiple opInc-s and opDec-s, as long as it's safe 
to do so (e.g. there could be exceptions thrown!):

Consider:

auto a = File("some-file.txt");
//pass to some structs for future use
B b = B(a);
C c = C(a);
a = File("other file");

May be (this is overly simplified!):

File a = void, b = void, c = void;
a = File.user_ctor("some-file.txt")'
a.opInc(2);
b = B(a);
c = C(a);
a = File.user_ctor("other file");
a.opInc();


-- 
Dmitry Olshansky

Sep 27 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Saturday, 27 September 2014 at 09:38:35 UTC, Dmitry Olshansky 
wrote:
 As usual, structs are value types, so this feature can be 
 mis-used, no two thoughts abouts it. It may need a bit of 
 improvement in user-friendliness, compiler may help there by 
 auto-detecting common misuse.

 Theoretically class-es would be better choice, except that 
 question of allocation pops up immediately, then consider for 
 instance COM objects.

 The good thing w.r.t. to memory about structs - they are 
 themselves already allocated "somewhere", and it's only 
 ref-counted payload that is allocated and destroyed in a 
 user-defined way.

 And now for the killer reasons to go for struct is the 
 following:

 Compiler _already_ does all of life-time management and had 
 numerous bug fixes to make sure it does the right thing. In 
 contrast there is nothing for classes that tracks their 
 lifetimes to call proper hooks.

This cannot be stressed enough.

 Let's REUSE that mechanism we have with structs and go as 
 lightly as possible on  untested LOCs budget.

 Full outline, of generic to the max, dirt-cheap implementation 
 with a bit of lowering:

 ARC or anything close to it, is implemented as follows:
 1. Any struct that have  ARC attached, must have the following 
 methods:
 	void opInc();
 	bool opDec(); // true - time to destroy
 It also MUST NOT have postblit, and MUST have destructor.

 2. Compiler takes user-defined destructor and creates proper 
 destructor, as equivalent of this:
 	if(opDec()){
 		user__defined_dtor;
 	}
 3. postblit is defined as opInc().

 4. any ctor has opInc() appended to its body.

 Everything else is taken care of by the very nature of the 
 structs.

AFAICS we don't gain anything from this, because it just 
automates certain things that can already be done manually in a 
suitably implemented wrapper struct. I don't think automation is 
necessary here, because realistically, how many RC wrappers will 
there be? Ideally just one, in Phobos.

 Now this is enough to make ref-counted stuff a bit simpler to 
 write but not much beyond. So here the next "consequences" that 
 we can then implement:

 4. Compiler is expected to assume anywhere in fully inlined 
 code, that opInc()/opDec() pairs are no-op. It should do so 
 even in debug mode (though there is less opportunity to do so 
 without inlining). Consider it an NRVO of the new age, required 
 optimization.

 5. If we extend opInc/opDec to take an argument, the compiler 
 may go further and batch up multiple opInc-s and opDec-s, as 
 long as it's safe to do so (e.g. there could be exceptions 
 thrown!):

 Consider:

 auto a = File("some-file.txt");
 //pass to some structs for future use
 B b = B(a);
 C c = C(a);
 a = File("other file");

 May be (this is overly simplified!):

 File a = void, b = void, c = void;
 a = File.user_ctor("some-file.txt")'
 a.opInc(2);
 b = B(a);
 c = C(a);
 a = File.user_ctor("other file");
 a.opInc();

I believe we can achieve the same efficiency without ARC with the 
help of borrowing and multiple alias this. Consider the cases 
where inc/dec can be elided:

    RC!int a;
    // ...
    foo(a);
    // ...
    bar(a);
    // ...

Under the assumption that foo() and bar() don't want to keep a 
copy of their arguments, this is a classical use case for 
borrowing. No inc/dec is necessary, and none will happen, if 
RC!int has an alias-this-ed method returning a scoped reference 
to its payload.

On the other hand, foo() and bar() could want to make copies of 
the refcounted variable. In this case, we still wouldn't need an 
inc/dec, but we need a way to express that. The solution is 
another alias-this-ed method that returns a (scoped) 
BorrowedRC!int, which does not inc/dec on 
construction/destruction, but does so on copying. (It's probably 
possible to reuse RC!int for this, a separate type is likely not 
necessary.)

The other opportunity is on moving:

     void foo() {
         RC!int a;
         // ....
         bar(a);    // last statement in foo()
     }

Here, clearly `a` isn't used after the tail call. Instead of copy 
& destroy, the compiler can resort to a move (bare bitcopy). In 
contrast to C++, this is allowed in D.

This covers most opportunities for elision of the ref counting. 
It only leaves a few corner cases (e.g. `a` no longer used after 
non-tail calls, accumulated inc/dec as in your example). I don't 
think these are worth complicating the compiler with ARC.

Sep 27 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Saturday, 27 September 2014 at 10:23:20 UTC, Marc Schütz wrote:
 On the other hand, foo() and bar() could want to make copies of 
 the refcounted variable. In this case, we still wouldn't need 
 an inc/dec, but we need a way to express that. The solution is 
 another alias-this-ed method that returns a (scoped) 
 BorrowedRC!int, which does not inc/dec on 
 construction/destruction, but does so on copying. (It's 
 probably possible to reuse RC!int for this, a separate type is 
 likely not necessary.)

Yepp, it's possible, it turned out to work quite naturally:
http://wiki.dlang.org/User:Schuetzm/scope#Reference_counting

Sep 27 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

27-Sep-2014 14:23, "Marc Schütz" <schuetzm gmx.net>" пишет:
 On Saturday, 27 September 2014 at 09:38:35 UTC, Dmitry Olshansky wrote:

 The good thing w.r.t. to memory about structs - they are themselves
 already allocated "somewhere", and it's only ref-counted payload that
 is allocated and destroyed in a user-defined way.

 And now for the killer reasons to go for struct is the following:

 Compiler _already_ does all of life-time management and had numerous
 bug fixes to make sure it does the right thing. In contrast there is
 nothing for classes that tracks their lifetimes to call proper hooks.

 This cannot be stressed enough.

 Let's REUSE that mechanism we have with structs and go as lightly as
 possible on  untested LOCs budget.

 Full outline, of generic to the max, dirt-cheap implementation with a
 bit of lowering:

 ARC or anything close to it, is implemented as follows:
 1. Any struct that have  ARC attached, must have the following methods:
     void opInc();
     bool opDec(); // true - time to destroy
 It also MUST NOT have postblit, and MUST have destructor.

 2. Compiler takes user-defined destructor and creates proper
 destructor, as equivalent of this:
     if(opDec()){
         user__defined_dtor;
     }
 3. postblit is defined as opInc().

 4. any ctor has opInc() appended to its body.

 Everything else is taken care of by the very nature of the structs.

 AFAICS we don't gain anything from this, because it just automates
 certain things that can already be done manually in a suitably
 implemented wrapper struct. I don't think automation is necessary here,
 because realistically, how many RC wrappers will there be? Ideally just
 one, in Phobos.

You must be missing something big. Ref-counting ain't singular thing, 
it's a strategy with a multitude of implementations, see my other post.


 Now this is enough to make ref-counted stuff a bit simpler to write
 but not much beyond. So here the next "consequences" that we can then
 implement:

 4. Compiler is expected to assume anywhere in fully inlined code, that
 opInc()/opDec() pairs are no-op. It should do so even in debug mode
 (though there is less opportunity to do so without inlining). Consider
 it an NRVO of the new age, required optimization.

 5. If we extend opInc/opDec to take an argument, the compiler may go
 further and batch up multiple opInc-s and opDec-s, as long as it's
 safe to do so (e.g. there could be exceptions thrown!):

 Consider:

 auto a = File("some-file.txt");
 //pass to some structs for future use
 B b = B(a);
 C c = C(a);
 a = File("other file");

 May be (this is overly simplified!):

 File a = void, b = void, c = void;
 a = File.user_ctor("some-file.txt")'
 a.opInc(2);
 b = B(a);
 c = C(a);
 a = File.user_ctor("other file");
 a.opInc();

 I believe we can achieve the same efficiency without ARC with the help
 of borrowing and multiple alias this.

Problem is - there is no borrowing yet in the compiler, or maybe you 
mean something more simple.

 Consider the cases where inc/dec
 can be elided:

     RC!int a;
     // ...
     foo(a);
     // ...
     bar(a);
     // ...

 Under the assumption that foo() and bar() don't want to keep a copy of
 their arguments, this is a classical use case for borrowing. No inc/dec
 is necessary, and none will happen, if RC!int has an alias-this-ed
 method returning a scoped reference to its payload.

Interesting. However scope must work first, also passing an RC!int by 
borrowing is this:

void func(scope(A) a)

or what? how does it transform scope? (Sorry I haven't followed your 
proposals for scope)


 On the other hand, foo() and bar() could want to make copies of the
 refcounted variable. In this case, we still wouldn't need an inc/dec,
 but we need a way to express that. The solution is another alias-this-ed
 method that returns a (scoped) BorrowedRC!int, which does not inc/dec on
 construction/destruction, but does so on copying. (It's probably
 possible to reuse RC!int for this, a separate type is likely not
 necessary.)

Who would make sure original RC still exists?

 The other opportunity is on moving:

      void foo() {
          RC!int a;
          // ....
          bar(a);    // last statement in foo()
      }

We should already have it with structs by their nature.

 Here, clearly `a` isn't used after the tail call. Instead of copy &
 destroy, the compiler can resort to a move (bare bitcopy). In contrast
 to C++, this is allowed in D.

 This covers most opportunities for elision of the ref counting. It only
 leaves a few corner cases (e.g. `a` no longer used after non-tail calls,
 accumulated inc/dec as in your example). I don't think these are worth
 complicating the compiler with ARC.

I don't mind having working scope and borrowing but my proposal doesn't 
require them.

-- 
Dmitry Olshansky

Sep 27 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Saturday, 27 September 2014 at 11:54:45 UTC, Dmitry Olshansky 
wrote:
 27-Sep-2014 14:23, "Marc Schütz" <schuetzm gmx.net>" пишет:
 AFAICS we don't gain anything from this, because it just 
 automates
 certain things that can already be done manually in a suitably
 implemented wrapper struct. I don't think automation is 
 necessary here,
 because realistically, how many RC wrappers will there be? 
 Ideally just
 one, in Phobos.

 You must be missing something big. Ref-counting ain't singular 
 thing, it's a strategy with a multitude of implementations, see 
 my other post.

Ok, you're right about the different possible implementations. 
Still, your proposal mostly seems to propose separate functions 
opInc()/opDec() for operations that can just be placed in the 
wrappers' constructor, destructor and postblit, with no 
repetition.
(opInitRefcount() is missing in you proposal, strictly speaking; 
an automatic call to opInc() isn't enough, because initialization 
could be arbitrarily complex.)

The one thing your proposal enables is wrapper-less reference 
counting. Is that important?

 I believe we can achieve the same efficiency without ARC with 
 the help
 of borrowing and multiple alias this.

 Problem is - there is no borrowing yet in the compiler, or 
 maybe you mean something more simple.

No, I'm talking about my proposal.

 Consider the cases where inc/dec
 can be elided:

    RC!int a;
    // ...
    foo(a);
    // ...
    bar(a);
    // ...

 Under the assumption that foo() and bar() don't want to keep a 
 copy of
 their arguments, this is a classical use case for borrowing. 
 No inc/dec
 is necessary, and none will happen, if RC!int has an 
 alias-this-ed
 method returning a scoped reference to its payload.

 Interesting. However scope must work first, also passing an 
 RC!int by borrowing is this:

 void func(scope(A) a)

 or what? how does it transform scope? (Sorry I haven't followed 
 your proposals for scope)

That's how it will look from the user's point of view. The RC 
wrapper's implementer will need to add an alias this that returns 
the payload with scope:

     struct RC(T) {
         private T payload_;
         scope!this(T) borrow() { return payload_; }
         alias borrow this;
     }

It is then implicitly convertible to scope(T);

 On the other hand, foo() and bar() could want to make copies 
 of the
 refcounted variable. In this case, we still wouldn't need an 
 inc/dec,
 but we need a way to express that. The solution is another 
 alias-this-ed
 method that returns a (scoped) BorrowedRC!int, which does not 
 inc/dec on
 construction/destruction, but does so on copying. (It's 
 probably
 possible to reuse RC!int for this, a separate type is likely 
 not
 necessary.)

 Who would make sure original RC still exists?

How could it not exist? If it existed immediately before the 
function call, it will also exist during the call. It can not 
have gone out of scope before the function returned.

This is already true without borrowing. Borrowing additionally 
ensures that it even works in special cases, e.g. when the called 
function makes a copy of the wrapper, and that there will be no 
references left when the function returns (that is, no references 
to the wrapper, not the payload).

 The other opportunity is on moving:

     void foo() {
         RC!int a;
         // ....
         bar(a);    // last statement in foo()
     }

 We should already have it with structs by their nature.

Exactly. What I want to say is that here, again, the compiler 
doesn't need to know that `a` does reference counting in order to 
generate efficient code. No RC specific optimizations are 
necessary, thus teaching the compiler about RC doesn't have 
advantages in this respect.

 Here, clearly `a` isn't used after the tail call. Instead of 
 copy &
 destroy, the compiler can resort to a move (bare bitcopy). In 
 contrast
 to C++, this is allowed in D.

 This covers most opportunities for elision of the ref 
 counting. It only
 leaves a few corner cases (e.g. `a` no longer used after 
 non-tail calls,
 accumulated inc/dec as in your example). I don't think these 
 are worth
 complicating the compiler with ARC.

 I don't mind having working scope and borrowing but my proposal 
 doesn't require them.

Well, implement RC in the compiler, and you still have no 
borrowing. But implement borrowing, and you get efficient RC for 
free, in addition to all the other nice things it allows.

Sep 28 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/27/14, 2:38 AM, Dmitry Olshansky wrote:
 Okay it serves no good for me to make these tiny comments while on the go.

 As usual, structs are value types, so this feature can be mis-used, no
 two thoughts abouts it. It may need a bit of improvement in
 user-friendliness, compiler may help there by auto-detecting common misuse.

I still don't understand what "this feature" is after reading your long 
post twice.

So structs are still value types and you replace postblit/destroy with 
calls to opInc/opDec? That's it? How does this enable anything more 
interesting than ctors/dtors?


Andrei

Sep 27 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

27-Sep-2014 23:14, Andrei Alexandrescu пишет:
 On 9/27/14, 2:38 AM, Dmitry Olshansky wrote:
 Okay it serves no good for me to make these tiny comments while on the
 go.

 As usual, structs are value types, so this feature can be mis-used, no
 two thoughts abouts it. It may need a bit of improvement in
 user-friendliness, compiler may help there by auto-detecting common
 misuse.

 I still don't understand what "this feature" is after reading your long
 post twice.

 So structs are still value types and you replace postblit/destroy with
 calls to opInc/opDec? That's it? How does this enable anything more
 interesting than ctors/dtors?

Compiler is aware that opInc and opDec are indeed ref-countinng ops, 
meaning that opInc + opDec = no op. I claim that this is enough to get 
"ARC" going.


-- 
Dmitry Olshansky

Sep 27 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/27/14, 12:50 PM, Dmitry Olshansky wrote:
 27-Sep-2014 23:14, Andrei Alexandrescu пишет:
 On 9/27/14, 2:38 AM, Dmitry Olshansky wrote:
 Okay it serves no good for me to make these tiny comments while on the
 go.

 As usual, structs are value types, so this feature can be mis-used, no
 two thoughts abouts it. It may need a bit of improvement in
 user-friendliness, compiler may help there by auto-detecting common
 misuse.

 I still don't understand what "this feature" is after reading your long
 post twice.

 So structs are still value types and you replace postblit/destroy with
 calls to opInc/opDec? That's it? How does this enable anything more
 interesting than ctors/dtors?

 Compiler is aware that opInc and opDec are indeed ref-countinng ops,
 meaning that opInc + opDec = no op. I claim that this is enough to get
 "ARC" going.

You give marginal details but still don't describe the thing. When are 
they called and what do they have that ctors/dtors/postblit don't?

FWIW the language always "understands" when to elide postblit/dtor calls.


Andrei

Sep 27 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

28-Sep-2014 00:16, Andrei Alexandrescu пишет:
 On 9/27/14, 12:50 PM, Dmitry Olshansky wrote:
 27-Sep-2014 23:14, Andrei Alexandrescu пишет:
 On 9/27/14, 2:38 AM, Dmitry Olshansky wrote:
 Okay it serves no good for me to make these tiny comments while on the
 go.

 As usual, structs are value types, so this feature can be mis-used, no
 two thoughts abouts it. It may need a bit of improvement in
 user-friendliness, compiler may help there by auto-detecting common
 misuse.

 I still don't understand what "this feature" is after reading your long
 post twice.



Must be my fault, I'll try once again.

 So structs are still value types and you replace postblit/destroy with
 calls to opInc/opDec? That's it? How does this enable anything more
 interesting than ctors/dtors?

 Compiler is aware that opInc and opDec are indeed ref-countinng ops,
 meaning that opInc + opDec = no op. I claim that this is enough to get
 "ARC" going.

 You give marginal details but still don't describe the thing. When are
 they called and what do they have that ctors/dtors/postblit don't?

The key point is that the change is small, that's why (maybe) it's hard 
to grasp. The whole thing is a bit of lowering and a "hint" to the 
compiler. It reuses the same mechanism that structs already have.

Okay a few examples of lowering to get things going:

http://dpaste.dzfl.pl/3722d9d70937

(note I think there could be better lowerings and simpler set of 
primitives to bootstrap ARC-ed type)

Now why would we need this trivial lowering?

Typical postblit can be anything unless the compiler has full source 
code, dtor can be anything as well.

With opInc/opDec compiler generates postblit/dtor on his own, in doing 
so it decorates user-defined dtor that actually clears resources.

What being in control gives to the compiler:

1. Compiler always has the source of generated parts, so they can be 
inlined (and should be)

2. Can do typical algebra optimization on opInc/opDec, no matter what's 
inside opInc and opDec (this is a contract between programmer and compiler).
	 e.g opInc(10) followed by OpDec(1) is opInc(9)

3. Also opInc and opDec do not alter object in any capacity, nor are 
they affected by any method calls on this object (another contract)

4. 1 + 2 = win, as by inlining postblits/dtors we expose opInc/opDecs to 
the optimizer pass which the would fold them using basic algebra 
optimizations.

Motivating example:

struct RC(T) { ... } //implemented with or without proposed feature

void foo(RC!int arg); // no source available, sorry;)

void bar(RC!int my)
{	
	foo(my);
	writeln("Bar: ", *my);	
}

which is the same as

{
	{
		auto __tmp = my; // a postblit
		foo(__tmp);
	}//dtor
	writeln("Bar: ", *my);
}

And assuming the same names for inc/dec of count and no exceptions, can 
be simplified back to:
{
	{
		auto __tmp = my;
		my.opInc(1); // or __tmp.opInc() it's the same count
		foo(__tmp);
		if(my.opDec(1)) my.__dtor();
	}
	writeln("Bar: ", *my);
}

Ideally with assumptions I stated above will look like this:
{
	{
		auto __tmp = my;
		foo(__tmp);
		if(my.opDec(0)) my.__dtor();
	}
	writeln("Bar: ", *my);
}

Now here if compiler can optimize ref-count operations completely on his 
own without assumptions stated above then we are done and no special 
opInc/opDec required.

I bet my hat it can't for the simple reason that all bets are off if any 
of the following is happening:
1. No source available thus e.g. postblit is opaque call
2. Not "everything" in between opInc/opDec can be inlined impairing 
optimizer (since 'my' is passed to foo, it may be modified)

I might be wrong though - Walter?

 FWIW the language always "understands" when to elide postblit/dtor calls.

Yes, which immediately gives us part of ARC advantages such as NRVO 
elides ref-count bump. The other part comes from batching up multiple 
incs/decs to one call and eliminating redundant pairs.


I might be wrong but RCObject proposal is harder to construct then this 
"feather-weight ARC" as it would have to "force" the same lifetime 
management as we have for structs on some class instances that never had 
it before!

That being said I think it's valuable to have RCObject and exceptions 
being ref-counted.

-- 
Dmitry Olshansky

Sep 28 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/28/14, 3:40 AM, Dmitry Olshansky wrote:
 28-Sep-2014 00:16, Andrei Alexandrescu пишет:
 The key point is that the change is small, that's why (maybe) it's hard
 to grasp.

I think I get it now. It has a huge overlap with stuff that we already 
have. That doesn't bode very well!

 The whole thing is a bit of lowering and a "hint" to the
 compiler. It reuses the same mechanism that structs already have.

 Okay a few examples of lowering to get things going:

 http://dpaste.dzfl.pl/3722d9d70937

 (note I think there could be better lowerings and simpler set of
 primitives to bootstrap ARC-ed type)

 Now why would we need this trivial lowering?

 Typical postblit can be anything unless the compiler has full source
 code, dtor can be anything as well.

 With opInc/opDec compiler generates postblit/dtor on his own, in doing
 so it decorates user-defined dtor that actually clears resources.

 What being in control gives to the compiler:

 1. Compiler always has the source of generated parts, so they can be
 inlined (and should be)

That's a given if the body of ctor/postblit/dtor is available. I don't 
see this as an important distinction.

 2. Can do typical algebra optimization on opInc/opDec, no matter what's
 inside opInc and opDec (this is a contract between programmer and
 compiler).
       e.g opInc(10) followed by OpDec(1) is opInc(9)

That typical algebra optimization is already doable post inlining 
without a language feature. Compilers know integer arithmetic.

 3. Also opInc and opDec do not alter object in any capacity, nor are
 they affected by any method calls on this object (another contract)

....?

 4. 1 + 2 = win, as by inlining postblits/dtors we expose opInc/opDecs to
 the optimizer pass which the would fold them using basic algebra
 optimizations.

I don't think this is a feature worth adding.



Andrei

Sep 28 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

28-Sep-2014 16:42, Andrei Alexandrescu пишет:
 On 9/28/14, 3:40 AM, Dmitry Olshansky wrote:

[snip]
 I don't think this is a feature worth adding.

Fair enough. Personally I suspect compilers would have to go a long way 
yet to optimize away ref-counts decently enough without hints.


-- 
Dmitry Olshansky

Sep 28 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/22/14, 11:47 PM, Manu via Digitalmars-d wrote:
 On 23 September 2014 16:19, deadalnix via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>> wrote:

     On Tuesday, 23 September 2014 at 03:03:49 UTC, Manu via
     Digitalmars-d wrote:

         I still think most of those users would accept RC instead of GC.
         Why not
         support RC in the language, and make all of this library noise
         redundant?
         Library RC can't really optimise well, RC requires language
         support to
         elide ref fiddling.


     I think a library solution + intrinsic for increment/decrement (so
     they can be better optimized) would be the best option.


 Right, that's pretty much how I imagined it too. Like ranges, where
 foreach makes implicit calls to contractual methods, there would also be
 a contract for refcounted objects, and the compiler will emit implicit
 calls to inc/dec if they exist?
 That should eliminate 'RefCounted', you would only need to provide
 opInc()/opDec() and rc fiddling calls would be generated automatically?
 Then we can preserve the type of things, rather than obscuring them in
 layers of wrapper templates...

That won't work. Sorry, it has too many holes to enumerate! -- Andrei

Sep 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 9/22/2014 11:19 PM, deadalnix wrote:
 I think a library solution + intrinsic for increment/decrement (so they can be
 better optimized) would be the best option.

Intrinsics are unnecessary. The compiler is perfectly capable of recognizing 
certain code patterns and replacing them with single instructions.

Sep 24 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Tuesday, 23 September 2014 at 06:19:58 UTC, deadalnix wrote:
 On Tuesday, 23 September 2014 at 03:03:49 UTC, Manu via 
 Digitalmars-d wrote:
 I still think most of those users would accept RC instead of 
 GC. Why not
 support RC in the language, and make all of this library noise 
 redundant?
 Library RC can't really optimise well, RC requires language 
 support to
 elide ref fiddling.

 I think a library solution + intrinsic for increment/decrement 
 (so they can be better optimized) would be the best option.

Yes, inc/dec intrinsic is needed to support TSX. I.e. You dont 
have to inc/dec to keep the object alive within a transaction, 
you only need to read something on the same cacheline as the ref 
count. Essentially zero overhead in many cases afaik.

Sep 25 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 http://dpaste.dzfl.pl/817283c163f5

The test on line 267 fails on a 32-bit build:

rcstring.d(267): Error: cannot implicitly convert expression 
(38430716820228232L) of type long to uint

Hosting it as a Gist on Github[1] might be an idea, as then the 
same link will be relevant after the code is updated, and people 
can post line comments. It doesn't support building and running 
the code online, but dpaste.dzfl.pl's old FE version (2.065) 
doesn't support the code anyway.

[1] https://gist.github.com/

Sep 15 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Mon, 15 Sep 2014 15:34:53 +0000
schrieb "Jakob Ovrum" <jakobovrum gmail.com>:

 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
 wrote:
 http://dpaste.dzfl.pl/817283c163f5

 
 The test on line 267 fails on a 32-bit build:
 
 rcstring.d(267): Error: cannot implicitly convert expression 
 (38430716820228232L) of type long to uint

https://issues.dlang.org/show_bug.cgi?id=5063 >.<

-- 
Marco

Sep 15 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu
wrote:
 The road there is long, but it starts with the proverbial first 
 step. As it were, I have a rough draft of a almost-drop-in 
 replacement of string (aka immutable(char)[]). Destroy with 
 maximum prejudice:

 http://dpaste.dzfl.pl/817283c163f5

So slicing an RCString doesn't increment its refcount?

Sep 15 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/15/14, 11:21 AM, Sean Kelly wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu
 wrote:
 The road there is long, but it starts with the proverbial first step.
 As it were, I have a rough draft of a almost-drop-in replacement of
 string (aka immutable(char)[]). Destroy with maximum prejudice:

 http://dpaste.dzfl.pl/817283c163f5

 So slicing an RCString doesn't increment its refcount?

It does. -- Andrei

Sep 15 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Monday, 15 September 2014 at 18:39:09 UTC, Andrei Alexandrescu
wrote:
 On 9/15/14, 11:21 AM, Sean Kelly wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei 
 Alexandrescu
 wrote:
 The road there is long, but it starts with the proverbial 
 first step.
 As it were, I have a rough draft of a almost-drop-in 
 replacement of
 string (aka immutable(char)[]). Destroy with maximum 
 prejudice:

 http://dpaste.dzfl.pl/817283c163f5

 So slicing an RCString doesn't increment its refcount?

 It does. -- Andrei

Oops, I was looking at the opSlice for Large, not RCString.

Sep 15 2014

"Dicebot" <public dicebot.lv> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 Walter, Brad, myself, and a couple of others have had a couple 
 of quite exciting ideas regarding code that is configurable to 
 use the GC or alternate resource management strategies. One 
 thing that became obvious to us is we need to have a reference 
 counted string in the standard library. That would be usable 
 with applications that want to benefit from comfortable string 
 manipulation whilst using classic reference counting for memory 
 management. I'll get into more details into the mechanisms that 
 would allow the stdlib to provide functionality for both GC 
 strings and RC strings; for now let's say that we hope and aim 
 for swapping between these with ease. We hope that at one point 
 people would be able to change one line of code, rebuild, and 
 get either GC or RC automatically (for Phobos and their own 
 code).

Ironically, strings have been probably least of my GC-related 
issues with D so far - hard to evaluate applicability of this 
proposal because of that. What are typical use cases for such 
solution? (not questioning its importance, just being curious)

Sep 17 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/17/14, 9:30 AM, Dicebot wrote:
 Ironically, strings have been probably least of my GC-related issues
 with D so far - hard to evaluate applicability of this proposal because
 of that. What are typical use cases for such solution? (not questioning
 its importance, just being curious)

Simplest is "I want to use D without a GC and suddenly the string 
support has fallen down to bear claws and silex stones."

RCString should be a transparent (or at least near-transparent) 
replacement for string in GC-less environments.


Andrei

Sep 17 2014

"Piotrek" <p nonexistent.pl> writes:

On Wednesday, 17 September 2014 at 16:32:41 UTC, Andrei 
Alexandrescu wrote:
 On 9/17/14, 9:30 AM, Dicebot wrote:
 Ironically, strings have been probably least of my GC-related 
 issues
 with D so far - hard to evaluate applicability of this 
 proposal because
 of that. What are typical use cases for such solution? (not 
 questioning
 its importance, just being curious)

 Simplest is "I want to use D without a GC and suddenly the 
 string support has fallen down to bear claws and silex stones."

 RCString should be a transparent (or at least near-transparent) 
 replacement for string in GC-less environments.


 Andrei

I think the biggest gc=(partially?)off customers are game makers:

http://forum.dlang.org/thread/k27bh7$t7f$1 digitalmars.com

(check especially the bottom of the 6th page)

Random quote:
"I created a reference counted array which is as close to the 
native D
array as currently possible (compiler bugs, type system issues, 
etc).
also in core.refcounted. It however does not replace the default 
string
or array type in all cases because it would lead to reference 
counting
in uneccessary places. The focus is to get only reference couting 
where
absolutly neccessary. I'm still using the standard string type as 
a
"only valid for current scope" kind of string."

And my fav:
"- You most likely won't like the way I implemented reference 
counting"

I hope Benjamin Thaut can share his viewpoint on the topic if he 
is still around.

Piotrek

Sep 17 2014

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 17 September 2014 at 16:32:41 UTC, Andrei 
Alexandrescu wrote:
 On 9/17/14, 9:30 AM, Dicebot wrote:
 Ironically, strings have been probably least of my GC-related 
 issues
 with D so far - hard to evaluate applicability of this 
 proposal because
 of that. What are typical use cases for such solution? (not 
 questioning
 its importance, just being curious)

 Simplest is "I want to use D without a GC and suddenly the 
 string support has fallen down to bear claws and silex stones."

 RCString should be a transparent (or at least near-transparent) 
 replacement for string in GC-less environments.

Well this is exactly what I don't understand. Strings we have 
don't have any strong connection to GC (apart from concatenation 
which can be verified by  nogc) being just slices to some 
external buffer. That buffer can be malloc'ed or stack allocated, 
that doesn't really affect most string processing algorithms, not 
unless those try to do some re-allocation of their own.

I agree that pipeline approach does not work that well for 
complex programs in general but strings seem to be best match to 
it - either you want read-only access or a pipe-line, everything 
else feels inefficient as amount of write operations gets out of 
control. Every single attempt to do something clever with shared 
CoW strings in C++ I have met was a total failure.

That is why I wonder - what kind of applications really need the 
rcstring as opposed to some generic rcarray?

Sep 19 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/19/14, 3:32 AM, Dicebot wrote:
 On Wednesday, 17 September 2014 at 16:32:41 UTC, Andrei Alexandrescu wrote:
 On 9/17/14, 9:30 AM, Dicebot wrote:
 Ironically, strings have been probably least of my GC-related issues
 with D so far - hard to evaluate applicability of this proposal because
 of that. What are typical use cases for such solution? (not questioning
 its importance, just being curious)

 Simplest is "I want to use D without a GC and suddenly the string
 support has fallen down to bear claws and silex stones."

 RCString should be a transparent (or at least near-transparent)
 replacement for string in GC-less environments.

 Well this is exactly what I don't understand. Strings we have don't have
 any strong connection to GC (apart from concatenation which can be
 verified by  nogc) being just slices to some external buffer. That
 buffer can be malloc'ed or stack allocated, that doesn't really affect
 most string processing algorithms, not unless those try to do some
 re-allocation of their own.

It does affect management, i.e. you don't know when to free the buffer 
if slices are unaccounted for. So the design of slices are affected as 
much as that of the buffer.

 I agree that pipeline approach does not work that well for complex
 programs in general but strings seem to be best match to it - either you
 want read-only access or a pipe-line, everything else feels inefficient
 as amount of write operations gets out of control. Every single attempt
 to do something clever with shared CoW strings in C++ I have met was a
 total failure.

What were the issues?

 That is why I wonder - what kind of applications really need the
 rcstring as opposed to some generic rcarray?

I started with rcstring because (a) it's easier to lift off the ground - 
no worries about construction/destruction of elements etc. and (b) it's 
frequent enough to warrant some good testing. Of course there'll be an 
rcarray!T as well.


Andrei

Sep 19 2014

"Dicebot" <public dicebot.lv> writes:

On Friday, 19 September 2014 at 15:09:41 UTC, Andrei Alexandrescu 
wrote:
 It does affect management, i.e. you don't know when to free the 
 buffer if slices are unaccounted for. So the design of slices 
 are affected as much as that of the buffer.

I see where you are going at. A bit hard to imagine how it fits 
the big picture when going bottom-up though but I trust you on 
this :)

 I agree that pipeline approach does not work that well for 
 complex
 programs in general but strings seem to be best match to it - 
 either you
 want read-only access or a pipe-line, everything else feels 
 inefficient
 as amount of write operations gets out of control. Every 
 single attempt
 to do something clever with shared CoW strings in C++ I have 
 met was a
 total failure.

 What were the issues?

Usually it went that way:

1) Get basic implementation, become shocked how slow it is 
because of redundant reference increments/decrements and thread 
safety
2) Add speed-up hacks to avoid reference count amending when 
considered unnecessary
3) Get hit by a snowball of synchronization / double-free issues 
and abandon the idea completely after months of debugging.

Of course those weren't teams of rock-star programmers but at the 
same time more "stupid" approach with making extra copies and 
putting extra effort into defining strict linear ownership chain 
seemed to work much better.

 That is why I wonder - what kind of applications really need 
 the
 rcstring as opposed to some generic rcarray?

 I started with rcstring because (a) it's easier to lift off the 
 ground - no worries about construction/destruction of elements 
 etc. and (b) it's frequent enough to warrant some good testing. 
 Of course there'll be an rcarray!T as well.


Thanks for explanation :) Well, I am curious how will it turn out 
but a bit skeptical right now.

Sep 20 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/20/14, 12:42 AM, Dicebot wrote:
 On Friday, 19 September 2014 at 15:09:41 UTC, Andrei Alexandrescu wrote:
 as amount of write operations gets out of control. Every single attempt
 to do something clever with shared CoW strings in C++ I have met was a
 total failure.

 What were the issues?

 Usually it went that way:

 1) Get basic implementation, become shocked how slow it is because of
 redundant reference increments/decrements and thread safety
 2) Add speed-up hacks to avoid reference count amending when considered
 unnecessary
 3) Get hit by a snowball of synchronization / double-free issues and
 abandon the idea completely after months of debugging.

I understand. RC strings will work just fine. Compared to interlocked 
approaches we're looking at a 5x improvement in RC speed for the most 
part because we can dispense with most interlocking. -- Andrei

Sep 20 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Saturday, 20 September 2014 at 15:30:55 UTC, Andrei 
Alexandrescu wrote:
 I understand. RC strings will work just fine. Compared to 
 interlocked approaches we're looking at a 5x improvement in RC 
 speed for the most part because we can dispense with most 
 interlocking. -- Andrei

Can someone explain why?

Since fibers can travel between threads, they will also be able 
to leak objects to different threads.

Sep 20 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

20-Sep-2014 21:55, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" пишет:
 On Saturday, 20 September 2014 at 15:30:55 UTC, Andrei Alexandrescu wrote:
 I understand. RC strings will work just fine. Compared to interlocked
 approaches we're looking at a 5x improvement in RC speed for the most
 part because we can dispense with most interlocking. -- Andrei

 Can someone explain why?

 Since fibers can travel between threads, they will also be able to leak
 objects to different threads.

Not spontaneously :)
You'd have to cast to shared and back, and then you are on your own.
Fiber is thread-local, shared(Fiber) isn't.

-- 
Dmitry Olshansky

Sep 21 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Sunday, 21 September 2014 at 08:24:46 UTC, Dmitry Olshansky 
wrote:
 Not spontaneously :)
 You'd have to cast to shared and back, and then you are on your 
 own.
 Fiber is thread-local, shared(Fiber) isn't.

That will have to change if Go is a target. To get full load you 
need to let fibers move freely between threads I think. Go also 
check fiber stack size... But maybe Go should not be considered a 
target.

Sep 21 2014

"Dicebot" <public dicebot.lv> writes:

On Sunday, 21 September 2014 at 09:06:57 UTC, Ola Fosheim Grostad 
wrote:
 On Sunday, 21 September 2014 at 08:24:46 UTC, Dmitry Olshansky 
 wrote:
 Not spontaneously :)
 You'd have to cast to shared and back, and then you are on 
 your own.
 Fiber is thread-local, shared(Fiber) isn't.

 That will have to change if Go is a target. To get full load 
 you need to let fibers move freely between threads I think. Go 
 also check fiber stack size... But maybe Go should not be 
 considered a target.

It doesn't ring a bell to me. For several reasons:

1) Go doesn't seem to be a target right now. There has been 
certain examples that D is capable to beat Go in its own domain 
(see Atila MQTT broker articles). It may change later but there 
has been no experimental confirmations that their approach is 
better by design.

2) For good CPU load distribution moving of task is likely to be 
needed indeed but it is not necessarily the same thing as moving 
fibers and definitely not all need to be moved. I like that 
vibe.d goes forward with this by defining own `Task` abstraction 
on top of fibers. Thus this is something that belong to specific 
task scheduler/manager and not basic Fiber implementation.

Sep 21 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

21-Sep-2014 13:06, Ola Fosheim Grostad пишет:
 On Sunday, 21 September 2014 at 08:24:46 UTC, Dmitry Olshansky wrote:
 Not spontaneously :)
 You'd have to cast to shared and back, and then you are on your own.
 Fiber is thread-local, shared(Fiber) isn't.

 That will have to change if Go is a target.

Go is not a target. The fixed concurrency model the have is not the 
silver bullet.

 To get full load you need to
 let fibers move freely between threads I think.

Why? The only thing required is scheduling by passing new work-item (a 
fiber) to the least loaded thread (or some other strategy). Keeping 
thread affinity of Fiber is a boon: you get to use non-atomic 
ref-counting and have far less cache pollution (the set of fibers to 
switch over is consistent).

 Go also check fiber
 stack size... But maybe Go should not be considered a target.

??? Just reserve more space. Even Go dropped segmented stack.
What Go has to do with this discussion at all BTW?

-- 
Dmitry Olshansky

Sep 21 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Sunday, 21 September 2014 at 17:52:42 UTC, Dmitry Olshansky 
wrote:
 to use non-atomic ref-counting and have far less cache 
 pollution (the set of fibers to switch over is consistent).

Caches are not a big deal when you wait for io.

 Go also check fiber
 stack size... But maybe Go should not be considered a target.

 ??? Just reserve more space. Even Go dropped segmented stack.
 What Go has to do with this discussion at all BTW?

Because that is what you are competing with in the webspace.

Go checks and extends stacks.

Sep 21 2014

Paulo Pinto <pjmlp progtools.org> writes:

Am 21.09.2014 23:45, schrieb Ola Fosheim Grostad:
 On Sunday, 21 September 2014 at 17:52:42 UTC, Dmitry Olshansky wrote:
 ...

 ??? Just reserve more space. Even Go dropped segmented stack.
 What Go has to do with this discussion at all BTW?

 Because that is what you are competing with in the webspace.

 Go checks and extends stacks.

Since when Go is a competitor in the webspace?

 From all the languages used to develop web applications, Go is not on 
top list for most people.

At least on the IT world I am part of.

--
Paulo

Sep 21 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Sunday, 21 September 2014 at 22:58:59 UTC, Paulo Pinto wrote:
 Since when Go is a competitor in the webspace?

Since people who create high throughput servers started using it?

Sep 21 2014

Paulo Pinto <pjmlp progtools.org> writes:

Am 22.09.2014 01:06, schrieb Ola Fosheim Grostad:
 On Sunday, 21 September 2014 at 22:58:59 UTC, Paulo Pinto wrote:
 Since when Go is a competitor in the webspace?

 Since people who create high throughput servers started using it?

Which people? A few Silicon Valley startups, besides Google?

Around me I see such servers being written in Erlang, JVM and .NET 
languages, with the occasional drop to C++ when nothing else goes.

--
Paulo

Sep 21 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Sunday, 21 September 2014 at 23:28:31 UTC, Paulo Pinto wrote:
 Am 22.09.2014 01:06, schrieb Ola Fosheim Grostad:
 On Sunday, 21 September 2014 at 22:58:59 UTC, Paulo Pinto 
 wrote:
 Since when Go is a competitor in the webspace?

 Since people who create high throughput servers started using 
 it?

 Which people? A few Silicon Valley startups, besides Google?

I am not keeping track, but e.g. 
https://www.cloudflare.com/railgun

 Around me I see such servers being written in Erlang,

Erlang would be another example.

Sep 21 2014

"Googler Lurker" <googler lurker.net> writes:

On Sunday, 21 September 2014 at 23:28:31 UTC, Paulo Pinto wrote:
 Am 22.09.2014 01:06, schrieb Ola Fosheim Grostad:
 On Sunday, 21 September 2014 at 22:58:59 UTC, Paulo Pinto 
 wrote:
 Since when Go is a competitor in the webspace?

 Since people who create high throughput servers started using 
 it?

 Which people? A few Silicon Valley startups, besides Google?

Go fizzled inside google but granted has traction outside of
google. Paulo stop feeding the troll for Petes sake.

Sep 21 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Monday, 22 September 2014 at 02:34:00 UTC, Googler Lurker 
wrote:
 Go fizzled inside google but granted has traction outside of
 google. Paulo stop feeding the troll for Petes sake.

Don't be such a coward, show your face and publish you real name. 
Your style and choice of words reminds me of A.A. Do the man a 
favour and clear up this source for confusion.

Locking fibers to threads will cost you more than using 
threadsafe features. One 300ms request can then starve waiting 
fibers even if you have 7 free threads. That's bad for latency, 
because then all fibers on that thread will get 300+ms in latency.

How anyone can disagree with this is beyond me.

Sep 22 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

22-Sep-2014 13:45, Ola Fosheim Grostad пишет:
 Locking fibers to threads will cost you more than using threadsafe
 features. One 300ms request can then starve waiting fibers even if you
 have 7 free threads.

This statement doesn't make any sense taken in isolation. It lacks way 
too much context to be informative. For instance, "locking a thread for 
300ms" is easily averted if all I/O and blocking sys-call are managed in 
a separate thread pool (that may grow far beyond fiber-scheduled "web" 
thread pool).

And if "locked" means CPU-bound locked, then it's
a) hard to fix without help from OS: re-scheduling a fiber without 
explicit yield ain't possible (it's cooperative, preemption is in the 
domain of OS).

Something like Windows User-Mode Scheduling is required or user-mode 
threads a-la FreeBSD (haven't checked in a while?).

b) If CPU-bound is happening more often then once in a while, then 
fibers are poor fit anyway - threads (and pools of 'em) do exactly 
what's needed in this case by being natively preemptive and well suited 
for running multiple CPU intensive tasks.

 That's bad for latency, because then all fibers on
 that thread will get 300+ms in latency.

E-hm locking threads to fibers and arbitrary latency figures have very 
little to do with each other. The nature of that latency is extremely 
important.

 How anyone can disagree with this is beyond me.

IMHO poorly formed problem statements are not going to prove your point. 
Pardon me making a personal statement, but for instance showing how Go 
avoids your problem and clearly specifying the exact conditions that 
cause it would go a long way to demonstrated whatever you wanted to.

-- 
Dmitry Olshansky

Sep 22 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Monday, 22 September 2014 at 19:58:31 UTC, Dmitry Olshansky 
wrote:
 22-Sep-2014 13:45, Ola Fosheim Grostad пишет:
 Locking fibers to threads will cost you more than using 
 threadsafe
 features. One 300ms request can then starve waiting fibers 
 even if you
 have 7 free threads.

 This statement doesn't make any sense taken in isolation. It 
 lacks way too much context to be informative. For instance, 
 "locking a thread for 300ms" is easily averted if all I/O and 
 blocking sys-call are managed in a separate thread pool (that 
 may grow far beyond fiber-scheduled "web" thread pool).

 And if "locked" means CPU-bound locked, then it's
 a) hard to fix without help from OS: re-scheduling a fiber 
 without explicit yield ain't possible (it's cooperative, 
 preemption is in the domain of OS).

If you porocess and compress a large dataset in one fiber you 
don't need rescheduling. You just the scheduler to pick fibers 
according to priority regardless of origin thread.

 b) If CPU-bound is happening more often then once in a while, 
 then fibers are poor fit anyway - threads (and pools of 'em) do 
 exactly what's needed in this case by being natively preemptive 
 and well suited for running multiple CPU intensive tasks.

Not really the issue. Load comes in spikes, if you on average 
only have a couple of heavy fibers at the same time then you are 
fine. You can spawn more threads if needed, but that wont help if 
fibers are stuck on a slow thread.

 That's bad for latency, because then all fibers on
 that thread will get 300+ms in latency.

 E-hm locking threads to fibers and arbitrary latency figures 
 have very little to do with each other. The nature of that 
 latency is extremely important.

If you in line behind a cpu heavy fiber then you get that effect.

 How anyone can disagree with this is beyond me.

 IMHO poorly formed problem statements are not going to prove 
 your point. Pardon me making a personal statement, but for 
 instance showing how Go avoids your problem and clearly 
 specifying the exact conditions that cause it would go a long 
 way to demonstrated whatever you wanted to.

Any decent framework that is concerned about latency solves this 
the same way: light threads, or events, or whatever are not 
locked to a specific thread.

Isolates are fine, but D does not provide it afaik.

Sep 25 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

Analysis of Go growth / usage.

http://redmonk.com/dberkholz/2014/03/18/go-the-emerging-language-of-cloud-infrastructure/

Sep 25 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Sep-2014 06:49, Ola Fosheim Grostad пишет:
 Analysis of Go growth / usage.

 http://redmonk.com/dberkholz/2014/03/18/go-the-emerging-language-of-cloud-infrastructure/

Google was popular last time I heard, so does their language.

-- 
Dmitry Olshansky

Sep 27 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

25-Sep-2014 17:31, Ola Fosheim Grostad пишет:
 On Monday, 22 September 2014 at 19:58:31 UTC, Dmitry Olshansky wrote:
 22-Sep-2014 13:45, Ola Fosheim Grostad пишет:
 Locking fibers to threads will cost you more than using threadsafe
 features. One 300ms request can then starve waiting fibers even if you
 have 7 free threads.

 This statement doesn't make any sense taken in isolation. It lacks way
 too much context to be informative. For instance, "locking a thread
 for 300ms" is easily averted if all I/O and blocking sys-call are
 managed in a separate thread pool (that may grow far beyond
 fiber-scheduled "web" thread pool).

 And if "locked" means CPU-bound locked, then it's
 a) hard to fix without help from OS: re-scheduling a fiber without
 explicit yield ain't possible (it's cooperative, preemption is in the
 domain of OS).

 If you porocess and compress a large dataset in one fiber you don't need
 rescheduling. You just the scheduler to pick fibers according to
 priority regardless of origin thread.

So do not. Large dataset is not something a single thread should do 
anyway, just post it to the "workers" thread pool and wait on that (by 
yeilding).

There is no FUNDAMENTAL problem.

 b) If CPU-bound is happening more often then once in a while, then
 fibers are poor fit anyway - threads (and pools of 'em) do exactly
 what's needed in this case by being natively preemptive and well
 suited for running multiple CPU intensive tasks.

 Not really the issue. Load comes in spikes,

You are trying to change the issue itself.
Load is multitude of requests, we are speaking of a SINGLE one taking a 
lot of time. So load makes no difference here, we are talking of DoS-ish 
kind of thing, not DDoS.

And my postulate is as follows: as long as one requests may take loong 
amount of time, there are going to be arbitrary many such "long" 
requests in row esp. on public services, that everybody tries hard to abuse.

if you on average only have
 a couple of heavy fibers at the same time then you are fine. You can
 spawn more threads if needed, but that wont help if fibers are stuck on
 a slow thread.

Well that's convenient I won't deny, but itself it just patches up the 
problem and in
non-transparent way - oh, hey 10 requests are taking too much time,
let's spawn 11-th thread.

But - if some requests may take arbitrary long to complete, just use the 
separate pool for heavy work, it's _better_ design and more resilent to 
"heavy" requests anyway.

 That's bad for latency, because then all fibers on
 that thread will get 300+ms in latency.

 E-hm locking threads to fibers and arbitrary latency figures have very
 little to do with each other. The nature of that latency is extremely
 important.

 If you in line behind a cpu heavy fiber then you get that effect.

Aye, I just don't see myself doing hard work on fiber. They are not 
meant to do that.

 How anyone can disagree with this is beyond me.

 IMHO poorly formed problem statements are not going to prove your
 point. Pardon me making a personal statement, but for instance showing
 how Go avoids your problem and clearly specifying the exact conditions
 that cause it would go a long way to demonstrated whatever you wanted to.

 Any decent framework that is concerned about latency solves this the
 same way: light threads, or events, or whatever are not locked to a
 specific thread.

They do not have thread-local by default. But anyway - ad populum.

 Isolates are fine, but D does not provide it afaik.

Would you explain?

-- 
Dmitry Olshansky

Sep 27 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Monday, 22 September 2014 at 09:45:23 UTC, Ola Fosheim Grostad 
wrote:
 Locking fibers to threads will cost you more than using 
 threadsafe features. One 300ms request can then starve waiting 
 fibers even if you have 7 free threads. That's bad for latency, 
 because then all fibers on that thread will get 300+ms in 
 latency.

I don't understand what you're getting at.  Nothing in D locks 
fibers to threads.  In fact, the MultiScheduler I'm going to 
write if the original Scheduler pull request is ever accepted 
will not work this way.  Granted, that means that use of 
thread-local storage will be utterly broken instead of mostly 
broken, but I think it's a fair exchange for not having a single 
long-running fiber block other fibers.

Sep 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

22-Sep-2014 01:45, Ola Fosheim Grostad пишет:
 On Sunday, 21 September 2014 at 17:52:42 UTC, Dmitry Olshansky wrote:
 to use non-atomic ref-counting and have far less cache pollution (the
 set of fibers to switch over is consistent).

 Caches are not a big deal when you wait for io.

 Go also check fiber
 stack size... But maybe Go should not be considered a target.

 ??? Just reserve more space. Even Go dropped segmented stack.
 What Go has to do with this discussion at all BTW?

 Because that is what you are competing with in the webspace.

E-hm Go is hardly the top dog in the web space. Java and JVM crowd like 
(Scala etc.) are apparently very sexy (and performant) in the web space.
They try to sell it as if it was all the rage though.

IMO Go is hardly an interesting opponent to compete against. In pretty 
much any use case I see Go is somewhere down to 4-th+ place to look at.

 Go checks and extends stacks.

Since 1.2 or 1.3 i.e. relatively new stuff.

-- 
Dmitry Olshansky

Sep 22 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/22/14, 12:34 PM, Dmitry Olshansky wrote:
 22-Sep-2014 01:45, Ola Fosheim Grostad пишет:
 On Sunday, 21 September 2014 at 17:52:42 UTC, Dmitry Olshansky wrote:
 to use non-atomic ref-counting and have far less cache pollution (the
 set of fibers to switch over is consistent).

 Caches are not a big deal when you wait for io.

 Go also check fiber
 stack size... But maybe Go should not be considered a target.

 ??? Just reserve more space. Even Go dropped segmented stack.
 What Go has to do with this discussion at all BTW?

 Because that is what you are competing with in the webspace.

 E-hm Go is hardly the top dog in the web space. Java and JVM crowd like
 (Scala etc.) are apparently very sexy (and performant) in the web space.
 They try to sell it as if it was all the rage though.

 IMO Go is hardly an interesting opponent to compete against. In pretty
 much any use case I see Go is somewhere down to 4-th+ place to look at.

I agree. It does have legs however. We should learn a few things from 
it, such as green threads, dependency management, networking libraries. 
Also Go shows that good quality tooling makes a lot of a difference. And 
of course the main lesson is that templates are good to have :o).

Andrei

Sep 22 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-Sep-2014 03:11, Andrei Alexandrescu пишет:
 On 9/22/14, 12:34 PM, Dmitry Olshansky wrote:
 22-Sep-2014 01:45, Ola Fosheim Grostad пишет:
 On Sunday, 21 September 2014 at 17:52:42 UTC, Dmitry Olshansky wrote:
 to use non-atomic ref-counting and have far less cache pollution (the
 set of fibers to switch over is consistent).

 Caches are not a big deal when you wait for io.

 Go also check fiber
 stack size... But maybe Go should not be considered a target.

 ??? Just reserve more space. Even Go dropped segmented stack.
 What Go has to do with this discussion at all BTW?

 Because that is what you are competing with in the webspace.

 E-hm Go is hardly the top dog in the web space. Java and JVM crowd like
 (Scala etc.) are apparently very sexy (and performant) in the web space.
 They try to sell it as if it was all the rage though.

 IMO Go is hardly an interesting opponent to compete against. In pretty
 much any use case I see Go is somewhere down to 4-th+ place to look at.

 I agree. It does have legs however. We should learn a few things from
 it, such as green threads, dependency management, networking libraries.

Well in short term that would mean..

green threads --> better support for fibers (see std.concurrency pull by 
Sean)
dependency management --> package dub with dmd releases, use it to build 
e.g.g Phobos? ;)
networking libraries -> there are plenty of good inspirational libraries 
out there in different languages. vibe.d is cool, but we ought to 
explore more and propagate stuff to std.net.*

 Also Go shows that good quality tooling makes a lot of a difference. And
 of course the main lesson is that templates are good to have :o).

Agreed.

 Andrei


-- 
Dmitry Olshansky

Sep 22 2014

"Wyatt" <wyatt.epp gmail.com> writes:

On Monday, 22 September 2014 at 23:11:42 UTC, Andrei Alexandrescu 
wrote:
 I agree. It does have legs however. We should learn a few 
 things from it, such as green threads, dependency management, 
 networking libraries. Also Go shows that good quality tooling 
 makes a lot of a difference. And of course the main lesson is 
 that templates are good to have :o).

Go also shows the viability of a fixup tool for minor automated 
code changes as the language develops.

-Wyatt

Sep 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/23/14, 5:51 AM, Wyatt wrote:
 On Monday, 22 September 2014 at 23:11:42 UTC, Andrei Alexandrescu wrote:
 I agree. It does have legs however. We should learn a few things from
 it, such as green threads, dependency management, networking
 libraries. Also Go shows that good quality tooling makes a lot of a
 difference. And of course the main lesson is that templates are good
 to have :o).

 Go also shows the viability of a fixup tool for minor automated code
 changes as the language develops.

 -Wyatt

Yah, we definitely should have one of our mythical lieutenants on that. 
-- Andrei

Sep 23 2014

"Wyatt" <wyatt.epp gmail.com> writes:

On Tuesday, 23 September 2014 at 15:43:41 UTC, Andrei 
Alexandrescu wrote:
 Yah, we definitely should have one of our mythical lieutenants 
 on that. -- Andrei

I distinctly remember someone offering to write one and being 
shot down (by Walter?).

-Wyatt

Sep 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/23/14, 9:02 AM, Wyatt wrote:
 On Tuesday, 23 September 2014 at 15:43:41 UTC, Andrei Alexandrescu wrote:
 Yah, we definitely should have one of our mythical lieutenants on
 that. -- Andrei

 I distinctly remember someone offering to write one and being shot down
 (by Walter?).

The offer was in the context of a feature that was being rejected. -- Andrei

Sep 23 2014

"Meta" <jared771 gmail.com> writes:

On Tuesday, 23 September 2014 at 16:18:33 UTC, Andrei 
Alexandrescu wrote:
 The offer was in the context of a feature that was being 
 rejected. -- Andrei

Walter *has* said before that he's uncomfortable with tools that 
directly modify source code, which is understandable. A good 
suggestion was to not directly modify the source, but produce a 
patch file that can be used by any merge tool (and maybe an 
option to directly modify a file if someone really wants to).

Sep 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/23/14, 11:08 AM, Meta wrote:
 On Tuesday, 23 September 2014 at 16:18:33 UTC, Andrei Alexandrescu wrote:
 The offer was in the context of a feature that was being rejected. --
 Andrei

 Walter *has* said before that he's uncomfortable with tools that
 directly modify source code, which is understandable. A good suggestion
 was to not directly modify the source, but produce a patch file that can
 be used by any merge tool (and maybe an option to directly modify a file
 if someone really wants to).

I've been at a conference where Pike spoke about gofix. He convinced me 
it's a very valuable tool. -- Andrei

Sep 23 2014

ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Tue, 23 Sep 2014 18:08:45 +0000
Meta via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Walter *has* said before that he's uncomfortable with tools that=20
 directly modify source code, which is understandable.

i can't understand this, though. any big project using SCM nowdays, and
it's easy to create branch and work on it. and even 'hello, world'
projects can use modern DVCS with two or three commands. there is
nothing wrong with source code modifications, and patches can be made
later, by something like 'git diff'.

the only poor fellas are those who forced to use svn by "company
standards". ah, ignore 'em, they don't mind having some more pain
(that's why they still using svn, aren't they?).

Sep 23 2014

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 23 September 2014 at 18:08:46 UTC, Meta wrote:
 On Tuesday, 23 September 2014 at 16:18:33 UTC, Andrei 
 Alexandrescu wrote:
 The offer was in the context of a feature that was being 
 rejected. -- Andrei

 Walter *has* said before that he's uncomfortable with tools 
 that directly modify source code, which is understandable. A 
 good suggestion was to not directly modify the source, but 
 produce a patch file that can be used by any merge tool (and 
 maybe an option to directly modify a file if someone really 
 wants to).

We are in 2014, and we have good source control. That will be 
fine.

Sep 23 2014

"Kagamin" <spam here.lot> writes:

On Sunday, 21 September 2014 at 09:06:57 UTC, Ola Fosheim Grostad 
wrote:
 That will have to change if Go is a target. To get full load 
 you need to let fibers move freely between threads I think. Go 
 also check fiber stack size... But maybe Go should not be 
 considered a target.

Only isolated cluster can safely migrate between threads. D has 
no means to check isolation, you should check it manually, and in 
addition check if the logic doesn't depend on tls.

Sep 21 2014

"Ola Fosheim Grostad" <ola.fosheim.grostad+dlang gmail.com> writes:

On Sunday, 21 September 2014 at 19:28:13 UTC, Kagamin wrote:
 Only isolated cluster can safely migrate between threads. D has 
 no means to check isolation, you should check it manually, and 
 in addition check if the logic doesn't depend on tls.

This can easily be borked if built in RC does not provide 
threadsafety.

If you want low latency, high throughput and low memory overhead, 
then you gotta use available threads. Otherwise the load 
balancing will be wonky.

Most requests in a web service will wait for network traffic from 
memcaches. So a requeston a fober will have to be rescheduled at 
least once on average.

Sep 21 2014

"Kagamin" <spam here.lot> writes:

On Sunday, 21 September 2014 at 21:42:03 UTC, Ola Fosheim Grostad 
wrote:
 On Sunday, 21 September 2014 at 19:28:13 UTC, Kagamin wrote:
 Only isolated cluster can safely migrate between threads. D 
 has no means to check isolation, you should check it manually, 
 and in addition check if the logic doesn't depend on tls.

 This can easily be borked if built in RC does not provide 
 threadsafety.

Isolated data is single-threaded w.r.t. concurrent access. What 
thread-safety do you miss? You should only check for 
environmental dependencies, which are not strictly related to 
concurrency.

Sep 22 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 Andrei

I'm testing your RCstring right now in my code to see how much 
memory it will save and speed it will gain. I want to use 
RCString in place of string as a key in my AAs. Any proposals for 
a suitable implementation of

     size_t toHash()  trusted pure nothrow

for RCString? I'm guessing there are two cases here; one for the 
SSO-case an one for the other. The other should be similar to

     size_t toHash(string)  trusted pure nothrow

right?

Sep 20 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Saturday, 20 September 2014 at 15:21:18 UTC, Nordlöw wrote:
 for RCString? I'm guessing there are two cases here;

I'm guessing

     size_t toHash() const  trusted pure nothrow
     {
         import core.internal.hash : hashOf;
         if (isSmall)
         {
             return this.small.hashOf;
         }
         else
         {
             return this.large[].hashOf;
         }
     }

Will

     this.large[].hashOf

do unneccessary GC-allocations? -vgc says nothing.

I'm compiling as

     dmd -vcolumns -debug -g -gs -vgc -unittest -wi -main 
rcstring.d -o rcstring.out

Sep 20 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/20/14, 8:54 AM, "Nordlöw" wrote:
 On Saturday, 20 September 2014 at 15:21:18 UTC, Nordlöw wrote:
 for RCString? I'm guessing there are two cases here;

 I'm guessing

      size_t toHash() const  trusted pure nothrow
      {
          import core.internal.hash : hashOf;
          if (isSmall)
          {
              return this.small.hashOf;
          }
          else
          {
              return this.large[].hashOf;
          }
      }

Why not just "return this.asSlice.hashOf;"?

 Will

      this.large[].hashOf

 do unneccessary GC-allocations? -vgc says nothing.

No.


Andrei

Sep 20 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Saturday, 20 September 2014 at 17:06:48 UTC, Andrei
Alexandrescu wrote:
 Why not just "return this.asSlice.hashOf;"?

Good idea :)

I'll use that instead.

 Will

     this.large[].hashOf

 do unneccessary GC-allocations? -vgc says nothing.


Ok, great! A couple of followup questions.

How big overhead is an RC compared to a non-RC GC-free string
variant?

Perhaps it would be nice to add a template parameter in RCXString
that makes the RC-optional?

If I want a *non*-RC GC-free variant of string/wstring/dstring
what's the best way to define them?

Would Array!char, Array!wchar, Array!dchar, be suitable
solutions? Of course these wouldn't utilize SSO. I'm asking
because Array is RandomAccess but string/wstring is not
byCodePoint.

Sep 20 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/20/14, 11:01 AM, "Nordlöw" wrote:
 How big overhead is an RC compared to a non-RC GC-free string
 variant?

Ballpark would be probably 1.1-2.5x. But there's of course a bunch of 
variability.

 Perhaps it would be nice to add a template parameter in RCXString
 that makes the RC-optional?

Manual memory management is not part of its charter.

 If I want a *non*-RC GC-free variant of string/wstring/dstring
 what's the best way to define them?

I think you're back to malloc and free kind of stuff.

 Would Array!char, Array!wchar, Array!dchar, be suitable
 solutions? Of course these wouldn't utilize SSO. I'm asking
 because Array is RandomAccess but string/wstring is not
 byCodePoint.

Those are refcounted.


Andrei

Sep 20 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/20/14, 8:54 AM, "Nordlöw" wrote:
 On Saturday, 20 September 2014 at 15:21:18 UTC, Nordlöw wrote:
 for RCString? I'm guessing there are two cases here;

 I'm guessing

      size_t toHash() const  trusted pure nothrow
      {
          import core.internal.hash : hashOf;
          if (isSmall)
          {
              return this.small.hashOf;

Oh in fact this.small.hashOf is incorrect anyway because it hashes 
random characters after the used portion of the string. -- Andrei

Sep 20 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/20/14, 8:21 AM, "Nordlöw" wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
 Andrei

 I'm testing your RCstring right now in my code to see how much memory it
 will save and speed it will gain.

Thanks!

 I want to use RCString in place of
 string as a key in my AAs. Any proposals for a suitable implementation of

      size_t toHash()  trusted pure nothrow

 for RCString? I'm guessing there are two cases here; one for the
 SSO-case an one for the other. The other should be similar to

      size_t toHash(string)  trusted pure nothrow

 right?

Yah, that's the one.


Andrei

Sep 20 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Saturday, 20 September 2014 at 16:57:49 UTC, Andrei 
Alexandrescu wrote:
 Thanks!

Calling writeln(rcstring) in a module other than rcstring.d

gives


large, small, msmall}, '\b', [10280751412894535920, 0, 
576460752303423488])

I believe you have to make either opSlice public or add a public 
toString.

Sep 20 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 http://dpaste.dzfl.pl/817283c163f5

You implementation seems to hold water at least in my tests and 
save memory at

https://github.com/nordlow/justd/blob/master/conceptnet5.d

Thanks :)

I'm however struggling with fast serialization with msgpack. FYI:

https://github.com/msgpack/msgpack-d/issues/43

Sep 22 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/22/14, 12:18 PM, "Nordlöw" wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
 http://dpaste.dzfl.pl/817283c163f5

 You implementation seems to hold water at least in my tests and save
 memory at

 https://github.com/nordlow/justd/blob/master/conceptnet5.d

Awesome, thanks for doing this. How did you measure and what results did 
you get? -- Andrei

Sep 22 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Monday, 22 September 2014 at 23:09:28 UTC, Andrei Alexandrescu 
wrote:
 On 9/22/14, 12:18 PM, "Nordlöw" wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei 
 Alexandrescu wrote:
 http://dpaste.dzfl.pl/817283c163f5

 You implementation seems to hold water at least in my tests 
 and save
 memory at

 https://github.com/nordlow/justd/blob/master/conceptnet5.d

 Awesome, thanks for doing this. How did you measure and what 
 results did you get? -- Andrei

I just checked that I didn't get any segfaults :)

Memory usage in my conceptnet5.d graph (around 1.2GB) didn't 
differ noticeable when using RCString compared to string in a 
network that allocates around 10 million RCStrings as keys in a 
hash-table. Average RCString length is about 14. Is that 
surprising?

I didn't test speed.

I've found a potential bug in msgpack-d that when fixed enables 
packing of RCString. See:

https://github.com/msgpack/msgpack-d/issues/43

Sep 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/24/14, 4:29 AM, "Nordlöw" wrote:
 On Monday, 22 September 2014 at 23:09:28 UTC, Andrei Alexandrescu wrote:
 On 9/22/14, 12:18 PM, "Nordlöw" wrote:
 On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
 http://dpaste.dzfl.pl/817283c163f5

 You implementation seems to hold water at least in my tests and save
 memory at

 https://github.com/nordlow/justd/blob/master/conceptnet5.d

 Awesome, thanks for doing this. How did you measure and what results
 did you get? -- Andrei

 I just checked that I didn't get any segfaults :)

 Memory usage in my conceptnet5.d graph (around 1.2GB) didn't differ
 noticeable when using RCString compared to string in a network that
 allocates around 10 million RCStrings as keys in a hash-table. Average
 RCString length is about 14. Is that surprising?

Sounds about reasonable.

 I didn't test speed.

Thanks for this work! -- Andrei

Sep 24 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Wednesday, 24 September 2014 at 14:58:34 UTC, Andrei 
Alexandrescu wrote:
 I didn't test speed.


So the pro must be (de)allocation speed then, I suppose?

Sep 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/24/14, 12:49 PM, "Nordlöw" wrote:
 On Wednesday, 24 September 2014 at 14:58:34 UTC, Andrei Alexandrescu wrote:
 I didn't test speed.


 So the pro must be (de)allocation speed then, I suppose?

The pro is twofold:

1. Code using RC will be more compact about using memory if strings are 
created and then discarded. Depending on a variety of factors, that may 
lead to better cache friendliness.

2. If a GC cycle will occur, that will add to the total run time of GC 
code, meaning RC code will gain an advantage.

Code that doesn't create/discard a lot of strings and doesn't get to run 
the GC is likely to be slower with RCString.


Andrei

Sep 24 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Wednesday, 24 September 2014 at 20:22:05 UTC, Andrei 
Alexandrescu wrote:
 1. Code using RC will be more compact about using memory if 
 strings are created and then discarded. Depending on a variety 
 of factors, that may lead to better cache friendliness.

 2. If a GC cycle will occur, that will add to the total run 
 time of GC code, meaning RC code will gain an advantage.

 Code that doesn't create/discard a lot of strings and doesn't 
 get to run the GC is likely to be slower with RCString.

Ok, thanks.

Sep 24 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Wednesday, 24 September 2014 at 14:58:34 UTC, Andrei 
Alexandrescu wrote:
 Thanks for this work! -- Andrei

BTW: If I want to construct my network once and destroy it all in 
one pass, I should probably use a region-based allocator from 
std.allocator to allocate the strings that are larger than 
maxSmall. Is the template RCXString parameter realloc sufficient 
for my needs here?

Sep 24 2014

=?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:

On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu 
wrote:
 So, please fire away. I'd appreciate it if you used RCString in 
 lieu of string and note the differences. The closer we get to 
 parity in semantics, the better.

Further,

import std.container: Array;
import rcstring;

unittest
{
     Array!RCString x;
}

fails as

/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/
td/conv.d(4010,17): 
Error: expression hasElaborateAssign!(RCXString!(immutable(char), 
23LU, realloc)) of type void does not have a boolean value
/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/
td/conv.d(3970,31): 
Error: template instance 
std.conv.emplaceInitializer!(RCXString!(immutable(char), 23LU, 
realloc)) error instantiating
/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/
td/conv.d(4064,18): 
        instantiated from here: emplaceImpl!(string)
/home/per/Work/justd/rcstring.d(428,16):        instantiated from 
here: emplace!(RCXString!(immutable(char), 23LU, realloc), string)
/home/per/Work/justd/rcstring.d(13,18):        instantiated from 
here: RCXString!(immutable(char), 23LU, realloc)
/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/std/a
gorithm.d(1577,16): 
Error: template instance 
std.traits.hasElaborateAssign!(RCXString!(immutable(char), 23LU, 
realloc)) error instantiating
/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/std/conta
ner/array.d(85,26): 
        instantiated from here: 
initializeAll!(RCXString!(immutable(char), 23LU, realloc)[])
t_rcstring_array.d(8,5):        instantiated from here: 
Array!(RCXString!(immutable(char), 23LU, realloc))
/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/std/contai
er/array.d(276,24): 
Error: template instance 
std.algorithm.move!(RCXString!(immutable(char), 23LU, realloc)) 
error instantiating
t_rcstring_array.d(8,5):        instantiated from here: 
Array!(RCXString!(immutable(char), 23LU, realloc))
/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/
td/conv.d(4064,18): 
Error: template instance 
std.conv.emplaceImpl!(RCXString!(immutable(char), 23LU, 
realloc)).emplaceImpl!(RCXString!(immutable(char), 23LU, 
realloc)) error instantiating
/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/std/contai
er/array.d(186,20): 
        instantiated from here: 
emplace!(RCXString!(immutable(char), 23LU, realloc), 
RCXString!(immutable(char), 23LU, realloc))
/home/per/opt/x86_64-unknown-linux-gnu/dmd/linux/bin64/src/phobos/std/contai
er/array.d(356,21): 
        instantiated from here: 
__ctor!(RCXString!(immutable(char), 23LU, realloc))
t_rcstring_array.d(8,5):        instantiated from here: 
Array!(RCXString!(immutable(char), 23LU, realloc))

Any clue what's missing in RCXString?

Sep 24 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Escaping the Tyranny of the GC: std.rcstring, first blood