digitalmars.D - Our Sister
- Andrei Alexandrescu (18/18) May 26 2016 I've been working on RCStr (endearingly pronounced "Our Sister"), D's
- Adam D. Ruppe (5/6) May 26 2016 You really should actually mention RCStr in the subject line so
- ixid (6/13) May 26 2016 To be fair using a forum called 'General' for technical
- Joakim (4/11) May 26 2016 Where do you see all this "chatter?" Looking at the topics for
- Jonathan M Davis via Digitalmars-d (4/10) May 26 2016 Yeah. I was about to ignore this thread as being clearly OT until I saw ...
- Pete (20/20) May 27 2016 I post this only as a warning to others.
- Jack Stouffer (3/4) May 27 2016 Please don't derail this conversation. If you have a complaint
- Pete (5/10) May 27 2016 read the subject line slowly Jack
- Jack Stouffer (4/5) May 27 2016 Sorry about that. I use the web interface and everything is
- Andrei Alexandrescu (6/8) May 27 2016 Thanks for that. Not sure what your moniker is there, but I noticed a
- Gary Willoughby (3/7) May 26 2016 Will s.by!Grapheme be supported too?
- Andrei Alexandrescu (2/7) May 26 2016 Yes. -- Andrei
- Jack Stouffer (15/23) May 26 2016 How is that going BTW. Last I heard you were having problems with
- Adam D. Ruppe (13/15) May 26 2016 That would be templated so like byUTF!char and byUTF!wchar right?
- Jack Stouffer (5/14) May 26 2016 This has the added benefit that it would automatically work with
- Jonathan M Davis via Digitalmars-d (6/11) May 26 2016 RCStr definitely should _not_ pass isSomeString. Those traits specifical...
- =?UTF-8?B?Tm9yZGzDtnc=?= (2/4) May 27 2016 +1
- Xinok (11/17) May 26 2016 I don't know how practical this would be, but if at all feasible,
- Seb (7/17) May 26 2016 Great news!
- jmh530 (2/8) May 26 2016 I like these ideas (and RCString over RCStr).
- Andrei Alexandrescu (3/5) May 26 2016 With all the criticism leveled against string, I thought more of the
- jmh530 (6/13) May 26 2016 Hmm, I think it would be better to be right than necessarily a
- Seb (4/19) May 26 2016 Oh yes that's what I meant. Sorry for being so confusing.
- Dicebot (5/7) May 29 2016 Don't get overly excited. dfix will never be capable of automatic fixup
- H. S. Teoh via Digitalmars-d (7/13) May 26 2016 I'm not sure what criticism you're referring to. The only one I can
- Vladimir Panteleev (4/5) May 26 2016 Having a "null" state which is distinguishable from an empty
- Bastiaan Veelo (11/31) May 26 2016 Interesting! I few noob questions first:
- Andrei Alexandrescu (8/15) May 26 2016 Yes, COW. Substrings will be managed COW-ish as well (no copy upon
- =?UTF-8?B?Tm9yZGzDtnc=?= (7/10) May 27 2016 For inspiration see:
- Marc =?UTF-8?B?U2Now7x0eg==?= (3/4) May 27 2016 It should _safely_ convert to `const(char)[]`.
- Andrei Alexandrescu (2/5) May 27 2016 That is not possible, sorry. -- Andrei
- Era Scarecrow (12/19) May 27 2016 I wonder if it could...
- Andrei Alexandrescu (5/6) May 27 2016 Reasoning is simple - yes we could safely convert to const(char)[] but
- Seb (2/9) May 27 2016 not if [] would be ref-counted too ;-)
- Adam D. Ruppe (14/15) May 27 2016 That would be kinda horrible. Right now, slicing is virtually
- Manu via Digitalmars-d (8/15) May 27 2016 This is only true for the owner. If we had 'scope', or something like
- Adam D. Ruppe (6/11) May 28 2016 Right, I agree - if we keep the slice just the way it is now, it
- Marco Leise (12/31) May 30 2016 I second that thought. But I'd be ok with an unsafe slice and
- Manu via Digitalmars-d (5/34) May 31 2016 D loves templates, but templates aren't a given. Closed-source
- Marco Leise (6/9) May 31 2016 Same effect for GPL code. Funny. (Template instantiations are
- tsbockman (3/10) May 27 2016 But conversions to scope const(char)[] could be made safe, right?
- Adam D. Ruppe (4/6) May 27 2016 Indeed, and I really think we should spend more effort on making
- Andrei Alexandrescu (2/12) May 27 2016 Yah, in principle. -- Andrei
- Nick Treleaven (7/14) May 31 2016 We could have:
- Manu via Digitalmars-d (4/13) May 27 2016 It should safely convert to 'scope const(char)[]', then we only need a
- Marc =?UTF-8?B?U2Now7x0eg==?= (2/11) May 28 2016 I didn't want to mention the s-word ;-)
- Marc =?UTF-8?B?U2Now7x0eg==?= (7/14) May 28 2016 It is when DIP25 [1] is finally fully implemented (by that I mean
- Manu via Digitalmars-d (10/11) May 27 2016 Ah, I totally skipped over this thread...
-
ZombineDev
(118/140)
May 28 2016
- ZombineDev (20/176) May 28 2016 Here's another case where the last change to AffixAllocator is
I've been working on RCStr (endearingly pronounced "Our Sister"), D's up-and-coming reference counted string type. The goals are: * Reference counted, shouldn't leak if all instances destroyed; even if not, use the GC as a last-resort reclamation mechanism. * Entirely safe. * Support UTF 100% by means of RCStr!char, RCStr!wchar etc. but also raw manipulation and custom encodings via RCStr!ubyte, RCStr!ushort etc. * Support several views of the same string, e.g. given s of type RCStr!char, it can be iterated byte-wise, code point-wise, code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc. * Support const and immutable qualifiers for the character type. * Work well with const and immutable when they qualify the entire RCStr type. * Fast: use the small string optimization and various other layout and algorithms to make it a good choice for high performance strings RFC: what primitives should RCStr have? Thanks, Andrei
May 26 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:I've been working on RCStr (endearingly pronounced "Our Sister")You really should actually mention RCStr in the subject line so people overwhelmed with the staggering amount of off topic chatter on this forum don't disregard this thread too.
May 26 2016
On Thursday, 26 May 2016 at 16:20:37 UTC, Adam D. Ruppe wrote:On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:To be fair using a forum called 'General' for technical discussion is asking for trouble. We will be able to tell when D actually starts to become popular because this part of the forum will cease to function as it's inundated with newbies who expect it to mean general questions or something similar.I've been working on RCStr (endearingly pronounced "Our Sister")You really should actually mention RCStr in the subject line so people overwhelmed with the staggering amount of off topic chatter on this forum don't disregard this thread too.
May 26 2016
On Thursday, 26 May 2016 at 16:20:37 UTC, Adam D. Ruppe wrote:On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:Where do you see all this "chatter?" Looking at the topics for the last 10 days, I only see one not about D generally, and it's labeled OT.I've been working on RCStr (endearingly pronounced "Our Sister")You really should actually mention RCStr in the subject line so people overwhelmed with the staggering amount of off topic chatter on this forum don't disregard this thread too.
May 26 2016
On Thursday, May 26, 2016 16:20:37 Adam D. Ruppe via Digitalmars-d wrote:On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:Yeah. I was about to ignore this thread as being clearly OT until I saw that it was started by Andrei. - Jonathan M DavisI've been working on RCStr (endearingly pronounced "Our Sister")You really should actually mention RCStr in the subject line so people overwhelmed with the staggering amount of off topic chatter on this forum don't disregard this thread too.
May 26 2016
I post this only as a warning to others. Imagine being the kind of person who isn't certain he could actually get Hello World past the D compiler -but (and?) sees the subject "Our Sister" and immediately thinks: "oh, Alexandrescu must be referring to his sister who is a doctor and did the art on the book cover". ---Welcome to the world of the PL trainspotter.--- Shoot-me shoot-me shoot-me. It gets worse: I'm at the supermarket the other day, and the guy at the checkout has a strong Africaans accent. I find myself saying to him; "umm if you right now, like hypothetically, heard the sound of hooves -would you think of horses or zebras" No lie. Try working *that* into a brief conversation about whether you have a store loyalty card. Forget the Star Wars allusion -think Aliens ...when the Ripley character mercifully torches the wretched mutant clones of herself. I was an actual programmer once. please, somebody ..kill .. me
May 27 2016
On Friday, 27 May 2016 at 17:08:33 UTC, Pete wrote:...Please don't derail this conversation. If you have a complaint please make it in a separate thread and tag it OT.
May 27 2016
read the subject line slowly Jack ..but I appreciate your witty use of the word derail. If anyone calls, Jack and I will be over at stack overflow gleefully closing down the derailers there. On Friday, 27 May 2016 at 17:37:20 UTC, Jack Stouffer wrote:and tag it OT.<<On Friday, 27 May 2016 at 17:08:33 UTC, Pete wrote:...Please don't derail this conversation. If you have a complaint please make it in a separate thread and tag it OT.
May 27 2016
On Friday, 27 May 2016 at 19:35:58 UTC, Pete wrote:read the subject line slowly JackSorry about that. I use the web interface and everything is grouped together even if it doesn't have the same subject line, so I didn't see that you changed it.
May 27 2016
On 05/27/2016 03:35 PM, Pete wrote:If anyone calls, Jack and I will be over at stack overflow gleefully closing down the derailers there.Thanks for that. Not sure what your moniker is there, but I noticed a good number of solid answers to D questions on SO. Regarding the title, it was actually making a subtle point: if it's not marked as [OT] it's on topic! :o) Andrei
May 27 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:* Support several views of the same string, e.g. given s of type RCStr!char, it can be iterated byte-wise, code point-wise, code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc.Will s.by!Grapheme be supported too?
May 26 2016
On 05/26/2016 12:58 PM, Gary Willoughby wrote:On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:Yes. -- Andrei* Support several views of the same string, e.g. given s of type RCStr!char, it can be iterated byte-wise, code point-wise, code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc.Will s.by!Grapheme be supported too?
May 26 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:* Support const and immutable qualifiers for the character type.How is that going BTW. Last I heard you were having problems with inout/const.* Support several views of the same string, e.g. given s of type RCStr!char, it can be iterated byte-wise, code point-wise, code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc. [snip] RFC: what primitives should RCStr have?Well, because we already have the standard library functions representation, byUTF, byCodePoint, byCodeUnit, and byGrapheme, I think RCStr should provide these names as methods which all return ranges. If possible, these would all work regardless of character or integer type of the data. So in effect, RCStr would have completely encapsulated data. Let's not make the same mistake that we made with string et al. by providing a default. If at all possible, it would be great if it was also an output range.RCStr*bikeshedding*: How about RCString, because the convention for D names is to be explicit most of the time.
May 26 2016
On Thursday, 26 May 2016 at 17:32:33 UTC, Jack Stouffer wrote:Well, because we already have the standard library functions representation, byUTFThat would be templated so like byUTF!char and byUTF!wchar right? Then byCodePoint can just be another name for byUTF!dchar. I kinda like that. Ideally, the string type would also use lazy imports for any conversion table. So if you never call byGrapheme, it never imports the std.uni tables. (Heck, std.uni could be the one to provide that type, of course.) Would an RCStr pass isSomeString? I kinda think it shouldn't. Actually, isSomeString probably shouldn't often be used - instead checking for string-like range capabilities is likely better for algorithms. Then doing some_algorithm(my_rcstr) fails - you must do some_algorithm(my_rcstr.some_range)
May 26 2016
On Thursday, 26 May 2016 at 17:50:36 UTC, Adam D. Ruppe wrote:That would be templated so like byUTF!char and byUTF!wchar right? Then byCodePoint can just be another name for byUTF!dchar. I kinda like that. Ideally, the string type would also use lazy imports for any conversion table. So if you never call byGrapheme, it never imports the std.uni tables. (Heck, std.uni could be the one to provide that type, of course.)This has the added benefit that it would automatically work with a lot of generic code that uses those functions.Would an RCStr pass isSomeString? I kinda think it shouldn't.I agree, it shouldn't. isSomeString should only test for one of the language provided string types.
May 26 2016
On Thursday, May 26, 2016 17:50:36 Adam D. Ruppe via Digitalmars-d wrote:Would an RCStr pass isSomeString? I kinda think it shouldn't. Actually, isSomeString probably shouldn't often be used - instead checking for string-like range capabilities is likely better for algorithms. Then doing some_algorithm(my_rcstr) fails - you must do some_algorithm(my_rcstr.some_range)RCStr definitely should _not_ pass isSomeString. Those traits specifically work only for the built-in types and not for stuff that acts like them. It's a disaster waiting to happen otherwise. We need to distinguish between testing for something that is a string and something that acts like one. - Jonathan M Davis
May 26 2016
On Thursday, 26 May 2016 at 17:32:33 UTC, Jack Stouffer wrote:*bikeshedding*: How about RCString, because the convention for D names is to be explicit most of the time.+1
May 27 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:I've been working on RCStr (endearingly pronounced "Our Sister"), D's up-and-coming reference counted string type. The goals are: ...I don't know how practical this would be, but if at all feasible, I think one of the goals should be to have a common interface/primitives with regular strings so we can write generic functions which accept both native strings and RCStr. Otherwise, I second Jack's points.* Reference counted, shouldn't leak if all instances destroyed; even if not, use the GC as a last-resort reclamation mechanism.Could you (or somebody) elaborate a little on how this could work from a technical standpoint? The only way I see this working is if the GC always scans for RCStr-allocated memory, in which case, why even bother with RC?
May 26 2016
On Thursday, 26 May 2016 at 17:45:15 UTC, Xinok wrote:On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:Great news! I think one can't stress this enough: If you want RCStr to be adapted it has to be a drop-in replacement for string. Maybe we can bundle the transition from auto-decoding with the adaption to a RCString. There was the proposal of having String without auto-decoding for this migration.I've been working on RCStr (endearingly pronounced "Our Sister"), D's up-and-coming reference counted string type. The goals are: ...I don't know how practical this would be, but if at all feasible, I think one of the goals should be to have a common interface/primitives with regular strings so we can write generic functions which accept both native strings and RCStr.
May 26 2016
On Thursday, 26 May 2016 at 18:44:42 UTC, Seb wrote:Great news! I think one can't stress this enough: If you want RCStr to be adapted it has to be a drop-in replacement for string. Maybe we can bundle the transition from auto-decoding with the adaption to a RCString. There was the proposal of having String without auto-decoding for this migration.I like these ideas (and RCString over RCStr).
May 26 2016
On 05/26/2016 02:44 PM, Seb wrote:If you want RCStr to be adapted it has to be a drop-in replacement for string.With all the criticism leveled against string, I thought more of the opposite. This is an opportunity to get it right. -- Andrei
May 26 2016
On Thursday, 26 May 2016 at 20:24:10 UTC, Andrei Alexandrescu wrote:On 05/26/2016 02:44 PM, Seb wrote:Hmm, I think it would be better to be right than necessarily a drop-in. I think the idea is so that you could change alias string = immutable(char)[]; to something using RCString and there would be minimal breakages.If you want RCStr to be adapted it has to be a drop-in replacement for string.With all the criticism leveled against string, I thought more of the opposite. This is an opportunity to get it right. -- Andrei
May 26 2016
On Thursday, 26 May 2016 at 21:42:31 UTC, jmh530 wrote:On Thursday, 26 May 2016 at 20:24:10 UTC, Andrei Alexandrescu wrote:Oh yes that's what I meant. Sorry for being so confusing. __Right__ is way more important than breakages. For that we have `dfix`.On 05/26/2016 02:44 PM, Seb wrote:Hmm, I think it would be better to be right than necessarily a drop-in. I think the idea is so that you could change alias string = immutable(char)[]; to something using RCString and there would be minimal breakages.If you want RCStr to be adapted it has to be a drop-in replacement for string.With all the criticism leveled against string, I thought more of the opposite. This is an opportunity to get it right. -- Andrei
May 26 2016
On 05/27/2016 01:17 AM, Seb wrote:Oh yes that's what I meant. Sorry for being so confusing. __Right__ is way more important than breakages. For that we have `dfix`.Don't get overly excited. dfix will never be capable of automatic fixup with such deep levels of semantic analysis required, this can only be done by compiler itself (which is currently not designed for fixup kind of tasks).
May 29 2016
On Thu, May 26, 2016 at 04:24:10PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:On 05/26/2016 02:44 PM, Seb wrote:I'm not sure what criticism you're referring to. The only one I can think of is autodecoding, which isn't really an inherent part of string being immutable(char)[], which I think is a fine idea. T -- The most powerful one-line C program: #include "/dev/tty" -- IOCCCIf you want RCStr to be adapted it has to be a drop-in replacement for string.With all the criticism leveled against string, I thought more of the opposite. This is an opportunity to get it right. -- Andrei
May 26 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:RFC: what primitives should RCStr have?Having a "null" state which is distinguishable from an empty string.
May 26 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:I've been working on RCStr (endearingly pronounced "Our Sister"), D's up-and-coming reference counted string type. The goals are: * Reference counted, shouldn't leak if all instances destroyed; even if not, use the GC as a last-resort reclamation mechanism. * Entirely safe. * Support UTF 100% by means of RCStr!char, RCStr!wchar etc. but also raw manipulation and custom encodings via RCStr!ubyte, RCStr!ushort etc. * Support several views of the same string, e.g. given s of type RCStr!char, it can be iterated byte-wise, code point-wise, code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc. * Support const and immutable qualifiers for the character type. * Work well with const and immutable when they qualify the entire RCStr type. * Fast: use the small string optimization and various other layout and algorithms to make it a good choice for high performance stringsInteresting! I few noob questions first: * Would it support implicit sharing (copy-on-write)? What about sub-strings? * Will concatenations be fast? * Would this have value for compile time string operations, mixin's, etc.?RFC: what primitives should RCStr have?String may have a few that are worth supporting: http://doc.qt.io/qt-5/qstring.html Bastiaan.
May 26 2016
On 05/26/2016 04:32 PM, Bastiaan Veelo wrote:* Would it support implicit sharing (copy-on-write)? What about sub-strings?Yes, COW. Substrings will be managed COW-ish as well (no copy upon substring extraction).* Will concatenations be fast?No, it will copy (i.e. no multiple segments management). It will be of course optimized as much as we can.* Would this have value for compile time string operations, mixin's, etc.?Not planned.Good list. Thanks! AndreiRFC: what primitives should RCStr have?String may have a few that are worth supporting: http://doc.qt.io/qt-5/qstring.html
May 26 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:* Fast: use the small string optimization and various other layout and algorithms to make it a good choice for high performance stringsFor inspiration see: - Vladimir recommends `tempCString` - Nikolay has https://bitbucket.org/sibnick/inplacearray.git Original thread: https://forum.dlang.org/post/msrlumbobhpuljvhwrlh forum.dlang.org
May 27 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:RFC: what primitives should RCStr have?It should _safely_ convert to `const(char)[]`.
May 27 2016
On 5/27/16 7:07 AM, Marc Schütz wrote:On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:That is not possible, sorry. -- AndreiRFC: what primitives should RCStr have?It should _safely_ convert to `const(char)[]`.
May 27 2016
On Friday, 27 May 2016 at 13:32:30 UTC, Andrei Alexandrescu wrote:On 5/27/16 7:07 AM, Marc Schütz wrote:I wonder if it could... For a while now I've wondered why there isn't an option to include flags to every type (for debugging)? The flags could relay a lot of information, like if a variable was originally immutable, const, shared, other? If it was originally allocated using the GC, malloc, C/C++/Other or stack. If it used a constructor, init, or not at all (= void)? Along with control options like where/when an assignment tries to happen, copies it's state (or it's variables with indirection), or printing an output each time it changes, etc. With the current state of things, I'll just take your word on it.On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:That is not possible, sorry. -- AndreiRFC: what primitives should RCStr have?It should _safely_ convert to `const(char)[]`.
May 27 2016
On 05/27/2016 05:02 PM, Era Scarecrow wrote:With the current state of things, I'll just take your word on it.Reasoning is simple - yes we could safely convert to const(char)[] but that means effectively all refcounting is lost for that string. So we can convert but in an explicit manner, e.g. str.toGCThisWillCompletelySuckMan. -- Andrei
May 27 2016
On Friday, 27 May 2016 at 21:25:50 UTC, Andrei Alexandrescu wrote:On 05/27/2016 05:02 PM, Era Scarecrow wrote:not if [] would be ref-counted too ;-)With the current state of things, I'll just take your word on it.Reasoning is simple - yes we could safely convert to const(char)[] but that means effectively all refcounting is lost for that string. So we can convert but in an explicit manner, e.g. str.toGCThisWillCompletelySuckMan. -- Andrei
May 27 2016
On Friday, 27 May 2016 at 21:51:59 UTC, Seb wrote:not if [] would be ref-counted too ;-)That would be kinda horrible. Right now, slicing is virtually free and compatible with all kinds of backing schemes. If it became refcounted, it'd: 1) have to keep a pointer to the refcount structure with the slice, adding memory cost 2) make assignments and slicing work through that refcount pointer, adding cpu cost 3) somehow need to know the appropriate freeing strategy, adding some kind of indirect call when refcount = 0, and would make creating a slice more tedious as you'd need to know this (meaning you also probably need to allocate this structure! no more free ptr[0 .. length] operation on malloc'd blocks.) So I'd be pretty strongly against that.
May 27 2016
On 28 May 2016 at 10:16, Adam D. Ruppe via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Friday, 27 May 2016 at 21:51:59 UTC, Seb wrote:This is only true for the owner. If we had 'scope', or something like it (ie, borrowing in rust lingo), then the fat slice wouldn't need to be passed around, it's only a burden on the top-level owner. 'scope' is consistently rejected, but it solves so many long-standing problems we have, and this reduction of 'fat'(/rc)-slices to normal slices is a particularly important one.not if [] would be ref-counted too ;-)That would be kinda horrible. Right now, slicing is virtually free and compatible with all kinds of backing schemes. If it became refcounted, it'd: 1) have to keep a pointer to the refcount structure with the slice, adding memory cost
May 27 2016
On Saturday, 28 May 2016 at 04:15:45 UTC, Manu wrote:This is only true for the owner. If we had 'scope', or something like it (ie, borrowing in rust lingo), then the fat slice wouldn't need to be passed aroundRight, I agree - if we keep the slice just the way it is now, it all still works if you borrow correctly! (BTW, I don't think we even need this to be strictly safe, though it would be nice if it was tested, we could say system getSlice and potentially change it to safe later.)
May 28 2016
Am Sat, 28 May 2016 14:15:45 +1000 schrieb Manu via Digitalmars-d <digitalmars-d puremagic.com>:On 28 May 2016 at 10:16, Adam D. Ruppe via Digitalmars-d <digitalmars-d puremagic.com> wrote:I second that thought. But I'd be ok with an unsafe slice and making sure myself, that I don't keep a reference around. A lot of functions only borrow data and can work on a naked pointer/ref/slice, while the owner(s) have the smart pointer. These can of course be converted to templates taking either char[] or RCStr, but I think borrowing is cleaner when the function in question doesn't care a bag of beans if the chars it works on were allocated on the GC heap or reference counted. -- MarcoOn Friday, 27 May 2016 at 21:51:59 UTC, Seb wrote:This is only true for the owner. If we had 'scope', or something like it (ie, borrowing in rust lingo), then the fat slice wouldn't need to be passed around, it's only a burden on the top-level owner. 'scope' is consistently rejected, but it solves so many long-standing problems we have, and this reduction of 'fat'(/rc)-slices to normal slices is a particularly important one.not if [] would be ref-counted too ;-)That would be kinda horrible. Right now, slicing is virtually free and compatible with all kinds of backing schemes. If it became refcounted, it'd: 1) have to keep a pointer to the refcount structure with the slice, adding memory cost
May 30 2016
On 31 May 2016 at 01:00, Marco Leise via Digitalmars-d <digitalmars-d puremagic.com> wrote:Am Sat, 28 May 2016 14:15:45 +1000 schrieb Manu via Digitalmars-d <digitalmars-d puremagic.com>:D loves templates, but templates aren't a given. Closed-source projects often can't have templates in the public API (ie, source should not be available), and this is my world.On 28 May 2016 at 10:16, Adam D. Ruppe via Digitalmars-d <digitalmars-d puremagic.com> wrote:I second that thought. But I'd be ok with an unsafe slice and making sure myself, that I don't keep a reference around. A lot of functions only borrow data and can work on a naked pointer/ref/slice, while the owner(s) have the smart pointer. These can of course be converted to templates taking either char[] or RCStr, but I think borrowing is cleaner when the function in question doesn't care a bag of beans if the chars it works on were allocated on the GC heap or reference counted.On Friday, 27 May 2016 at 21:51:59 UTC, Seb wrote:This is only true for the owner. If we had 'scope', or something like it (ie, borrowing in rust lingo), then the fat slice wouldn't need to be passed around, it's only a burden on the top-level owner. 'scope' is consistently rejected, but it solves so many long-standing problems we have, and this reduction of 'fat'(/rc)-slices to normal slices is a particularly important one.not if [] would be ref-counted too ;-)That would be kinda horrible. Right now, slicing is virtually free and compatible with all kinds of backing schemes. If it became refcounted, it'd: 1) have to keep a pointer to the refcount structure with the slice, adding memory cost
May 31 2016
Am Wed, 1 Jun 2016 01:06:36 +1000 schrieb Manu via Digitalmars-d <digitalmars-d puremagic.com>:D loves templates, but templates aren't a given. Closed-source projects often can't have templates in the public API (ie, source should not be available), and this is my world.Same effect for GPL code. Funny. (Template instantiations are like statically linking in the open source code.) -- Marco
May 31 2016
On Friday, 27 May 2016 at 21:25:50 UTC, Andrei Alexandrescu wrote:On 05/27/2016 05:02 PM, Era Scarecrow wrote:But conversions to scope const(char)[] could be made safe, right? (If scope were ever fully implemented, that is.)With the current state of things, I'll just take your word on it.Reasoning is simple - yes we could safely convert to const(char)[] but that means effectively all refcounting is lost for that string. So we can convert but in an explicit manner, e.g. str.toGCThisWillCompletelySuckMan. -- Andrei
May 27 2016
On Friday, 27 May 2016 at 22:09:48 UTC, tsbockman wrote:But conversions to scope const(char)[] could be made safe, right? (If scope were ever fully implemented, that is.)Indeed, and I really think we should spend more effort on making this work. Not as much as Rust spends on it, but a lil more than our current return ref dip.
May 27 2016
On 05/27/2016 06:09 PM, tsbockman wrote:On Friday, 27 May 2016 at 21:25:50 UTC, Andrei Alexandrescu wrote:Yah, in principle. -- AndreiOn 05/27/2016 05:02 PM, Era Scarecrow wrote:But conversions to scope const(char)[] could be made safe, right? (If scope were ever fully implemented, that is.)With the current state of things, I'll just take your word on it.Reasoning is simple - yes we could safely convert to const(char)[] but that means effectively all refcounting is lost for that string. So we can convert but in an explicit manner, e.g. str.toGCThisWillCompletelySuckMan. -- Andrei
May 27 2016
On Friday, 27 May 2016 at 21:25:50 UTC, Andrei Alexandrescu wrote:On 05/27/2016 05:02 PM, Era Scarecrow wrote:We could have: const(char)[] s = rcstr.stealSlice; Which is null* if the refcount is > 1. rcstr would then be empty on success. In fact if with the RC DIP we guarantee the memory doesn't escape, stealSlice could return string. *Or better, return an Option.With the current state of things, I'll just take your word on it.Reasoning is simple - yes we could safely convert to const(char)[] but that means effectively all refcounting is lost for that string. So we can convert but in an explicit manner, e.g. str.toGCThisWillCompletelySuckMan. -- Andrei
May 31 2016
On 27 May 2016 at 23:32, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 5/27/16 7:07 AM, Marc Schütz wrote:It should safely convert to 'scope const(char)[]', then we only need a fat-slice or like at the very top of the callstack...On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:That is not possible, sorry. -- AndreiRFC: what primitives should RCStr have?It should _safely_ convert to `const(char)[]`.
May 27 2016
On Saturday, 28 May 2016 at 04:28:16 UTC, Manu wrote:On 27 May 2016 at 23:32, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:I didn't want to mention the s-word ;-)On 5/27/16 7:07 AM, Marc Schütz wrote:It should safely convert to 'scope const(char)[]', then we only need a fat-slice or like at the very top of the callstack...It should _safely_ convert to `const(char)[]`.That is not possible, sorry. -- Andrei
May 28 2016
On Friday, 27 May 2016 at 13:32:30 UTC, Andrei Alexandrescu wrote:On 5/27/16 7:07 AM, Marc Schütz wrote:It is when DIP25 [1] is finally fully implemented (by that I mean including for slices and pointers etc., Walter told me at Dconf that this is going to happen), and the problem with aliasing references is solved (which needs to happen anyway for any reference counting to be safe). [1] https://wiki.dlang.org/DIP25On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:That is not possible, sorry. -- AndreiRFC: what primitives should RCStr have?It should _safely_ convert to `const(char)[]`.
May 28 2016
On 27 May 2016 at 02:11, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:I've been working on RCStr (endearingly pronounced "Our Sister"),Ah, I totally skipped over this thread... Wow... this really doesn't work in any accent I'm close to, but I can hear it if I imagine you saying it ;) If I said RCStr, it sounds like 'are'-'see'-strrr, but 'our sister' would be 'hour'-sistə... isn't it strange that word recognition seems to work pretty much reliably down a sliding scale until an arbitrary point where it just drops off. There's not a lot of fuzzy area in the middle.
May 27 2016
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:I've been working on RCStr (endearingly pronounced "Our Sister"), D's up-and-coming reference counted string type. The goals are:<Slightly off-topic> RCStr may be an easier first step, but I think generic dynamic arrays are more interesting, because are more generally applicable and user types like move-only resources make them a more challenging problem to solve. BTW, what happened to scope? Generally speaking, I'm not a fan of Rust, and I know that you think that D needs to differentiate, but I like their borrowing model for several reasons: a) while not 100% safe and quite verbose, it offers enough improvements over safe D to make it a worthwhile upgrade, if you don't care about any other language features b) it's not that hard to grasp / almost natural for people familiar with C++11's copy (shared_ptr) and move (unique_ptr) semantics. 3) it's general enough that it can be applied to areas like iterator invalidation, thread synchronization and other logic bugs, like some third-party rust packages demonstrate. I think that improving escape analysis with the scope attribute can go along way to shortening the gap between Rust and D in that area. The other elephant(s) in the room are nested contexts like delegates, nested structs and some alias template parameter arguments. These are especially bad because the user has zero control over those GC allocations. Which makes some of D's key features unusable in nogc contexts. <End off-topic>* Reference counted, shouldn't leak if all instances destroyed; even if not, use the GC as a last-resort reclamation mechanism. * Entirely safe. * Support UTF 100% by means of RCStr!char, RCStr!wchar etc. but also raw manipulation and custom encodings via RCStr!ubyte, RCStr!ushort etc. * Support several views of the same string, e.g. given s of type RCStr!char, it can be iterated byte-wise, code point-wise, code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc. * Support const and immutable qualifiers for the character type. * Work well with const and immutable when they qualify the entire RCStr type. * Fast: use the small string optimization and various other layout and algorithms to make it a good choice for high performance strings RFC: what primitives should RCStr have? Thanks, Andrei0) (Prerequisite) Composition/interaction with language features/user types - RCStr in nested contexts (alias template parameters, delegates, nested structs/classes), array of RCStr-s, RCStr as a struct/class member, RCStr passed as (const) ref parameter, etc. should correctly increase/decrease ref count. This is also a prerequisite for safe RefCounted!T. Action item: related compiler bugs should be prioritized. E.g. the RAII bug from Shachar Shemesh's lightning talk - http://forum.dlang.org/post/n8algm$qra$1 digitalmars.com. See also: https://issues.dlang.org/buglist.cgi?quicksearch=raii&list_id=208631 https://issues.dlang.org/buglist.cgi?quicksearch=destructor&list_id=208632 (not everything in those lists is related but there are some nasty ones, like bad RVO codegen). 1) Safe slicing 2) shared overloads of member functions (e.g. for stuff like atomic incRef/decRef) 3) Concatenation (RCStr ~= RCStr ~ RCStr ~ char) 4) (Optional) Reserving (pre-allocating capacity) / shrinking. I labeled this feature request as optional, as it's not clear if RCStr is more like a container, or more like a slice/range. 5) Some sort of optimization for zero-terminated strings. Quite often one needs to interact with C APIs, which requires calling toStringz / toUTFz, which causes unnecessary allocations. It would be great if RCStr could efficiently handle this scenario. 6) !!! Not really a primitive, but we need to make sure that applying a chain of range transformations won't break ownership (e.g. leak or free prematurely). 7) Should be able to replace GC usage in transient ranges like e.g. File.byLine 8) Cheap initialization/assignment from string literals - should be roughly the same as either initializing a static character array (if the small string optimization is used) or just making it point to read-only memory in the data segment of the executable. It shouldn't try to write or free such memory. When initialized from a string literal, RCStr should also offer a null-terminating byte, provided that it points to the whole If one wants to assign a string literal by overwriting parts of the already allocated storage, std.algorithm.mutation.copy should be used instead. There may be other important primitives which I haven't thought of, but generally we should try to leverage std.algorithm, std.range, std.string and std.uni for them, via UFCS. ---------- On a related note, I know that you want to use AffixAllocator for reference counting, and I think it's a great idea. I have one question, which wasn't answered during that discussion: // Use a nightly build to compile import core.thread : Thread, thread_joinAll; import std.range : iota; import std.experimental.allocator : makeArray; import std.experimental.allocator.building_blocks.region : InSituRegion; import std.experimental.allocator.building_blocks.affix_allocator : AffixAllocator; AffixAllocator!(InSituRegion!(4096) , uint) tlsAllocator; static assert (tlsAllocator.sizeof >= 4096); import std.stdio; void main() { shared(int)[] myArray; foreach (i; 0 .. 100) { new Thread( { if (i != 0) return; myArray = tlsAllocator.makeArray!(shared int)(100.iota); static assert(is(typeof(&tlsAllocator.prefix(myArray)) == shared(uint)*)); writefln("At %x: %s", myArray.ptr, myArray); }).start(); thread_joinAll(); } writeln(myArray); // prints garbage!!! } So my question is: should it be possible to share thread-local data like this? IMO, the current allocator design opens a serious hole in the type system, because it allows using data allocated from another thread's thread-local storage. After the other thread exits, accessing memory allocated from it's TLS should not be possible, but https://github.com/dlang/phobos/pull/3991 clearly allows that. One should be able to allocate shared memory only from shared allocators. And shared allocators must backed by shared parent allocators or shared underlying storage. In this case the Region allocator should be shared, and must be backed by shared memory, Mallocator, or something in that vein.
May 28 2016
On Saturday, 28 May 2016 at 09:43:41 UTC, ZombineDev wrote:On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:Here's another case where the last change to AffixAllocator is really dangerous: void main() { immutable(int)[] myArray; foreach (i; 0 .. 100) { new Thread( { if (i != 0) return; myArray = tlsAllocator.makeArray!(immutable int)(100.iota); writeln(myArray); // prints [0, ..., 99] }).start(); thread_joinAll(); // prints garbage } writeln(myArray); } In this case it severely violates the promise of immutable.I've been working on RCStr (endearingly pronounced "Our Sister"), D's up-and-coming reference counted string type. The goals are:<Slightly off-topic> RCStr may be an easier first step, but I think generic dynamic arrays are more interesting, because are more generally applicable and user types like move-only resources make them a more challenging problem to solve. BTW, what happened to scope? Generally speaking, I'm not a fan of Rust, and I know that you think that D needs to differentiate, but I like their borrowing model for several reasons: a) while not 100% safe and quite verbose, it offers enough improvements over safe D to make it a worthwhile upgrade, if you don't care about any other language features b) it's not that hard to grasp / almost natural for people familiar with C++11's copy (shared_ptr) and move (unique_ptr) semantics. 3) it's general enough that it can be applied to areas like iterator invalidation, thread synchronization and other logic bugs, like some third-party rust packages demonstrate. I think that improving escape analysis with the scope attribute can go along way to shortening the gap between Rust and D in that area. The other elephant(s) in the room are nested contexts like delegates, nested structs and some alias template parameter arguments. These are especially bad because the user has zero control over those GC allocations. Which makes some of D's key features unusable in nogc contexts. <End off-topic>* Reference counted, shouldn't leak if all instances destroyed; even if not, use the GC as a last-resort reclamation mechanism. * Entirely safe. * Support UTF 100% by means of RCStr!char, RCStr!wchar etc. but also raw manipulation and custom encodings via RCStr!ubyte, RCStr!ushort etc. * Support several views of the same string, e.g. given s of type RCStr!char, it can be iterated byte-wise, code point-wise, code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc. * Support const and immutable qualifiers for the character type. * Work well with const and immutable when they qualify the entire RCStr type. * Fast: use the small string optimization and various other layout and algorithms to make it a good choice for high performance strings RFC: what primitives should RCStr have? Thanks, Andrei0) (Prerequisite) Composition/interaction with language features/user types - RCStr in nested contexts (alias template parameters, delegates, nested structs/classes), array of RCStr-s, RCStr as a struct/class member, RCStr passed as (const) ref parameter, etc. should correctly increase/decrease ref count. This is also a prerequisite for safe RefCounted!T. Action item: related compiler bugs should be prioritized. E.g. the RAII bug from Shachar Shemesh's lightning talk - http://forum.dlang.org/post/n8algm$qra$1 digitalmars.com. See also: https://issues.dlang.org/buglist.cgi?quicksearch=raii&list_id=208631 https://issues.dlang.org/buglist.cgi?quicksearch=destructor&list_id=208632 (not everything in those lists is related but there are some nasty ones, like bad RVO codegen). 1) Safe slicing 2) shared overloads of member functions (e.g. for stuff like atomic incRef/decRef) 3) Concatenation (RCStr ~= RCStr ~ RCStr ~ char) 4) (Optional) Reserving (pre-allocating capacity) / shrinking. I labeled this feature request as optional, as it's not clear if RCStr is more like a container, or more like a slice/range. 5) Some sort of optimization for zero-terminated strings. Quite often one needs to interact with C APIs, which requires calling toStringz / toUTFz, which causes unnecessary allocations. It would be great if RCStr could efficiently handle this scenario. 6) !!! Not really a primitive, but we need to make sure that applying a chain of range transformations won't break ownership (e.g. leak or free prematurely). 7) Should be able to replace GC usage in transient ranges like e.g. File.byLine 8) Cheap initialization/assignment from string literals - should be roughly the same as either initializing a static character array (if the small string optimization is used) or just making it point to read-only memory in the data segment of the executable. It shouldn't try to write or free such memory. When initialized from a string literal, RCStr should also offer a null-terminating byte, provided that it points to the whole If one wants to assign a string literal by overwriting parts of the already allocated storage, std.algorithm.mutation.copy should be used instead. There may be other important primitives which I haven't thought of, but generally we should try to leverage std.algorithm, std.range, std.string and std.uni for them, via UFCS. ---------- On a related note, I know that you want to use AffixAllocator for reference counting, and I think it's a great idea. I have one question, which wasn't answered during that discussion: // Use a nightly build to compile import core.thread : Thread, thread_joinAll; import std.range : iota; import std.experimental.allocator : makeArray; import std.experimental.allocator.building_blocks.region : InSituRegion; import std.experimental.allocator.building_blocks.affix_allocator : AffixAllocator; AffixAllocator!(InSituRegion!(4096) , uint) tlsAllocator; static assert (tlsAllocator.sizeof >= 4096); import std.stdio; void main() { shared(int)[] myArray; foreach (i; 0 .. 100) { new Thread( { if (i != 0) return; myArray = tlsAllocator.makeArray!(shared int)(100.iota); static assert(is(typeof(&tlsAllocator.prefix(myArray)) == shared(uint)*)); writefln("At %x: %s", myArray.ptr, myArray); }).start(); thread_joinAll(); } writeln(myArray); // prints garbage!!! } So my question is: should it be possible to share thread-local data like this? IMO, the current allocator design opens a serious hole in the type system, because it allows using data allocated from another thread's thread-local storage. After the other thread exits, accessing memory allocated from it's TLS should not be possible, but https://github.com/dlang/phobos/pull/3991 clearly allows that. One should be able to allocate shared memory only from shared allocators. And shared allocators must backed by shared parent allocators or shared underlying storage. In this case the Region allocator should be shared, and must be backed by shared memory, Mallocator, or something in that vein.
May 28 2016