www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 11138] New: "real" appender missing

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11138

           Summary: "real" appender missing
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: verylonglogin.reg gmail.com



14:23:28 MSD ---
Lets call "real" an appender which will create the one and only array, i.e. it
will not randomly create a copy of array on appending. Such copy is bad as:
1. Almost no use.
    In 99% appender is used as "real" appender and the copy is not needed.
2. Performance.
    It will cause calls of postblits and then destructors on collection of copy
array.
3. Needless restriction on element type.
    One may want to create and array of uncopyable type which non-"real"
appender can't do by definition.

Obviously, "real" appender is uncopyable and don't give safe access to its
array until it finished.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 29 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11138


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra gmail.com




 3. Needless restriction on element type.
     One may want to create and array of uncopyable type which non-"real"
 appender can't do by definition.
Appender supports immutable construction (mostly bug free, AFAIK). I don't think a "real" appender would do any better than appender could anyways, so this is a non-argument (IMO).
 2. Performance.
     It will cause calls of postblits and then destructors on collection of copy
 array.
Technically, the destructors don't need to be called, since the GC will leak them anyways :/ See below about "postblits"
 Lets call "real" an appender which will create the one and only array, i.e. it
 will not randomly create a copy of array on appending. Such copy is bad as:
 1. Almost no use.
     In 99% appender is used as "real" appender and the copy is not needed.
I've thought about this before. The *only* case this would really change anything, is on structs that have an elaborate CC. Arguably, these things have no business being in a GC managed array, since it never finalizes its elements (and arguably, if you have a CC, you have a destructor). If we ever get a finalizing GC, then things will be different. One "workaround" I've though about is that when the type has a CC, Appender could carry an extra "ownsArray" boolean. If appender must relocate its array, and "owns" the array, then postblit is not necessary. In the future, if the GC is finalizing, it may also have to re-initialize to T.init, if there is an elaborate destructor.
 Obviously, "real" appender is uncopyable
Appender having ref semantics, there would be absolutely no problem "copying" it (read: pass by value).
 and don't give safe access to its
 array until it finished.
I can see the motivation behind this "real" appender. Another (dis)advantage it has is that it wouldn't be initialize-able from an existing array. This means it would have 100% ownership of its internals, but on the other hand, it also means, it also means it can't extend an exiting array :/ So, "at the end of the day", I'm not really sure a "real" appender would pull its weight compared to the standard appender (IMO). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11138




19:12:23 MSD ---


 3. Needless restriction on element type.
     ...
Appender supports immutable construction (mostly bug free, AFAIK). I don't think a "real" appender would do any better than appender could anyways, so this is a non-argument (IMO).
Mutability have nothing to do with copyability. Struct with disabled postblits are meant here. And current appender design can't accept them.
 2. Performance.
     It will cause calls of postblits and then destructors on collection of copy
 array.
Technically, the destructors don't need to be called, since the GC will leak them anyways :/ See below about "postblits"
 Lets call "real" an appender which will create the one and only array, i.e. it
 will not randomly create a copy of array on appending. Such copy is bad as:
 1. Almost no use.
     In 99% appender is used as "real" appender and the copy is not needed.
I've thought about this before. The *only* case this would really change anything, is on structs that have an elaborate CC. Arguably, these things have no business being in a GC managed array, since it never finalizes its elements (and arguably, if you have a CC, you have a destructor). If we ever get a finalizing GC, then things will be different.
So, you are talking that GC bug must force us to do bad design? I will accept it only for an iron-WONTFIXed one.
 One "workaround" I've though about is that when the type has a CC, Appender
 could carry an extra "ownsArray" boolean. If appender must relocate its array,
 and "owns" the array, then postblit is not necessary. In the future, if the GC
 is finalizing, it may also have to re-initialize to T.init, if there is an
 elaborate destructor.
Such flag will not work as Appender API allows safe access to the array so the array can never be owned.
 Obviously, "real" appender is uncopyable
Appender having ref semantics, there would be absolutely no problem "copying" it (read: pass by value).
Yes. And such ref semantic is not generally needed and just results in needless GC allocation in stuff aimed for performance-critical tasks.
 I can see the motivation behind this "real" appender. Another (dis)advantage it
 has is that it wouldn't be initialize-able from an existing array. This means
 it would have 100% ownership of its internals, but on the other hand, it also
 means, it also means it can't extend an exiting array :/
It of course can't initialize from existing array if element isn't copyable. Otherwise it will create its own copy of array. Where is the problem? Also we talk here about very rare usage of the appender which I'd rather ignore just like the case when its content is needed in the middle of appending as non-general.
 So, "at the end of the day", I'm not really sure a "real" appender would pull
 its weight compared to the standard appender (IMO).
Exactly what I expect for each my proposal. :) So as always I will just create one for my own use. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11138







 3. Needless restriction on element type.
     ...
Appender supports immutable construction (mostly bug free, AFAIK). I don't think a "real" appender would do any better than appender could anyways, so this is a non-argument (IMO).
Mutability have nothing to do with copyability. Struct with disabled postblits are meant here. And current appender design can't accept them.
Right. Sorry. Wait. How do you append something that can't be copied? Is this even relevant?
 I've thought about this before. The *only* case this would really change
 anything, is on structs that have an elaborate CC. Arguably, these things have
 no business being in a GC managed array, since it never finalizes its elements
 (and arguably, if you have a CC, you have a destructor). If we ever get a
 finalizing GC, then things will be different.
So, you are talking that GC bug must force us to do bad design? I will accept it only for an iron-WONTFIXed one.
You have a point.
 One "workaround" I've though about is that when the type has a CC, Appender
 could carry an extra "ownsArray" boolean. If appender must relocate its array,
 and "owns" the array, then postblit is not necessary. In the future, if the GC
 is finalizing, it may also have to re-initialize to T.init, if there is an
 elaborate destructor.
Such flag will not work as Appender API allows safe access to the array so the array can never be owned.
Not really, the appender API is rich enough for appender to "know" if anybody else has yet accessed the array, either through arr, or via construction. If no one else has a handle on the array, and appender is about to relocate, then it can pretty much do anything it wants. Maybe the word "owns" is inadequate. The bool would be called "arrayHasExternalView".
 Obviously, "real" appender is uncopyable
Appender having ref semantics, there would be absolutely no problem "copying" it (read: pass by value).
Yes. And such ref semantic is not generally needed and just results in needless GC allocation in stuff aimed for performance-critical tasks.
I think appender was mostly to offset how ridiculously slow ~= is. I'm not sure it wasn't design with *absolute* performance in mind. BTW, you might want to know about: http://d.puremagic.com/issues/show_bug.cgi?id=10864 Off topic: I'd like to try to make a push for some templatization in druntime. I thought you might be interested in the project. I'll keep you posted.
 I can see the motivation behind this "real" appender. Another (dis)advantage it
 has is that it wouldn't be initialize-able from an existing array. This means
 it would have 100% ownership of its internals, but on the other hand, it also
 means, it also means it can't extend an exiting array :/
It of course can't initialize from existing array if element isn't copyable. Otherwise it will create its own copy of array. Where is the problem?
You are creating a copy. Appender can be used to append to an existing array, whith lots of capacity, without relocating it. This can be very useful for say, processing lines in IO, where you are re-using the same buffer for each line.
 Also we
 talk here about very rare usage of the appender which I'd rather ignore just
 like the case when its content is needed in the middle of appending as
 non-general.
Well very rare remains to be proven. Still, I don't think it's that big of a problem if a "real" appender simply duplicated before allocating: To each their own behavior. You can't get "super speed" if you aren't willing to give appended total ownership of the passed data.
 So, "at the end of the day", I'm not really sure a "real" appender would pull
 its weight compared to the standard appender (IMO).
Exactly what I expect for each my proposal. :) So as always I will just create one for my own use.
-- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11138


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh gmail.com



03:26:03 PDT ---


 3. Needless restriction on element type.
     One may want to create and array of uncopyable type which non-"real"
 appender can't do by definition.
Appender supports immutable construction (mostly bug free, AFAIK). I don't think a "real" appender would do any better than appender could anyways, so this is a non-argument (IMO).
Construct in place. See also C++11 emplace_back and friends. Even these guys see the benefit.
 2. Performance.
     It will cause calls of postblits and then destructors on collection of copy
 array.
Technically, the destructors don't need to be called, since the GC will leak them anyways :/ See below about "postblits"
Who told you that a long-standing bug allows us to basically do even worse job then we have to ? ;)
 Lets call "real" an appender which will create the one and only array, i.e. it
 will not randomly create a copy of array on appending. Such copy is bad as:
 1. Almost no use.
     In 99% appender is used as "real" appender and the copy is not needed.
I've thought about this before. The *only* case this would really change anything, is on structs that have an elaborate CC. Arguably, these things have no business being in a GC managed array, since it never finalizes its elements (and arguably, if you have a CC, you have a destructor). If we ever get a finalizing GC, then things will be different. One "workaround" I've though about is that when the type has a CC, Appender could carry an extra "ownsArray" boolean. If appender must relocate its array, and "owns" the array, then postblit is not necessary. In the future, if the GC is finalizing, it may also have to re-initialize to T.init, if there is an elaborate destructor.
Sadly Appender leaks refernces to its internal array so it's can own the data at all.
 Obviously, "real" appender is uncopyable
Appender having ref semantics, there would be absolutely no problem "copying" it (read: pass by value).
IMHO it could just be uncopyable type and RefCounted!T should solve the issue of passing it as reference anywhere.
 and don't give safe access to its
 array until it finished.
I can see the motivation behind this "real" appender. Another (dis)advantage it has is that it wouldn't be initialize-able from an existing array. This means it would have 100% ownership of its internals, but on the other hand, it also means, it also means it can't extend an exiting array :/
You miss a big use case. Frankly patching Appender on top of an array is a minor thing that buys us a tiny bit over assumeSafeAppend. Doing a lot of reallocations in the process of building and not owning data is what makes Appender very poor at constructing ... array-like containers.
 So, "at the end of the day", I'm not really sure a "real" appender would pull
 its weight compared to the standard appender (IMO).
Then call it a Builder! Such a builder has a chance at good API design unlike the current abomination. In particular it could accumulate data in chunks and never reallocate until the very end. Secondly while assembling the final array/container it may move data freely both supporting non-copyable stuff _and_ being more efficient. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 20 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11138




03:32:42 PDT ---

 Lets call "real" an appender which will create the one and only array, i.e. it
 will not randomly create a copy of array on appending. Such copy is bad as:
 1. Almost no use.
     In 99% appender is used as "real" appender and the copy is not needed.
 2. Performance.
     It will cause calls of postblits and then destructors on collection of copy
 array.
 3. Needless restriction on element type.
     One may want to create and array of uncopyable type which non-"real"
 appender can't do by definition.
4. Support building stuff other then built-in array on GC heap. 5. Deterministic memory management. There is no point in keeping discarded memory flaoting around after the final array is produced. 6. (sky is the limit) Use the allocators if such an ephemeral thing ever comes to materialize.
 Obviously, "real" appender is uncopyable and don't give safe access to its
 array until it finished.
Honestly the thing current Appender is used for is a Builder. With rebranding it as Builder and few extra points I'm all for it :) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 20 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11138


Denis Shelomovskij <verylonglogin.reg gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|nobody puremagic.com        |verylonglogin.reg gmail.com



15:45:49 MSD ---

 With rebranding it as Builder and few extra points I'm all for it :)
And you will get it. Thanks for the positive response. As always I will create it in my Unstandard library first and make an inclusion proposal. But not sure I will do it fast. Everybody if free to reassign the issue to him, I will provide all my related code and ideas. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 20 2013