digitalmars.D - what to do with postblit on the heap?
- Steven Schveighoffer (29/29) Jun 20 2011 I have submitted a fix for bug 5272,
- bearophile (7/11) Jun 20 2011 I think the current situation is not acceptable. This is a problem quite...
- Steven Schveighoffer (22/40) Jun 20 2011 The compiler is the one passing the parameters to _d_arraycopy, so even ...
- Jonathan M Davis (5/13) Jun 20 2011 Plain Old Datatype. It's a user-defined data type with member variables ...
- bearophile (5/16) Jun 20 2011 Given that D is a system language, and the general usefulness and ubiqui...
- Michel Fortin (17/53) Jun 20 2011 My feeling is that array appending and array assignment should be
- Steven Schveighoffer (21/63) Jun 20 2011 BTW, I now feel that your request to make a distinction between move and...
- Jonathan M Davis (5/70) Jun 20 2011 If an object is moved, neither the postblit nor the destructor should be...
- Steven Schveighoffer (13/23) Jun 20 2011 Well, I think in this case it is being copied. It's put on the stack, a...
- Jonathan M Davis (22/47) Jun 20 2011 Well, going from the stack to the heap probably is a copy. But moves sho...
- Michel Fortin (29/59) Jun 20 2011 Well, if
- Jonathan M Davis (9/37) Jun 20 2011 I would expect that to have move semantics. There's no need to create an...
- Steven Schveighoffer (52/99) Jun 21 2011 Good question. I don't even know how the runtime could avoid calling
- Michel Fortin (23/69) Jun 21 2011 ... and in the special case where the reference is a rvalue, then it
- Steven Schveighoffer (10/16) Jun 21 2011 Another issue with appending a @disabled-postblit struct, what happens
- Michel Fortin (10/27) Jun 21 2011 That's indeed a problem.
- so (16/31) Jun 21 2011 It should be something else because move(tmp) in std.algorithm takes by ...
- Michel Fortin (17/53) Jun 21 2011 Actually, no copy is needed. Move takes the argument by ref so it can
- so (11/16) Jun 21 2011 T move(ref T a) {
- Michel Fortin (17/37) Jun 21 2011 Actually, that depends on how you look at this.
- Andrei Alexandrescu (6/23) Jun 21 2011 The rule that move and TDPL rely on but is not fully implemented is that...
- Sean Kelly (5/7) Jun 21 2011 that returning a nonstatic local value never does a postblit nor a =
- Andrei Alexandrescu (4/8) Jun 21 2011 Illegal. All D structs must be transparently relocatable without
- so (5/10) Jun 21 2011 There was a similar discussion on struct constructors which ended up
I have submitted a fix for bug 5272, http://d.puremagic.com/issues/show_bug.cgi?id=5272 "Postblit not called on copying due to array append" However, I am starting to realize that one of the major reasons for postblit is to match it with an equivalent dtor. This works well when the struct is on the stack -- the posblit for instance increments a reference counter, then the dtor decrements the ref counter. But when the data is on the heap, the destructor is *not* called. So what happens to any ref-counted data that is on the heap? It's never decremented. Currently though, it might still work, because postblit isn't called when the data is on the heap! So no increment, no decrement. I think this is an artificial "success". However, if the pull request I initiated is accepted, then postblit *will* be called on heap allocation, for instance if you append data. This will further highlight the fact that the destructor is not being called. So is it worth adding calls to postblit, knowing that the complement destructor is not going to be called? I can see in some cases where it would be expected, and I can see other cases where it will be difficult to deal with. IMO, the difficult cases are already broken anyways, but it just seems like they are not. The other part of this puzzle that is missing is array assignment, for example a[] = b[] does not call postblits. I cannot fix this because _d_arraycopy does not give me the typeinfo. Anyone else have any thoughts? I'm mixed as to whether this patch should be accepted without more comprehensive GC/compiler reform. I feel its a step in the right direction, but that it will upset the balance in a few places (particularly ref-counting). -Steve
Jun 20 2011
Steven Schveighoffer:The other part of this puzzle that is missing is array assignment, for example a[] = b[] does not call postblits. I cannot fix this because _d_arraycopy does not give me the typeinfo.This seems fixable. Is it possible to rewrite _d_arraycopy?Anyone else have any thoughts?I think the current situation is not acceptable. This is a problem quite worse than _d_arraycopy because here some information is missing. Isn't this is the same problem with struct destructors? A solution is to add this information at runtime, a type tag to structs that have a postblit and/or destructor. But then structs aren't PODs any more. There are other places to store this information, like in some kind of associative array. Another solution is to forbid what the compiler can't guarantee. If a struct is going to be used only where its type is known, then it's allowed to have postblit and destructor. Is it possible to enforce this? I think it is. Here an annotation is useful to better manage this contract between programmer and compiler. Bye, bearophile
Jun 20 2011
On Mon, 20 Jun 2011 11:03:27 -0400, bearophile <bearophileHUGS lycos.com> wrote:Steven Schveighoffer:The compiler is the one passing the parameters to _d_arraycopy, so even if I change _d_arraycopy to accept a TypeInfo, the compiler needs to be fixed to send the TypeInfo. I think this is really a no-brainer, because currently what is passed is the element size, which is contained within the TypeInfo. I will be filing a bug on that. But currently, I can't fix it.The other part of this puzzle that is missing is array assignment, for example a[] = b[] does not call postblits. I cannot fix this because _d_arraycopy does not give me the typeinfo.This seems fixable. Is it possible to rewrite _d_arraycopy?This is an easy fix -- the typeinfo contains information of whether or not and how to run the postblit. The larger problem is the GC not calling the destructor. But my immediate question is -- is it better to half-fix the problem by committing my changes, or leave the issue alone?Anyone else have any thoughts?I think the current situation is not acceptable. This is a problem quite worse than _d_arraycopy because here some information is missing. Isn't this is the same problem with struct destructors?A solution is to add this information at runtime, a type tag to structs that have a postblit and/or destructor. But then structs aren't PODs any more. There are other places to store this information, like in some kind of associative array.Any solution that fixes the GC problem will have to store the typeinfo somehow associated with the block. I think we may have more traction for this problem with a precise GC. I don't think the right route is to store type info inside the struct itself. This added overhead is not necessary for when the struct is stored on the stack.Another solution is to forbid what the compiler can't guarantee. If a struct is going to be used only where its type is known, then it's allowed to have postblit and destructor. Is it possible to enforce this? I think it is. Here an annotation is useful to better manage this contract between programmer and compiler.This is a possibility, making a struct only usable if it's inside another such struct or inside a class, or on the stack. -Steve
Jun 20 2011
On 2011-06-20 11:56, Jose Armando Garcia wrote:On Mon, Jun 20, 2011 at 12:03 PM, bearophile <bearophileHUGS lycos.com>wrote:Plain Old Datatype. It's a user-defined data type with member variables but no functions. It just holds data. - Jonathan M DavisSteven Schveighoffer: A solution is to add this information at runtime, a type tag to structs that have a postblit and/or destructor. But then structs aren't PODs any more. There are other places to store this information, like in some kind of associative array.What are PODs?
Jun 20 2011
Steven Schveighoffer:But my immediate question is -- is it better to half-fix the problem by committing my changes, or leave the issue alone?I suggest to leave the issue alone.Any solution that fixes the GC problem will have to store the typeinfo somehow associated with the block. I think we may have more traction for this problem with a precise GC. I don't think the right route is to store type info inside the struct itself. This added overhead is not necessary for when the struct is stored on the stack.This is a possibility, making a struct only usable if it's inside another such struct or inside a class, or on the stack.Given that D is a system language, and the general usefulness and ubiquity of structs, a third possibility is to do both and add an attribute to help enforcing what can't be done on PODs, or add more runtime info _on request_ where the programmer wants more flexible structs. This solves the situation, but has the disadvantage of increasing D complexity a little. Bye, bearophile
Jun 20 2011
On 2011-06-20 10:34:14 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:I have submitted a fix for bug 5272, http://d.puremagic.com/issues/show_bug.cgi?id=5272 "Postblit not called on copying due to array append" However, I am starting to realize that one of the major reasons for postblit is to match it with an equivalent dtor. This works well when the struct is on the stack -- the posblit for instance increments a reference counter, then the dtor decrements the ref counter. But when the data is on the heap, the destructor is *not* called. So what happens to any ref-counted data that is on the heap? It's never decremented. Currently though, it might still work, because postblit isn't called when the data is on the heap! So no increment, no decrement. I think this is an artificial "success". However, if the pull request I initiated is accepted, then postblit *will* be called on heap allocation, for instance if you append data. This will further highlight the fact that the destructor is not being called. So is it worth adding calls to postblit, knowing that the complement destructor is not going to be called? I can see in some cases where it would be expected, and I can see other cases where it will be difficult to deal with. IMO, the difficult cases are already broken anyways, but it just seems like they are not. The other part of this puzzle that is missing is array assignment, for example a[] = b[] does not call postblits. I cannot fix this because _d_arraycopy does not give me the typeinfo. Anyone else have any thoughts? I'm mixed as to whether this patch should be accepted without more comprehensive GC/compiler reform. I feel its a step in the right direction, but that it will upset the balance in a few places (particularly ref-counting).My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero. As for the issue that destructors aren't called for arrays on the heap, it's a serious problem. But it's also a separate problem that concerns purely the runtime, as far as I am aware of. Is there someone working on it? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 20 2011
On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin <michel.fortin michelf.com> wrote:On 2011-06-20 10:34:14 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function. If the issue of array assignment is fixed, do you think it's worth putting the change in, and then filing a bug against the GC? I still think the current cases that "work" are fundamentally broken anyways. For instance, in a ref-counted struct, if you appended it to an array, then removed all the stack-based references, the ref count goes to zero, even though the array still has a reference (I think someone filed a bug against std.stdio.File for this).I have submitted a fix for bug 5272, http://d.puremagic.com/issues/show_bug.cgi?id=5272 "Postblit not called on copying due to array append" However, I am starting to realize that one of the major reasons for postblit is to match it with an equivalent dtor. This works well when the struct is on the stack -- the posblit for instance increments a reference counter, then the dtor decrements the ref counter. But when the data is on the heap, the destructor is *not* called. So what happens to any ref-counted data that is on the heap? It's never decremented. Currently though, it might still work, because postblit isn't called when the data is on the heap! So no increment, no decrement. I think this is an artificial "success". However, if the pull request I initiated is accepted, then postblit *will* be called on heap allocation, for instance if you append data. This will further highlight the fact that the destructor is not being called. So is it worth adding calls to postblit, knowing that the complement destructor is not going to be called? I can see in some cases where it would be expected, and I can see other cases where it will be difficult to deal with. IMO, the difficult cases are already broken anyways, but it just seems like they are not. The other part of this puzzle that is missing is array assignment, for example a[] = b[] does not call postblits. I cannot fix this because _d_arraycopy does not give me the typeinfo. Anyone else have any thoughts? I'm mixed as to whether this patch should be accepted without more comprehensive GC/compiler reform. I feel its a step in the right direction, but that it will upset the balance in a few places (particularly ref-counting).My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero.As for the issue that destructors aren't called for arrays on the heap, it's a serious problem. But it's also a separate problem that concerns purely the runtime, as far as I am aware of. Is there someone working on it?I think we need precise scanning to get a complete solution. Another option is to increase the information the array runtime stores in the memory block (currently it only stores the "used" length) and then hook the GC to call the dtors. This might be a quick fix that doesn't require precise scanning, but it also fixes the most common case of allocating a single struct or an array of structs on the heap. -Steve
Jun 20 2011
On 2011-06-20 15:12, Steven Schveighoffer wrote:On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin <michel.fortin michelf.com> wrote:If an object is moved, neither the postblit nor the destructor should be called. The object is moved, not copied and destroyed. I believe that TDPL is very specific on that. - Jonathan M DavisOn 2011-06-20 10:34:14 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.I have submitted a fix for bug 5272, http://d.puremagic.com/issues/show_bug.cgi?id=5272 "Postblit not called on copying due to array append" However, I am starting to realize that one of the major reasons for postblit is to match it with an equivalent dtor. This works well when the struct is on the stack -- the posblit for instance increments a reference counter, then the dtor decrements the ref counter. But when the data is on the heap, the destructor is *not* called. So what happens to any ref-counted data that is on the heap? It's never decremented. Currently though, it might still work, because postblit isn't called when the data is on the heap! So no increment, no decrement. I think this is an artificial "success". However, if the pull request I initiated is accepted, then postblit *will* be called on heap allocation, for instance if you append data. This will further highlight the fact that the destructor is not being called. So is it worth adding calls to postblit, knowing that the complement destructor is not going to be called? I can see in some cases where it would be expected, and I can see other cases where it will be difficult to deal with. IMO, the difficult cases are already broken anyways, but it just seems like they are not. The other part of this puzzle that is missing is array assignment, for example a[] = b[] does not call postblits. I cannot fix this because _d_arraycopy does not give me the typeinfo. Anyone else have any thoughts? I'm mixed as to whether this patch should be accepted without more comprehensive GC/compiler reform. I feel its a step in the right direction, but that it will upset the balance in a few places (particularly ref-counting).My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero.
Jun 20 2011
On Mon, 20 Jun 2011 18:43:30 -0400, Jonathan M Davis <jmdavisProg gmx.com> wrote:On 2011-06-20 15:12, Steven Schveighoffer wrote:Well, I think in this case it is being copied. It's put on the stack, and then copied to the heap inside the runtime function. The runtime could be passed a flag indicating the append is really a move, but I'm not sure it's a good choice. To me, not calling the postblit and dtor on a moved struct is an optimization, no? And you can't re-implement these semantics for a normal function. The one case I can think of is when an rvalue is allowed to be passed by reference (which is exactly what's happening here). Is there anything a postblit is allowed to do that would break a struct if you disabled the postblit in this case? I'm pretty sure internal pointers are not supported, especially if move semantics do not call the postblit. -SteveBTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.If an object is moved, neither the postblit nor the destructor should be called. The object is moved, not copied and destroyed. I believe that TDPL is very specific on that.
Jun 20 2011
On 2011-06-20 16:07, Steven Schveighoffer wrote:On Mon, 20 Jun 2011 18:43:30 -0400, Jonathan M Davis <jmdavisProg gmx.com> wrote:Well, going from the stack to the heap probably is a copy. But moves shouldn't be calling the postblit or the destructor, and you seemed to be saying that they should. The main place that a move would occur that I can think would be when returning a value from a function, which is very different. And I don't think that avoiding the postblit is necessarily just an optimization. If the postblit really is skipped, then it's probably possible to return an object which cannot legally be copied (presumably due to some combination of reference or pointer member variables and const or immutable), though that wouldn't exactly be a typical situation, even if it actually is possible. It _is_ primarily an optimization to move rather than copy and destroy, but I'm not sure that it's _just_ an optimization.On 2011-06-20 15:12, Steven Schveighoffer wrote:Well, I think in this case it is being copied. It's put on the stack, and then copied to the heap inside the runtime function. The runtime could be passed a flag indicating the append is really a move, but I'm not sure it's a good choice. To me, not calling the postblit and dtor on a moved struct is an optimization, no? And you can't re-implement these semantics for a normal function. The one case I can think of is when an rvalue is allowed to be passed by reference (which is exactly what's happening here).BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.If an object is moved, neither the postblit nor the destructor should be called. The object is moved, not copied and destroyed. I believe that TDPL is very specific on that.Is there anything a postblit is allowed to do that would break a struct if you disabled the postblit in this case? I'm pretty sure internal pointers are not supported, especially if move semantics do not call the postblit.If the struct had a pointer to a local member variable which the postblit would have deep-copied, then sure, not calling the postblit would screw with the struct. But that would screw with a struct which was returned from a function as well, and that's the prime place for the move semantics. That sort of struct is just plain badly designed, so I don't think that it's really something to worry about. I can't think of any other cases where it would be a problem though. Structs don't usually care where they live (aside from the issue of structs being designed to live on the stack and then not getting their destructor called because they're on the heap). - Jonathan M Davis
Jun 20 2011
On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Well, if a ~= S(); does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a disabled postblit, should it still be appendable?My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero.BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.If the issue of array assignment is fixed, do you think it's worth putting the change in, and then filing a bug against the GC? I still think the current cases that "work" are fundamentally broken anyways.That depends. I'm not too sure currently whether the S destructor is called for this code: a ~= S(); If the compiler currently calls the destructor on the temporary S struct, then your patch is actually a fix because it balances constructors and destructors correctly for the appending part (the bug is then that compiler should use move semantics but is using copy instead). If it doesn't call the destructor then your patch does introduce a bug for this case. All in all, I don't think it's important enough to justify we waste hours debating in what order we should fix those bugs. Do what you think is right. If it becomes a problem or it introduces a bug here or there, we'll adjust, at worse that means a revert of your commit.The GC calling the destructor doesn't require precise scanning. Although it's true that both problems require adding type information to memory blocks, beyond that requirement they're both independent. It'd be really nice if struct destructors were called correctly. -- Michel Fortin michel.fortin michelf.com http://michelf.com/As for the issue that destructors aren't called for arrays on the heap, it's a serious problem. But it's also a separate problem that concerns purely the runtime, as far as I am aware of. Is there someone working on it?I think we need precise scanning to get a complete solution. Another option is to increase the information the array runtime stores in the memory block (currently it only stores the "used" length) and then hook the GC to call the dtors. This might be a quick fix that doesn't require precise scanning, but it also fixes the most common case of allocating a single struct or an array of structs on the heap.
Jun 20 2011
On 2011-06-20 18:59, Michel Fortin wrote:On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:I would expect that to have move semantics. There's no need to create and destroy a temporary. It's completely wasteful. A copy should only be happening when a copy _needs_ to happen. It doesn't need to happen here. Now, depending on what ~= did internally (assuming that it were an overloaded operator), then a copy may end up occurring inside of the function, but that shouldn't happen for the built-in ~= operator, and a well-written overloaded ~= should avoid the need to copy as well. - Jonathan M DavisOn Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Well, if a ~= S(); does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a disabled postblit, should it still be appendable?My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero.BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.
Jun 20 2011
On Mon, 20 Jun 2011 21:59:49 -0400, Michel Fortin <michel.fortin michelf.com> wrote:On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:Good question. I don't even know how the runtime could avoid calling postblit, there is no flag saying the postblit is disabled in the typeinfo (that I know of). But think about it this way, if you have a function foo: foo(S)(ref S s, S[] arr) { arr[0] = s; } Isn't this copy semantics? This is exactly how the D runtime gets the data. The only difference is, the runtime function is allowed to accept a temporary as a reference (not possible in a normal function). Now, you could force move semantics, if you know the argument is an rvalue, but I don't know enough about what postblit is used for in order to say it's fine to use move semantics to move the struct into the heap. The reason I say move semantics are an optimization is because: { S tmp; arr ~= tmp; } is essentially equivalent to: arr ~= S(); But the former is copy semantics, the latter can be considered move. It seems like a smart compiler during optimization could rewrite the former as the latter, unless the semantics truly are different. Which is why I'm trying to figure out how postblit can be used ;)On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin <michel.fortin michelf.com> wrote:Well, if a ~= S(); does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a disabled postblit, should it still be appendable?My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero.BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.It is, I tested it. I ran this code: struct Test { this(this) { writeln("copy done"); } void opAssign(Test rhs) { writeln("assignment done"); } ~this() { writeln("destructor called"); } } void main() { Test[] tests = new Test[1]; { // Test test; // tests ~= test; tests ~= Test(); } writeln("done"); } and saw "destructor called" in the output, no matter which option was commented out.If the issue of array assignment is fixed, do you think it's worth putting the change in, and then filing a bug against the GC? I still think the current cases that "work" are fundamentally broken anyways.That depends. I'm not too sure currently whether the S destructor is called for this code: a ~= S();All in all, I don't think it's important enough to justify we waste hours debating in what order we should fix those bugs. Do what you think is right. If it becomes a problem or it introduces a bug here or there, we'll adjust, at worse that means a revert of your commit.OK, then I'll push the change. I already filed a bug against _d_arraycopy.Yes, the more I think about it, the more this solution looks attractive. All that is required is to flag the block as having a finalizer, store the TypeInfo pointer somewhere, and the GC should call it. I'll put in a bugzilla enhancement so it's not forgotten. -SteveThe GC calling the destructor doesn't require precise scanning. Although it's true that both problems require adding type information to memory blocks, beyond that requirement they're both independent. It'd be really nice if struct destructors were called correctly.As for the issue that destructors aren't called for arrays on the heap, it's a serious problem. But it's also a separate problem that concerns purely the runtime, as far as I am aware of. Is there someone working on it?I think we need precise scanning to get a complete solution. Another option is to increase the information the array runtime stores in the memory block (currently it only stores the "used" length) and then hook the GC to call the dtors. This might be a quick fix that doesn't require precise scanning, but it also fixes the most common case of allocating a single struct or an array of structs on the heap.
Jun 21 2011
On 2011-06-21 07:34:24 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:On Mon, 20 Jun 2011 21:59:49 -0400, Michel Fortin <michel.fortin michelf.com> wrote:... and in the special case where the reference is a rvalue, then it should have move semantics. See below.Well, if a ~= S(); does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a disabled postblit, should it still be appendable?Good question. I don't even know how the runtime could avoid calling postblit, there is no flag saying the postblit is disabled in the typeinfo (that I know of). But think about it this way, if you have a function foo: foo(S)(ref S s, S[] arr) { arr[0] = s; } Isn't this copy semantics? This is exactly how the D runtime gets the data. The only difference is, the runtime function is allowed to accept a temporary as a reference (not possible in a normal function).Now, you could force move semantics, if you know the argument is an rvalue, but I don't know enough about what postblit is used for in order to say it's fine to use move semantics to move the struct into the heap. The reason I say move semantics are an optimization is because: { S tmp; arr ~= tmp; } is essentially equivalent to: arr ~= S(); But the former is copy semantics, the latter can be considered move. It seems like a smart compiler during optimization could rewrite the former as the latter, unless the semantics truly are different. Which is why I'm trying to figure out how postblit can be used ;)Actually, this should be the equivalent: import std.algorithm; S tmp; arr ~= move(tmp); While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer. Things might be clearer if we had a move operator, but instead we have a 'move' function. There is only one case where I think we can assume to have move semantics: when a temporary (a rvalue) is assigned to somewhere. That's also all that's needed for the 'move' function to work. And that is broken currently when it comes to array appending. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 21 2011
On Tue, 21 Jun 2011 08:25:40 -0400, Michel Fortin <michel.fortin michelf.com> wrote:While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer.Another issue with appending a disabled-postblit struct, what happens when you have to reallocate a block to get more space? This cannot possibly be a move, because the compiler has no idea at the time of appending whether anything else has a reference to the original data. So should it just be a runtime error? I'm starting to think that disabled postblit structs *shouldn't* be able to be appended. -Steve
Jun 21 2011
On 2011-06-21 08:38:05 -0400, "Steven Schveighoffer" <schveiguy yahoo.com> said:On Tue, 21 Jun 2011 08:25:40 -0400, Michel Fortin <michel.fortin michelf.com> wrote:That's indeed a problem.While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer.Another issue with appending a disabled-postblit struct, what happens when you have to reallocate a block to get more space? This cannot possibly be a move, because the compiler has no idea at the time of appending whether anything else has a reference to the original data. So should it just be a runtime error?I'm starting to think that disabled postblit structs *shouldn't* be able to be appended.That would make sense. It should be a compile-time error. It would also turn appending using move to an optimization, because all the types you can append will be guarantied to be copyable. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 21 2011
On Tue, 21 Jun 2011 15:25:40 +0300, Michel Fortin <michel.fortin michelf.com> wrote:Actually, this should be the equivalent: import std.algorithm; S tmp; arr ~= move(tmp); While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer. Things might be clearer if we had a move operator, but instead we have a 'move' function. There is only one case where I think we can assume to have move semantics: when a temporary (a rvalue) is assigned to somewhere. That's also all that's needed for the 'move' function to work. And that is broken currently when it comes to array appending.It should be something else because move(tmp) in std.algorithm takes by reference and returns by value by actually moving it, because of the value semantics in D, that the ability to differentiate value from reference it doesn't need any other syntax because this is much better. I think it is pretty neat, yet i still have some trouble understanding its effect here. S tmp; arr ~= move(tmp); // would make an unnecessary copy. Move should do some kind of a magic there and treat its argument like a value, and return it. Something like: move(ref T a) return cast(T)a; Maybe it makes no sense at all but i tried!
Jun 21 2011
On 2011-06-21 09:24:29 -0400, so <so so.so> said:On Tue, 21 Jun 2011 15:25:40 +0300, Michel Fortin <michel.fortin michelf.com> wrote:Actually, no copy is needed. Move takes the argument by ref so it can obliterates it. Obliteration consists of replacing its bytes with those in S.init. That way if you have a smart pointer, it gets returned without having to update the reference count (since the source's content has been destroyed). It was effectively be moved, not copied. Note 1: Currently 'move' obliterates the source only if the type has a destructor or a postblit. I think it should always do it, but without inlining that might be a performance bottleneck. Note 2: Making move efficient in the case of appending might require a total rework of how the compiler interacts with the runtime. And I don't think you can optimize away all blitting unless the move function was treated specially by the compiler (or became a special operator). -- Michel Fortin michel.fortin michelf.com http://michelf.com/Actually, this should be the equivalent: import std.algorithm; S tmp; arr ~= move(tmp); While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer. Things might be clearer if we had a move operator, but instead we have a 'move' function. There is only one case where I think we can assume to have move semantics: when a temporary (a rvalue) is assigned to somewhere. That's also all that's needed for the 'move' function to work. And that is broken currently when it comes to array appending.It should be something else because move(tmp) in std.algorithm takes by reference and returns by value by actually moving it, because of the value semantics in D, that the ability to differentiate value from reference it doesn't need any other syntax because this is much better. I think it is pretty neat, yet i still have some trouble understanding its effect here. S tmp; arr ~= move(tmp); // would make an unnecessary copy. Move should do some kind of a magic there and treat its argument like a value, and return it.
Jun 21 2011
On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin <michel.fortin michelf.com> wrote:Actually, no copy is needed. Move takes the argument by ref so it can obliterates it. Obliteration consists of replacing its bytes with those in S.init. That way if you have a smart pointer, it gets returned without having to update the reference count (since the source's content has been destroyed). It was effectively be moved, not copied.T move(ref T a) { T b; move(a, b); return b; } T a; whatever = move(a); If T is a struct, i don't see how a copy is not needed looking at the current state of move.
Jun 21 2011
On 2011-06-21 12:13:32 -0400, so <so so.so> said:On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin <michel.fortin michelf.com> wrote:Actually, that depends on how you look at this. The essence of a move operation is that you just copy the bits and then obliterate the old ones. So yes, there's indeed a copy to do, but there's no need to call a copy constructor or a destructor because no new instance has been created, it has just been moved. If you don't call the copy constructor (postblit) then it's a move operation, not a copy operation, even though there's still a bitwise copy inside the move operation. In the return statement above, 'b' gets copied to 'whatever', then disappears along with the stack frame belonging to the function. So it becomes a move operation. (And it's even more direct than that with the named-value optimization.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/Actually, no copy is needed. Move takes the argument by ref so it can obliterates it. Obliteration consists of replacing its bytes with those in S.init. That way if you have a smart pointer, it gets returned without having to update the reference count (since the source's content has been destroyed). It was effectively be moved, not copied.T move(ref T a) { T b; move(a, b); return b; } T a; whatever = move(a); If T is a struct, i don't see how a copy is not needed looking at the current state of move.
Jun 21 2011
(resending) On 6/21/11 11:13 AM, so wrote:On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin <michel.fortin michelf.com> wrote:The rule that move and TDPL rely on but is not fully implemented is that returning a nonstatic local value never does a postblit nor a destructor - it just copies the bits. AndreiActually, no copy is needed. Move takes the argument by ref so it can obliterates it. Obliteration consists of replacing its bytes with those in S.init. That way if you have a smart pointer, it gets returned without having to update the reference count (since the source's content has been destroyed). It was effectively be moved, not copied.T move(ref T a) { T b; move(a, b); return b; } T a; whatever = move(a); If T is a struct, i don't see how a copy is not needed looking at the current state of move.
Jun 21 2011
On Jun 21, 2011, at 11:26 AM, Andrei Alexandrescu wrote:=20 The rule that move and TDPL rely on but is not fully implemented is =that returning a nonstatic local value never does a postblit nor a = destructor - it just copies the bits. So it's effectively illegal to have a struct containing a pointer that = references itself, correct?=
Jun 21 2011
On 6/21/11 4:24 PM, Sean Kelly wrote:On Jun 21, 2011, at 11:26 AM, Andrei Alexandrescu wrote:Illegal. All D structs must be transparently relocatable without breaking their invariant. AndreiThe rule that move and TDPL rely on but is not fully implemented is that returning a nonstatic local value never does a postblit nor a destructor - it just copies the bits.So it's effectively illegal to have a struct containing a pointer that references itself, correct?
Jun 21 2011
On Tue, 21 Jun 2011 04:59:49 +0300, Michel Fortin <michel.fortin michelf.com> wrote:Well, if a ~= S(); does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics.There was a similar discussion on struct constructors which ended up something like this, that it is an optimization. I fully agree it is not, move exists just the reasons like this.
Jun 21 2011