digitalmars.D.learn - Range returning an array
- Joseph Rushton Wakeling (41/41) Apr 09 2013 Hello all,
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (6/8) Apr 09 2013 Your problem is the exact situation that stdio.ByLine has faced. There,
- bearophile (7/12) Apr 09 2013 In similar situations I add a boolean template argument that
- Steven Schveighoffer (33/34) Apr 09 2013 1. documentation. Make sure the user of the range knows that this data ...
- Joseph Rushton Wakeling (10/15) Apr 09 2013 In this case it's quite unlikely anyone will ever want to use the code a...
- Steven Schveighoffer (31/40) Apr 09 2013 Well here is another solution:
- Joseph Rushton Wakeling (26/27) Apr 09 2013 I did consider something like that. The actual one I have come to after
- Joseph Rushton Wakeling (6/7) Apr 09 2013 By the way: the reason that I rejected the temporary-variable choice was...
- Steven Schveighoffer (5/15) Apr 09 2013 Calling front after empty is not good range policy, once empty, front is...
- H. S. Teoh (9/24) Apr 09 2013 [...]
- Steven Schveighoffer (12/33) Apr 09 2013 That is not necessary. A range does not just have to be a range, it jus...
- H. S. Teoh (9/42) Apr 09 2013 [...]
- Jesse Phillips (7/21) Apr 10 2013 I'm pretty sure he realizes this, his original code shows how he
- Joseph Rushton Wakeling (4/10) Apr 11 2013 Agree. I do also see the risks in these assumptions with respect to fro...
- Joseph Rushton Wakeling (11/13) Apr 11 2013 It's not necessarily required to actually call front with the range empt...
Hello all, I have a situation in a program I'm writing where it's convenient to define a range whose return value is an array. Something like: struct MySimulation(T) { T[] var; T diff, convergence; auto front() property { return var; } bool empty() property { return (diff < convergence); } void popFront() { // update values in var // and calculate diff } } Now, the problem with a design like this is that it's unsafe, in the sense that, let's say I do something like, T[] output; foreach(opvar; mySim) output = opvar; ... then, at the end of this loop, output will not be the same as it was set to with the last assignment, because popFront() will have been called one last time prior to the "empty" condition being set to true. Now, there are a number of ways I can think of to avoid this. One is to return var.dup rather than var from front(), but this is undesirable because it will hit performance in a big way. Another might be to calculate the updated value of var in a temporary array, and update var itself only if the diff is greater than the convergence criterion. However, I'm curious enough about this problem that I thought I'd throw it open to everyone else to see if anyone has a better idea than one of these. The challenge here is to ensure really good performance while not risking bad or incorrect results leaking outside of the range. Any thoughts? :-) Thanks & best wishes, -- Joe
Apr 09 2013
On 04/09/2013 01:03 PM, Joseph Rushton Wakeling wrote:to return var.dup rather than var from front(), but this is undesirable because it will hit performance in a big way.Your problem is the exact situation that stdio.ByLine has faced. There, the choice has been to use an internal buffer just like your 'var' and leave the .dup call to the user for when it is needed. That choice is faster but sometimes causes bugs. Ali
Apr 09 2013
Ali Çehreli:Your problem is the exact situation that stdio.ByLine has faced. There, the choice has been to use an internal buffer just like your 'var' and leave the .dup call to the user for when it is needed. That choice is faster but sometimes causes bugs.In similar situations I add a boolean template argument that switches on or off the dupping of the given result. On default it's on. So it's safe on default and faster on request, and this goes well with the D zen. Bye, bearophile
Apr 09 2013
On Tue, 09 Apr 2013 16:03:01 -0400, Joseph Rushton Wakeling <joseph.wakeling webdrake.net> wrote:Any thoughts? :-)1. documentation. Make sure the user of the range knows that this data is going to be re-used. 2. Make front() return const if possible. It is another signal that you aren't supposed to keep this data. 3. For your specific situation, add lastFront(): struct MySimulation(T) { T[] var; T[] lastVar; T diff, convergence; auto front() property { return var; } bool empty() property { return (diff < convergence); } void popFront() { lastVar[] = var[]; // update values in var // and calculate diff } auto lastFront() property { return lastVar; } } -Steve
Apr 09 2013
On 04/09/2013 11:02 PM, Steven Schveighoffer wrote:1. documentation. Make sure the user of the range knows that this data is going to be re-used.In this case it's quite unlikely anyone will ever want to use the code apart from me, but yes, good point. :-)2. Make front() return const if possible. It is another signal that you aren't supposed to keep this data. 3. For your specific situation, add lastFront():It's an interesting thought. I don't think it's ultimately the right way to go -- yes, my application rests strongly on finding the last value, but the problem is very simply that popFront kills the value _before_ finding out if the range is now empty. The only logical option that I can see is to tweak things so that doesn't happen, which is possible but probably a little bit more finnicky than the current implementation.
Apr 09 2013
On Tue, 09 Apr 2013 18:09:07 -0400, Joseph Rushton Wakeling <joseph.wakeling webdrake.net> wrote:On 04/09/2013 11:02 PM, Steven Schveighoffer wrote:Well here is another solution: struct MySimulation(T) { T[] var; T[] tmpvar; T diff, convergence; auto front() property { return var; } bool empty() property { return (diff < convergence); } void popFront() { tmpvar[] = var[]; // update values in var // and calculate diff if(empty) { var[] = tmpvar[]; // revert var = null; // optional, detach from original array } } } You could also use alloca to avoid storing tmpvar as a struct member and also avoid allocation. -Steve3. For your specific situation, add lastFront():It's an interesting thought. I don't think it's ultimately the right way to go -- yes, my application rests strongly on finding the last value, but the problem is very simply that popFront kills the value _before_ finding out if the range is now empty.
Apr 09 2013
On 04/10/2013 12:33 AM, Steven Schveighoffer wrote:Well here is another solutionI did consider something like that. The actual one I have come to after thinking about it is like this: struct MySimulation(T) { T[] var; T diff = T.max; T convergence; _bool empty = false; auto front() property { return var; } bool empty() property { return _empty; } void popFront() { if(diff < convergence) _empty = true; else // update var and calculate new diff } } How does this look to people?
Apr 09 2013
On 04/10/2013 12:50 AM, Joseph Rushton Wakeling wrote:I did consider something like that.By the way: the reason that I rejected the temporary-variable choice was that I couldn't really see the difference cost-wise between doing that, versus returning var.dup from front(). Especially as it's not necessarily guaranteed that front will be called frequently (I might just popFront() until the range is empty and then take the final front value).
Apr 09 2013
On Tue, 09 Apr 2013 18:53:56 -0400, Joseph Rushton Wakeling <joseph.wakeling webdrake.net> wrote:On 04/10/2013 12:50 AM, Joseph Rushton Wakeling wrote:Calling front after empty is not good range policy, once empty, front is possibly invalid or points at invalid memory. -SteveI did consider something like that.By the way: the reason that I rejected the temporary-variable choice was that I couldn't really see the difference cost-wise between doing that, versus returning var.dup from front(). Especially as it's not necessarily guaranteed that front will be called frequently (I might just popFront() until the range is empty and then take the final front value).
Apr 09 2013
On Tue, Apr 09, 2013 at 07:18:25PM -0400, Steven Schveighoffer wrote:On Tue, 09 Apr 2013 18:53:56 -0400, Joseph Rushton Wakeling <joseph.wakeling webdrake.net> wrote:[...] I believe it was proposed that .front should assert if .empty returns true. Personally I think that's a bit too extreme, but nevertheleses, I agree that calling .front after .empty returns true is sloppy coding, and can easily lead to bugs or runtime errors. T -- The two rules of success: 1. Don't tell everything you know. -- YHLOn 04/10/2013 12:50 AM, Joseph Rushton Wakeling wrote:Calling front after empty is not good range policy, once empty, front is possibly invalid or points at invalid memory.I did consider something like that.By the way: the reason that I rejected the temporary-variable choice was that I couldn't really see the difference cost-wise between doing that, versus returning var.dup from front(). Especially as it's not necessarily guaranteed that front will be called frequently (I might just popFront() until the range is empty and then take the final front value).
Apr 09 2013
On Tue, 09 Apr 2013 19:31:59 -0400, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:On Tue, Apr 09, 2013 at 07:18:25PM -0400, Steven Schveighoffer wrote:That is not necessary. A range does not just have to be a range, it just has to support a range interface. It can be perfectly valid for front to not assert. It can be also valid for it to assert. It just means that functions that generically operate on ranges cannot count on front working after empty is true (and that WOULD be a bug). For example, if the OP replaced his range with someone else's, he may run into that situation. But as long as he is the author, it is not illegal for an object that is also a range to be empty and have front still work. -SteveOn Tue, 09 Apr 2013 18:53:56 -0400, Joseph Rushton Wakeling <joseph.wakeling webdrake.net> wrote:[...] I believe it was proposed that .front should assert if .empty returns true. Personally I think that's a bit too extreme, but nevertheleses, I agree that calling .front after .empty returns true is sloppy coding, and can easily lead to bugs or runtime errors.On 04/10/2013 12:50 AM, Joseph Rushton Wakeling wrote:Calling front after empty is not good range policy, once empty, front is possibly invalid or points at invalid memory.I did consider something like that.By the way: the reason that I rejected the temporary-variable choice was that I couldn't really see the difference cost-wise between doing that, versus returning var.dup from front(). Especially as it's not necessarily guaranteed that front will be called frequently (I might just popFront() until the range is empty and then take the final front value).
Apr 09 2013
On Tue, Apr 09, 2013 at 09:59:42PM -0400, Steven Schveighoffer wrote:On Tue, 09 Apr 2013 19:31:59 -0400, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:[...] Yes, that was my point. I didn't mean to say that front *should* assert, just that it *might* for some ranges. (In fact, I've written my own ranges that do support calling .front even when .empty is true.) In generic code, though, one should not assume one way or another. T -- Study gravitation, it's a field with a lot of potential.On Tue, Apr 09, 2013 at 07:18:25PM -0400, Steven Schveighoffer wrote:That is not necessary. A range does not just have to be a range, it just has to support a range interface. It can be perfectly valid for front to not assert. It can be also valid for it to assert. It just means that functions that generically operate on ranges cannot count on front working after empty is true (and that WOULD be a bug).On Tue, 09 Apr 2013 18:53:56 -0400, Joseph Rushton Wakeling <joseph.wakeling webdrake.net> wrote:[...] I believe it was proposed that .front should assert if .empty returns true. Personally I think that's a bit too extreme, but nevertheleses, I agree that calling .front after .empty returns true is sloppy coding, and can easily lead to bugs or runtime errors.On 04/10/2013 12:50 AM, Joseph Rushton Wakeling wrote:Calling front after empty is not good range policy, once empty, front is possibly invalid or points at invalid memory.I did consider something like that.By the way: the reason that I rejected the temporary-variable choice was that I couldn't really see the difference cost-wise between doing that, versus returning var.dup from front(). Especially as it's not necessarily guaranteed that front will be called frequently (I might just popFront() until the range is empty and then take the final front value).
Apr 09 2013
On Tuesday, 9 April 2013 at 23:18:26 UTC, Steven Schveighoffer wrote:On Tue, 09 Apr 2013 18:53:56 -0400, Joseph Rushton Wakeling <joseph.wakeling webdrake.net> wrote:I'm pretty sure he realizes this, his original code shows how he is doing this. He expects 'output' to hold the final front value, but instead it holds the empty "value." Joseph, I think you will have to profile to decide which is the fastest for your use case.By the way: the reason that I rejected the temporary-variable choice was that I couldn't really see the difference cost-wise between doing that, versus returning var.dup from front(). Especially as it's not necessarily guaranteed that front will be called frequently (I might just popFront() until the range is empty and then take the final front value).Calling front after empty is not good range policy, once empty, front is possibly invalid or points at invalid memory. -Steve
Apr 10 2013
On 04/10/2013 08:22 PM, Jesse Phillips wrote:I'm pretty sure he realizes this, his original code shows how he is doing this. He expects 'output' to hold the final front value, but instead it holds the empty "value." Joseph, I think you will have to profile to decide which is the fastest for your use case.Agree. I do also see the risks in these assumptions with respect to front -- they can be safely worked around within my code but it's not a very friendly construction to put in front of other people.
Apr 11 2013
On 04/10/2013 01:18 AM, Steven Schveighoffer wrote:Calling front after empty is not good range policy, once empty, front is possibly invalid or points at invalid memory.It's not necessarily required to actually call front with the range empty: one could do something like, for(; !sim.empty; sim.popFront()) v = sim.front; ... and the way that popFront() operates means that, while the output of front() may be transient in general, the last value will remain. Though I agree that, even in this case, it's is a potentially problematic situation (there's something weird about the idea of a range that is transient except for the last entry). I will have to profile with .dup enabled and see what changes.
Apr 11 2013