digitalmars.D - Overhauling the notion of output range
- Andrei Alexandrescu (28/28) Jul 11 2010 The notion of output range has been a tad vague in the past; up until
- Andrei Alexandrescu (5/5) Jul 11 2010 On 07/11/2010 08:17 PM, Andrei Alexandrescu wrote:
- dsimcha (21/77) Jul 11 2010 Nice. I had quietly wondered why input ranges with assignable front are...
- Andrei Alexandrescu (7/24) Jul 11 2010 There is no standard way at the moment. I think not throwing implies all...
- Philippe Sigaud (7/13) Jul 11 2010 It's still early where I live, but... For the callable case, why just
- Andrei Alexandrescu (5/19) Jul 11 2010 Good point. I haven't thought of it that way - I used arrays as a lingua...
- Steven Schveighoffer (9/33) Jul 12 2010 Wait, isn't a delegate that accepts a type T an output range of type T? ...
- Andrei Alexandrescu (9/44) Jul 12 2010 Good call. Probably I need to handle ranges of arrays differently. (The
- Steven Schveighoffer (17/62) Jul 12 2010 Yes, I'm not saying that a delegate that accepts a range of T cannot be ...
- Andrei Alexandrescu (4/67) Jul 12 2010 No, the library does that. Look here:
- Steven Schveighoffer (17/24) Jul 12 2010 So you're saying it's not ok for an output range to support appending a ...
- torhu (4/8) Jul 12 2010 put(r, e) prefers to call r.put(e) for single element adds. Doesn't
- Steven Schveighoffer (7/15) Jul 12 2010 No, dcollections doesn't define put. But you could take a delegate to
- Andrei Alexandrescu (3/19) Jul 12 2010 Efficiency?
- dsimcha (23/46) Jul 12 2010 But efficiency only matters some of the time, assuming the kind of ineff...
- Andrei Alexandrescu (7/15) Jul 12 2010 I guess that's fine, though I clearly recall that doFormat was, at the
- dsimcha (4/9) Jul 12 2010 I'm not saying don't encourage the use of delegates that take ranges whe...
- Andrei Alexandrescu (35/61) Jul 12 2010 Yes. The point is that with a delegate you must choose between accepting...
- Steven Schveighoffer (72/112) Jul 12 2010 But given a delegate that takes a single element, there's no way to wrap...
- Andrei Alexandrescu (16/51) Jul 12 2010 void delegate(int) perItem;
- Steven Schveighoffer (28/62) Jul 12 2010 Yeah, I realized after posting that it was incorrect :) But encouraging...
- Andrei Alexandrescu (10/12) Jul 12 2010 It's very simple. As far as a user of an output range is concerned, they...
- Steven Schveighoffer (8/19) Jul 12 2010 How does that work for a range whose front() can be assigned a dchar?
- Andrei Alexandrescu (4/25) Jul 12 2010 A char[] should be a valid output range for a dchar. I forgot to encode
- Andrei Alexandrescu (39/60) Jul 12 2010 Actually a char[] is not a valid output range. Overwriting
- Andrei Alexandrescu (5/6) Jul 12 2010 [snip]
- Philippe Sigaud (13/13) Jul 12 2010 On Tue, Jul 13, 2010 at 05:18, Andrei Alexandrescu <
- Steven Schveighoffer (5/8) Jul 13 2010 I think struct with opCall is ok. That's pretty much all a delegate is ...
- Andrei Alexandrescu (4/11) Jul 13 2010 Yes, the delegate is mostly used as an example. Any syntactically valid
- Steven Schveighoffer (15/78) Jul 13 2010 Hm... I think it should be, and here is why. Imagine this situation:
- Andrei Alexandrescu (4/20) Jul 13 2010 Appender doesn't currently work with fixed-size buffers, but it could
- Christian Kamm (35/39) Jul 12 2010 I don't think the current implementation of put allows passing E and E[]...
- Michel Fortin (12/16) Jul 12 2010 If this means what I think, it means put() cannot be memory safe.
- BLS (7/10) Jul 12 2010 What if we have to deal with non orthogonal structures, or .. simple
The notion of output range has been a tad vague in the past; up until now a range that wanted to qualify as an output range had to define a method called put. That definition is awkward though. For example, the std.algorithm.copy() primitive should work for an output range, but also for some other ranges that allow assignment to front. In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! All of the three mechanisms above are desirable for certain types. So imagine this dialog (paraphrased from http://www.youtube.com/watch?v=MrTsuvykUZk): Guy 1: We need a good output range interface, and we have three good candidates. We should make every one an output range. Guy 2: What do you mean, every one? Guy 1: E-V-E-R-Y O-N-E!!!! So, I defined isOutputRange!R to yield true if at least one of these conditions above is met: put() definition, input range with assignable front, delegate. Please refer to this code and this doc: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227 http://erdani.com/d/phobos/std_range.html This confers a ton of flexibility to programmers who need to define output ranges. I've also reworked std.format to use put(r, e) so now it works with all output ranges seamlessly. Any thoughts would be appreciated. Thanks! Andrei
Jul 11 2010
On 07/11/2010 08:17 PM, Andrei Alexandrescu wrote: [snip] Forgot to mention one detail: now the free function std.range.put() serves as a convenient dispatcher for any kind of output range. Andrei
Jul 11 2010
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articleThe notion of output range has been a tad vague in the past; up until now a range that wanted to qualify as an output range had to define a method called put. That definition is awkward though. For example, the std.algorithm.copy() primitive should work for an output range, but also for some other ranges that allow assignment to front. In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! All of the three mechanisms above are desirable for certain types. So imagine this dialog (paraphrased from http://www.youtube.com/watch?v=MrTsuvykUZk): Guy 1: We need a good output range interface, and we have three good candidates. We should make every one an output range. Guy 2: What do you mean, every one? Guy 1: E-V-E-R-Y O-N-E!!!! So, I defined isOutputRange!R to yield true if at least one of these conditions above is met: put() definition, input range with assignable front, delegate. Please refer to this code and this doc: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227 http://erdani.com/d/phobos/std_range.html This confers a ton of flexibility to programmers who need to define output ranges. I've also reworked std.format to use put(r, e) so now it works with all output ranges seamlessly. Any thoughts would be appreciated. Thanks! Andrei== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articleThe notion of output range has been a tad vague in the past; up until now a range that wanted to qualify as an output range had to define a method called put. That definition is awkward though. For example, the std.algorithm.copy() primitive should work for an output range, but also for some other ranges that allow assignment to front. In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! All of the three mechanisms above are desirable for certain types. So imagine this dialog (paraphrased from http://www.youtube.com/watch?v=MrTsuvykUZk): Guy 1: We need a good output range interface, and we have three good candidates. We should make every one an output range. Guy 2: What do you mean, every one? Guy 1: E-V-E-R-Y O-N-E!!!! So, I defined isOutputRange!R to yield true if at least one of these conditions above is met: put() definition, input range with assignable front, delegate. Please refer to this code and this doc: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227 http://erdani.com/d/phobos/std_range.html This confers a ton of flexibility to programmers who need to define output ranges. I've also reworked std.format to use put(r, e) so now it works with all output ranges seamlessly. Any thoughts would be appreciated. Thanks! AndreiNice. I had quietly wondered why input ranges with assignable front aren't also output ranges, but never gotten around to asking. Two comments: 1. What happens if you run out of space in the input range variety? I guess it throws? 2. Would there be a "standard" way of signaling how much stuff was written to the input range variety? I guess that since functions that output their results via output ranges would usually return void, they could return an integer instead indicating how much stuff was written. 3. While we're on the subject of improving output ranges, I was just thinking before I read your post that it would be nice to have a Tee type in std.range, since outputting to exactly one output range is kind of inflexible. Such a type would itself be an output range. It would take in N output ranges where N > 1 as instantiation parameters, and pass any input received to all of the underlying ranges. The use case I thought of for this is when I'm generating tons of data through a monte carlo simulation and don't want to store it all in memory. Both Summary in dstats and Histogram in dflplot can be used as output ranges. I'd love to write a function that outputs results to an output range and use Tee to get both summary statistics and a histogram.
Jul 11 2010
On 07/11/2010 08:34 PM, dsimcha wrote:1. What happens if you run out of space in the input range variety? I guess it throws?It's up to the range. Most will throw.2. Would there be a "standard" way of signaling how much stuff was written to the input range variety? I guess that since functions that output their results via output ranges would usually return void, they could return an integer instead indicating how much stuff was written.There is no standard way at the moment. I think not throwing implies all passed data has been written.3. While we're on the subject of improving output ranges, I was just thinking before I read your post that it would be nice to have a Tee type in std.range, since outputting to exactly one output range is kind of inflexible. Such a type would itself be an output range. It would take in N output ranges where N> 1 as instantiation parameters, and pass any input received to all of the underlying ranges.Yah, tee would be great.The use case I thought of for this is when I'm generating tons of data through a monte carlo simulation and don't want to store it all in memory. Both Summary in dstats and Histogram in dflplot can be used as output ranges. I'd love to write a function that outputs results to an output range and use Tee to get both summary statistics and a histogram.... and checkpoint all that to a file. Andrei
Jul 11 2010
On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote:In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227>http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227It's still early where I live, but... For the callable case, why just accepting E[] instead of any range with E as element? Though, thinking about it, I right now have no idea how to put that into a template constraint, given only R and E. Hmm...
Jul 11 2010
On 07/12/2010 12:45 AM, Philippe Sigaud wrote:On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> wrote: In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227> http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227 It's still early where I live, but... For the callable case, why just accepting E[] instead of any range with E as element? Though, thinking about it, I right now have no idea how to put that into a template constraint, given only R and E. Hmm...Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion. Andrei
Jul 11 2010
On Mon, 12 Jul 2010 02:21:25 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 07/12/2010 12:45 AM, Philippe Sigaud wrote:Wait, isn't a delegate that accepts a type T an output range of type T? Why does the argument have to be an array/range? For example, a delegate that accepts a string is a range of strings, is it not? Or do you consider it a range of immutable(char)? For example, I'd expect to be able to use a push_back delegate on an array-type of integers as an output range. -SteveOn Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> wrote: In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227> http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227 It's still early where I live, but... For the callable case, why just accepting E[] instead of any range with E as element? Though, thinking about it, I right now have no idea how to put that into a template constraint, given only R and E. Hmm...Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion.
Jul 12 2010
On 07/12/2010 07:44 AM, Steven Schveighoffer wrote:On Mon, 12 Jul 2010 02:21:25 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Efficiency - see the doFormat disaster.On 07/12/2010 12:45 AM, Philippe Sigaud wrote:Wait, isn't a delegate that accepts a type T an output range of type T? Why does the argument have to be an array/range?On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> wrote: In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227> http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227 It's still early where I live, but... For the callable case, why just accepting E[] instead of any range with E as element? Though, thinking about it, I right now have no idea how to put that into a template constraint, given only R and E. Hmm...Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion.For example, a delegate that accepts a string is a range of strings, is it not? Or do you consider it a range of immutable(char)?Good call. Probably I need to handle ranges of arrays differently. (The problem has already come up, but I punted on it.)For example, I'd expect to be able to use a push_back delegate on an array-type of integers as an output range.Would it hurt to define push_back for arrays too? The thing is, if you can output an array you can always output one occasional element. But if you only know how to output an element, having the client side output arrays in a loop can be quite slow. Andrei
Jul 12 2010
On Mon, 12 Jul 2010 10:41:51 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 07/12/2010 07:44 AM, Steven Schveighoffer wrote:Yes, I'm not saying that a delegate that accepts a range of T cannot be an output range of T, I just wondered why it *has* to be a range.On Mon, 12 Jul 2010 02:21:25 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Efficiency - see the doFormat disaster.On 07/12/2010 12:45 AM, Philippe Sigaud wrote:Wait, isn't a delegate that accepts a type T an output range of type T? Why does the argument have to be an array/range?On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> wrote: In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227> http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227 It's still early where I live, but... For the callable case, why just accepting E[] instead of any range with E as element? Though, thinking about it, I right now have no idea how to put that into a template constraint, given only R and E. Hmm...Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion.It seems to me to be similar to appending -- you can append an element or an array of elements. But appending the individual element makes sense in a lot of cases, despite performance. If output ranges were restricted to deal only with stream data (i.e. char) then I'd agree only accepting an array of data is a good idea.For example, a delegate that accepts a string is a range of strings, is it not? Or do you consider it a range of immutable(char)?Good call. Probably I need to handle ranges of arrays differently. (The problem has already come up, but I punted on it.)If I always have to do something like this in order to append a single element: put(r, (&elem)[0..1]); Then the concept of output ranges is much less attractive to me. In some cases, appending a single element is all that works. For example, a linked list could be an output range, and passing it an array is not going to be any more optimal than passing individual elements. -SteveFor example, I'd expect to be able to use a push_back delegate on an array-type of integers as an output range.Would it hurt to define push_back for arrays too? The thing is, if you can output an array you can always output one occasional element. But if you only know how to output an element, having the client side output arrays in a loop can be quite slow.
Jul 12 2010
On 07/12/2010 09:59 AM, Steven Schveighoffer wrote:On Mon, 12 Jul 2010 10:41:51 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:No, the library does that. Look here: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L306 AndreiOn 07/12/2010 07:44 AM, Steven Schveighoffer wrote:Yes, I'm not saying that a delegate that accepts a range of T cannot be an output range of T, I just wondered why it *has* to be a range.On Mon, 12 Jul 2010 02:21:25 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Efficiency - see the doFormat disaster.On 07/12/2010 12:45 AM, Philippe Sigaud wrote:Wait, isn't a delegate that accepts a type T an output range of type T? Why does the argument have to be an array/range?On Mon, Jul 12, 2010 at 03:17, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> wrote: In related news, there's been this burning desire regarding a lightweight output range defined as simply a delegate that accepts const(char)[]. That range is an output range all right, and the way you output to it is by simply calling the delegate! <http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227> http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L227 It's still early where I live, but... For the callable case, why just accepting E[] instead of any range with E as element? Though, thinking about it, I right now have no idea how to put that into a template constraint, given only R and E. Hmm...Good point. I haven't thought of it that way - I used arrays as a lingua franca buffer. Your suggestion is interesting. I see a risk of infinite regression in writing the constraint, but the idea warrants more discussion.It seems to me to be similar to appending -- you can append an element or an array of elements. But appending the individual element makes sense in a lot of cases, despite performance. If output ranges were restricted to deal only with stream data (i.e. char) then I'd agree only accepting an array of data is a good idea.For example, a delegate that accepts a string is a range of strings, is it not? Or do you consider it a range of immutable(char)?Good call. Probably I need to handle ranges of arrays differently. (The problem has already come up, but I punted on it.)If I always have to do something like this in order to append a single element: put(r, (&elem)[0..1]);For example, I'd expect to be able to use a push_back delegate on an array-type of integers as an output range.Would it hurt to define push_back for arrays too? The thing is, if you can output an array you can always output one occasional element. But if you only know how to output an element, having the client side output arrays in a loop can be quite slow.
Jul 12 2010
On Mon, 12 Jul 2010 11:05:33 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 07/12/2010 09:59 AM, Steven Schveighoffer wrote:So you're saying it's not ok for an output range to support appending a single element, but it's ok for put to support appending a single element? Well, all that will end up happening is cases where appending a single element is the only possibility will produce overloaded add functions, one that takes a single element, and one that takes an array. The one that takes an array will look like this: foreach(e; arr) add(e); I can tell you this for sure, because it's exactly what's in many dcollections classes. So what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable. -SteveIf I always have to do something like this in order to append a single element: put(r, (&elem)[0..1]);No, the library does that. Look here: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L306
Jul 12 2010
On 12.07.2010 18:48, Steven Schveighoffer wrote: [...]So what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable.put(r, e) prefers to call r.put(e) for single element adds. Doesn't that take care of it?
Jul 12 2010
On Mon, 12 Jul 2010 13:43:18 -0400, torhu <no spam.invalid> wrote:On 12.07.2010 18:48, Steven Schveighoffer wrote: [...]No, dcollections doesn't define put. But you could take a delegate to add. The question is, why the prejudice against the single element version when using a delegate as an output range? The fact that an output range can define a put function that takes a single element, but you can't use a delegate just makes no sense to me. -SteveSo what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable.put(r, e) prefers to call r.put(e) for single element adds. Doesn't that take care of it?
Jul 12 2010
On 07/12/2010 12:55 PM, Steven Schveighoffer wrote:On Mon, 12 Jul 2010 13:43:18 -0400, torhu <no spam.invalid> wrote:Efficiency? AndreiOn 12.07.2010 18:48, Steven Schveighoffer wrote: [...]No, dcollections doesn't define put. But you could take a delegate to add. The question is, why the prejudice against the single element version when using a delegate as an output range? The fact that an output range can define a put function that takes a single element, but you can't use a delegate just makes no sense to me. -SteveSo what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable.put(r, e) prefers to call r.put(e) for single element adds. Doesn't that take care of it?
Jul 12 2010
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articleOn 07/12/2010 12:55 PM, Steven Schveighoffer wrote:But efficiency only matters some of the time, assuming the kind of inefficiency in question is small constant term inefficiency (as is the case here) and not O(N!) inefficiency or 1000-fold constant term inefficiency. Calling a delegate once per element is inefficient, but not **that** inefficient, and it's convenient at times, such as when the delegate will be used in contexts other than as an output range. I think inefficiency is a poor excuse for not supporting this, and the users of Phobos should decide on a case-by-case basis whether calling a delegate for every element is efficient enough. The following benchmark takes a whopping 830 milliseconds on my computer, for a **hundred million** iterations: import std.stdio, std.perf; void main() { void foo() {} auto pc = new PerformanceCounter; pc.start; auto del = &foo; foreach(i; 0..100_000_000) { del(); } pc.stop; writeln(pc.milliseconds); }On Mon, 12 Jul 2010 13:43:18 -0400, torhu <no spam.invalid> wrote:Efficiency? AndreiOn 12.07.2010 18:48, Steven Schveighoffer wrote: [...]No, dcollections doesn't define put. But you could take a delegate to add. The question is, why the prejudice against the single element version when using a delegate as an output range? The fact that an output range can define a put function that takes a single element, but you can't use a delegate just makes no sense to me. -SteveSo what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable.put(r, e) prefers to call r.put(e) for single element adds. Doesn't that take care of it?
Jul 12 2010
On 07/12/2010 01:31 PM, dsimcha wrote:But efficiency only matters some of the time, assuming the kind of inefficiency in question is small constant term inefficiency (as is the case here) and not O(N!) inefficiency or 1000-fold constant term inefficiency.Historically the constant was large.Calling a delegate once per element is inefficient, but not **that** inefficient, and it's convenient at times, such as when the delegate will be used in contexts other than as an output range. I think inefficiency is a poor excuse for not supporting this, and the users of Phobos should decide on a case-by-case basis whether calling a delegate for every element is efficient enough.I guess that's fine, though I clearly recall that doFormat was, at the time I decided to rewrite it (2008 or so), virtually unusable for serious workloads. I don't know to what extent improvement in the compiler have eroded that margin. Andrei
Jul 12 2010
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articleOn 07/12/2010 01:31 PM, dsimcha wrote:I'm not saying don't encourage the use of delegates that take ranges where it makes sense, such as for formatting text. I'm just saying support both and don't force their use even where it doesn't matter.But efficiency only matters some of the time, assuming the kind of inefficiency in question is small constant term inefficiency (as is the case here) and not O(N!) inefficiency or 1000-fold constant term inefficiency.Historically the constant was large.
Jul 12 2010
On 07/12/2010 11:48 AM, Steven Schveighoffer wrote:On Mon, 12 Jul 2010 11:05:33 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Yes. The point is that with a delegate you must choose between accepting E and E[]. Given the constraint, it's better for everyone to accept E[] and let put() take care of the occasional E by doing the wraparoo (&elem)[0..1].On 07/12/2010 09:59 AM, Steven Schveighoffer wrote:So you're saying it's not ok for an output range to support appending a single element, but it's ok for put to support appending a single element?If I always have to do something like this in order to append a single element: put(r, (&elem)[0..1]);No, the library does that. Look here: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L306Well, all that will end up happening is cases where appending a single element is the only possibility will produce overloaded add functions, one that takes a single element, and one that takes an array. The one that takes an array will look like this: foreach(e; arr) add(e);That's fine. the point is that if you put this loop on the wrong side of the delegate, it's much less efficient.I can tell you this for sure, because it's exactly what's in many dcollections classes. So what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable.Ah, I see. There is a confusion. The array restriction is only for delegates. For straight ranges, you should accept individual Es. Let me copy the current definition of isOutputRange: template isOutputRange(R, E) { enum bool isOutputRange = is(typeof( { R r; E e; r.put(e); // can write element to range }())) || isInputRange!R && is(typeof( { R r; E e; r.front = e; // can assign to the front of range }())) || is(typeof( { R r; E[] es; r(es); }())); } Notice how the E[] requirement is for delegates only. Andrei
Jul 12 2010
On Mon, 12 Jul 2010 13:49:50 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 07/12/2010 11:48 AM, Steven Schveighoffer wrote:But given a delegate that takes a single element, there's no way to wrap it so it can be an output range. Yet such a delegate can easily be something that outputs something. Indeed, a delegate that takes a string takes a single element, but because a string happens to be defined as a range of chars, it passes the test for output ranges. I could loop on an array of strings of one character, and output that to a valid output range no problem. The only thing that solves this problem correctly is buffering. What if I have my own container types that are large chunks of data, but don't happen to define the input range primitives? Why should I be artificially prevented from using those as input to output ranges? Really to me, you are saying, "I want your delegate to be efficient", but you defined something that is related to that in a small set of circumstances (when the arrays being passed in are large).On Mon, 12 Jul 2010 11:05:33 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Yes. The point is that with a delegate you must choose between accepting E and E[]. Given the constraint, it's better for everyone to accept E[] and let put() take care of the occasional E by doing the wraparoo (&elem)[0..1].On 07/12/2010 09:59 AM, Steven Schveighoffer wrote:So you're saying it's not ok for an output range to support appending a single element, but it's ok for put to support appending a single element?If I always have to do something like this in order to append a single element: put(r, (&elem)[0..1]);No, the library does that. Look here: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L306Here's a proposal for put/isOutputRange which would solve my problem and not have any for loops in it: template isOutputRange(R, E) { enum bool isOutputRange = is(typeof( { R r; E e; r.put(e); // can write element to range }())) || isInputRange!R && is(typeof( { R r; E e; r.front = e; // can assign to the front of range }())) || is(typeof( { R r; E[] es; r(es); }())) || is(typeof( { R r; E es; r(es); }())); } void put(R, E)(ref R r, E e) if (isOutputRange!(R, E)) { static if (!isArray!R && is(typeof(r.put(e)))) { r.put(e); } else static if (isInputRange!R && is(typeof(r.front = e))) { r.front = e; r.popFront(); } else static if (is(typeof(r(e)))) { r(e); } else { static assert(false); } }Well, all that will end up happening is cases where appending a single element is the only possibility will produce overloaded add functions, one that takes a single element, and one that takes an array. The one that takes an array will look like this: foreach(e; arr) add(e);That's fine. the point is that if you put this loop on the wrong side of the delegate, it's much less efficient.Why the discrepency? A naive coder can define inefficient ranges just as well as he can define inefficient delegates. -SteveI can tell you this for sure, because it's exactly what's in many dcollections classes. So what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable.Ah, I see. There is a confusion. The array restriction is only for delegates. For straight ranges, you should accept individual Es.
Jul 12 2010
On 07/12/2010 01:47 PM, Steven Schveighoffer wrote:On Mon, 12 Jul 2010 13:49:50 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:void delegate(int) perItem; void ofCourseThereIsAWay(int[] items) { foreach (e; items) perItem(i); }Yes. The point is that with a delegate you must choose between accepting E and E[]. Given the constraint, it's better for everyone to accept E[] and let put() take care of the occasional E by doing the wraparoo (&elem)[0..1].But given a delegate that takes a single element, there's no way to wrap it so it can be an output range. Yet such a delegate can easily be something that outputs something.Indeed, a delegate that takes a string takes a single element, but because a string happens to be defined as a range of chars, it passes the test for output ranges.Yah, that's a good point.I could loop on an array of strings of one character, and output that to a valid output range no problem. The only thing that solves this problem correctly is buffering.I don't understand the point here.What if I have my own container types that are large chunks of data, but don't happen to define the input range primitives? Why should I be artificially prevented from using those as input to output ranges?I don't understand this either.Really to me, you are saying, "I want your delegate to be efficient", but you defined something that is related to that in a small set of circumstances (when the arrays being passed in are large).What I'm saying is, "If you know how to output one item you may as well output several, as I'm producing them in bulk".Here's a proposal for put/isOutputRange which would solve my problem and not have any for loops in it:[snip] I'm uncomfortable about allowing inefficiency by design if I can help it, but I guess the costs all depend on the costs of trafficking one item.That doesn't mean we shouldn't foster the mechanisms that reduce the chance of bad designs happening. AndreiWhy the discrepency? A naive coder can define inefficient ranges just as well as he can define inefficient delegates.I can tell you this for sure, because it's exactly what's in many dcollections classes. So what happens when you call put(r, e) for one of these output classes? Instead of just calling add(e), it calls (add((&e)[0..1])) which in turn goes through some needless loop, which then ends up calling add(e). I don't see why this is preferable.Ah, I see. There is a confusion. The array restriction is only for delegates. For straight ranges, you should accept individual Es.
Jul 12 2010
On Mon, 12 Jul 2010 15:18:17 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 07/12/2010 01:47 PM, Steven Schveighoffer wrote:Yeah, I realized after posting that it was incorrect :) But encouraging this kind of thing is probably not fostering efficient code (roll your own delegate when the compiler complains). It goes to my example with dcollections and add.On Mon, 12 Jul 2010 13:49:50 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:void delegate(int) perItem; void ofCourseThereIsAWay(int[] items) { foreach (e; items) perItem(i); }Yes. The point is that with a delegate you must choose between accepting E and E[]. Given the constraint, it's better for everyone to accept E[] and let put() take care of the occasional E by doing the wraparoo (&elem)[0..1].But given a delegate that takes a single element, there's no way to wrap it so it can be an output range. Yet such a delegate can easily be something that outputs something.My point is, if the reason behind requiring arrays instead of single elements is to force efficiency, then the reason is flawed. I can make inefficient code even when having to pass arrays to an output range. To solve this, you could make an output range that buffers elements until it has enough to pass to an underlying sink that takes an array of elements. The point is, you've only gone halfway to ensuring efficiency. But going the full way means you are imposing possibly bad buffering semantics on everything. If you can only go halfway, then I think the design is more annoying than successful.I could loop on an array of strings of one character, and output that to a valid output range no problem. The only thing that solves this problem correctly is buffering.I don't understand the point here.Well, I had started constructing an example of how this would work, but I realize, I have no idea how you intend to use such a delegate as an output range... I don't really know how such output ranges will be generically usable in conjunction with output ranges which support the put interface.What if I have my own container types that are large chunks of data, but don't happen to define the input range primitives? Why should I be artificially prevented from using those as input to output ranges?I don't understand this either.I have no control over being able to output items in bulk or not, all I have is a delegate. What do I do?Really to me, you are saying, "I want your delegate to be efficient", but you defined something that is related to that in a small set of circumstances (when the arrays being passed in are large).What I'm saying is, "If you know how to output one item you may as well output several, as I'm producing them in bulk".I'm unsure how it will work either. I admit now that I didn't think through how this will be used. I imagined that the delegate version would be substituted for an output range which takes an element at a time, but now I'm not sure how you could write generic code that does that, given that the delegate must take an array of elements instead. I guess I'll wait to see how it works before objecting any further. -SteveHere's a proposal for put/isOutputRange which would solve my problem and not have any for loops in it:[snip] I'm uncomfortable about allowing inefficiency by design if I can help it, but I guess the costs all depend on the costs of trafficking one item.
Jul 12 2010
On 07/12/2010 02:41 PM, Steven Schveighoffer wrote:I'm unsure how it will work either. I admit now that I didn't think through how this will be used.It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient. Andrei
Jul 12 2010
On Mon, 12 Jul 2010 17:25:43 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 07/12/2010 02:41 PM, Steven Schveighoffer wrote:How does that work for a range whose front() can be assigned a dchar? Wait, it doesn't, because it won't compile. But wouldn't that be the same for a delegate that takes a dchar? I'm very confused at what you are trying to do. I expected that a char[] would be a valid output range. -SteveI'm unsure how it will work either. I admit now that I didn't think through how this will be used.It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient.
Jul 12 2010
On 07/12/2010 04:39 PM, Steven Schveighoffer wrote:On Mon, 12 Jul 2010 17:25:43 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:A char[] should be a valid output range for a dchar. I forgot to encode all valid strings situations, working on that now. AndreiOn 07/12/2010 02:41 PM, Steven Schveighoffer wrote:How does that work for a range whose front() can be assigned a dchar? Wait, it doesn't, because it won't compile. But wouldn't that be the same for a delegate that takes a dchar? I'm very confused at what you are trying to do. I expected that a char[] would be a valid output range.I'm unsure how it will work either. I admit now that I didn't think through how this will be used.It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient.
Jul 12 2010
On 07/12/2010 04:39 PM, Steven Schveighoffer wrote:On Mon, 12 Jul 2010 17:25:43 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Actually a char[] is not a valid output range. Overwriting variable-length codes with other variable-length codes might mess up the string. Here's what I have. Works? void put(R, E)(ref R r, E e) { static if (!isArray!R && is(typeof(r.put(e)))) { r.put(e); } else static if (!isArray!R && is(typeof(r.put((&e)[0..1])))) { r.put((&e)[0..1]); } else static if (is(typeof(r.front = e, r.popFront()))) { r.front = e; r.popFront(); } else static if (isInputRange!E && is(typeof(put(r, e.front)))) { for (; !e.empty; e.popFront()) put(r, e.front); } else static if (is(typeof(r(e)))) { r(e); } else static if (is(typeof(r((&e)[0..1])))) { r((&e)[0..1]); } else { static assert(false, "Cannot put a "~E.stringof~" into a "~R.stringof); } } AndreiOn 07/12/2010 02:41 PM, Steven Schveighoffer wrote:How does that work for a range whose front() can be assigned a dchar? Wait, it doesn't, because it won't compile. But wouldn't that be the same for a delegate that takes a dchar? I'm very confused at what you are trying to do. I expected that a char[] would be a valid output range.I'm unsure how it will work either. I admit now that I didn't think through how this will be used.It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient.
Jul 12 2010
On 07/12/2010 09:58 PM, Andrei Alexandrescu wrote:Here's what I have. Works?[snip] To clarify: http://erdani.com/d/phobos/std_range.html Andrei
Jul 12 2010
On Tue, Jul 13, 2010 at 05:18, Andrei Alexandrescu < SeeWebsiteForEmail erdani.org> wrote: To clarify: http://erdani.com/d/phobos/std_range.html Seems good to me. Lots of flexibility. I may begin to write output ranges :-) So in the very first case, e may well be a range, but the way it's used internally is up to the output range designer. Maybe it uses a loop, maybe not. OK. You call r(e) and r([e])) 'delegates'. Are you just using it as a generic term, or does that mean using any callable (struct with opCall) is not a good idea in this case, for whatever reason? Philippe
Jul 12 2010
On Tue, 13 Jul 2010 01:52:52 -0400, Philippe Sigaud <philippe.sigaud gmail.com> wrote:You call r(e) and r([e])) 'delegates'. Are you just using it as a generic term, or does that mean using any callable (struct with opCall) is not a good idea in this case, for whatever reason?I think struct with opCall is ok. That's pretty much all a delegate is anyways. -Steve
Jul 13 2010
On 07/13/2010 06:18 AM, Steven Schveighoffer wrote:On Tue, 13 Jul 2010 01:52:52 -0400, Philippe Sigaud <philippe.sigaud gmail.com> wrote:Yes, the delegate is mostly used as an example. Any syntactically valid r(e) should be accepted. AndreiYou call r(e) and r([e])) 'delegates'. Are you just using it as a generic term, or does that mean using any callable (struct with opCall) is not a good idea in this case, for whatever reason?I think struct with opCall is ok. That's pretty much all a delegate is anyways.
Jul 13 2010
On Mon, 12 Jul 2010 22:58:07 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:On 07/12/2010 04:39 PM, Steven Schveighoffer wrote:Hm... I think it should be, and here is why. Imagine this situation: char[1024] buffer = void; // allocate some blank space on the stack put(buffer, someInputRange); But the above won't compile anyways, because a ref char[1024] isn't a range, and even if it was, it would be left in a state where it pointed to the uninitialized data. What we need is a helper struct, and then we are covered. char[1024] buffer = void; CharBuilder builder(buffer[]); // defines put put(builder, someInputRange); So I think we are good. Does Appender work here?On Mon, 12 Jul 2010 17:25:43 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Actually a char[] is not a valid output range. Overwriting variable-length codes with other variable-length codes might mess up the string.On 07/12/2010 02:41 PM, Steven Schveighoffer wrote:How does that work for a range whose front() can be assigned a dchar? Wait, it doesn't, because it won't compile. But wouldn't that be the same for a delegate that takes a dchar? I'm very confused at what you are trying to do. I expected that a char[] would be a valid output range.I'm unsure how it will work either. I admit now that I didn't think through how this will be used.It's very simple. As far as a user of an output range is concerned, they should write stuff like: put(r, '['); char[] someBuf; put(r, someBuf); put(r, ", "); put(r, ']'); in confidence that things are reasonably efficient.Here's what I have. Works? void put(R, E)(ref R r, E e) { static if (!isArray!R && is(typeof(r.put(e)))) { r.put(e); } else static if (!isArray!R && is(typeof(r.put((&e)[0..1])))) { r.put((&e)[0..1]); } else static if (is(typeof(r.front = e, r.popFront()))) { r.front = e; r.popFront(); } else static if (isInputRange!E && is(typeof(put(r, e.front)))) { for (; !e.empty; e.popFront()) put(r, e.front); } else static if (is(typeof(r(e)))) { r(e); } else static if (is(typeof(r((&e)[0..1])))) { r((&e)[0..1]); } else { static assert(false, "Cannot put a "~E.stringof~" into a "~R.stringof); } }That is satisfactory, it encompasses what I was saying, thanks! -Steve
Jul 13 2010
On 07/13/2010 06:15 AM, Steven Schveighoffer wrote:On Mon, 12 Jul 2010 22:58:07 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Appender doesn't currently work with fixed-size buffers, but it could and should be made to work. It's a good idea. AndreiActually a char[] is not a valid output range. Overwriting variable-length codes with other variable-length codes might mess up the string.Hm... I think it should be, and here is why. Imagine this situation: char[1024] buffer = void; // allocate some blank space on the stack put(buffer, someInputRange); But the above won't compile anyways, because a ref char[1024] isn't a range, and even if it was, it would be left in a state where it pointed to the uninitialized data. What we need is a helper struct, and then we are covered. char[1024] buffer = void; CharBuilder builder(buffer[]); // defines put put(builder, someInputRange); So I think we are good. Does Appender work here?
Jul 13 2010
Andrei Alexandrescu wrote:Yes. The point is that with a delegate you must choose between accepting E and E[]. Given the constraint, it's better for everyone to accept E[] and let put() take care of the occasional E by doing the wraparoo (&elem)[0..1].I don't think the current implementation of put allows passing E and E[] correctly: void put(R, E)(ref R r, E e) if (isOutputRange!(R, E)) { if (isArray!E && is(typeof(r(e)))) { r(e); } else static if (is(typeof(r(new E[])))) { r((&e)[0 .. 1]); } else { static assert(false); } } Example: typeof(fn) == void function(int[][]). put(fn, int[]) -> ok put(fn, int[][]) -> no match, isOutputRange!(fn, int[][]) fails. You'll need a void put(R, E)(ref R r, E[] a) if (isOutputRange!(R, E)) overload get the desired behavior. Implementing that one generically also makes explicit what Steve was saying: front-assignable ranges and classic output ranges are stuck using put with a single element in a loop. Associating different performance tradeoffs with abstractions that should be interchangeable sounds odd. In the end it comes down to the question why R.put(E) and void R(E[]) should both be output ranges for E. They should either take E or E[]. I'd go with void foo(E) being an output range for E and define the overloads void put(ref R r, T t) if (isOutputRange!(R, T[])) // makes 1-element slice void put(ref R r, E e) if (isOutputRange!(R, E)) void put(ref R r, A[] a) if (isOutputRange!(R, A)) // uses foreach Christian
Jul 12 2010
On 2010-07-12 13:49:50 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:The point is that with a delegate you must choose between accepting E and E[]. Given the constraint, it's better for everyone to accept E[] and let put() take care of the occasional E by doing the wraparoo (&elem)[0..1].If this means what I think, it means put() cannot be memory safe. Making an array from a stack variable and passing it around cannot be safe unless you can trust this reference won't escape the scope of the delegate you're calling (and there's no way to enforce that for dynamic arrays). To be safe, all you can do is copy the element on the heap. Am I wrong? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jul 12 2010
On 12/07/2010 03:17, Andrei Alexandrescu wrote:he notion of output range has been a tad vague in the past; up until now a range that wanted to qualify as an output range had to define a method called put.What if we have to deal with non orthogonal structures, or .. simple directed graphs ? well well.. yet we do not even have a simple map!(string , T) in std.container .. So, For me the Range Stuff still lacks proof of product. I remain pretty sceptical. bjoern
Jul 12 2010