digitalmars.D - range behaviour
- Benjamin Thaut (8/8) May 13 2014 I know that there was a recent discussion about how the methods of
- Jonathan M Davis via Digitalmars-d (13/21) May 13 2014 Certainly, ranges are pretty much always used this way, but there was so...
- Steven Schveighoffer (12/28) May 13 2014 I don't agree there was a consensus. I think empty should not have to be...
- H. S. Teoh via Digitalmars-d (38/43) May 13 2014 I only partially agree with this, up to the point where .empty has been
- Steven Schveighoffer (26/38) May 13 2014 You can be generic, but still know that empty is not necessary.
- luka8088 (13/27) May 13 2014 This is a potential issue because if it turns out that empty _must_ be
- Jonathan M Davis via Digitalmars-d (7/38) May 13 2014 On Tue, 13 May 2014 13:29:32 -0400
- H. S. Teoh via Digitalmars-d (35/64) May 13 2014 [...]
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (8/17) May 13 2014 An API is a "user" interface. It should be intuitive.
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (10/25) May 14 2014 I wouldn't say never, because optimizers have become damn good. I
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (28/35) May 14 2014 Well, "never" without a proof is of course a too strong claim.
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (72/83) May 14 2014 As a demonstration of what I mean:
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (26/36) May 14 2014 I only have DMD installed, but I trust your word for it. However
- Dicebot (64/67) May 14 2014 There is no single reason why this should be true. Actually
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (4/13) May 14 2014 This isn't really a loop. It is an expression: n*(n+1)
- Dicebot (8/21) May 15 2014 I think it is your turn to make a counter-example ;) It does not
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (9/10) May 15 2014 I've given 2 examples that no compiler can get at… because they
- Dicebot (14/20) May 15 2014 If compiler lacks contextual knowledge, than only means that
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (70/79) May 16 2014 Not really. Here is a simple example, a sawtooth generator that
- Jonathan M Davis via Digitalmars-d (19/22) May 13 2014 Oh, but that's part of the fun. If the range's front returns by ref or a...
- Dicebot (2/13) May 13 2014 I don't remember any final decision made from that discussion.
I know that there was a recent discussion about how the methods of ranges should behave. E.g. - Does empty always have to be called before calling front or popFront? - Is it allowed to call front multiple times between two calls to popFront? Was there a result of that discussion? Is it documented somewhere? Kind Regards Benjamin Thaut
May 13 2014
On Tue, 13 May 2014 18:38:44 +0200 Benjamin Thaut via Digitalmars-d <digitalmars-d puremagic.com> wrote:I know that there was a recent discussion about how the methods of ranges should behave. E.g. - Does empty always have to be called before calling front or popFront?Certainly, ranges are pretty much always used this way, but there was some debate as to whether empty could have work done in it (and thus _had_ to be called). However, I believe that the consensus was that yes, empty had to be called (certainly, both Walter and Andrei felt that way).- Is it allowed to call front multiple times between two calls to popFront?Definitely. _Lots_ of range-based code would break otherwise - though there are casese where that can cause problems depending on what you rely on (e.g. map!(a => to!string(a)) will return equal strings, but they aren't the _same_ string).Was there a result of that discussion? Is it documented somewhere?AFAIK, there's just the semi-recent newsgroup discussion on the matter, though maybe someone put something up on the wiki. - Jonathan M Davis
May 13 2014
On Tue, 13 May 2014 12:58:09 -0400, Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Tue, 13 May 2014 18:38:44 +0200 Benjamin Thaut via Digitalmars-d <digitalmars-d puremagic.com> wrote:I don't agree there was a consensus. I think empty should not have to be called if it's already logically known that the range is not empty. The current documentation states that, and I don't think there was an agreement that we should change it, despite the arguments from Walter. In any case, I think generic code for an unknown range type in an unknown condition should have to call empty, since it cannot logically prove that it's not. Even if it was required, it would be an unenforceable policy, just like range.save. -SteveI know that there was a recent discussion about how the methods of ranges should behave. E.g. - Does empty always have to be called before calling front or popFront?Certainly, ranges are pretty much always used this way, but there was some debate as to whether empty could have work done in it (and thus _had_ to be called). However, I believe that the consensus was that yes, empty had to be called (certainly, both Walter and Andrei felt that way).
May 13 2014
On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]I don't agree there was a consensus. I think empty should not have to be called if it's already logically known that the range is not empty.I only partially agree with this, up to the point where .empty has been checked previously, such as: void f1() { void f2(R)(R r) if (isInputRange!R) { doSomething(r.front); } auto r = makeMeARange(); if (!r.empty) f2(r); ... } Even in this case, I'd put an in-contract on f2 that verifies that the range is indeed non-empty: ... void f2(R)(R r) if (isInputRange!R) in { assert(!r.empty); } body { doSomething(r.front); } [...]In any case, I think generic code for an unknown range type in an unknown condition should have to call empty, since it cannot logically prove that it's not.[...] In my mind, *all* range-based code is generic. If you need to depend on something about your range beyond what the range API guarantees, then you should be using the concrete type rather than a template argument, which means that your code is no longer range-based code -- it's operating on a concrete type. (Of course, if the concrete type implements the range API, then there's nothing wrong with using range API methods, but one shouldn't be under the illusion that one is writing range-based code. It is type-dependent code that just happens to have range-based methods. If the code will break if the range is substituted by a different range that doesn't have the same guarantees (outside of what the range API specifies), then it's not range-based code.) T -- Many open minds should be closed for repairs. -- K5 user
May 13 2014
On Tue, 13 May 2014 13:40:55 -0400, H. S. Teoh via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]You can be generic, but still know that empty is not necessary. This is a potential use case (somewhat similar to yours): void foo(R)(R r) { if(!r.empty) { auto r2 = map!(x => x.bar)(r); // or some similar translation range // have to check r2.empty here? The "empty must be called" rules say yes. } } I will note, there are cases in phobos that don't check empty because it's provably not necessary. The issue arose because of the filter range, which someone was trying to make fully lazy. The whole thrust of the argument is that we want to force people to call empty not because we want them to write generically safe code but because we want to *instrument* empty to do something other than check for emptiness. In other words, if we are guaranteed that empty will be called before anything else, we can add extra bits to empty for e.g. lazy initialization. Otherwise, we have to add the bits to all the primitives. It also results in an awkward call when it's not strictly necessary: r2.empty; -SteveIn any case, I think generic code for an unknown range type in an unknown condition should have to call empty, since it cannot logically prove that it's not.[...] In my mind, *all* range-based code is generic. If you need to depend on something about your range beyond what the range API guarantees, then you should be using the concrete type rather than a template argument, which means that your code is no longer range-based code -- it's operating on a concrete type.
May 13 2014
On 13.5.2014. 19:40, H. S. Teoh via Digitalmars-d wrote:On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...] Even in this case, I'd put an in-contract on f2 that verifies that the range is indeed non-empty: ... void f2(R)(R r) if (isInputRange!R) in { assert(!r.empty); } body { doSomething(r.front); } [...]This is a potential issue because if it turns out that empty _must_ be called than the author could put the front population logic inside empty. Consider: struct R { bool empty () { front = 1; return false; } int front = 0; void popFront () { front = 0; } } This is a valid code if empty _must_ be called, but it will behave differently if passed to f2 in case asserts are compiled out. In case asserts are compiled out empty is never called and front in never populated. Because of this I think that it is necessary to document range behavior.
May 13 2014
On Tue, 13 May 2014 13:29:32 -0400 Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Tue, 13 May 2014 12:58:09 -0400, Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> wrote:Yeah, and they're both cases where the common case will work just fine if you do it wrong but which will break in less common cases, meaning that the odds are much higher that the bug won't be caught. :( - Jonathan M DavisOn Tue, 13 May 2014 18:38:44 +0200 Benjamin Thaut via Digitalmars-d <digitalmars-d puremagic.com> wrote:I don't agree there was a consensus. I think empty should not have to be called if it's already logically known that the range is not empty. The current documentation states that, and I don't think there was an agreement that we should change it, despite the arguments from Walter. In any case, I think generic code for an unknown range type in an unknown condition should have to call empty, since it cannot logically prove that it's not. Even if it was required, it would be an unenforceable policy, just like range.save.I know that there was a recent discussion about how the methods of ranges should behave. E.g. - Does empty always have to be called before calling front or popFront?Certainly, ranges are pretty much always used this way, but there was some debate as to whether empty could have work done in it (and thus _had_ to be called). However, I believe that the consensus was that yes, empty had to be called (certainly, both Walter and Andrei felt that way).
May 13 2014
On Tue, May 13, 2014 at 09:58:09AM -0700, Jonathan M Davis via Digitalmars-d wrote:On Tue, 13 May 2014 18:38:44 +0200 Benjamin Thaut via Digitalmars-d <digitalmars-d puremagic.com> wrote:[...] I second all of the above. In generic code (all range-based code is generic, since otherwise there's no point in using the range abstraction, you should just use the concrete type directly), you cannot assume anything about what .empty, .front, and .popFront() might do apart from what is specified in the range API. Therefore, since .front and .popFront have unspecified behaviour if .empty is false, the only correct way to write range-based code is to call .empty first to determine whether it's OK to call .front and .popFront. Furthermore, it doesn't make sense to restrict .front to be called only once, since otherwise we should simplify the API by merging .front and .popFront and have the same function both return the current element and advance to the next element. The fact that they were designed to be separate calls implies that .front can be called multiple times. Of course, for efficiency purposes range-based code (esp. Phobos code) should try their best to only call .front once. But it should be perfectly permissible to call .front multiple times. Lastly, since the range API is an *abstraction*, it should not dictate any concrete implementation details such as whether .empty can do non-trivial initialization work. Properly-written range-based code should be able to handle all possible implementations of the range API, including those that do non-trivial work in .empty. Of course, that doesn't mean it's a *good* idea for .empty to do non-trivial work. The code is probably cleaner if that work is done somewhere else (like in the ctor or the function that returns the range). But that's an issue for the implementer of the range, not the consumers. The consumers of the range should not depend on one behaviour or the other, so that they will still work correctly when given a range that, for whatever reason, needs to do non-trivial work in .empty. T -- Prosperity breeds contempt, and poverty breeds consent. -- Suck.comI know that there was a recent discussion about how the methods of ranges should behave. E.g. - Does empty always have to be called before calling front or popFront?Certainly, ranges are pretty much always used this way, but there was some debate as to whether empty could have work done in it (and thus _had_ to be called). However, I believe that the consensus was that yes, empty had to be called (certainly, both Walter and Andrei felt that way).- Is it allowed to call front multiple times between two calls to popFront?Definitely. _Lots_ of range-based code would break otherwise - though there are casese where that can cause problems depending on what you rely on (e.g. map!(a => to!string(a)) will return equal strings, but they aren't the _same_ string).Was there a result of that discussion? Is it documented somewhere?AFAIK, there's just the semi-recent newsgroup discussion on the matter, though maybe someone put something up on the wiki.
May 13 2014
On Tuesday, 13 May 2014 at 17:32:21 UTC, H. S. Teoh via Digitalmars-d wrote:Lastly, since the range API is an *abstraction*, it should not dictate any concrete implementation details such as whether .empty can do non-trivial initialization work. Properly-written range-based code should be able to handle all possible implementations of the range API, including those that do non-trivial work in .empty.An API is a "user" interface. It should be intuitive. Besides, D ranges will never perform as well as an optimized explicit loop, so you might as well aim for usability over speed. I learned this over 15 years ago when I was fed up with STL begin/end and implemented my own inlined iterators that work like D ranges ( for a scanline renderer ).
May 13 2014
On Wednesday, 14 May 2014 at 06:00:33 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 13 May 2014 at 17:32:21 UTC, H. S. Teoh via Digitalmars-d wrote:I wouldn't say never, because optimizers have become damn good. I believe the additional overhead of lazy initialization is mostly optimized away where it matters, i.e. inner loops, because the compiler can see that the initialization has already been performed in the first iteration. All it requires is unrolling the first iteration of the loop and treating it specially. But of course, this makes your conclusion even more true, because youLastly, since the range API is an *abstraction*, it should not dictate any concrete implementation details such as whether .empty can do non-trivial initialization work. Properly-written range-based code should be able to handle all possible implementations of the range API, including those that do non-trivial work in .empty.An API is a "user" interface. It should be intuitive. Besides, D ranges will never perform as well as an optimized explicit loop, so you might as well aim for usability over speed.
May 14 2014
On Wednesday, 14 May 2014 at 09:47:56 UTC, Marc Schütz wrote:I wouldn't say never, because optimizers have become damn good.Well, "never" without a proof is of course a too strong claim. :-) I agree with that, but I don't think optimizers are that good yet even though some of my troubles have been solved.I believe the additional overhead of lazy initialization is mostly optimized away where it matters, i.e. inner loops, because the compiler can see that the initialization has already been performed in the first iteration. All it requires is unrolling the first iteration of the loop and treating it specially.You can do this for some generators/iterators, but you have different ways to construct loops and generators and consumers that affect what is most efficient and how many registers you use etc. Abstractions hide what is going on, so the consumer might miss more efficient way of structuring the algorithm. Example: if you know that the end comes after 0, then you only need to test empty() after you read 0. You don't need to test for 0, you just branch up to the start of the loop if the zero-flag is set. Quasi ASM: loop: write(x) start: x = generate_next() branch_if_not_zero loop test( empty() ) branch_if_zero loop write(0) It is easy to construct examples where naive ranges will be suboptimal. For more complex iterators you also need to deal with sparse datastructures, skiplists, run length encoded data, SIMD alignment, guard protected stream ends… Meta information you exploit when you need faster algorithms.
May 14 2014
On Wednesday, 14 May 2014 at 11:23:16 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 14 May 2014 at 09:47:56 UTC, Marc Schütz wrote:As a demonstration of what I mean: /////////////////////////////////// import std.functional; struct Filter(alias predicate, Range) { Range r; private: bool is_initialized; public: void initialize() { alias pred = unaryFun!predicate; if(is_initialized) return; while(!r.empty && !pred(r.front)) r.popFront(); is_initialized = true; } property bool empty() { initialize(); return r.empty; } property auto front() { initialize(); return r.front; } void popFront() { r.popFront(); is_initialized = false; } auto sum() { typeof(r.front) result = 0; foreach(v; this) result += v; return result; } } auto filter(alias predicate, Range)(Range r) { return Filter!(predicate, Range)(r); } extern bool myPredicate(int); struct MyGen { int front; int max; property bool empty() { return front > max; } void popFront() { front++; } } void main() { auto gen = MyGen(1, 6); gen.filter!myPredicate.sum(); gen.filter!"a % 2".sum(); gen.filter!(a => true).sum(); } /////////////////////////////////// Compile this with: ldc2 -O3 -release -output-s demo.d and have a look a the generated assembly code for the Filter.sum() functions. All of them have been inlined as much as possible, and in particular the variable `is_initialized` has disappeared, even in the version that uses an external (unknown to the compiler) predicate. Of course, if you don't access the range in a tight loop, these optimizations usually aren't possible. But arguably, in such cases an extra variable and an extra check for initialization aren't too bad. This means that we can have lazy implementations even without a guaranteed call to `empty`, and still have comparable performance to eagerly initialized ranges where it matters most.I wouldn't say never, because optimizers have become damn good.Well, "never" without a proof is of course a too strong claim. :-) I agree with that, but I don't think optimizers are that good yet even though some of my troubles have been solved.I believe the additional overhead of lazy initialization is mostly optimized away where it matters, i.e. inner loops, because the compiler can see that the initialization has already been performed in the first iteration. All it requires is unrolling the first iteration of the loop and treating it specially.
May 14 2014
On Wednesday, 14 May 2014 at 19:32:57 UTC, Marc Schütz wrote:Compile this with: ldc2 -O3 -release -output-s demo.d and have a look a the generated assembly code for the Filter.sum() functions. All of them have been inlined as much as possible, and in particular the variable `is_initialized` has disappeared, even in the version that uses an external (unknown to the compiler) predicate.I only have DMD installed, but I trust your word for it. However I don't feel confident that compilers will ever catch up, even for simple loops. Take for instance floating point. Floating point math is inaccurate, that means the compiler will have to guess what kind of accuracy you are happy with… It can't, so it can't optimize real well, even when loop unrolling. Why is that? Well, because even if it can create SIMD instructions it cannot decouple the dependencies between consecutive elements without creating drift between the calculations over time. It has to assume the worst case. If you have a simple generator like this: sample[n] = f( sample[n-1] ) you could in theory do sample[n] = f(f(f(f( sample[n-4] )))) sample[n+1] = f(f(f(f( sample[n-3] )))) sample[n+2] = f(f(f(f( sample[n-2] )))) sample[n+3] = f(f(f(f( sample[n-1] )))) But because of floating point inaccuracies you would then risk that the sample[BIGNUMBER] and sample[BIGNUMBER+1] is completely disconnected which could be a disaster. So only hand optimization and analysis of the math and stability is sufficient to get the SIMD speed-up.This means that we can have implementations even without a guaranteed call to `empty`, and still have comparable performance to eagerly initialized ranges where it matters most.Maybe you can. :-) I will have to get my hands on ldc and try to create a counter example… Hm.
May 14 2014
On Wednesday, 14 May 2014 at 06:00:33 UTC, Ola Fosheim Grøstad wrote:Besides, D ranges will never perform as well as an optimized explicit loop, so you might as well aim for usability over speed.There is no single reason why this should be true. Actually ranges of medium complexity are already very close to explicit loops in D when you use something like LDC. Consider sample program like this: 1 void main() 2 { 3 import std.stdio; 4 int max; 5 readf("%d", &max); 6 assert(foo1(max) == foo2(max)); 7 } 8 9 int foo1(int max) 10 { 11 int sum = 0; 12 for (auto i = 0; i < max; ++i) 13 { 14 sum += i*2; 15 } 16 return sum; 17 } 18 19 int foo2(int max) 20 { 21 import std.algorithm : reduce, map; 22 import std.range : iota; 23 return reduce!((a, b) => a + b)(0, iota(0, max).map!(a => a*2)); 24 } Disassembly (ldmd2 -O -release -inline): 765 0000000000402df0 <_D4test4foo1FiZi>: 766 402df0: 31 c0 xor %eax,%eax 767 402df2: 85 ff test %edi,%edi 768 402df4: 7e 10 jle 402e06 <_D4test4foo1FiZi+0x16> 769 402df6: 8d 47 ff lea -0x1(%rdi),%eax 770 402df9: 8d 4f fe lea -0x2(%rdi),%ecx 771 402dfc: 0f af c8 imul %eax,%ecx 772 402dff: 83 e1 fe and $0xfffffffe,%ecx 773 402e02: 8d 44 79 fe lea -0x2(%rcx,%rdi,2),%eax 774 402e06: c3 retq 775 402e07: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) 776 402e0e: 00 00 777 778 0000000000402e10 <_D4test4foo2FiZi>: 779 402e10: 31 c0 xor %eax,%eax 780 402e12: 85 ff test %edi,%edi 781 402e14: 0f 48 f8 cmovs %eax,%edi 782 402e17: 85 ff test %edi,%edi 783 402e19: 74 10 je 402e2b <_D4test4foo2FiZi+0x1b> 784 402e1b: 8d 47 ff lea -0x1(%rdi),%eax 785 402e1e: 8d 4f fe lea -0x2(%rdi),%ecx 786 402e21: 0f af c8 imul %eax,%ecx 787 402e24: 83 e1 fe and $0xfffffffe,%ecx 788 402e27: 8d 44 79 fe lea -0x2(%rcx,%rdi,2),%eax 789 402e2b: c3 retq 790 402e2c: 0f 1f 40 00 nopl 0x0(%rax) it is almost identical. I fully expect to be 100% identical at certain point of compiler maturity.
May 14 2014
On Wednesday, 14 May 2014 at 13:29:35 UTC, Dicebot wrote:9 int foo1(int max) 10 { 11 int sum = 0; 12 for (auto i = 0; i < max; ++i) 13 { 14 sum += i*2; 15 } 16 return sum; 17 }This isn't really a loop. It is an expression: n*(n+1) Try something more realistic such as doing alpha-blending to a buffer or DSP.
May 14 2014
On Wednesday, 14 May 2014 at 14:00:18 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 14 May 2014 at 13:29:35 UTC, Dicebot wrote:I think it is your turn to make a counter-example ;) It does not matter that much what loop actually does. It was effectively an expression here but compiler was able to reduce range version to quite the same expression despite all the intermediate code. Of course there are still many ways to fool existing optimizers but declaring fundamental inability to do so? No way.9 int foo1(int max) 10 { 11 int sum = 0; 12 for (auto i = 0; i < max; ++i) 13 { 14 sum += i*2; 15 } 16 return sum; 17 }This isn't really a loop. It is an expression: n*(n+1) Try something more realistic such as doing alpha-blending to a buffer or DSP.
May 15 2014
On Thursday, 15 May 2014 at 12:34:49 UTC, Dicebot wrote:I think it is your turn to make a counter-example ;)I've given 2 examples that no compiler can get at… because they lack contextual knowledge. ;) Basically anything that involves guards at the end is an issue, most things that involve floating point is an issue, but there are others too I think… I'll get back to counter-examples later when I have time to look closer at LDC. (not this week)
May 15 2014
On Thursday, 15 May 2014 at 13:01:34 UTC, Ola Fosheim Grøstad wrote:On Thursday, 15 May 2014 at 12:34:49 UTC, Dicebot wrote:If compiler lacks contextual knowledge, than only means that range is not actually semantically equivalent to a loop. Which is quite possible if you start considering something like SIMD, this is true. What is wrong is assumption that such kinds of loops are anything but tiny minority and this warrants generic "ranges can never be as fast as loops" statement. Btw,I think it is your turn to make a counter-example ;)I've given 2 examples that no compiler can get at… because they lack contextual knowledge. ;)Floating point math is inaccurate, that means the compiler will have to guess what kind of accuracy you are happy with…AFAIK it is actually not true. Floating point standard defines basic precision guarantees and compilers are free to make any adjustments outside of those basics. In your example you need to verify generated assembly anyway, loop or range.
May 15 2014
On Thursday, 15 May 2014 at 13:10:29 UTC, Dicebot wrote:If compiler lacks contextual knowledge, than only means that range is not actually semantically equivalent to a loop.Not really. Here is a simple example, a sawtooth generator that goes from 1 to 0 with a cycle length >=2 (nyquist). import std.algorithm; import std.math; struct gen_sawtooth { enum bool empty = false; float front; float delta; this(immutable float samps_per_cycle){ front = 1.0f; delta = 1.0f/samps_per_cycle; } void popFront() { front -= delta; if(front < 0.0f) front += 1.0f; } } void rangefill_sawtooth(float[] a, immutable float samps_per_cycle){ auto gen = gen_sawtooth(samps_per_cycle); fill(a,gen); } void fill_sawtooth(float[] a,immutable float samps_per_cycle) { if(a.length==0) return; immutable float delta = 1.0f/samps_per_cycle; immutable max_cycle_length = cast(uint)floor(samps_per_cycle*1.0001+1); immutable lastcycle = a.length - max_cycle_length; float v = 1.0f; uint i = 0; do { do{ // main inner loop a[i] = v; ++i; v -= delta; } while(v>=0.0f); v += 1.0f; } while( i < lastcycle ); for(;i<a.length;++i){ // can be optimized further a[i] = v; v -= delta; if (v<0.0f) v += 1.0f; } } The generated code for the range based version does not create a fast innerloop, it lacks enough context and smarts: fill_sawtooth main inner loop: loop: incl %eax leal -2(%rax), %edx movss %xmm1, (%rbx,%rdx,4) subss %xmm0, %xmm1 ucomiss %xmm3, %xmm1 jae loop rangefill_sawtooth inner loop: loop: movss %xmm3, (%rsi) subss %xmm2, %xmm3 decq %rdi ucomiss %xmm3, %xmm0 jbe skip addss %xmm1, %xmm3 skip: addq $4, %rsi testq %rdi, %rdi jne loopWhat is wrong is assumption that such kinds of loops are anything but tiny minority and this warrants generic "ranges can never be as fast as loops" statement.Not without significant work…Minimum yes. That makes drift unpredictable. Sometimes you care more about deltas than absolute values.Floating point math is inaccurate, that means the compiler will have to guess what kind of accuracy you are happy with…AFAIK it is actually not true. Floating point standard defines basic precision guarantees
May 16 2014
On Tue, 13 May 2014 10:30:47 -0700 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:Of course, for efficiency purposes range-based code (esp. Phobos code) should try their best to only call .front once. But it should be perfectly permissible to call .front multiple times.Oh, but that's part of the fun. If the range's front returns by ref or auto ref - or if it actually made front a public member variable - then it would be _less_ efficient to assign the result of front to a variable in order to avoid calling front multiple times. Much as ranges have a defined API, there's enough flexibility in the API and in how it's implemented for any given range that generalities about what is more or less efficient or which is more or less likely to avoid unintended behaviors isn't a sure thing - which is part of why that particular discussion was instigated in the first place and why discussions about stuff like whether front can be transitive or not keep popping up from time to time. In general, it's probably better to avoid calling front multiple times though - the issue with map being a prime example of where the problem can be worse than simply an issue of efficiency - and in most cases, ranges don't have a front which returns by ref or auto ref. But unfortunately, because that's just _most_ cases, it does make it so that we can't make a blanket statement about what is more or less efficient. - Jonathan M Davis
May 13 2014
On Tuesday, 13 May 2014 at 16:38:43 UTC, Benjamin Thaut wrote:I know that there was a recent discussion about how the methods of ranges should behave. E.g. - Does empty always have to be called before calling front or popFront? - Is it allowed to call front multiple times between two calls to popFront? Was there a result of that discussion? Is it documented somewhere? Kind Regards Benjamin ThautI don't remember any final decision made from that discussion.
May 13 2014