digitalmars.D.learn - Zip vs Enumerate

Jordan Wilson (28/28) Jun 19 2018 Hello,

Cym13 (8/36) Jun 19 2018 This sounds like a very limited case: if you're not zipping

Cym13 (2/11) Jun 19 2018 My bad, I didn't read your example thoroughly enough.

Dukc (5/7) Jun 19 2018 The problem might be that zip checks both arrays for empty during

Jordan Wilson (31/38) Jun 20 2018 takeExactly didn't change the timings at all.

Jordan Wilson <wilsonjord gmail.com> writes:

Hello,

Idiomatically, I make use of zip, however, when looking to speed 
up my program, notice that using enumerate leads to a 20-30% 
improvement:

void main(){
     auto x = iota(1_000).array;
     auto y = iota(1_000).array;

     auto func1() {
         return zip(x,y).map!(a => a[0]+a[1])
                        .array;
     }

     auto func2() {
         return x.enumerate
                 .map!(a => a.value + y[a.index])
                 .array;
     }

     assert(func1.equal(func2));

     import std.datetime.stopwatch;
     auto r = benchmark!(func1, func2)(10_000);
     // r[0] approx 1794 ms
     // r[1] approx 1295 ms
}

Is there anything I can do to improve zip, before I go ahead and 
change to the faster but slightly less readable enumerate? In my 
particular case, all ranges that I zip are of the same length, 
but not sure how I can leverage that information.

Thanks,

Jordan

Jun 19 2018

Cym13 <cpicard openmailbox.org> writes:

On Wednesday, 20 June 2018 at 03:44:58 UTC, Jordan Wilson wrote:
 Hello,

 Idiomatically, I make use of zip, however, when looking to 
 speed up my program, notice that using enumerate leads to a 
 20-30% improvement:

 void main(){
     auto x = iota(1_000).array;
     auto y = iota(1_000).array;

     auto func1() {
         return zip(x,y).map!(a => a[0]+a[1])
                        .array;
     }

     auto func2() {
         return x.enumerate
                 .map!(a => a.value + y[a.index])
                 .array;
     }

     assert(func1.equal(func2));

     import std.datetime.stopwatch;
     auto r = benchmark!(func1, func2)(10_000);
     // r[0] approx 1794 ms
     // r[1] approx 1295 ms
 }

 Is there anything I can do to improve zip, before I go ahead 
 and change to the faster but slightly less readable enumerate? 
 In my particular case, all ranges that I zip are of the same 
 length, but not sure how I can leverage that information.

 Thanks,

 Jordan

This sounds like a very limited case: if you're not zipping 
against a iota(foo) then there's no comparing with enumerate, 
they simply don't do the same thing at all, and if you are then 
it sounds like enumerate expresses the purpose of using an index 
way better than a zipped iota IMHO.

Is there anything about your true use case that would be worth 
mentionning to better understand your situation?

Jun 19 2018

Cym13 <cpicard openmailbox.org> writes:

On Wednesday, 20 June 2018 at 04:40:42 UTC, Cym13 wrote:
 On Wednesday, 20 June 2018 at 03:44:58 UTC, Jordan Wilson wrote:
 [...]

 This sounds like a very limited case: if you're not zipping 
 against a iota(foo) then there's no comparing with enumerate, 
 they simply don't do the same thing at all, and if you are then 
 it sounds like enumerate expresses the purpose of using an 
 index way better than a zipped iota IMHO.

 Is there anything about your true use case that would be worth 
 mentionning to better understand your situation?

My bad, I didn't read your example thoroughly enough.

Jun 19 2018

Dukc <ajieskola gmail.com> writes:

On Wednesday, 20 June 2018 at 03:44:58 UTC, Jordan Wilson wrote:
 Is there anything I can do to improve zip, before I go ahead 
 and change to the faster but slightly less readable enumerate?

The problem might be that zip checks both arrays for empty during 
each step, enumerate only the first one. Try appending 
takeExactly(1000) on the result of zip. It might be even faster 
than enumerate.

Jun 19 2018

Jordan Wilson <wilsonjord gmail.com> writes:

On Wednesday, 20 June 2018 at 05:49:15 UTC, Dukc wrote:
 On Wednesday, 20 June 2018 at 03:44:58 UTC, Jordan Wilson wrote:
 Is there anything I can do to improve zip, before I go ahead 
 and change to the faster but slightly less readable enumerate?

 The problem might be that zip checks both arrays for empty 
 during each step, enumerate only the first one. Try appending 
 takeExactly(1000) on the result of zip. It might be even faster 
 than enumerate.

takeExactly didn't change the timings at all.
I did write a rough zip function that took advantage of random 
access inputs and didn't need to account for different stopping 
policies:

auto myZip(R1,R2) (R1 r1, R2 r2) {
     struct MyZip(R1,R2) {
         size_t index=0;
         R1 rng1;
         R2 rng2;
         size_t length;
         this (R1 r1, R2 r2) {
             rng1=r1;
             rng2=r2;
             index=0;
             length=r1.length;
         }

         auto empty() { return index==length; }
         auto front() { return tuple(rng1[index],rng2[index]); }
         void popFront() { index++; }
     }
     return MyZip!(R1,R2)(r1,r2);
}

This ended up being 40% faster than normal zip, both DMD 32/64.
I briefly tested with LDC and didn't notice any difference, which 
I thought was weird, but didn't have time to investigate further.

That fact that zip is slower is not bad, as it provides far more 
flexibility, different stopping policies, all types of ranges.

What is interesting (TBC) is the fact that it's only noticeably 
slower using DMD.

Jordan

Jun 20 2018

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Zip vs Enumerate