www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Just playing with compiler explorer to see assembly line count.

reply SrMordred <patric.dexheimer gmail.com> writes:
//D compiled with gdc 5.2 -O3

auto test(int[] arr, int cmp)
{
     int[] r;
     foreach(v ; arr)
         if(v == cmp)r~=v;
     return r;
}
// 51 lines of assembly

auto test(int[] arr, int cmp)
{
   return arr.filter!((v)=>v==cmp).array;
}
//1450 lines... what?

Ok let me look also at c++:
//gcc 7.2 -O3

vector<int> test(vector<int>& arr, int cmp) {
     vector<int> r;
     for(auto v : arr)
         if(v == cmp)r.push_back(v);
     return r;
}
//152 lines. more than D :)

vector<int> test(vector<int>& arr, int cmp) {
     vector<int> r;
     std::copy_if (arr.begin(), arr.end(), std::back_inserter(r),
      [cmp](int i){return i==cmp;} );
     return r;
}

//150 lines. That what i expected earlier with D.

Hmm. let me be 'fair' and use std.container.array just for 
curiosity:

auto test(ref Array!int arr, int cmp)
{
     Array!int r;
     foreach(v ; arr)
         if(v == cmp)r.insert(v);
     return r;
}

//5542 lines... what??

Someone interested to discuss about this?

Or point me some grotesque mistake.
Oct 03 2017
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
Be warned, x86 cpu's today are not like they were 10 years ago. A good 
portion of a symbol could be full of nop's and it could end up being 
faster than the one without them.

Next, compare against ldc, not gdc primarily. Its better maintained and 
ugh more inline with dmd (its a bit of a mess, lets not go there). Of 
course nothing wrong with doing both.

std.container.* is basically dead. We need to replace it. We are 
currently waiting on std.experimental.allocators before going much more 
further (also a lot of other no-gc stuff).

Compare (on https://d.godbolt.org/ with "ldc -O3" and "gdc -O3"):
---
auto test1(int[] arr, int cmp)
{
     int[] r;
     foreach(v ; arr)
       if(v == cmp)r~=v;
     return r;
}

import std.container.array;
auto test2(ref Array!int arr, int cmp)
{
     Array!int r;
     foreach(v ; arr)
       if(v == cmp)r.insert(v);
     return r;
}
---
Oct 03 2017
parent reply SrMordred <patric.dexheimer gmail.com> writes:
On Tuesday, 3 October 2017 at 13:53:38 UTC, rikki cattermole 
wrote:
 Be warned, x86 cpu's today are not like they were 10 years ago. 
 A good portion of a symbol could be full of nop's and it could 
 end up being faster than the one without them.

 Next, compare against ldc, not gdc primarily. Its better 
 maintained and ugh more inline with dmd (its a bit of a mess, 
 lets not go there). Of course nothing wrong with doing both.

 std.container.* is basically dead. We need to replace it. We 
 are currently waiting on std.experimental.allocators before 
 going much more further (also a lot of other no-gc stuff).

 Compare (on https://d.godbolt.org/ with "ldc -O3" and "gdc 
 -O3"):
 ---
 auto test1(int[] arr, int cmp)
 {
     int[] r;
     foreach(v ; arr)
       if(v == cmp)r~=v;
     return r;
 }

 import std.container.array;
 auto test2(ref Array!int arr, int cmp)
 {
     Array!int r;
     foreach(v ; arr)
       if(v == cmp)r.insert(v);
     return r;
 }
 ---
With ldc the results are similar. 5k+ And I know, im not into performance comparison yet. But you know, less code, more cache friendly (and sometimes better performance). But my big surprise was with .filter.
Oct 03 2017
parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Tuesday, 3 October 2017 at 14:07:39 UTC, SrMordred wrote:
 On Tuesday, 3 October 2017 at 13:53:38 UTC, rikki cattermole 
 wrote:
 Be warned, x86 cpu's today are not like they were 10 years 
 ago. A good portion of a symbol could be full of nop's and it 
 could end up being faster than the one without them.

 Next, compare against ldc, not gdc primarily. Its better 
 maintained and ugh more inline with dmd (its a bit of a mess, 
 lets not go there). Of course nothing wrong with doing both.

 std.container.* is basically dead. We need to replace it. We 
 are currently waiting on std.experimental.allocators before 
 going much more further (also a lot of other no-gc stuff).

 Compare (on https://d.godbolt.org/ with "ldc -O3" and "gdc 
 -O3"):
 ---
 auto test1(int[] arr, int cmp)
 {
     int[] r;
     foreach(v ; arr)
       if(v == cmp)r~=v;
     return r;
 }

 import std.container.array;
 auto test2(ref Array!int arr, int cmp)
 {
     Array!int r;
     foreach(v ; arr)
       if(v == cmp)r.insert(v);
     return r;
 }
 ---
With ldc the results are similar. 5k+ And I know, im not into performance comparison yet. But you know, less code, more cache friendly (and sometimes better performance).
Well -O3 does not generate cache friendly assembly anyway. For instance, inlining functions regardless of cost. I'd take the assembly count with a pinch of salt when you are using templated code. Iain
Oct 03 2017
prev sibling parent reply Daniel Kozak <kozzi11 gmail.com> writes:
is not bad

https://godbolt.org/g/bSfubs

On Tue, Oct 3, 2017 at 3:19 PM, SrMordred via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 //D compiled with gdc 5.2 -O3

 auto test(int[] arr, int cmp)
 {
     int[] r;
     foreach(v ; arr)
         if(v == cmp)r~=v;
     return r;
 }
 // 51 lines of assembly

 auto test(int[] arr, int cmp)
 {
   return arr.filter!((v)=>v==cmp).array;
 }
 //1450 lines... what?

 Ok let me look also at c++:
 //gcc 7.2 -O3

 vector<int> test(vector<int>& arr, int cmp) {
     vector<int> r;
     for(auto v : arr)
         if(v == cmp)r.push_back(v);
     return r;
 }
 //152 lines. more than D :)

 vector<int> test(vector<int>& arr, int cmp) {
     vector<int> r;
     std::copy_if (arr.begin(), arr.end(), std::back_inserter(r),
      [cmp](int i){return i==cmp;} );
     return r;
 }

 //150 lines. That what i expected earlier with D.

 Hmm. let me be 'fair' and use std.container.array just for curiosity:

 auto test(ref Array!int arr, int cmp)
 {
     Array!int r;
     foreach(v ; arr)
         if(v == cmp)r.insert(v);
     return r;
 }

 //5542 lines... what??

 Someone interested to discuss about this?

 Or point me some grotesque mistake.
Oct 03 2017
parent SrMordred <patric.dexheimer gmail.com> writes:
On Tuesday, 3 October 2017 at 17:15:04 UTC, Daniel Kozak wrote:
 is not bad

 https://godbolt.org/g/bSfubs
Thats cool, I never used copy xD. (but you returned the .copy range, not the 'r' array ;p) //now with ldc 1.4 and -O3 -release -boundscheck=off foreach -> 99 lines .filter.copy -> 368 lines .filter.array -> 1229 lines (1002 lines with -O1)
Oct 03 2017