www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 12169] New: sum(int[]) should return a int

reply d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169

           Summary: sum(int[]) should return a int
           Product: D
           Version: D2
          Platform: x86
        OS/Version: Windows
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



import std.algorithm: sum;
void main() {
    int[] arr1;
    auto arr2 = new int[arr1.sum];
}


With dmd 2.065beta3 gives:

test3.d(4,25): Error: cannot implicitly convert expression (sum(arr1)) of type
long to uint

It seems a sum(int[]) gives a long. 99% of the cases this is not what I need.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 15 2014
next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169


Peter Alexander <peter.alexander.au gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter.alexander.au gmail.co
                   |                            |m
         AssignedTo|nobody puremagic.com        |andrei erdani.com



05:49:52 PST ---
This is by design in the code, presumably to avoid overflow (which you are
usually so concerned about, what changed?)

I actually agree with you though. I'd expect sum!(int[]) to return int. In
general, I'd expect sum!(T[]) to return typeof(T.init + T.init), and respect
the language integral promotions.

A nice compromise might be to adjust the API:

int[] xs;
int sum1 = sum(xs);
long sum2 = sum!long(xs);

That way, you can specify what type to return. This also solves the problem of
potential overflows with longs (just specify ReturnType = BigInt).

I'll assign this to Andrei for his thoughts.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 15 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169




PST ---
If sum returned int for ranges of int, it would be 100% identical to
std.algorithm.reduce. It's worth returning a 64-bit quantity because (a) 64-bit
summing is about as fast as 32-bit summing, (b) the result can be cast to int
if needed.

I propose we close this.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 15 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169





 This is by design in the code, presumably to avoid overflow (which you are
 usually so concerned about, what changed?)
I have tens of usages of sum in my code and no case I need a long result. So the Phobos sum halves my usability. And I need to keep my own sum() around.
 I actually agree with you though. I'd expect sum!(int[]) to return int. In
 general, I'd expect sum!(T[]) to return typeof(T.init + T.init), and respect
 the language integral promotions.
Right.
 A nice compromise might be to adjust the API:
 
 int[] xs;
 int sum1 = sum(xs);
 long sum2 = sum!long(xs);
 
 That way, you can specify what type to return. This also solves the problem of
 potential overflows with longs (just specify ReturnType = BigInt).
This is OK, but an alternative (and in my opinion better) solution is in Issue 12173: long sum2 = xs.sum(0L); -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 15 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169






 If sum returned int for ranges of int, it would be 100% identical to
 std.algorithm.reduce.
If it's equal to what reduce does, it gives less surprises to the programmer. I am sure in Haskell you don't see a sum to act differently from reduce. One is defined on the other, for uniformity, simplicity and minimize surprises. Also what do you want sum(long[]) to return on default? A "cent" perhaps?
 It's worth returning a 64-bit quantity because (a) 64-bit
 summing is about as fast as 32-bit summing, (b) the result can be cast to int
 if needed.
I am not so fond of casts. What if later I change the code and I am now summing floats instead? The cast(int) will remove the floating point part :-)
 I propose we close this.
Let's hear what others think first :-) -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 15 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169


timon.gehr gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timon.gehr gmx.ch




 This is by design in the code, presumably to avoid overflow (which you are
 usually so concerned about, what changed?)
 
 I actually agree with you though. I'd expect sum!(int[]) to return int. In
 general, I'd expect sum!(T[]) to return typeof(T.init + T.init), and respect
 the language integral promotions.
 ...
+1.
 If sum returned int for ranges of int, it would be 100% identical to
 std.algorithm.reduce.
Not yet, there are more special cases. (but a more general divide and conquer reduction algorithm might be an useful addition.) BTW: D semantics do not seem to support Kahan summation. (Walter has expressed in the past that any operation might arbitrarily be performed at higher precision.)
 It's worth returning a 64-bit quantity because
 (a) 64-bit summing is about as fast as 32-bit summing,
 (b) the result can be cast to int if needed.
 
IMO it's not worth the special casing. (Another thing I find questionable: extending range capabilities can reduce precision.) Allowing specification of the summation type is enough. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 16 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169






 presumably to avoid overflow (which you are
 usually so concerned about, what changed?)
This after very few seconds of run time prints the wrong result 3028092401290448384 instead of 4_294_967_295 * 5_000_000_000 = 21474836475000000000: struct Uints { ulong count; bool empty() { return count == 0; } void popFront() { count--; } uint front() { return uint.max; } } void main() { import std.stdio, std.algorithm; Uints(5_000_000_000).sum.writeln; } I think sum() should be changed to return an int before dmd 2.065final is released. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 18 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra gmail.com




 import std.algorithm: sum;
 void main() {
     int[] arr1;
     auto arr2 = new int[arr1.sum];
 }
 
 
 With dmd 2.065beta3 gives:
 
 test3.d(4,25): Error: cannot implicitly convert expression (sum(arr1)) of type
 long to uint
 
 It seems a sum(int[]) gives a long. 99% of the cases this is not what I need.
Looks like I missed this when pulling. I even suggested this change: https://github.com/D-Programming-Language/phobos/pull/1205/files#r9175954 I didn't realize the wordy code was actually doing something different. I would have disagreed with it had I seen it.
 If sum returned int for ranges of int, it would be 100% identical to
 std.algorithm.reduce. It's worth returning a 64-bit quantity because (a) 64-bit
 summing is about as fast as 32-bit summing, (b) the result can be cast to int
 if needed.
 
 I propose we close this.
Wasn't the *point* for sum to be a specialization of reduce though? To be a more convenient and more accurate/efficient result? I don't think doing minor deviations as such is worth it.
 This is OK, but an alternative (and in my opinion better) solution is in Issue
 12173:
 
 long sum2 = xs.sum(0L);
I think this is a nice and simple solution. It perfectly fits with the idea that sum is reduce but specialized for numeric "+".

 I actually agree with you though. I'd expect sum!(int[]) to return int. In
 general, I'd expect sum!(T[]) to return typeof(T.init + T.init), and respect
 the language integral promotions.
 ...
+1.
 It's worth returning a 64-bit quantity because
 (a) 64-bit summing is about as fast as 32-bit summing,
 (b) the result can be cast to int if needed.
 
IMO it's not worth the special casing. (Another thing I find questionable: extending range capabilities can reduce precision.) Allowing specification of the summation type is enough.
+1 +1 -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 19 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169




PST ---
Fine. I think if sum returned int, having it return long would have been
submitted as an enhancement request :o). 

I'm busy for the time being, could someone please make a pull request. Thanks.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 25 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169




Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/e5c10ca4abaed05d7e206353a57453c7f92ec75c
Fix issue 12169 - sum(int[]) should return a int

https://github.com/D-Programming-Language/phobos/commit/8979c4612075612b2f70ddd7f96aa59691fb25ea


Fix issue 12169 - sum(int[]) should return a int

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 25 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 25 2014
prev sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=12169




Commit pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/353c0dfba2fee297d92f97666533fe64eda4a6f9


Fixup sum documentation

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 26 2014