www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Performance optimization for minElement and maxElement

reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
Since the minimum and maximum values of fundamental types are known, can 
minElement and maxElement shortcut and return that value immediately?

I think such an optimization may break existing code because it may not 
consume ranges in old code, which may be surprised by it. (?)

I ask because the documentation says "O(n) Exactly n - 1 comparisons are 
needed."

Ali

P.S. I decided to use my own function instead of minElement; I return 0 
immediately on a range of size_t elements.
Sep 11 2022
next sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Sun, Sep 11, 2022 at 07:40:35PM -0700, Ali Çehreli via Digitalmars-d wrote:
 Since the minimum and maximum values of fundamental types are known, can
 minElement and maxElement shortcut and return that value immediately?
No, because that value may not appear in the given range. T -- Having a smoking section in a restaurant is like having a peeing section in a swimming pool. -- Edward Burr
Sep 12 2022
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 9/12/22 04:07, H. S. Teoh wrote:
 On Sun, Sep 11, 2022 at 07:40:35PM -0700, Ali Çehreli via 
Digitalmars-d wrote:
 Since the minimum and maximum values of fundamental types are known, can
 minElement and maxElement shortcut and return that value immediately?
No, because that value may not appear in the given range. T
I am not that silly. :) What I meant is "as soon as they see e.g. 0". Say, at the third element of a million-element range... Ali
Sep 12 2022
next sibling parent reply Sergey <kornburn yandex.ru> writes:
On Monday, 12 September 2022 at 11:46:23 UTC, Ali Çehreli wrote:
 On 9/12/22 04:07, H. S. Teoh wrote:
 On Sun, Sep 11, 2022 at 07:40:35PM -0700, Ali Çehreli via
Digitalmars-d wrote:
 Since the minimum and maximum values of fundamental types
are known, can
 minElement and maxElement shortcut and return that value
immediately?
 No, because that value may not appear in the given range.


 T
I am not that silly. :) What I meant is "as soon as they see e.g. 0". Say, at the third element of a million-element range... Ali
In case zero will be the last element - the search should make an extra comparisons “if currentElement = T.minimum” which double operations.. Isn’t it?
Sep 12 2022
next sibling parent Ogi <ogion.art gmail.com> writes:
On Monday, 12 September 2022 at 11:53:34 UTC, Sergey wrote:
 In case zero will be the last element - the search should make 
 an extra comparisons “if currentElement = T.minimum” which 
 double operations..
 Isn’t it?
Branch prediction makes this extra check basically free.
Sep 12 2022
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/12/22 7:53 AM, Sergey wrote:
 On Monday, 12 September 2022 at 11:46:23 UTC, Ali Çehreli wrote:
 I am not that silly. :) What I meant is "as soon as they see e.g. 0". 
 Say, at the third element of a million-element range...
In case zero will be the last element - the search should make an extra comparisons “if currentElement = T.minimum” which double operations.. Isn’t it?
It could only make comparisons when assigning the "current minimum". I don't think it would cost that much. Complexity would be the same anyway. One thing that is worrisome though, is the fact that `min` and `max` aren't always the minimum and maximum bit patterns available. So short-cutting the search might do something you don't want. e.g. an enum which defines a bitfield might have values of 1, 2, 4, 8, but instances of the enum might go all the way up to 15 (all bits set). However, the `.max` value would be 8. And in other cases, the enum truly may just be an enum, where the `.max` value is the maximum value, but you'd have to defensively check for the max representable value, making it perform this check for no reason. A possibility is to provide an overload that accepts the absolute minimum or maximum, and only then does it do the extra comparisons. -Steve
Sep 12 2022
next sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Mon, Sep 12, 2022 at 10:15:45AM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 9/12/22 7:53 AM, Sergey wrote:
 On Monday, 12 September 2022 at 11:46:23 UTC, Ali Çehreli wrote:
 
 I am not that silly. :) What I meant is "as soon as they see e.g.
 0". Say, at the third element of a million-element range...
 
In case zero will be the last element - the search should make an extra comparisons “if currentElement = T.minimum” which double operations.. Isn’t it?
It could only make comparisons when assigning the "current minimum". I don't think it would cost that much. Complexity would be the same anyway. One thing that is worrisome though, is the fact that `min` and `max` aren't always the minimum and maximum bit patterns available. So short-cutting the search might do something you don't want. e.g. an enum which defines a bitfield might have values of 1, 2, 4, 8, but instances of the enum might go all the way up to 15 (all bits set). However, the `.max` value would be 8. And in other cases, the enum truly may just be an enum, where the `.max` value is the maximum value, but you'd have to defensively check for the max representable value, making it perform this check for no reason.
I would on principle exclude enums from this optimization. Besides, enums as bitfields IMO are a code smell; there really should be an "official" bitflags type constructor that takes an enum and creates a bitfield type based on that enum. The constructed type would have .min and .max appropriately defined as 0 and the bitwise OR of all flags, respectively.
 A possibility is to provide an overload that accepts the absolute
 minimum or maximum, and only then does it do the extra comparisons.
[...] Please don't. This smells like one of those well-intentioned extensions that 5 years later will have us wondering "how did Phobos become this hairy?". T -- Debian GNU/Linux: Cray on your desktop.
Sep 12 2022
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/12/22 10:59 AM, H. S. Teoh wrote:
 On Mon, Sep 12, 2022 at 10:15:45AM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 A possibility is to provide an overload that accepts the absolute
 minimum or maximum, and only then does it do the extra comparisons.
[...] Please don't. This smells like one of those well-intentioned extensions that 5 years later will have us wondering "how did Phobos become this hairy?".
This isn't that crazy though. Checking against an absolute minimum or maximum has a cost, one which you may not want to incur. For instance, uint.min is 0, this might be a common occurrence in some ranges. But int.min is -2billion. That might *never* occur in some ranges. Why waste time checking for something that will never occur in the name of "performance"? For many types and data sets, this is a net pessimization. And there's not a way to do this shortcut via type wrapping. The only option is to alter the algorithm. Making it opt-in isn't a bad idea, but of course, the API could be badly designed. -Steve
Sep 12 2022
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 12 September 2022 at 15:27:25 UTC, Steven 
Schveighoffer wrote:
 This isn't that crazy though. Checking against an absolute 
 minimum or maximum has a cost, one which you may not want to 
 incur.

 For instance, uint.min is 0, this might be a common occurrence 
 in some ranges. But int.min is -2billion. That might *never* 
 occur in some ranges. Why waste time checking for something 
 that will never occur in the name of "performance"? For many 
 types and data sets, this is a net pessimization.

 And there's not a way to do this shortcut via type wrapping. 
 The only option is to alter the algorithm. Making it opt-in 
 isn't a bad idea, but of course, the API could be badly 
 designed.
It seems like maybe what's desired here is a version of [`fold`][1] that allows for early termination. Like, instead of just returning the next accumulator value, the callback could return a tuple of the next value and a `bool continue`, or something. [1]: https://phobos.dpldocs.info/std.algorithm.iteration.fold.html
Sep 12 2022
parent reply JG <someone somewhere.com> writes:
On Monday, 12 September 2022 at 16:26:01 UTC, Paul Backus wrote:
 On Monday, 12 September 2022 at 15:27:25 UTC, Steven 
 Schveighoffer wrote:
 This isn't that crazy though. Checking against an absolute 
 minimum or maximum has a cost, one which you may not want to 
 incur.

 For instance, uint.min is 0, this might be a common occurrence 
 in some ranges. But int.min is -2billion. That might *never* 
 occur in some ranges. Why waste time checking for something 
 that will never occur in the name of "performance"? For many 
 types and data sets, this is a net pessimization.

 And there's not a way to do this shortcut via type wrapping. 
 The only option is to alter the algorithm. Making it opt-in 
 isn't a bad idea, but of course, the API could be badly 
 designed.
It seems like maybe what's desired here is a version of [`fold`][1] that allows for early termination. Like, instead of just returning the next accumulator value, the callback could return a tuple of the next value and a `bool continue`, or something. [1]: https://phobos.dpldocs.info/std.algorithm.iteration.fold.html
Could build it like this (not sure performance is great). ```d import std; auto stoppingFold(alias stopCond, alias op, R)(R r) { return r.cumulativeFold!op.until!(stopCond)(No.openRight).fold!((a,b)=>b); } void main() { auto x = [1,2,4,3]; assert(x.stoppingFold!(x=>x==0,(a,b)=>min(a,b))==1); x = [1,0,4,3]; assert(x.stoppingFold!(x=>x==0,(a,b)=>min(a,b))==1); } ```
Sep 12 2022
parent JG <someone somewhere.com> writes:
On Monday, 12 September 2022 at 17:05:44 UTC, JG wrote:
 On Monday, 12 September 2022 at 16:26:01 UTC, Paul Backus wrote:
 [...]
Could build it like this (not sure performance is great). ```d import std; auto stoppingFold(alias stopCond, alias op, R)(R r) { return r.cumulativeFold!op.until!(stopCond)(No.openRight).fold!((a,b)=>b); } void main() { auto x = [1,2,4,3]; assert(x.stoppingFold!(x=>x==0,(a,b)=>min(a,b))==1); x = [1,0,4,3]; assert(x.stoppingFold!(x=>x==0,(a,b)=>min(a,b))==1); } ```
Oops last assert is wrong... should be ==0
Sep 12 2022
prev sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 9/12/22 08:27, Steven Schveighoffer wrote:

 For instance, uint.min is 0, this might be a common occurrence in some
 ranges. But int.min is -2billion. That might *never* occur in some
 ranges.
Makes sense. My mind caught this "optimization" (if at all) on unsigned types being 0. So I started writing the original post on unsigned types but generalized to all fundamental types as an afterthought. Even maxElement was added by generalization... :) Yeah, I think this is about unsigned and 0. Ali
Sep 12 2022
prev sibling parent Nick Treleaven <nick geany.org> writes:
On Monday, 12 September 2022 at 14:15:45 UTC, Steven 
Schveighoffer wrote:
 On 9/12/22 7:53 AM, Sergey wrote:
 One thing that is worrisome though, is the fact that `min` and 
 `max` aren't always the minimum and maximum bit patterns 
 available. So short-cutting the search might do something you 
 don't want.
Are they only broken for enum?
 e.g. an enum which defines a bitfield might have values of 1, 
 2, 4, 8, but instances of the enum might go all the way up to 
 15 (all bits set). However, the `.max` value would be 8.
Actually dmd allows other operations besides `|` and `&` on enum members: ```d enum E { a, b=5, c=2 } pragma(msg, E.max); // E.b immutable E e = E.b << 1; pragma(msg, e); // cast(E)10 immutable E f = E.b + E.b + E.b; pragma(msg, f); // cast(E)15 ``` All bits set for E would be 7. I don't know why those operations are allowed.
Sep 12 2022
prev sibling parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Mon, Sep 12, 2022 at 04:46:23AM -0700, Ali Çehreli via Digitalmars-d wrote:
 On 9/12/22 04:07, H. S. Teoh wrote:
 On Sun, Sep 11, 2022 at 07:40:35PM -0700, Ali Çehreli via Digitalmars-d
wrote:
 Since the minimum and maximum values of fundamental types are
 known, can minElement and maxElement shortcut and return that value
 immediately?
No, because that value may not appear in the given range.
[...]
 I am not that silly. :) What I meant is "as soon as they see e.g. 0".
 Say, at the third element of a million-element range...
[...] Sounds like a good idea. T -- PNP = Plug 'N' Pray
Sep 12 2022
prev sibling parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Monday, 12 September 2022 at 02:40:35 UTC, Ali Çehreli wrote:
 Since the minimum and maximum values of fundamental types are 
 known, can minElement and maxElement shortcut and return that 
 value immediately?
I would take a peek a look the standard libraries of C++ and Rust see if they apply this optimization trick. The real question is what input the algorithm should be optimized for. And how big of an overhead such a check has in the case where early return never happens.
Sep 12 2022