digitalmars.D - Performance optimization for minElement and maxElement
- =?UTF-8?Q?Ali_=c3=87ehreli?= (9/9) Sep 11 2022 Since the minimum and maximum values of fundamental types are known, can...
- H. S. Teoh (5/7) Sep 12 2022 No, because that value may not appear in the given range.
- =?UTF-8?Q?Ali_=c3=87ehreli?= (5/10) Sep 12 2022 I am not that silly. :) What I meant is "as soon as they see e.g. 0".
- Sergey (5/20) Sep 12 2022 In case zero will be the last element - the search should make an
- Ogi (2/6) Sep 12 2022 Branch prediction makes this extra check basically free.
- Steven Schveighoffer (15/24) Sep 12 2022 It could only make comparisons when assigning the "current minimum". I
- H. S. Teoh (14/41) Sep 12 2022 I would on principle exclude enums from this optimization.
- Steven Schveighoffer (12/20) Sep 12 2022 This isn't that crazy though. Checking against an absolute minimum or
- Paul Backus (8/20) Sep 12 2022 It seems like maybe what's desired here is a version of
- JG (15/38) Sep 12 2022 Could build it like this (not sure performance is great).
- JG (2/18) Sep 12 2022 Oops last assert is wrong... should be ==0
- =?UTF-8?Q?Ali_=c3=87ehreli?= (7/10) Sep 12 2022 Makes sense. My mind caught this "optimization" (if at all) on unsigned
- Nick Treleaven (20/28) Sep 12 2022 Are they only broken for enum?
- H. S. Teoh (7/17) Sep 12 2022 [...]
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (6/9) Sep 12 2022 I would take a peek a look the standard libraries of C++ and Rust
Since the minimum and maximum values of fundamental types are known, can minElement and maxElement shortcut and return that value immediately? I think such an optimization may break existing code because it may not consume ranges in old code, which may be surprised by it. (?) I ask because the documentation says "O(n) Exactly n - 1 comparisons are needed." Ali P.S. I decided to use my own function instead of minElement; I return 0 immediately on a range of size_t elements.
Sep 11 2022
On Sun, Sep 11, 2022 at 07:40:35PM -0700, Ali Çehreli via Digitalmars-d wrote:Since the minimum and maximum values of fundamental types are known, can minElement and maxElement shortcut and return that value immediately?No, because that value may not appear in the given range. T -- Having a smoking section in a restaurant is like having a peeing section in a swimming pool. -- Edward Burr
Sep 12 2022
On 9/12/22 04:07, H. S. Teoh wrote:On Sun, Sep 11, 2022 at 07:40:35PM -0700, Ali Çehreli viaDigitalmars-d wrote:I am not that silly. :) What I meant is "as soon as they see e.g. 0". Say, at the third element of a million-element range... AliSince the minimum and maximum values of fundamental types are known, can minElement and maxElement shortcut and return that value immediately?No, because that value may not appear in the given range. T
Sep 12 2022
On Monday, 12 September 2022 at 11:46:23 UTC, Ali Çehreli wrote:On 9/12/22 04:07, H. S. Teoh wrote:In case zero will be the last element - the search should make an extra comparisons “if currentElement = T.minimum” which double operations.. Isn’t it?On Sun, Sep 11, 2022 at 07:40:35PM -0700, Ali Çehreli viaDigitalmars-d wrote:are known, canSince the minimum and maximum values of fundamental typesimmediately?minElement and maxElement shortcut and return that valueNo, because that value may not appear in the given range. TI am not that silly. :) What I meant is "as soon as they see e.g. 0". Say, at the third element of a million-element range... Ali
Sep 12 2022
On Monday, 12 September 2022 at 11:53:34 UTC, Sergey wrote:In case zero will be the last element - the search should make an extra comparisons “if currentElement = T.minimum” which double operations.. Isn’t it?Branch prediction makes this extra check basically free.
Sep 12 2022
On 9/12/22 7:53 AM, Sergey wrote:On Monday, 12 September 2022 at 11:46:23 UTC, Ali Çehreli wrote:It could only make comparisons when assigning the "current minimum". I don't think it would cost that much. Complexity would be the same anyway. One thing that is worrisome though, is the fact that `min` and `max` aren't always the minimum and maximum bit patterns available. So short-cutting the search might do something you don't want. e.g. an enum which defines a bitfield might have values of 1, 2, 4, 8, but instances of the enum might go all the way up to 15 (all bits set). However, the `.max` value would be 8. And in other cases, the enum truly may just be an enum, where the `.max` value is the maximum value, but you'd have to defensively check for the max representable value, making it perform this check for no reason. A possibility is to provide an overload that accepts the absolute minimum or maximum, and only then does it do the extra comparisons. -SteveI am not that silly. :) What I meant is "as soon as they see e.g. 0". Say, at the third element of a million-element range...In case zero will be the last element - the search should make an extra comparisons “if currentElement = T.minimum” which double operations.. Isn’t it?
Sep 12 2022
On Mon, Sep 12, 2022 at 10:15:45AM -0400, Steven Schveighoffer via Digitalmars-d wrote:On 9/12/22 7:53 AM, Sergey wrote:I would on principle exclude enums from this optimization. Besides, enums as bitfields IMO are a code smell; there really should be an "official" bitflags type constructor that takes an enum and creates a bitfield type based on that enum. The constructed type would have .min and .max appropriately defined as 0 and the bitwise OR of all flags, respectively.On Monday, 12 September 2022 at 11:46:23 UTC, Ali Çehreli wrote:It could only make comparisons when assigning the "current minimum". I don't think it would cost that much. Complexity would be the same anyway. One thing that is worrisome though, is the fact that `min` and `max` aren't always the minimum and maximum bit patterns available. So short-cutting the search might do something you don't want. e.g. an enum which defines a bitfield might have values of 1, 2, 4, 8, but instances of the enum might go all the way up to 15 (all bits set). However, the `.max` value would be 8. And in other cases, the enum truly may just be an enum, where the `.max` value is the maximum value, but you'd have to defensively check for the max representable value, making it perform this check for no reason.I am not that silly. :) What I meant is "as soon as they see e.g. 0". Say, at the third element of a million-element range...In case zero will be the last element - the search should make an extra comparisons “if currentElement = T.minimum” which double operations.. Isn’t it?A possibility is to provide an overload that accepts the absolute minimum or maximum, and only then does it do the extra comparisons.[...] Please don't. This smells like one of those well-intentioned extensions that 5 years later will have us wondering "how did Phobos become this hairy?". T -- Debian GNU/Linux: Cray on your desktop.
Sep 12 2022
On 9/12/22 10:59 AM, H. S. Teoh wrote:On Mon, Sep 12, 2022 at 10:15:45AM -0400, Steven Schveighoffer via Digitalmars-d wrote:This isn't that crazy though. Checking against an absolute minimum or maximum has a cost, one which you may not want to incur. For instance, uint.min is 0, this might be a common occurrence in some ranges. But int.min is -2billion. That might *never* occur in some ranges. Why waste time checking for something that will never occur in the name of "performance"? For many types and data sets, this is a net pessimization. And there's not a way to do this shortcut via type wrapping. The only option is to alter the algorithm. Making it opt-in isn't a bad idea, but of course, the API could be badly designed. -SteveA possibility is to provide an overload that accepts the absolute minimum or maximum, and only then does it do the extra comparisons.[...] Please don't. This smells like one of those well-intentioned extensions that 5 years later will have us wondering "how did Phobos become this hairy?".
Sep 12 2022
On Monday, 12 September 2022 at 15:27:25 UTC, Steven Schveighoffer wrote:This isn't that crazy though. Checking against an absolute minimum or maximum has a cost, one which you may not want to incur. For instance, uint.min is 0, this might be a common occurrence in some ranges. But int.min is -2billion. That might *never* occur in some ranges. Why waste time checking for something that will never occur in the name of "performance"? For many types and data sets, this is a net pessimization. And there's not a way to do this shortcut via type wrapping. The only option is to alter the algorithm. Making it opt-in isn't a bad idea, but of course, the API could be badly designed.It seems like maybe what's desired here is a version of [`fold`][1] that allows for early termination. Like, instead of just returning the next accumulator value, the callback could return a tuple of the next value and a `bool continue`, or something. [1]: https://phobos.dpldocs.info/std.algorithm.iteration.fold.html
Sep 12 2022
On Monday, 12 September 2022 at 16:26:01 UTC, Paul Backus wrote:On Monday, 12 September 2022 at 15:27:25 UTC, Steven Schveighoffer wrote:Could build it like this (not sure performance is great). ```d import std; auto stoppingFold(alias stopCond, alias op, R)(R r) { return r.cumulativeFold!op.until!(stopCond)(No.openRight).fold!((a,b)=>b); } void main() { auto x = [1,2,4,3]; assert(x.stoppingFold!(x=>x==0,(a,b)=>min(a,b))==1); x = [1,0,4,3]; assert(x.stoppingFold!(x=>x==0,(a,b)=>min(a,b))==1); } ```This isn't that crazy though. Checking against an absolute minimum or maximum has a cost, one which you may not want to incur. For instance, uint.min is 0, this might be a common occurrence in some ranges. But int.min is -2billion. That might *never* occur in some ranges. Why waste time checking for something that will never occur in the name of "performance"? For many types and data sets, this is a net pessimization. And there's not a way to do this shortcut via type wrapping. The only option is to alter the algorithm. Making it opt-in isn't a bad idea, but of course, the API could be badly designed.It seems like maybe what's desired here is a version of [`fold`][1] that allows for early termination. Like, instead of just returning the next accumulator value, the callback could return a tuple of the next value and a `bool continue`, or something. [1]: https://phobos.dpldocs.info/std.algorithm.iteration.fold.html
Sep 12 2022
On Monday, 12 September 2022 at 17:05:44 UTC, JG wrote:On Monday, 12 September 2022 at 16:26:01 UTC, Paul Backus wrote:Oops last assert is wrong... should be ==0[...]Could build it like this (not sure performance is great). ```d import std; auto stoppingFold(alias stopCond, alias op, R)(R r) { return r.cumulativeFold!op.until!(stopCond)(No.openRight).fold!((a,b)=>b); } void main() { auto x = [1,2,4,3]; assert(x.stoppingFold!(x=>x==0,(a,b)=>min(a,b))==1); x = [1,0,4,3]; assert(x.stoppingFold!(x=>x==0,(a,b)=>min(a,b))==1); } ```
Sep 12 2022
On 9/12/22 08:27, Steven Schveighoffer wrote:For instance, uint.min is 0, this might be a common occurrence in some ranges. But int.min is -2billion. That might *never* occur in some ranges.Makes sense. My mind caught this "optimization" (if at all) on unsigned types being 0. So I started writing the original post on unsigned types but generalized to all fundamental types as an afterthought. Even maxElement was added by generalization... :) Yeah, I think this is about unsigned and 0. Ali
Sep 12 2022
On Monday, 12 September 2022 at 14:15:45 UTC, Steven Schveighoffer wrote:On 9/12/22 7:53 AM, Sergey wrote: One thing that is worrisome though, is the fact that `min` and `max` aren't always the minimum and maximum bit patterns available. So short-cutting the search might do something you don't want.Are they only broken for enum?e.g. an enum which defines a bitfield might have values of 1, 2, 4, 8, but instances of the enum might go all the way up to 15 (all bits set). However, the `.max` value would be 8.Actually dmd allows other operations besides `|` and `&` on enum members: ```d enum E { a, b=5, c=2 } pragma(msg, E.max); // E.b immutable E e = E.b << 1; pragma(msg, e); // cast(E)10 immutable E f = E.b + E.b + E.b; pragma(msg, f); // cast(E)15 ``` All bits set for E would be 7. I don't know why those operations are allowed.
Sep 12 2022
On Mon, Sep 12, 2022 at 04:46:23AM -0700, Ali Çehreli via Digitalmars-d wrote:On 9/12/22 04:07, H. S. Teoh wrote:[...]On Sun, Sep 11, 2022 at 07:40:35PM -0700, Ali Çehreli via Digitalmars-dwrote:Since the minimum and maximum values of fundamental types are known, can minElement and maxElement shortcut and return that value immediately?No, because that value may not appear in the given range.I am not that silly. :) What I meant is "as soon as they see e.g. 0". Say, at the third element of a million-element range...[...] Sounds like a good idea. T -- PNP = Plug 'N' Pray
Sep 12 2022
On Monday, 12 September 2022 at 02:40:35 UTC, Ali Çehreli wrote:Since the minimum and maximum values of fundamental types are known, can minElement and maxElement shortcut and return that value immediately?I would take a peek a look the standard libraries of C++ and Rust see if they apply this optimization trick. The real question is what input the algorithm should be optimized for. And how big of an overhead such a check has in the case where early return never happens.
Sep 12 2022