www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.algorithm imporvments

reply "monarch_dodra" <monarchdodra gmail.com> writes:
I had a discussion recently about the proper use of "save" when 
passing to foreach or algorithms, as well as read the thread 
about algorithms being hard to use. It was concluded this the 
problem mostly came from:
a) Lack of proper testing.
b) Missing or inaccurate input type checking.

I decided to donate some of my time to investigate and improve on 
both these points.

Here is what I found:

*"Save" issues:
*"minPos". Atsolutly no backup is ever made, and this method will 
return an empty range ALL the time.
*"fill(Range1, Range2)". Save occurs only once, but not within 
loop body. An out of range exception occurs should Range1.length 
 2* Range2.length.
*"No specifier" issues: *The following algorithms do not validate the type of their ranges: **"minPos": Should be "isInputRange". **"skipOver": Should be "isInputRange". *"isForwardRange" could be "isInputRange" issues: **"countUntil". Also, the functions: *"fill(Range, Value)" *"uninitializedFill(Range, Value)" *"fill(Range1, Range2)" *"initializeAll(Range)" *"moveAll(Range1, Range2)" *"moveSome(Range1, Range2)" *"swapFront(R1, R2)" For some reason, the output ranges (Range/Range2/R2) for all the above functions are defined as needing "isInputRange", when it should really be "isOutputRange" (or Forward range for uninitializedFill/initializeAll ?) Finally: "fill(Range1, Range2)" requires "Range1" to be a forward range, so that it can be repeated. However, if Range1 hasInfinity, then only isInputRange is needed. ---- I did not go much more in depth, but I think it may be a good starting point? I apologize if I gave any false positives. Would this be something I should fix myself? ---- Something else I noticed it that even the most basic algorithms seem to go out of their way to avoid using "foreach". For example, "count", "equal", "minCount", "minPos". etc. Doing this potentially short-circuits any call to opApply if that range defines it.
Jul 16 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/16/12 6:43 AM, monarch_dodra wrote:
 I had a discussion recently about the proper use of "save" when passing
 to foreach or algorithms, as well as read the thread about algorithms
 being hard to use. It was concluded this the problem mostly came from:
 a) Lack of proper testing.
 b) Missing or inaccurate input type checking.

 I decided to donate some of my time to investigate and improve on both
 these points.

 Here is what I found:

 *"Save" issues:
 *"minPos". Atsolutly no backup is ever made, and this method will return
 an empty range ALL the time.
 *"fill(Range1, Range2)". Save occurs only once, but not within loop
 body. An out of range exception occurs should Range1.length > 2*
 Range2.length.
[snip] Wow, this is awesome. Did you discover that by inspection or by testing? I think a "malicious input range" would be a great tool for assessing which algorithms fail on input ranges. Andrei
Jul 16 2012
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Monday, 16 July 2012 at 22:42:47 UTC, Andrei Alexandrescu
wrote:
 Wow, this is awesome. Did you discover that by inspection or by 
 testing? I think a "malicious input range" would be a great 
 tool for assessing which algorithms fail on input ranges.

 Andrei
The first I discovered testing with a "ConsumableRange", actually. The second, I found by inspection. I'll correct those two issues myself, but I don't feel comfortable with the other issues.
Jul 17 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/17/12 4:41 AM, monarch_dodra wrote:
 On Monday, 16 July 2012 at 22:42:47 UTC, Andrei Alexandrescu
 wrote:
 Wow, this is awesome. Did you discover that by inspection or by
 testing? I think a "malicious input range" would be a great tool for
 assessing which algorithms fail on input ranges.

 Andrei
The first I discovered testing with a "ConsumableRange", actually. The second, I found by inspection. I'll correct those two issues myself, but I don't feel comfortable with the other issues.
You may want to submit them as bug requests. Thanks! Andrei
Jul 17 2012
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, July 17, 2012 10:47:50 Andrei Alexandrescu wrote:
 On 7/17/12 4:41 AM, monarch_dodra wrote:
 On Monday, 16 July 2012 at 22:42:47 UTC, Andrei Alexandrescu
 
 wrote:
 Wow, this is awesome. Did you discover that by inspection or by
 testing? I think a "malicious input range" would be a great tool for
 assessing which algorithms fail on input ranges.
 
 Andrei
The first I discovered testing with a "ConsumableRange", actually. The second, I found by inspection. I'll correct those two issues myself, but I don't feel comfortable with the other issues.
You may want to submit them as bug requests. Thanks!
Yes. Please do. It's on my todo list to improve std.algorithm and std.range's tests (particularly for reference type ranges), and I've gotten started on it, but it could take a while to get it all done, and anything that you find will be valuable in not only figuring out what needs fixing but also in figuring out what needs better testing. bugzilla: http://d.puremagic.com/issues - Jonathan M Davis
Jul 17 2012
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 17 July 2012 at 17:19:31 UTC, Jonathan M Davis wrote:
 On Tuesday, July 17, 2012 10:47:50 Andrei Alexandrescu wrote:
 On 7/17/12 4:41 AM, monarch_dodra wrote:
 On Monday, 16 July 2012 at 22:42:47 UTC, Andrei Alexandrescu
 
 wrote:
 Wow, this is awesome. Did you discover that by inspection 
 or by
 testing? I think a "malicious input range" would be a great 
 tool for
 assessing which algorithms fail on input ranges.
 
 Andrei
The first I discovered testing with a "ConsumableRange", actually. The second, I found by inspection. I'll correct those two issues myself, but I don't feel comfortable with the other issues.
You may want to submit them as bug requests. Thanks!
Yes. Please do. It's on my todo list to improve std.algorithm and std.range's tests (particularly for reference type ranges), and I've gotten started on it, but it could take a while to get it all done, and anything that you find will be valuable in not only figuring out what needs fixing but also in figuring out what needs better testing. bugzilla: http://d.puremagic.com/issues - Jonathan M Davis
Hi Jonathan, I've made changes to algorithm to the best of my abilities. If it does not meet requirements, please tell me what is wrong, and all work on it as I can. I've put an in-depth explanation of the changes in the pull request description. Slightly on topic, did you read my post about "Definition of "OutputRange" insuficient""? Would it be OK to add "hasLength" to range.d? This would be the first step to making outputRanges more useable, without directly changing the definition of an output range quite yet.
Jul 18 2012