www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 8946] New: « any » function does not what it should do

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946

           Summary: « any » function does not what it should do
           Product: D
           Version: future
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: dimitri.sabadie gmail.com



11:07:46 PDT ---
The any function calls find, which is really weird implemented because it
returns a subrange, doing (re)allocation(s) on the fly. I think it’s way too
much for just applying a predicate on a range. That’s why I have rewritten any
as it would be (a simple loop and apply the predicate), and I’d propose a
patch
here for the std one.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 02 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |dmitry.olsh gmail.com
         Resolution|                            |INVALID



07:18:01 PDT ---
Ever heard of slicing an array in D? 

Check this out: http://dlang.org/d-array-article.html

Dig down in std.algotrithm.find and you'll find out that it's lazy iterates a
forward range. Thus like the most of std.algorithm functions it DOESN'T
allocate. In case of array it just returns a slice of original array.

So this report is invalid.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946




14:09:28 PDT ---

 Ever heard of slicing an array in D? 
 
 Check this out: http://dlang.org/d-array-article.html
 
 Dig down in std.algotrithm.find and you'll find out that it's lazy iterates a
 forward range. Thus like the most of std.algorithm functions it DOESN'T
 allocate. In case of array it just returns a slice of original array.
 
 So this report is invalid.
Ok, my argument wasn’t worth it I agree. But, why any would use such a function where a simple loop and a test will do it? It’s the same argument for not to use countUntil and check if the index is invalid, it consumes more memory that we want. any does things we don’t want to IMHO. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946




14:18:25 PDT ---
 Ok, my argument wasn’t worth it I agree. But, why any would use such a
function
where a simple loop and a test will do it? Because find is a simple loop? To not duplicate code? If you have liner search then it's obvious (for me) to reuse it for exactly the same task. any basically gives you a bit less then find/countUntill (bool vs pointer-index) by doing the same amount of work.
 it consumes more memory that we want. any does things we don’t want to IMHO.
Simply not true. I'm not sure where this 'memory' argument comes from at all. In fact it saves some code space by not duplicating the same code over again. At best this could be an enhancement request (or better yet - a pull request) if there is a soild proof that the resulting code is measurably _faster_. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946




14:35:35 PDT ---

 Ok, my argument wasn’t worth it I agree. But, why any would use such a
function
where a simple loop and a test will do it? Because find is a simple loop? To not duplicate code? If you have liner search then it's obvious (for me) to reuse it for exactly the same task. any basically gives you a bit less then find/countUntill (bool vs pointer-index) by doing the same amount of work.
 it consumes more memory that we want. any does things we don’t want to IMHO.
Simply not true. I'm not sure where this 'memory' argument comes from at all. In fact it saves some code space by not duplicating the same code over again. At best this could be an enhancement request (or better yet - a pull request) if there is a soild proof that the resulting code is measurably _faster_.
There’s a popFront() function, so why would it be the correct way to search? Reusing a « linear search » code is not a point here because we want to _find_ something. countUntil gives us the position of an element, find should give whether the element is or not in a range. It returns a range: why ? It’s clearly implementation-related, and we don’t care about that detail. I’ll propose a pull request because the simple fact there’s « find » and « any » is not good. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946


timon.gehr gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timon.gehr gmx.ch




 ...
 There’s a popFront() function, so why would it be the correct way to search?
 Reusing a « linear search » code is not a point here because we want to
_find_
 something. countUntil gives us the position of an element, find should give
 whether the element is or not in a range.
Well, no.
 It returns a range: why ?
 ...
Because that is its specification. What is the issue? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946




16:45:23 PDT ---


 ...
 There’s a popFront() function, so why would it be the correct way to search?
 Reusing a « linear search » code is not a point here because we want to
_find_
 something. countUntil gives us the position of an element, find should give
 whether the element is or not in a range.
Well, no.
 It returns a range: why ?
 ...
Because that is its specification. What is the issue?
The issue is: a function that is designed to « search » for something is excepted to return a boolean, not top popFront() shit and returns the result range… -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946


Andrej Mitrovic <andrej.mitrovich gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrej.mitrovich gmail.com



16:58:47 PDT ---



 ...
 There’s a popFront() function, so why would it be the correct way to search?
 Reusing a « linear search » code is not a point here because we want to
_find_
 something. countUntil gives us the position of an element, find should give
 whether the element is or not in a range.
Well, no.
 It returns a range: why ?
 ...
Because that is its specification. What is the issue?
The issue is: a function that is designed to « search » for something is excepted to return a boolean, not top popFront() shit and returns the result range…
Then use canFind. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946





 ...
 
 The issue is: a function that is designed to « search » for something is
 excepted to return a boolean, not top popFront() [...] and returns the result
 range…
Maybe the expectation is flawed? eg. see www.google.com. The service called 'search' delivers more information than just a boolean. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmx.com



PDT ---
 Then use canFind.
canFind does exactly the same thing. The _only_ difference between calling find and checking whether the result was empty rather than duplicating find's implementation inside of canFind or any is the fact that a range is returned, and since that range is bound to be a relatively small type (almost always either an array or a struct with just a handful of member variables), the cost of that return is generally minimal, and there's a halfway decent chance that a move was involved rather than a copy, making it that much more efficient. If there's a measurable difference in efficiency between calling find (and checking for empty) and reimplementing find inside of canFind or any, then maybe we'll look at making it so that canFind and any don't call find. But the overhead of that return is so small (especially in comparison to the algorithm itself), that I'd be surprised if that ever happened. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946




04:11:05 PST ---

 If there's a measurable difference in efficiency between calling find (and
 checking for empty) and reimplementing find inside of canFind or any, then
 maybe we'll look at making it so that canFind and any don't call find. But the
 overhead of that return is so small (especially in comparison to the algorithm
 itself), that I'd be surprised if that ever happened.
The difference is that no one expect D developpers — who know the phobos — want to manipulate a range as return of a search function. An index would be okay since it returns a logic value for a search concept, but seriously, why would it return the range with all poped up values to find the object ? It’s clearly weird, and D is the single langage to do such a thing. Also, wanting to refactor all the world is clearly not a good idea. Sometimes, rewritting the wheel to adapt it for a specific concept is faraway better than reusing a weird code that makes the function weird. Moreover, any would be something like 3-4 lines of code… It’s really terrible. So you prefer to write it with just one line and propose a weird interface. Clever. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 04 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8946




PST ---
C++'s find takes a pair of iterators indicating the beginning and ending of a
range and returns an iterator to the element found - i.e. the new beginning of
the range. D's ranges can be viewed as being the same as that pair of
iterators. But unlike iterators, they always have a begin and end. So,
rathering than just return the new begin (which it can't do, because that would
be one iterator instead of the pair), D's find returns the new begin and the
end together as a new range. What D is doing is the logical progression of what
happens when you move from iterators to ranges.

You're free to not like it, but it works _very_ well, and it's very much in
line with how C++ does things, only we're doing it with ranges rather than
iterators, and the result is actually much more powerful than what C++ has,
because ranges are far more powerful.

As for returning an index, that doesn't work for D any more than it works for
C++. For anything which isn't random access, you'd have to iterate over the
entire range again to get to that element. Iterators are _far_ more efficient
than using indices. If you were to ask a linked list for its nth element, you'd
have to iterate over the entire list up to that element to get to it, meaning
that something like

for(size_t i = 0; i < list.size(); ++i)
    auto e = list.getIndex(i);

would actually end up iterating over the entire list multiple times in order to
get to each element. Contrast that with iterators, which iterate only once

for(list::iterator iter; iter < list.end(); ++iter)
    auto e = *iter;

Ranges are just a further evolution of iterators, and the result is
surprisingly flexible and powerful, especially when you start chaining function
calls. One of the major rationales behind ranges is discussed here:

http://www.drdobbs.com/architecture-and-design/component-programming-in-d/240008321

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 04 2012