digitalmars.D.learn - is std.algorithm.joiner lazy?

Puming (24/24) Apr 07 2016 Hi:

Edwin van Leeuwen (7/32) Apr 07 2016 Apparently it works processing the first two elements at

Puming (5/13) Apr 07 2016 OK. Even if it consumes the first two elements, then why does it

Edwin van Leeuwen (6/11) Apr 07 2016 After some testing it seems to get each element twice, calls

Puming (6/18) Apr 07 2016 Thanks! I added more elements to xs and checked that you are

Jonathan M Davis via Digitalmars-d-learn (20/40) Apr 07 2016 I would note that in general, it's not uncommon for an algorithm to acce...

Puming (22/71) Apr 07 2016 But in the joiner docs, it says joiner is lazy. But accessing

Jonathan M Davis via Digitalmars-d-learn (42/91) Apr 07 2016 Lazy means that it's not going to consume the entire range when you call...

Puming (10/66) Apr 07 2016 So what you mean is to read the front in constructor, and read

Jonathan M Davis via Digitalmars-d-learn (54/60) Apr 07 2016 In general, when you're dealing with a non-random access range, it's bes...

Puming (3/4) Apr 07 2016 Thanks. I'll adopt this idiom. Hopefully it gets used often

Mike Parker (33/37) Apr 08 2016 What would such a function look like? I don't think such a thing

Puming (2/17) Apr 08 2016

Puming (21/34) Apr 07 2016 Well, I used map because of when viewing the scenario in a data

Puming (28/40) Apr 07 2016 There is another problem with cache, that is if I want another

Edwin van Leeuwen (28/33) Apr 07 2016 That seems like a bug to me and you might want to submit it to

Puming (4/10) Apr 07 2016 Thanks. I just looked at the joiner code, but didn't find the

Puming <zhaopuming gmail.com> writes:

Hi:

when I use map with joiner, I found that function in map are 
called. In the document it says joiner is lazy, so why is the 
function called?

say:

int[] mkarray(int a) {
    writeln("mkarray called!");
    return [a * 2]; // just for test
}

void main() {
    auto xs = [1, 2];
    auto r = xs.map!(x=>mkarray(x)).joiner;
}

running this will get the output:

mkarray called!
mkarray called!

I suppose joiner does not consume?

when I actually consume the result by writlen, I get more output:

mkarray called!
mkarray called!
[2mkarray called!
mkarray called!
, 4]

I don't understand

Apr 07 2016

Edwin van Leeuwen <edder tkwsping.nl> writes:

On Thursday, 7 April 2016 at 07:07:40 UTC, Puming wrote:
 Hi:

 when I use map with joiner, I found that function in map are 
 called. In the document it says joiner is lazy, so why is the 
 function called?

 say:

 int[] mkarray(int a) {
    writeln("mkarray called!");
    return [a * 2]; // just for test
 }

 void main() {
    auto xs = [1, 2];
    auto r = xs.map!(x=>mkarray(x)).joiner;
 }

 running this will get the output:

 mkarray called!
 mkarray called!

 I suppose joiner does not consume?

 when I actually consume the result by writlen, I get more 
 output:

 mkarray called!
 mkarray called!
 [2mkarray called!
 mkarray called!
 , 4]

 I don't understand

Apparently it works processing the first two elements at 
creation. All the other elements will be processed lazily.

Even when a range is lazy the algorithm still often has to 
"consume" one or two starting elements, just to set initial 
conditions. It does surprise me that joiner needs to process the 
first two, would have to look at the implementation why.

Apr 07 2016

Puming <zhaopuming gmail.com> writes:

On Thursday, 7 April 2016 at 08:07:12 UTC, Edwin van Leeuwen 
wrote:
 On Thursday, 7 April 2016 at 07:07:40 UTC, Puming wrote:
 [...]

 Apparently it works processing the first two elements at 
 creation. All the other elements will be processed lazily.

 Even when a range is lazy the algorithm still often has to 
 "consume" one or two starting elements, just to set initial 
 conditions. It does surprise me that joiner needs to process 
 the first two, would have to look at the implementation why.

OK. Even if it consumes the first two elements, then why does it 
have to consume them AGAIN when actually used? If the function 
mkarray has side effects, it could lead to problems.

Apr 07 2016

Edwin van Leeuwen <edder tkwsping.nl> writes:

On Thursday, 7 April 2016 at 08:17:38 UTC, Puming wrote:
 On Thursday, 7 April 2016 at 08:07:12 UTC, Edwin van Leeuwen 
 wrote:

 OK. Even if it consumes the first two elements, then why does 
 it have to consume them AGAIN when actually used? If the 
 function mkarray has side effects, it could lead to problems.

After some testing it seems to get each element twice, calls 
front on the MapResult twice, on each element. The first two 
mkarray are both for first element, the second two for the 
second. You can solve this by caching the front call with:

xs.map!(x=>mkarray(x)).cache.joiner;

Apr 07 2016

Puming <zhaopuming gmail.com> writes:

On Thursday, 7 April 2016 at 08:27:23 UTC, Edwin van Leeuwen 
wrote:
 On Thursday, 7 April 2016 at 08:17:38 UTC, Puming wrote:
 On Thursday, 7 April 2016 at 08:07:12 UTC, Edwin van Leeuwen 
 wrote:

 OK. Even if it consumes the first two elements, then why does 
 it have to consume them AGAIN when actually used? If the 
 function mkarray has side effects, it could lead to problems.

 After some testing it seems to get each element twice, calls 
 front on the MapResult twice, on each element. The first two 
 mkarray are both for first element, the second two for the 
 second. You can solve this by caching the front call with:

 xs.map!(x=>mkarray(x)).cache.joiner;

Thanks! I added more elements to xs and checked that you are 
right.

So EVERY element is accessed twice with joiner. Better add that 
to the docs, and note the use of cache.

Apr 07 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Thursday, April 07, 2016 08:47:15 Puming via Digitalmars-d-learn wrote:
 On Thursday, 7 April 2016 at 08:27:23 UTC, Edwin van Leeuwen

 wrote:
 On Thursday, 7 April 2016 at 08:17:38 UTC, Puming wrote:
 On Thursday, 7 April 2016 at 08:07:12 UTC, Edwin van Leeuwen
 wrote:

 OK. Even if it consumes the first two elements, then why does
 it have to consume them AGAIN when actually used? If the
 function mkarray has side effects, it could lead to problems.

 After some testing it seems to get each element twice, calls
 front on the MapResult twice, on each element. The first two
 mkarray are both for first element, the second two for the
 second. You can solve this by caching the front call with:

 xs.map!(x=>mkarray(x)).cache.joiner;

 Thanks! I added more elements to xs and checked that you are
 right.

 So EVERY element is accessed twice with joiner. Better add that
 to the docs, and note the use of cache.

I would note that in general, it's not uncommon for an algorithm to access
front multiple times. So, this really isn't a joiner-specific issue. If
anything, it's map that should get a note in its docs, not joiner. You
really should just expect front to be called multiple times. So, if that's a
problem, use cache. But joiner is not doing anything abnormal.

And it's not even the case that it necessarily makes sense to make a rule of
thumb that ranges should copy front instead of calling it multiple times,
because if front returns by ref, calling front multiple times is likely to
be cheapepr, and while we don't properly support non-copyable types (like
UniquePtr) with ranges right now, we really should, so if anything, it
becomes the case that algorithms should favor calling front multiple times
over copying its value.

So, there are pros and cons involved with copying front vs calling it
multiple times, and I think that both approaches are both pretty common at
this point. So, given how frequently it makes sense for map to allocate
(e.g. to!string(a)), map should probably have a note about cache, but
overall, it's just something that you need to be aware of. Regardless, I
don't think that it makes sense to put anything in joiner's docs about it.

- Jonathan M Davis

Apr 07 2016

Puming <zhaopuming gmail.com> writes:

On Thursday, 7 April 2016 at 18:15:07 UTC, Jonathan M Davis wrote:
 On Thursday, April 07, 2016 08:47:15 Puming via 
 Digitalmars-d-learn wrote:
 On Thursday, 7 April 2016 at 08:27:23 UTC, Edwin van Leeuwen

 wrote:
 On Thursday, 7 April 2016 at 08:17:38 UTC, Puming wrote:
 On Thursday, 7 April 2016 at 08:07:12 UTC, Edwin van 
 Leeuwen wrote:

 OK. Even if it consumes the first two elements, then why 
 does it have to consume them AGAIN when actually used? If 
 the function mkarray has side effects, it could lead to 
 problems.

 After some testing it seems to get each element twice, calls 
 front on the MapResult twice, on each element. The first two 
 mkarray are both for first element, the second two for the 
 second. You can solve this by caching the front call with:

 xs.map!(x=>mkarray(x)).cache.joiner;

 Thanks! I added more elements to xs and checked that you are 
 right.

 So EVERY element is accessed twice with joiner. Better add 
 that to the docs, and note the use of cache.

 I would note that in general, it's not uncommon for an 
 algorithm to access front multiple times. So, this really isn't 
 a joiner-specific issue. If anything, it's map that should get 
 a note in its docs, not joiner. You really should just expect 
 front to be called multiple times. So, if that's a problem, use 
 cache. But joiner is not doing anything abnormal.

But in the joiner docs, it says joiner is lazy. But accessing 
front multiple times is not true laziness. I think it better note 
that after the lazy part: "joiner is lazy, but it will access the 
front twice".

If there are many other lazy functions behave like this, I 
suggest to make a new name for it, like 'semi-lazy', to be more 
accurate.

Maybe its my fault, I didn't know what cache does before Edwin 
told me.
So there is the solution, it just is not easy for newbies to find 
out because there is no direct link between these functions.

 And it's not even the case that it necessarily makes sense to 
 make a rule of thumb that ranges should copy front instead of 
 calling it multiple times, because if front returns by ref, 
 calling front multiple times is likely to be cheapepr, and 
 while we don't properly support non-copyable types (like 
 UniquePtr) with ranges right now, we really should, so if 
 anything, it becomes the case that algorithms should favor 
 calling front multiple times over copying its value.

Indeed. I think copy is not good. But multiple access is a thing 
to note. When I want to use lazy things, it usually is that I'm 
reading files, so accessing twice is not acceptable.

 So, there are pros and cons involved with copying front vs 
 calling it multiple times, and I think that both approaches are 
 both pretty common at this point. So, given how frequently it 
 makes sense for map to allocate (e.g. to!string(a)), map should 
 probably have a note about cache, but overall, it's just 
 something that you need to be aware of. Regardless, I don't 
 think that it makes sense to put anything in joiner's docs 
 about it.

There is another problem, map, cache, and joiner don't work when 
composed multiple times. I've submitted a bug, 
https://issues.dlang.org/show_bug.cgi?id=15891, can you confirm?

Because of this, now I have to read a file multiple times(using 
only joiner), or have to eagerly retrieve data in an array (which 
is too big), or fall back to an imperative way of manually 
accessing each file. They are all bad.
 - Jonathan M Davis

Apr 07 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Friday, April 08, 2016 00:30:05 Puming via Digitalmars-d-learn wrote:
 On Thursday, 7 April 2016 at 18:15:07 UTC, Jonathan M Davis wrote:
 On Thursday, April 07, 2016 08:47:15 Puming via

 Digitalmars-d-learn wrote:
 On Thursday, 7 April 2016 at 08:27:23 UTC, Edwin van Leeuwen

 wrote:
 On Thursday, 7 April 2016 at 08:17:38 UTC, Puming wrote:
 On Thursday, 7 April 2016 at 08:07:12 UTC, Edwin van
 Leeuwen wrote:

 OK. Even if it consumes the first two elements, then why
 does it have to consume them AGAIN when actually used? If
 the function mkarray has side effects, it could lead to
 problems.

 After some testing it seems to get each element twice, calls
 front on the MapResult twice, on each element. The first two
 mkarray are both for first element, the second two for the
 second. You can solve this by caching the front call with:

 xs.map!(x=>mkarray(x)).cache.joiner;

 Thanks! I added more elements to xs and checked that you are
 right.

 So EVERY element is accessed twice with joiner. Better add
 that to the docs, and note the use of cache.

 I would note that in general, it's not uncommon for an
 algorithm to access front multiple times. So, this really isn't
 a joiner-specific issue. If anything, it's map that should get
 a note in its docs, not joiner. You really should just expect
 front to be called multiple times. So, if that's a problem, use
 cache. But joiner is not doing anything abnormal.

 But in the joiner docs, it says joiner is lazy. But accessing
 front multiple times is not true laziness. I think it better note
 that after the lazy part: "joiner is lazy, but it will access the
 front twice".

 If there are many other lazy functions behave like this, I
 suggest to make a new name for it, like 'semi-lazy', to be more
 accurate.

 Maybe its my fault, I didn't know what cache does before Edwin
 told me.
 So there is the solution, it just is not easy for newbies to find
 out because there is no direct link between these functions.

Lazy means that it's not going to consume the entire range when you call the
function. Rather, it's going to return a range that you can iterate over. It
may or may not process the first element before returning, depending on how
it works, and there's definitely nothing that says whether it's going to
access front multiple times or not before calling popFront. And accessing
front multiple times without calling popFront is _normal_ whether you're
dealing with a lazy range or an eager one. All that lazy means is that
you're getting a range from the function rather than it consuming the range
before returning.

So, whatever you do with a range, in general, you have to assume that an
algorithm might access front multiple times, and the implementation is free
to change so that it accesses it more times or fewer times, because the
range API says nothing about whether front is accessed multiple times or
not. front needs to return equal values every time that it's called before
popFront is called, but that doesn't mean that they have to be the same
objects, and it doesn't mean that there's any restriction on how many times
front is accessed before a call to popFront.

So, I see no reason for joiner to say anything in its docs about how many
times it accesses front. It's pretty much irrelevant to how ranges are
expected to work, and it could change. If it actually matters for what
you're doing, then you need to figure out how to rework your code so that it
doesn't matter whether front is accessed multiple times per call to popFront
or not. That's just part of working with ranges, though I can certainly
understand if you didn't realize that previously.
 There is another problem, map, cache, and joiner don't work when
 composed multiple times. I've submitted a bug,
 https://issues.dlang.org/show_bug.cgi?id=15891, can you confirm?

Well, given your example, I would strongly argue that you should write a
range that calls read in its constructor and in popFront rather (so that
calling front multiple times doesn't matter) rather than using map. While
map can theoretically be used the way that you're trying to use it, it's
really intended for converting an element using rather than doing stuff like
I/O in it. Also, if the range that you give map is random access (like an
array would be), then opIndex could be used to access random elements, which
_really_ wouldn't work with reading from a file. So, I think that map is
just plain a bad choice for what you're trying to do.

It's not obvious to me why your example is failing to compile - the problem
appears to be with cache specifically and has nothing to do with joiner -
and I am inclined to agree that there's a bug there (be it in cache or in
the compiler), but I really think that using map is a bad move for what
you're trying to do anyway - especially when you consider what will happen
if opIndex is used. I'd strongly encourage you to just write a range that
does what you need instead.

- Jonathan M Davis

Apr 07 2016

Puming <zhaopuming gmail.com> writes:

On Friday, 8 April 2016 at 01:14:11 UTC, Jonathan M Davis wrote:
 [...]

 Lazy means that it's not going to consume the entire range when 
 you call the function. Rather, it's going to return a range 
 that you can iterate over. It may or may not process the first 
 element before returning, depending on how it works, and 
 there's definitely nothing that says whether it's going to 
 access front multiple times or not before calling popFront. And 
 accessing front multiple times without calling popFront is 
 _normal_ whether you're dealing with a lazy range or an eager 
 one. All that lazy means is that you're getting a range from 
 the function rather than it consuming the range before 
 returning.

 So, whatever you do with a range, in general, you have to 
 assume that an algorithm might access front multiple times, and 
 the implementation is free to change so that it accesses it 
 more times or fewer times, because the range API says nothing 
 about whether front is accessed multiple times or not. front 
 needs to return equal values every time that it's called before 
 popFront is called, but that doesn't mean that they have to be 
 the same objects, and it doesn't mean that there's any 
 restriction on how many times front is accessed before a call 
 to popFront.

 So, I see no reason for joiner to say anything in its docs 
 about how many times it accesses front. It's pretty much 
 irrelevant to how ranges are expected to work, and it could 
 change. If it actually matters for what you're doing, then you 
 need to figure out how to rework your code so that it doesn't 
 matter whether front is accessed multiple times per call to 
 popFront or not. That's just part of working with ranges, 
 though I can certainly understand if you didn't realize that 
 previously.

That makes sense. Thanks for the clarification.
 There is another problem, map, cache, and joiner don't work 
 when composed multiple times. I've submitted a bug, 
 https://issues.dlang.org/show_bug.cgi?id=15891, can you 
 confirm?

 Well, given your example, I would strongly argue that you 
 should write a range that calls read in its constructor and in 
 popFront rather (so that calling front multiple times doesn't 
 matter) rather than using map. While map can theoretically be 
 used the way that you're trying to use it, it's really intended 
 for converting an element using rather than doing stuff like 
 I/O in it. Also, if the range that you give map is random 
 access (like an array would be), then opIndex could be used to 
 access random elements, which _really_ wouldn't work with 
 reading from a file. So, I think that map is just plain a bad 
 choice for what you're trying to do.

So what you mean is to read the front in constructor, and read 
further parts in the popFront()? that way multiple access to the 
front won't hurt anything. I think it might work, I'll change my 
code.

So the guideline is: when accessing front is costly, don't use 
map, use a customized range struct instead. right?

 It's not obvious to me why your example is failing to compile - 
 the problem appears to be with cache specifically and has 
 nothing to do with joiner - and I am inclined to agree that 
 there's a bug there (be it in cache or in the compiler), but I 
 really think that using map is a bad move for what you're 
 trying to do anyway - especially when you consider what will 
 happen if opIndex is used. I'd strongly encourage you to just 
 write a range that does what you need instead.

OK, hope it'll get fixed. I'll try to look for it once I'm able 
to understande the code in phobos.

 - Jonathan M Davis

Apr 07 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Friday, April 08, 2016 02:01:07 Puming via Digitalmars-d-learn wrote:
 So what you mean is to read the front in constructor, and read
 further parts in the popFront()? that way multiple access to the
 front won't hurt anything. I think it might work, I'll change my
 code.

 So the guideline is: when accessing front is costly, don't use
 map, use a customized range struct instead. right?

In general, when you're dealing with a non-random access range, it's best
for popFront to do the work of setting up front and then have front return
the same object every time. If front is doing the work, then if it gets
called multiple times, that work is being repeated every time it gets
called. map is a funny case, because it can be a random-access range (if the
underlying range it's wrapping is a random-access range). So, fundamentally,
it doesn't work in map to do the work in popFront. It pretty much has to be
done in front. So, doing stuff like range.map!(a => to!string(a))() is
problematic in that a new allocation is going to occur every time that front
is called - or when any element is accessed via opIndex. It works so long as
the element is equal every time, and calling front multiple times does not
affect the rest of the range, but it can be costly. In theory, cache should
solve that case (and it would result in a range that wasn't random access,
so opIndex wouldn't be called on it), but obviously, you're running into
problems with it.

In any case, in general, when doing something like reading from a file with
a range, it works best to do the work in popFront to avoid issues with
multiple calls to front, and the constructor needs to do that work as well
(be it by calling popFront or not), because front needs to be valid as soon
as the range has been created, and it's not empty. So, you end up with
something like

struct MyRange
{
public:
     property T front() { return _value; }
     property bool empty() { ... }
    void popFront()
    {
        _value = readNextValueFromFile();
    }

private:

    this(Something s)
    {
        ...
        popFront();
    }

    T _value;
}

It also encapsulates things better than having a function whose only purpose
is to be used in map, though there are obviously cases where writing a
function just to use in map would make sense.

In general, I would only use map for cases where I'm converting something to
something else and not for functions that do arbitrary work. A function for
map that cannot be pure is a danger sign IMHO. Certainly, if you're going to
follow how ranges are expected to work, whatever function you give map needs
to return equal values every time front is called between calls to popFront,
and multiple calls to front cannot affect the rest of the range.  And what
you did with map, doesn't follow those guidelines, though it probably would
if cache worked, and you always fed it into cache.  Still, for something
like this, I'd just create my own range and be done with it. You often need
to anyway in order to manage extra state. And it tends to be more idiomatic,
though I suppose that that's somewhat subjective.

- Jonathan M Davis

Apr 07 2016

Puming <zhaopuming gmail.com> writes:

On Friday, 8 April 2016 at 02:49:01 UTC, Jonathan M Davis wrote:
 [...]

Thanks. I'll adopt this idiom. Hopefully it gets used often 
enough to warrent a phobos function :-)

Apr 07 2016

Mike Parker <aldacron gmail.com> writes:

On Friday, 8 April 2016 at 03:20:53 UTC, Puming wrote:
 On Friday, 8 April 2016 at 02:49:01 UTC, Jonathan M Davis wrote:
 [...]

 Thanks. I'll adopt this idiom. Hopefully it gets used often 
 enough to warrent a phobos function :-)

What would such a function look like? I don't think such a thing 
could exist. This is more than just an idiom, IMO. It's a basic 
principle of ranges that, if not followed, is likely to produce a 
broken range and/or one whose front is more expensive than it 
needs to be. The trouble is that it isn't necessarily obvious and 
is easy to overlook when first implementing a custom range.

In Learning D, I used a custom FilteredRange to introduce the 
concept of ranges. It has a member function called skipNext which 
does the work of the filtering. It's called once in the 
constructor to 'prime' the range with the first value that 
matches the filter, then inside every call to popFront to find 
the next match. I closed that section with this paragraph:

"It might be tempting to take the filtering logic out of the 
skipNext method and add
it to front, which is another way to guarantee that it's 
performed on every element.
Then no work would need to be done in the constructor and 
popFront would
simply become a wrapper for _source.popFront. The problem with 
that approach
is that front can potentially be called multiple times without 
calling popFront in
between, meaning the predicate will be tested on each call. 
That's unnecessary work.
As a general rule, any work that needs to be done inside a range 
to prepare a front
element should happen as a result of calling popFront, leaving 
front to simply
focus on returning the current element."

A lazy range should be advanced in the constructor when it needs 
to be (usually when there is some criterion for an element to be 
returned from front) and always in popFront, but never in front.

Apr 08 2016

Puming <zhaopuming gmail.com> writes:

On Friday, 8 April 2016 at 08:44:36 UTC, Mike Parker wrote:
 On Friday, 8 April 2016 at 03:20:53 UTC, Puming wrote:
 On Friday, 8 April 2016 at 02:49:01 UTC, Jonathan M Davis 
 wrote:
 [...]

 Thanks. I'll adopt this idiom. Hopefully it gets used often 
 enough to warrent a phobos function :-)

 What would such a function look like? I don't think such a 
 thing could exist. This is more than just an idiom, IMO. It's a 
 basic principle of ranges that, if not followed, is likely to 
 produce a broken range and/or one whose front is more expensive 
 than it needs to be. The trouble is that it isn't necessarily 
 obvious and is easy to overlook when first implementing a 
 custom range.

I thought it was just like map!readNext.cache


 [...]

Apr 08 2016

Puming <zhaopuming gmail.com> writes:

On Friday, 8 April 2016 at 01:14:11 UTC, Jonathan M Davis wrote:
 [...]

 Well, given your example, I would strongly argue that you 
 should write a range that calls read in its constructor and in 
 popFront rather (so that calling front multiple times doesn't 
 matter) rather than using map. While map can theoretically be 
 used the way that you're trying to use it, it's really intended 
 for converting an element using rather than doing stuff like 
 I/O in it. Also, if the range that you give map is random 
 access (like an array would be), then opIndex could be used to 
 access random elements, which _really_ wouldn't work with 
 reading from a file. So, I think that map is just plain a bad 
 choice for what you're trying to do.

Well, I used map because of when viewing the scenario in a data 
flow, map seems an intuitive choise:

what I have: a bunch of large files, each file containing 
sections of data, each sections is composed of many lines of 
record. For each file, I have an list of indices.

what I want: given a list of files and indices for each file, I 
want to construct a lazy stream of records for other program to 
use.

here is the data flow:

query constraints
-> [(filePath, [index])]
-> [(File, [index])] // map, needs cache
-> [[section]] // map, needs cache
-> [[[record]]]  // joiner.joiner
-> Range of record

And after reading cache's docs, I get that cache is perfect for 
converting a Range with front side effect into a Range with 
popFront side effect.

So if cache and map works harmoniously, they should do the same 
trick as manually writing two Ranges here.

 - Jonathan M Davis

Apr 07 2016

Puming <zhaopuming gmail.com> writes:

On Thursday, 7 April 2016 at 08:27:23 UTC, Edwin van Leeuwen 
wrote:
 On Thursday, 7 April 2016 at 08:17:38 UTC, Puming wrote:
 On Thursday, 7 April 2016 at 08:07:12 UTC, Edwin van Leeuwen 
 wrote:

 OK. Even if it consumes the first two elements, then why does 
 it have to consume them AGAIN when actually used? If the 
 function mkarray has side effects, it could lead to problems.

 After some testing it seems to get each element twice, calls 
 front on the MapResult twice, on each element. The first two 
 mkarray are both for first element, the second two for the 
 second. You can solve this by caching the front call with:

 xs.map!(x=>mkarray(x)).cache.joiner;

There is another problem with cache, that is if I want another 
level of this map&joiner(which is my code scenario, where I'm 
reading a bunch of files, with each one I need to read multiple 
locations with seek and return a bunch of lines with each seek), 
adding cache will result compiler error:

simplified demo:

auto read(int a) {
    writeln("read called!", a);
    return [0, a]; // second level
}

auto mkarray(int a) {
   writeln("mkarray called!", a);
   return [-a, a].map!(x=>read(x)).cache.joiner; // to avoid 
calling read twice
}

void main() {
   auto xs = [1,2 ,3, 4];
   auto r = xs.map!(x=>mkarray(x)).cache.joiner; // to avoid 
calling mkarray twice

   writeln(r);
}

When compiled, I get the error:

Error: open path skips field __caches_field_0
source/app.d(19, 36): Error: template instance 
std.algorithm.iteration.cache!(MapResult!(__lambda1, int[])) 
error instantiating

Apr 07 2016

Edwin van Leeuwen <edder tkwsping.nl> writes:

On Thursday, 7 April 2016 at 09:55:56 UTC, Puming wrote:
 When compiled, I get the error:

 Error: open path skips field __caches_field_0
 source/app.d(19, 36): Error: template instance 
 std.algorithm.iteration.cache!(MapResult!(__lambda1, int[])) 
 error instantiating

That seems like a bug to me and you might want to submit it to 
the bug tracker. Even converting it to an array first does not 
seem to work:

import std.stdio : writeln;
import std.algorithm : map, cache, joiner;
import std.array : array;

auto read(int a) {
    return [0, a]; // second level
}

auto mkarray(int a) {
   return [-a, a].map!(x=>read(x)).cache.joiner; // to avoid 
calling read twice
}

void main() {
   auto xs = [1,2 ,3, 4];
   auto r = xs.map!(x=>mkarray(x)).array;
	
   // Both lines below should be equal, but second does not compile
   [[0, -1, 0, 1], [0, -2, 0, 2], [0, -3, 0, 3], [0, -4, 0, 
4]].cache.joiner.writeln;
   r.cache.joiner.writeln;
}

Above results in following error:
/opt/compilers/dmd2/include/std/algorithm/iteration.d(326): 
Error: one path skips field __caches_field_0
/d617/f62.d(19): Error: template instance 
std.algorithm.iteration.cache!(Result[]) error instantiating

Apr 07 2016

Puming <zhaopuming gmail.com> writes:

On Thursday, 7 April 2016 at 10:57:25 UTC, Edwin van Leeuwen 
wrote:
 On Thursday, 7 April 2016 at 09:55:56 UTC, Puming wrote:
 [...]

 That seems like a bug to me and you might want to submit it to 
 the bug tracker. Even converting it to an array first does not 
 seem to work:

 [...]

Thanks. I just looked at the joiner code, but didn't find the 
source of error. I'll submit a bug report.

Apr 07 2016

D Programming

C/C++ Programming

Other

digitalmars.D.learn - is std.algorithm.joiner lazy?