digitalmars.D.learn - Behavior of joining mapresults
- =?UTF-8?Q?Christian_K=c3=b6stlin?= (27/27) Dec 20 2017 When working with json data files, that we're a little bigger than
- Stefan Koch (3/16) Dec 20 2017 you need to memorize I guess, map is lazy.
- =?UTF-8?Q?Christian_K=c3=b6stlin?= (2/15) Dec 20 2017 thats an idea, thank a lot, will give it a try ...
- =?UTF-8?Q?Christian_K=c3=b6stlin?= (24/25) Dec 20 2017 #!/usr/bin/env rdmd -unittest
- Jonathan M Davis (6/31) Dec 20 2017 I would think that it would make a lot more sense to simply put the whol...
- =?UTF-8?Q?Christian_K=c3=b6stlin?= (62/66) Dec 21 2017 thats also possible, but i wanted to make use of the laziness ... e.g.
When working with json data files, that we're a little bigger than convenient I stumbled upon a strange behavior with joining of mapresults (I understand that this is more or less flatmap). I mapped inputfiles, to JSONValues, from which I took out some arrays, whose content I wanted to join. Although the joiner is at the end of the functional pipe, it led to calling of the parsing code twice. I tried to reduce the problem: unittest { import std.stdio; import std.range; import std.algorithm; import std.string; auto parse(int i) { writeln("parsing %s".format(i)); return [1, 2, 3]; } writeln(iota(1, 5).map!(parse)); writeln("-------------------------------"); writeln((iota(1, 5).map!(parse)).joiner); } void main() {} As you can see if you run this code, parsing 1,..5 is called two times each. What am I doing wrong here? Thanks in advance, Christian
Dec 20 2017
On Wednesday, 20 December 2017 at 15:28:00 UTC, Christian Köstlin wrote:When working with json data files, that we're a little bigger than convenient I stumbled upon a strange behavior with joining of mapresults (I understand that this is more or less flatmap). I mapped inputfiles, to JSONValues, from which I took out some arrays, whose content I wanted to join. Although the joiner is at the end of the functional pipe, it led to calling of the parsing code twice. I tried to reduce the problem: [...]you need to memorize I guess, map is lazy.
Dec 20 2017
On 20.12.17 17:19, Stefan Koch wrote:On Wednesday, 20 December 2017 at 15:28:00 UTC, Christian Köstlin wrote:thats an idea, thank a lot, will give it a try ...When working with json data files, that we're a little bigger than convenient I stumbled upon a strange behavior with joining of mapresults (I understand that this is more or less flatmap). I mapped inputfiles, to JSONValues, from which I took out some arrays, whose content I wanted to join. Although the joiner is at the end of the functional pipe, it led to calling of the parsing code twice. I tried to reduce the problem: [...]you need to memorize I guess, map is lazy.
Dec 20 2017
On 20.12.17 17:30, Christian Köstlin wrote:thats an idea, thank a lot, will give it a try ...unittest { import std.stdio; import std.range; import std.algorithm; import std.string; import std.functional; auto parse(int i) { writeln("parsing %s".format(i)); return [1, 2, 3]; } writeln(iota(1, 5).map!(memoize!parse)); writeln("-------------------------------"); writeln((iota(1, 5).map!(memoize!parse)).joiner); } void main() {} works, but i fear for the data that is stored in the memoization. at the moment its not a big issue, as all the data fits comfortable into ram, but for bigger data another approach is needed (probably even my current json parsing must be exchanged). I still wonder, if the joiner calls front more often than necessary. For sure its valid to call front as many times as one sees fit, but with a lazy map in between, it might not be the best solution.
Dec 20 2017
On Thursday, December 21, 2017 07:46:03 Christian Köstlin via Digitalmars-d- learn wrote:On 20.12.17 17:30, Christian Köstlin wrote:I would think that it would make a lot more sense to simply put the whole thing in an array than to use memoize. e.g. auto arr = iota(1, 5).map!parse().array(); - Jonathan M Davisthats an idea, thank a lot, will give it a try ...unittest { import std.stdio; import std.range; import std.algorithm; import std.string; import std.functional; auto parse(int i) { writeln("parsing %s".format(i)); return [1, 2, 3]; } writeln(iota(1, 5).map!(memoize!parse)); writeln("-------------------------------"); writeln((iota(1, 5).map!(memoize!parse)).joiner); } void main() {} works, but i fear for the data that is stored in the memoization. at the moment its not a big issue, as all the data fits comfortable into ram, but for bigger data another approach is needed (probably even my current json parsing must be exchanged). I still wonder, if the joiner calls front more often than necessary. For sure its valid to call front as many times as one sees fit, but with a lazy map in between, it might not be the best solution.
Dec 20 2017
On 21.12.17 08:41, Jonathan M Davis wrote:I would think that it would make a lot more sense to simply put the whole thing in an array than to use memoize. e.g. auto arr = iota(1, 5).map!parse().array();thats also possible, but i wanted to make use of the laziness ... e.g. if i then search over the flattened stuff, i do not have to parse the 10th file. i replaced joiner by a primitive flatten function like this: unittest { import std.stdio; import std.range; import std.algorithm; import std.string; import std.functional; auto parse(int i) { writeln("parsing %s".format(i)); return [1, 2, 3]; } writeln(iota(1, 5).map!(parse)); writeln("-------------------------------"); writeln((iota(1, 5).map!(parse)).joiner); writeln("-------------------------------"); writeln((iota(1, 5).map!(memoize!parse)).joiner); writeln("-------------------------------"); writeln((iota(1, 5).map!(parse)).flatten); } auto flatten(T)(T input) { import std.range; struct Res { T input; ElementType!T current; this(T input) { this.input = input; this.current = this.input.front; advance(); } private void advance() { while (current.empty) { if (input.empty) { return; } input.popFront; if (input.empty) { return; } current = input.front; } } bool empty() { return current.empty; } auto front() { return current.front; } void popFront() { current.popFront; advance(); } } return Res(input); } void main() {} With this implementation my program behaves as expected (parsing the input data only once).
Dec 21 2017