www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.range.chunk without length

reply "Stephan Schiffels" <stephan_schiffels mac.com> writes:
Hi,

I'd like a version of std.range.chunk that does not require the 
range to have the "length" property.

As an example, consider a file that you would like parse by lines 
and always lump together four lines, i.e.

import std.stdio;
void main() {
   auto range = File("test.txt", "r").byLine();
   foreach(c; range.chunks(4)) { //doesn't compile
     writefln("%s %s", c[0], c[1]);
   }
}

Thanks,
Stephan
Oct 29 2013
parent reply "qznc" <qznc web.de> writes:
On Wednesday, 30 October 2013 at 00:20:12 UTC, Stephan Schiffels 
wrote:
 Hi,

 I'd like a version of std.range.chunk that does not require the 
 range to have the "length" property.

 As an example, consider a file that you would like parse by 
 lines and always lump together four lines, i.e.

 import std.stdio;
 void main() {
   auto range = File("test.txt", "r").byLine();
   foreach(c; range.chunks(4)) { //doesn't compile
     writefln("%s %s", c[0], c[1]);
   }
 }
Your wish was granted. Monarchdodra was sent back in time [0], so it is already fixed in HEAD. You could try the dmd beta [1]. [0] https://github.com/D-Programming-Language/phobos/pull/992 [1] http://forum.dlang.org/thread/526DD8C5.2040402 digitalmars.com
Oct 30 2013
parent reply "Stephan Schiffels" <stephan_schiffels mac.com> writes:
On Wednesday, 30 October 2013 at 20:43:54 UTC, qznc wrote:
 On Wednesday, 30 October 2013 at 00:20:12 UTC, Stephan 
 Schiffels wrote:
 Hi,

 I'd like a version of std.range.chunk that does not require 
 the range to have the "length" property.

 As an example, consider a file that you would like parse by 
 lines and always lump together four lines, i.e.

 import std.stdio;
 void main() {
  auto range = File("test.txt", "r").byLine();
  foreach(c; range.chunks(4)) { //doesn't compile
    writefln("%s %s", c[0], c[1]);
  }
 }
Your wish was granted. Monarchdodra was sent back in time [0], so it is already fixed in HEAD. You could try the dmd beta [1]. [0] https://github.com/D-Programming-Language/phobos/pull/992 [1] http://forum.dlang.org/thread/526DD8C5.2040402 digitalmars.com
Ah, awesome! Should have updated my github clone then. Thanks, Stephan
Oct 31 2013
parent reply "Stephan Schiffels" <stephan_schiffels mac.com> writes:
On Thursday, 31 October 2013 at 10:35:54 UTC, Stephan Schiffels 
wrote:
 On Wednesday, 30 October 2013 at 20:43:54 UTC, qznc wrote:
 On Wednesday, 30 October 2013 at 00:20:12 UTC, Stephan 
 Schiffels wrote:
 Hi,

 I'd like a version of std.range.chunk that does not require 
 the range to have the "length" property.

 As an example, consider a file that you would like parse by 
 lines and always lump together four lines, i.e.

 import std.stdio;
 void main() {
 auto range = File("test.txt", "r").byLine();
 foreach(c; range.chunks(4)) { //doesn't compile
   writefln("%s %s", c[0], c[1]);
 }
 }
Your wish was granted. Monarchdodra was sent back in time [0], so it is already fixed in HEAD. You could try the dmd beta [1]. [0] https://github.com/D-Programming-Language/phobos/pull/992 [1] http://forum.dlang.org/thread/526DD8C5.2040402 digitalmars.com
Ah, awesome! Should have updated my github clone then. Thanks, Stephan
Sorry for the late follow up, but it turns out that std.range.chunks needs a ForwardRange, and hence does not work on File.byLine(). The referenced pull request claims that it does in the comments, but of course the current implementation needs a "save()" function which doesn't exist for the byLine range. It would be actually easy to implement chunks without the "save" function, by using an internal buffer, which would however make this algorithm's memory burden linear in the chunk size. Would that be acceptable? If so, I'd be happy to make that change and push it. Stephan
Feb 13 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Stephan Schiffels:

 It would be actually easy to implement chunks without the 
 "save" function, by using an internal buffer, which would 
 however make this algorithm's memory burden linear in the chunk 
 size. Would that be acceptable?
I think it's acceptable. But perhaps you need to add one more optional argument for the buffer :-) Bye, bearophile
Feb 13 2014
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 13 February 2014 at 14:45:44 UTC, bearophile wrote:
 Stephan Schiffels:

 It would be actually easy to implement chunks without the 
 "save" function, by using an internal buffer, which would 
 however make this algorithm's memory burden linear in the 
 chunk size. Would that be acceptable?
I think it's acceptable. But perhaps you need to add one more optional argument for the buffer :-) Bye, bearophile
Users andralex: https://github.com/D-Programming-Language/phobos/pull/1186 And quickfur: https://github.com/D-Programming-Language/phobos/pull/1453 Have submitted different algorithms for a similar problem: Basically, bu being "2-dimensional lazy" (each subrange is itself a lazy range). However, both come with their own pitfalls. Andrei's still requires forward ranges. quickfur's doesn't, and, arguably, has a simpler design. However, if I remember correctly, it is also less efficient (it does double work). Implementing Quickfur's solution in Chunks for input ranges only could be a good idea. It *is* extra work, more code, more code to cover (that is difficult to cover). I'm not sure we have the man power to support such complexity: I was able to make chunks work with forward ranges, but I still haven't even fixed Splitter yet! I think that should take precedence.
Feb 13 2014
parent "Stephan Schiffels" <stephan_schiffels mac.com> writes:
On Thursday, 13 February 2014 at 17:41:37 UTC, monarch_dodra 
wrote:
 On Thursday, 13 February 2014 at 14:45:44 UTC, bearophile wrote:
 Stephan Schiffels:

 It would be actually easy to implement chunks without the 
 "save" function, by using an internal buffer, which would 
 however make this algorithm's memory burden linear in the 
 chunk size. Would that be acceptable?
I think it's acceptable. But perhaps you need to add one more optional argument for the buffer :-) Bye, bearophile
Users andralex: https://github.com/D-Programming-Language/phobos/pull/1186 And quickfur: https://github.com/D-Programming-Language/phobos/pull/1453 Have submitted different algorithms for a similar problem: Basically, bu being "2-dimensional lazy" (each subrange is itself a lazy range). However, both come with their own pitfalls. Andrei's still requires forward ranges. quickfur's doesn't, and, arguably, has a simpler design. However, if I remember correctly, it is also less efficient (it does double work). Implementing Quickfur's solution in Chunks for input ranges only could be a good idea. It *is* extra work, more code, more code to cover (that is difficult to cover). I'm not sure we have the man power to support such complexity: I was able to make chunks work with forward ranges, but I still haven't even fixed Splitter yet! I think that should take precedence.
Yeah, nevermind, I won't do it. I realised that you had good reasons to require a ForwardRange. Chunking really needs some sort of "save" implemented. And what I had in mind to make it work on File.byLine with a buffer is actually a hack that effectively adds "save" functionality to the InputRange… so I agree it's logically not reasonable to do it here. Thanks anyway. Stephan
Feb 18 2014