www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Adding a read primitive to ranges

reply "Freddy" <Hexagonalstar64 gmail.com> writes:
Would it be a bad idea to add a read primitive to ranges for 
streaming?
----
struct ReadRange(T){
     size_t read(T[] buffer);
     //and | or
     T[] read(size_t request);

     /+ empty,front,popFront,etc +/
}
----
May 03 2015
next sibling parent ketmar <ketmar ketmar.no-ip.org> writes:
On Mon, 04 May 2015 00:07:25 +0000, Freddy wrote:

 Would it be a bad idea to add a read primitive to ranges for streaming?
 ----
 struct ReadRange(T){
      size_t read(T[] buffer); //and | or T[] read(size_t request);
=20
      /+ empty,front,popFront,etc +/
 }
 ----
if you want to add such things, i'd say you should model that by=20 `std.stdio.File` (`rawRead`, `rawWrite` and other file functions). i'm using my `streams` module that uses such interfaces for a long time. can't see why it should be range, though. i introduced "Stream" entity,=20 which, like range, can be checked with various traits: isReadableStream,=20 isWriteableStream, isSeekableStream and so on. note that stream can be=20 range too, that's completely different interfaces. what is good with taking `std.stdio.File` as a base -- all my stream=20 operations immediately usable on standard file objects from Phobos.=
May 03 2015
prev sibling next sibling parent reply "Alex Parrill" <initrd.gz gmail.com> writes:
On Monday, 4 May 2015 at 00:07:27 UTC, Freddy wrote:
 Would it be a bad idea to add a read primitive to ranges for 
 streaming?
 ----
 struct ReadRange(T){
     size_t read(T[] buffer);
     //and | or
     T[] read(size_t request);

     /+ empty,front,popFront,etc +/
 }
 ----
IT seems redundant to me. It's semantically no different than iterating through the range normally with front/popFront. For objects where reading large amounts of data is more efficient than reading one-at-a-time, you can implement a byChunks function like stdio.File.
May 04 2015
parent reply "Freddy" <Hexagonalstar64 gmail.com> writes:
On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote:
 IT seems redundant to me. It's semantically no different than 
 iterating through the range normally with front/popFront. For 
 objects where reading large amounts of data is more efficient 
 than reading one-at-a-time, you can implement a byChunks 
 function like stdio.File.
The ploblem is that all the functions in std.range,std.algorithm and many other wrappers would ignore byChucks and produce much slower code.
May 04 2015
parent reply "Alex Parrill" <initrd.gz gmail.com> writes:
On Monday, 4 May 2015 at 19:23:08 UTC, Freddy wrote:
 On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote:

 The ploblem is that all the functions in 
 std.range,std.algorithm and many other wrappers would ignore 
 byChucks and produce much slower code.
How so? `file.byChunks(4096).joiner` is a range that acts as if you read each byte out of the file one at a time, but actually reads them in 4096-byte buffers. It's still compatible with all of the range and algorithm functions.
May 04 2015
parent reply "Freddy" <Hexagonalstar64 gmail.com> writes:
On Monday, 4 May 2015 at 23:20:57 UTC, Alex Parrill wrote:
 On Monday, 4 May 2015 at 19:23:08 UTC, Freddy wrote:
 On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote:

 The ploblem is that all the functions in 
 std.range,std.algorithm and many other wrappers would ignore 
 byChucks and produce much slower code.
How so? `file.byChunks(4096).joiner` is a range that acts as if you read each byte out of the file one at a time, but actually reads them in 4096-byte buffers. It's still compatible with all of the range and algorithm functions.
Reading an arbitrary number of data after being wrapped. For example ---- void func(R)(R range){//expects range of strings string[] elms=range.read(5); string[] elms2=range.read(9); /++..++/ } void caller(){ auto file=...;//unbuffered file file.map!(a=>a.to!string).func(); } ---- Using byChucks would cause much more reallocation.
May 04 2015
parent reply "Freddy" <Hexagonalstar64 gmail.com> writes:
On Tuesday, 5 May 2015 at 00:50:44 UTC, Freddy wrote:
 ----
 void func(R)(R range){//expects range of strings
     string[] elms=range.read(5);
     string[] elms2=range.read(9);
     /++..++/
 }


 void caller(){
     auto file=...;//unbuffered file
     file.map!(a=>a.to!string).func();
 }
 ----
Wait, Bad example, ---- void func(R)(R range){//expects range of ubyte ubyte[] data=range.read(VERY_BIG_NUMBER); ubyte[] other_data=range.read(OTHER_VERY_BIG_NUMBER); } ---- which would be more optimal for a file but still works for other ranges, compared to looping though the ranges read appending to data.
May 04 2015
parent reply "Alex Parrill" <initrd.gz gmail.com> writes:
On Tuesday, 5 May 2015 at 01:28:03 UTC, Freddy wrote:
 Wait, Bad example,
 ----
 void func(R)(R range){//expects range of ubyte
     ubyte[] data=range.read(VERY_BIG_NUMBER);
     ubyte[] other_data=range.read(OTHER_VERY_BIG_NUMBER);
 }
 ----
 which would be more optimal for a file but still works for 
 other ranges, compared to looping though the ranges read 
 appending to data.
How would it be more optimal? As I said, if you pass in `file.byChunks(some_amount).joiner`, this will still read the file in large chunks. It's less optimal now because `read` has to allocate an array on every call (easily avoidable by passing in a reusable buffer, but still). Equivalent code with ranges: auto range = file.byChunks(4096).joiner; ubyte[] data = range.take(VERY_BIG_NUMBER).array; ubyte[] other_data = range.take(OTHER_VERY_BIG_NUMBER).array;
May 05 2015
parent "Freddy" <Hexagonalstar64 gmail.com> writes:
 How would it be more optimal? As I said, if you pass in 
 `file.byChunks(some_amount).joiner`, this will still read the 
 file in large chunks. It's less optimal now because `read` has 
 to allocate an array on every call (easily avoidable by passing 
 in a reusable buffer, but still).

 Equivalent code with ranges:

     auto range = file.byChunks(4096).joiner;
     ubyte[] data = range.take(VERY_BIG_NUMBER).array;
     ubyte[] other_data = 
 range.take(OTHER_VERY_BIG_NUMBER).array;
The range solution copies from a buffer to a newly allocated array many times, doing many system calls. The read(stream) solution allocates a new array and does one system call. Sorry for the miscommunication.
May 05 2015
prev sibling parent "Freddy" <Hexagonalstar64 gmail.com> writes:
On Monday, 4 May 2015 at 00:07:27 UTC, Freddy wrote:
 Would it be a bad idea to add a read primitive to ranges for 
 streaming?
 ----
 struct ReadRange(T){
     size_t read(T[] buffer);
     //and | or
     T[] read(size_t request);

     /+ empty,front,popFront,etc +/
 }
 ----
Also if so, What about adding a default read for input ranges. Something like ---- typeof(range.front)[] read(R)(ref R range,size_t amount){ auto data=new typeof(range.front)[amount]; /+... read into data ...+/ return data[0..actual_amount]; } ----
May 04 2015