www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Files and Buffers

reply Jerry Ferris <biomechanical iomailu.io> writes:
Hello,

I'm developing a program that will either receive data from stdin 
or a file, and pass it along to a function for processing. I want 
to place this data into a buffer so there only has to be one 
version of the function. However, since I'm new to D, I'm unsure 
how to go about this in the most idiomatic way.

Thus my questions are:
* Is there a simpler way to copy a file's contents into a buffer?
* Is the above superfluous due to some mechanism?
* Basically, am I doing this in a convoluted and inefficient 
manner?

Here is the current code:

---
/**
  * Copy file contents to buffer
  */
auto file = File(filename, "rb");
auto compBuf = new OutBuffer();

{
	import std.conv : to;

	immutable fileSize = filename.getSize.to!uint;
	compBuf.reserve(fileSize);
	
	foreach (ubyte[] buf; file.byChunk(fileSize))
	{
		compBuf.write(buf);
	}
}

// Decompress takes an OutBuffer w/ compressed data and returns 
an OutBuffer w/ the uncompressed version
auto decompBuf = compBuf.decompress;
---

Thank you in advance, and I apologize if this is a very stupid 
question.

Regards,
Jerry Ferris
Feb 01 2018
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/1/18 4:36 PM, Jerry Ferris wrote:
 Hello,
 
 I'm developing a program that will either receive data from stdin or a 
 file, and pass it along to a function for processing. I want to place 
 this data into a buffer so there only has to be one version of the 
 function. However, since I'm new to D, I'm unsure how to go about this 
 in the most idiomatic way.
https://dlang.org/phobos/std_file.html#read Then cast to ubyte[].
 Thank you in advance, and I apologize if this is a very stupid question.
Not a stupid question! -Steve
Feb 01 2018
next sibling parent Jerry Ferris <biomechanical iomailu.io> writes:
On Thursday, 1 February 2018 at 21:42:33 UTC, Steven 
Schveighoffer wrote:
 On 2/1/18 4:36 PM, Jerry Ferris wrote:
 https://dlang.org/phobos/std_file.html#read

 Then cast to ubyte[].

 Thank you in advance, and I apologize if this is a very stupid 
 question.
Not a stupid question! -Steve
On Thursday, February 01, 2018 21:36:52 Jerry Ferris via Digitalmars-d-learn wrote:
If you want to copy an entire file into an array, then use 
std.file.read or std.file.readText. std.stdio.File only really 
makes sense if you're trying not to read the entire file into 
memory.

- Jonathan M Davis
In retrospect, I should've looked more closely to that module. Regardless, thank you very much for your (prompt!) answers. Regards, Jerry Ferris
Feb 01 2018
prev sibling parent reply Seb <seb wilzba.ch> writes:
On Thursday, 1 February 2018 at 21:42:33 UTC, Steven 
Schveighoffer wrote:
 On 2/1/18 4:36 PM, Jerry Ferris wrote:
 Hello,
 
 I'm developing a program that will either receive data from 
 stdin or a file, and pass it along to a function for 
 processing. I want to place this data into a buffer so there 
 only has to be one version of the function. However, since I'm 
 new to D, I'm unsure how to go about this in the most 
 idiomatic way.
https://dlang.org/phobos/std_file.html#read Then cast to ubyte[].
There's also always std.string.representation which imho looks nicer then the cast: https://dlang.org/library/std/string/representation.html
Feb 01 2018
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/1/18 8:27 PM, Seb wrote:
 On Thursday, 1 February 2018 at 21:42:33 UTC, Steven Schveighoffer wrote:
 On 2/1/18 4:36 PM, Jerry Ferris wrote:
 Hello,

 I'm developing a program that will either receive data from stdin or 
 a file, and pass it along to a function for processing. I want to 
 place this data into a buffer so there only has to be one version of 
 the function. However, since I'm new to D, I'm unsure how to go about 
 this in the most idiomatic way.
https://dlang.org/phobos/std_file.html#read Then cast to ubyte[].
There's also always std.string.representation which imho looks nicer then the cast: https://dlang.org/library/std/string/representation.html
std.file.read returns a void[]. I didn't see one that returns a ubyte[], and using the readText version is going to validate the text I think (which may not be desired). It really depends on the use case of the OP, but his original code was working with ubyte[] without validation, so I suggested the void[] return with cast. It's a shame, actually, that ubyte[] isn't returned from read. I remember discussions at some point to change it, but that was shot down for some reason. -Steve
Feb 02 2018
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, February 02, 2018 09:40:52 Steven Schveighoffer via Digitalmars-
d-learn wrote:
 std.file.read returns a void[].

 I didn't see one that returns a ubyte[], and using the readText version
 is going to validate the text I think (which may not be desired). It
 really depends on the use case of the OP, but his original code was
 working with ubyte[] without validation, so I suggested the void[]
 return with cast.
readText calls std.utf.validate, which makes some sense if char, wchar, and dchar are supposed to be UTF-8, UTF-16, and UTF-32 respectively, but then we turn around and end up validating Unicode all over the place thanks to how the range API works with "narrow" strings, making the validation kind of pointless. Regardless, if what you want is ubyte[], then there's no reason to be calling readText rather than read.
 It's a shame, actually, that ubyte[] isn't returned from read. I
 remember discussions at some point to change it, but that was shot down
 for some reason.
I can never remember what the reasons are for using void[] instead of ubyte[]. I think that maybe it was argued at one point that having a parameter be void[] made more sense, because it then can accept any array type, but that argument doesn't hold for something returning void[]. It's just a bunch of bytes, so I would have thought that ubyte[] would make more sense. But I can't remember the arguments now, and without thinking through it a bit, I'm not sure whether changing read to return ubyte[] would break code or not. I don't _think_ so, since you have to cast from void[] to do anything, and ubyte[] will implicitly convert to void[], but there may be something that I'm missing that would make such a change a breaking change. But the fact that it's void[] instead of ubyte[], means that it can't be used in safe code without using trusted even if all you want is ubyte[]. - Jonathan M Davis
Feb 02 2018
prev sibling next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, February 01, 2018 21:36:52 Jerry Ferris via Digitalmars-d-learn 
wrote:
 Hello,

 I'm developing a program that will either receive data from stdin
 or a file, and pass it along to a function for processing. I want
 to place this data into a buffer so there only has to be one
 version of the function. However, since I'm new to D, I'm unsure
 how to go about this in the most idiomatic way.

 Thus my questions are:
 * Is there a simpler way to copy a file's contents into a buffer?
If you want to copy an entire file into an array, then use std.file.read or std.file.readText. std.stdio.File only really makes sense if you're trying not to read the entire file into memory. - Jonathan M Davis
Feb 01 2018
prev sibling parent Jerry Ferris <biomechanical iomailu.io> writes:
Good day,

I reexamined my objective and needs, and I've determined that the 
entirety of the input does not need to be read in all cases. Thus 
I've returned to using File instead std.file.read.
I should've put forth more thought; because I did not, I created 
a fairly useless OP. I do apologize, but from what I've seen, it 
at least initiated some discussion; I learned some useful 
functions too.

---
void main(string[] args)
{
	// file points to either stdin or a file specified by a user via 
args
	// fileSize is 0 for stdin
	auto uncompData = file.decompress(fileSize);
}

OutBuffer decompress(File file, ulong fileSize = 0)
{
	// Process file's contents
}
---

Best regards,
Jerry Ferris
Feb 02 2018