digitalmars.D.learn - block file reads and lazy utf-8 decoding
I want to combine block reads with lazy conversion of utf-8 characters to dchars. Solution I came with is in the program below. This works fine. Has good performance, etc. Question I have is if there is a better way to do this. For example, a different way to construct the lazy 'decodeUTF8Range' rather than writing it out in this fashion. There is quite a bit of power in the library and I'm still learning it. I'm wondering if I overlooked a useful alternative. --Jon Program: ----------- import std.algorithm: each, joiner, map; import std.conv; import std.range; import std.stdio; import std.traits; import std.utf: decodeFront; auto decodeUTF8Range(Range)(Range charSource) if (isInputRange!Range && is(Unqual!(ElementType!Range) == char)) { static struct Result { private Range source; private dchar next; bool empty = false; dchar front() property { return next; } void popFront() { if (source.empty) { empty = true; next = dchar.init; } else { next = source.decodeFront; } } } auto r = Result(charSource); r.popFront; return r; } void main(string[] args) { if (args.length != 2) { writeln("Provide one file name."); return; } ubyte[1024*1024] rawbuf; auto inputStream = args[1].File(); inputStream .byChunk(rawbuf) // Read in blocks .joiner // Join the blocks into a single input char range .map!(a => to!char(a)) // Cast ubyte to char for decodeFront. Any better ways? .decodeUTF8Range // utf8 to dchar conversion. .each; // Real work goes here. writeln("done"); }
Dec 09 2015
On Thursday, 10 December 2015 at 00:36:27 UTC, Jon D wrote:Question I have is if there is a better way to do this. For example, a different way to construct the lazy 'decodeUTF8Range' rather than writing it out in this fashion.A further thought - The decodeUTF8Range function is basically constructing a lazy wrapper range around decodeFront, which is effectively combining a 'front' and 'popFront' operation. So perhaps a generic way to compose a wrapper for such functions.auto decodeUTF8Range(Range)(Range charSource) if (isInputRange!Range && is(Unqual!(ElementType!Range) == char)) { static struct Result { private Range source; private dchar next; bool empty = false; dchar front() property { return next; } void popFront() { if (source.empty) { empty = true; next = dchar.init; } else { next = source.decodeFront; } } } auto r = Result(charSource); r.popFront; return r; }
Dec 09 2015