www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - eliminate cast

reply Dee Girl <deegirl noreply.com> writes:
There is a file data.txt with numbers:

345
5467
45
238
...

And I want to load into an array of uint like this.

auto sizes = map!
    (to!(uint, string))
    (compose!(split, q{cast(string) std.file.read(a)})("data.txt"));

It works but cast is always bad ^_^. How can I eliminate cast? Is there
function to read entire text file in string? Thank you, Dee Girl
May 14 2008
next sibling parent reply BCS <ao pathlink.com> writes:
Reply to Dee,

 There is a file data.txt with numbers:
 
 345
 5467
 45
 238
 ...
 And I want to load into an array of uint like this.
 
 auto sizes = map!
 (to!(uint, string))
 (compose!(split, q{cast(string) std.file.read(a)})("data.txt"));

 It works but cast is always bad ^_^. How can I eliminate cast? Is
 there function to read entire text file in string? Thank you, Dee Girl
 
IMHO in this case the cast is /not/ bad. It is an explicit documentation of your intent. The file.read function knows nothing about what it is reading and therefor returns ubyte[] to say "this is data" and the cast is the programer saying "I known this is text".
May 14 2008
parent reply Dee Girl <deegirl noreply.com> writes:
BCS Wrote:

 Reply to Dee,
 
 There is a file data.txt with numbers:
 
 345
 5467
 45
 238
 ...
 And I want to load into an array of uint like this.
 
 auto sizes = map!
 (to!(uint, string))
 (compose!(split, q{cast(string) std.file.read(a)})("data.txt"));

 It works but cast is always bad ^_^. How can I eliminate cast? Is
 there function to read entire text file in string? Thank you, Dee Girl
 
IMHO in this case the cast is /not/ bad. It is an explicit documentation of your intent. The file.read function knows nothing about what it is reading and therefor returns ubyte[] to say "this is data" and the cast is the programer saying "I known this is text".
O.K. but let say we want to check for invalid file (binary and not ascii). How can I do it? Thank you, Dee Girl
May 14 2008
next sibling parent BCS <ao pathlink.com> writes:
Reply to Dee,


 O.K. but let say we want to check for invalid file (binary and not
 ascii). How can I do it? Thank you, Dee Girl
 
http://www.digitalmars.com/d/1.0/phobos/std_utf.html validate might be a start
May 14 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 14/05/2008, Dee Girl <deegirl noreply.com> wrote:
 O.K. but let say we want to check for invalid file (binary and not ascii). How
can I do it? Thank you, Dee Girl
First off, char and string mean UTF-8, not ASCII. So by casting to string, you are saying "I know this is UTF-8". There is a way to say "This is ASCII", which is to cast to AsciiString instead of to string. (AsciiString is defined in std.encoding). There are currently two ways to validate. There's the old way: std.utf.validate(), which throws an exception if the validation fails. That will work for UTF-8, but it won't work for ASCII. Then there's the new way: std.encoding.isValid(), which returns bool if validation fails. That one works for ASCII too, providing your data has type AsciiString instead of string. Yes, there's currently duplicate functionality in Phobos, but std.utf will eventally be deprecated in favor of std.encoding.
May 15 2008
prev sibling parent reply downs <default_357-line yahoo.de> writes:
Dee Girl wrote:
 There is a file data.txt with numbers:
 
 345
 5467
 45
 238
 ...
 
 And I want to load into an array of uint like this.
 
 auto sizes = map!
     (to!(uint, string))
     (compose!(split, q{cast(string) std.file.read(a)})("data.txt"));
 
 It works but cast is always bad ^_^.
Where did you get that idea? In this case, basically, we know the file is really text. So we put this knowledge into code by assuring the compiler that yes, the arbitrary data you've just read _is_ text after all. In any case, it's just a reinterpreting cast. It doesn't change any actual data. So no time lost. Also, just fyi, here's how that code would look like in tools.functional: gentoo-pc ~ $ cat test2.d; echo -----; rebuild test2.d -oftest2 && ./test2 module test2; import std.stdio, std.file, tools.functional, std.string: split, atoi; void main() { auto sizes = (cast(string) "test.txt".read()).split() /map/ &atoi; writefln(sizes); } ----- [345,5467,45,238] I still prefer infix ^^ --downs
May 14 2008
next sibling parent reply Dee Girl <deegirl noreply.com> writes:
downs Wrote:

 Dee Girl wrote:
 There is a file data.txt with numbers:
 
 345
 5467
 45
 238
 ...
 
 And I want to load into an array of uint like this.
 
 auto sizes = map!
     (to!(uint, string))
     (compose!(split, q{cast(string) std.file.read(a)})("data.txt"));
 
 It works but cast is always bad ^_^.
Where did you get that idea? In this case, basically, we know the file is really text. So we put this knowledge into code by assuring the compiler that yes, the arbitrary data you've just read _is_ text after all. In any case, it's just a reinterpreting cast. It doesn't change any actual data. So no time lost. Also, just fyi, here's how that code would look like in tools.functional: gentoo-pc ~ $ cat test2.d; echo -----; rebuild test2.d -oftest2 && ./test2 module test2; import std.stdio, std.file, tools.functional, std.string: split, atoi; void main() { auto sizes = (cast(string) "test.txt".read()).split() /map/ &atoi; writefln(sizes); } ----- [345,5467,45,238] I still prefer infix ^^ --downs
Infix is nice but... I am sorry only a few days and I fight with everybody! ^_^ Sorry! Infix is nice (less ((()))) but it forces arity too 2. If you have two arrays and want to map over them you do: arr1 ~ arr2 /map/ &atoi; But with map you do: map!(atoi)(arr1, arr2); There is no more data copy. I think Also atoi is called directly (remember the previous thread ^_^) not by pointer. It looks to me map in std s superior in two ways. But I think std.map can be better. Maybe you want to map many functions over one or many arrays. You should write: auto t = map!(sin, cos)(array1, array2); Then t is array of tuple with results. I think it is possible. std.reduce does it. So Andrei knows to do it but maybe did not have time. D is so powerful! Also maybe compose can take many functions. So in my dream I write: auto sizes = compose! (map!(to!(uint, string)), split, fileToText) ("data.txt"); It would be even better than all functional languages and scripting languages! Thank you, Dee Girl
May 14 2008
next sibling parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
Dee Girl <deegirl noreply.com> wrote:

 If you have two arrays and want to map over them you do:

 arr1 ~ arr2 /map/ &atoi;

 But with map you do:

 map!(atoi)(arr1, arr2);
That's one of the reasons we want better tuples. (arr1, arr2) /map/ &atoi; -- Simen
May 14 2008
parent Dee Girl <deegirl noreply.com> writes:
Simen Kjaeraas Wrote:

 Dee Girl <deegirl noreply.com> wrote:
 
 If you have two arrays and want to map over them you do:

 arr1 ~ arr2 /map/ &atoi;

 But with map you do:

 map!(atoi)(arr1, arr2);
That's one of the reasons we want better tuples. (arr1, arr2) /map/ &atoi; -- Simen
I search for tuple in std.algorithm. Good to have source! I found in the mismatch function. So I think this should work: tuple(arr1, arr2) /map/ &atoi; This can work? If it works then maybe it is better because syntax is not ambiguous. For a new comer, it seems D people care about syntax very much! This is good. ^_^ Thank you, Dee Girl
May 14 2008
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
My libs are designed to solve exactly such problems, this is the lib:
http://www.fantascienza.net/leonardo/so/libs_d.zip

Usage examples to do the things shown in this thread:

import std.string: stripr;
import std.conv: toInt;
import d.func: map, xrange, range, toStruct, cinterval, zip;
import d.string: xsplit, putr;
import d.xio: xfile;

void main() {
  auto values = map((string l){return toInt(stripr(l));}, xfile("test.txt"));
  putr(values);

  // mapping on two joined iterable things:
  auto arr = map((int x){return -x;}, xrange(6) ~ [1, 2, 3]);
  putr(arr);

  // zipping int a struct array:
  auto array1 = range(3, 30, 2);
  auto array2 = cinterval('b', 'h');
  auto pairs = map((int i, char c){return toStruct(i, c);}, array1, array2);

  // Or just:
  putr(pairs);
  putr(zip(array1, array2));
}


The output is verbatim:
[345, 5467, 45, 238]
[0, -1, -2, -3, -4, -5, -1, -2, -3]
[<3, 'b'>, <5, 'c'>, <7, 'd'>, <9, 'e'>, <11, 'f'>, <13, 'g'>, <15, 'h'>]
[<3, 'b'>, <5, 'c'>, <7, 'd'>, <9, 'e'>, <11, 'f'>, <13, 'g'>, <15, 'h'>]

Note that xfile() is lazy and it returns strings, so if the file is large it
doesn't build a huge array of lines.

If you want there is an xmap() too that is lazy, and allows you the lazy
printing too.

The putr() is able to print any struct too, putting it inside <...>.

cinterval gives a (closed on the right too) interval of chars.

std.conv.toInt() is stupid, it's not even able to strip spaces by itself... so
I've used the (quite slow) std.string.stripr().

The ~ after the xrange to chain those sequences is possible because all lazy
iterators (like xrange) support the xchain protocol, but you can use the
xchain() thing by itself too, for example if you want to chain two arrays, or
you can add the Chainable mixin to your iterable classes.

There are full ddocs with examples and full coverage unittests for everything
that's present in the lib.

Bye,
bearophile
May 14 2008
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"downs" <default_357-line yahoo.de> wrote in message 
news:g0f48g$19jh$1 digitalmars.com...
 Dee Girl wrote:
 There is a file data.txt with numbers:

 345
 5467
 45
 238
 ...

 And I want to load into an array of uint like this.

 auto sizes = map!
     (to!(uint, string))
     (compose!(split, q{cast(string) std.file.read(a)})("data.txt"));

 It works but cast is always bad ^_^.
Where did you get that idea? In this case, basically, we know the file is really text. So we put this knowledge into code by assuring the compiler that yes, the arbitrary data you've just read _is_ text after all. In any case, it's just a reinterpreting cast. It doesn't change any actual data. So no time lost. Also, just fyi, here's how that code would look like in tools.functional: gentoo-pc ~ $ cat test2.d; echo -----; rebuild test2.d -oftest2 && ./test2 module test2; import std.stdio, std.file, tools.functional, std.string: split, atoi; void main() { auto sizes = (cast(string) "test.txt".read()).split() /map/ &atoi; writefln(sizes); } ----- [345,5467,45,238] I still prefer infix ^^
Where is "tools.functional" from? I don't see it in phobos, tango, cashew, dsource, wiki4d, google, or the D newsgroups.
May 14 2008
parent BCS <ao pathlink.com> writes:
Reply to Nick,
 Where is "tools.functional" from? I don't see it in phobos, tango,
 cashew, dsource, wiki4d, google, or the D newsgroups.
it's in scrapple: http://dsource.org/projects/scrapple/browser/trunk/tools/tools http://svn.dsource.org/projects/scrapple/trunk/tools/tools
May 14 2008
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
downs wrote:

 
 Where did you get that idea?
 In this case, basically, we know the file is really text. So we put this
knowledge into code by assuring the compiler that yes, the arbitrary data
you've just read _is_ text after all.
 In any case, it's just a reinterpreting cast. It doesn't change any actual
data. So no time lost.
 
 Also, just fyi, here's how that code would look like in tools.functional:
 
 gentoo-pc ~ $ cat test2.d; echo -----; rebuild test2.d -oftest2 && ./test2
 module test2;
 import std.stdio, std.file, tools.functional, std.string: split, atoi;
 void main() {
   auto sizes = (cast(string) "test.txt".read()).split() /map/ &atoi;
   writefln(sizes);
 }
 -----
 [345,5467,45,238]
 
 
 I still prefer infix ^^
 
  --downs
Once again I see that wicked operator overloading. Am I the only one to find this code... ingenious but distasteful? Isn't this nicer?: auto sizes = (cast(string) "test.txt".read()).split().map(&atoi); -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jun 10 2008