digitalmars.D.learn - Associative Array - where's my error?
- Adrian Mercieca (71/71) Apr 27 2011 Hi,
- Jesse Phillips (6/32) Apr 27 2011 Quite simple your error is in using cast(string). Solution use to!string...
- bearophile (4/5) Apr 28 2011 You have written a nice explanation of the situation, and indeed if you ...
- Jesse Phillips (5/9) Apr 28 2011 Thank you.
- bearophile (4/5) Apr 28 2011 I don't agree. If you perform a cast of the dup-ed char array to a strin...
- Jesse Phillips (2/10) Apr 28 2011 _This_ code wouldn't not have problems, that doesn't mean code won't be ...
- Andrej Mitrovic (8/8) Apr 28 2011 There's more issues with string assignments. For example see this:
- Jesse Phillips (2/11) Apr 28 2011
- Andrej Mitrovic (1/1) Apr 28 2011 Thanks. I'm glad it wasn't a language implementation issue.
- Adrian Mercieca (7/7) Apr 28 2011 Hi everyone,
Hi, I am currently learning D, but have ran into a problem. I wrote this simple program: import std.stdio; import std.string; import std.conv; void main() { int[string] dataMap; foreach(line; stdin.byLine() ) { auto s = line.split("\t"); auto anum = to!int(s[0]); auto aname = cast(string)s[1]; dataMap[aname] = anum; } foreach(s; dataMap.keys.sort) { writeln(s); } } The code is in a file named 'rl.d'. GNU/Linux) The compiler flags used are: dmd -w -wi -O -inline -noboundscheck -release rl.d. I invoke the program from the command line with this command: cat data.txt | ./rl Basically, this is piping the contents of the data.txt file (below) into the program. data.txt contents: 1 ABC 2 DEF 3 GHI 4 JKL 5 MNO 6 PQR 7 STU 8 VWX 9 YZ There are two 'fields' per line, a number and a string separated by a tab. Now, when I run the program, I get this output: YZ YZ YZ YZ YZ YZ YZ YZ YZ Which is nowhere close to what I was expecting. The program is supposed to read the input line by line, split each line on tab character and populate an associative array with the key being the 2nd string value and set the corresponding value to the value of the 1st field. Then it goes into a loop printing out the sorted keys of the associative array. I was expecting output like: ABC DEF GHI JKL MNO PQR STU VWX YZ but, as shown, I get something completely different and unexpected. Can anyone shed any light on this for me please? What am I doing wrong? Thanks. - Adrian.
Apr 27 2011
Adrian Mercieca Wrote:Hi, I am currently learning D, but have ran into a problem. I wrote this simple program: import std.stdio; import std.string; import std.conv; void main() { int[string] dataMap; foreach(line; stdin.byLine() ) { auto s = line.split("\t"); auto anum = to!int(s[0]); auto aname = cast(string)s[1]; dataMap[aname] = anum; } foreach(s; dataMap.keys.sort) { writeln(s); } }Quite simple your error is in using cast(string). Solution use to!string(s[1]) The explanation: byLine reuses the buffer as it iterates, this is why you see YZ, though I would expect remains of some other words in there. Associative Array keys must be immutable (so that their value is not changed by another reference). By casting to string you have told the type system "This value cannot be changed by me or any other reference to this data." But in reality you do continue to modify it because byLine still has the reference. Casting types is making statements about the type, it is not converting the type (in general). You must have inserted the cast because the compiler said s[1] wasn't immutable and figured you could "force it" into being so. In such cases the compiler is telling something is wrong and you are saying "I don't care. I know I'm right." Don't use cast, to!() is much safer. Also you could use s[1].idup instead of cast or to!().
Apr 27 2011
Jesse Phillips:Casting types is making statements about the type, it is not converting the type (in general). You must have inserted the cast because the compiler said s[1] wasn't immutable and figured you could "force it" into being so. In such cases the compiler is telling something is wrong and you are saying "I don't care. I know I'm right." Don't use cast, to!() is much safer.You have written a nice explanation of the situation, and indeed if you take care of using casts correctly, this situation is not bad. But in practice I think people will not use casts so carefully. This is why I have suggested a different design, essentially: the by line dups on default and doesn't do it on request (this difference is expressed with two different function names or a template argument boolean, etc). So when you don't know what you are doing the code works correctly. Bye, bearophile
Apr 28 2011
bearophile Wrote:You have written a nice explanation of the situation, and indeed if you take care of using casts correctly, this situation is not bad. But in practice I think people will not use casts so carefully. This is why I have suggested a different design, essentially: the by line dups on default and doesn't do it on request (this difference is expressed with two different function names or a template argument boolean, etc). So when you don't know what you are doing the code works correctly. Bye, bearophileThank you. I think this just hides the issue. To fix the issue you would need byLine to return a string, since user code my still save and modify the returned value and store it in an AA. Even then this is "fixing" one small area in Phobos. To really make this better you would need to change the semantics of cast to be what new users would gone with changing the semantics and adding new features to provide some of the because he could check for null instead of catching cast exceptions. D on the other hand left cast as a "dangerous" language feature and provided a "safe" library function. So yes new users will get it wrong. I believe the correct solution is to educate at emphasize to!() over cast in examples.
Apr 28 2011
Jesse Phillips:I think this just hides the issue. To fix the issue you would need byLine to return a string, since user code my still save and modify the returned value and store it in an AA.I don't agree. If you perform a cast of the dup-ed char array to a string you will not have problems. The main problem is the missed duplication (and the bad usage of casts in general, for people coming from C/C++). Bye, bearophile
Apr 28 2011
bearophile Wrote:Jesse Phillips:_This_ code wouldn't not have problems, that doesn't mean code won't be written that does have problems.I think this just hides the issue. To fix the issue you would need byLine to return a string, since user code my still save and modify the returned value and store it in an AA.I don't agree. If you perform a cast of the dup-ed char array to a string you will not have problems. The main problem is the missed duplication (and the bad usage of casts in general, for people coming from C/C++). Bye, bearophile
Apr 28 2011
There's more issues with string assignments. For example see this: http://d.puremagic.com/issues/show_bug.cgi?id=5059 In that example, key.name() returns a string. But "key" itself will be reused on every loop, and the string that key holds will be overwritten on each pass. I haven't gotten any comments on that bug report yet. I'm not sure if it really is a bug, but it certainly is unexpected that an immutable string gets overwritten.
Apr 28 2011
Andrej Mitrovic Wrote:There's more issues with string assignments. For example see this: http://d.puremagic.com/issues/show_bug.cgi?id=5059 In that example, key.name() returns a string. But "key" itself will be reused on every loop, and the string that key holds will be overwritten on each pass.Commented on your bug. It is because of a cast(string) inside the KeySequence's opApply. My point still stands, to!() is safer.I haven't gotten any comments on that bug report yet. I'm not sure if it really is a bug, but it certainly is unexpected that an immutable string gets overwritten.
Apr 28 2011
Thanks. I'm glad it wasn't a language implementation issue.
Apr 28 2011
Hi everyone, Ok - I think I got the gist of what you guys are saying. Thanks for the help. Coming from C/C++ and Java, there's quite a bit for me to learn. But I'll get there; so far, I am very impressed with D. Regards. - Adrian.
Apr 28 2011