digitalmars.D.learn - Associative array issue
- Igor Kolesnik (53/53) Jan 23 2013 Hi;
- H. S. Teoh (24/44) Jan 23 2013 [...]
- Igor Kolesnik (2/10) Jan 23 2013 This makes sense. Thanks a lot!
Hi;
I'm trying to run an example from the tutorial on
http://www.informit.com/articles/article.aspx?p=1381876&seqNum=4
Here is the code
import std.stdio, std.string;
void main() {
uint[string] dic;
foreach (line; stdin.byLine) {
string[] words = cast(string[])split(strip(line));
foreach (word; words) {
if (word in dic)
continue;
uint id = dic.length;
dic[word] = id;
writeln(id, '\t', word);
}
}
//foreach (k,v; dic)
// writeln(k, '|', v);
}
When run it behaves somehow strange. Here is an example of the
input/output I get
the type of array
0 the
1 type
2 of
3 array
in d the type of array
4 in
5 d
6 the
7 type
8 of
9 array
It seems like the 'word in dic' doesn't find the item in the
array.
If I print the contents of 'dic' array on exit, I get the
following
d|5
in |0
e of |3
in|4
the|6
array|9
the|1
type|7
ty|2
of|8
Can someone help me understand what is going wrong? Am I missing
something here?
ps: mdm32 v2.061 on Win7 x64
Sincerely,
Igor
Jan 23 2013
On Wed, Jan 23, 2013 at 09:07:24PM +0100, Igor Kolesnik wrote:
[...]
import std.stdio, std.string;
void main() {
uint[string] dic;
foreach (line; stdin.byLine) {
string[] words = cast(string[])split(strip(line));
foreach (word; words) {
if (word in dic)
continue;
uint id = dic.length;
dic[word] = id;
writeln(id, '\t', word);
}
}
//foreach (k,v; dic)
// writeln(k, '|', v);
}
When run it behaves somehow strange. Here is an example of the
input/output I get
[...]
This is a known issue with stdin.byLine: it is a transient range (that
means it reuses the same buffer for each line read from the input). The
problem with this is that split returns slices of the line, that
ultimately refer back to the data in the buffer. But by the time byLine
is called again, that data has been overwritten. That's why the
associative array is messed up.
There's a slight hint of this problem in your code that starts with
"string[] words = cast(string[])..." -- in normal D code, you should not
need to perform this kind of casting. In this case, this is an unsafe
operation, because string is immutable(char)[], but the reused buffer
returned by byLine is *not* immutable, so by casting away immutable,
you've inadvertently introduced yourself to the buffer reuse issue in
byLine. :)
The correct way to write that line is:
string[] words = split(strip(line.idup));
which will copy the buffer, thereby ensuring it's safe to keep slices of
it in your associative array, and also return the correct type so that
no cast is necessary.
T
--
Notwithstanding the eloquent discontent that you have just respectfully
expressed at length against my verbal capabilities, I am afraid that I must
unfortunately bring it to your attention that I am, in fact, NOT verbose.
Jan 23 2013
The correct way to write that line is: string[] words = split(strip(line.idup)); which will copy the buffer, thereby ensuring it's safe to keep slices of it in your associative array, and also return the correct type so that no cast is necessary. TThis makes sense. Thanks a lot! Igor
Jan 23 2013








"Igor Kolesnik" <shadowmaan gmail.com>