www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - array trouble

reply "Jan Hanselaer" <jan.hanselaer gmail.com> writes:
Hi all,

When looking in to the documentation of arrays I found something that 
appeared rather "buggy" to me.
In the documentation you can read this (it's an example to show how changing 
the size of one array could affect the contents of another, namely when the 
one is a slice of the other, her in the example b is a slice of a):

char[] a = new char[20];
char[] b = a[0..10];
char[] c = a[10..20];

b.length = 15; // always resized in place because it is sliced
                    // from a[] which has enough memory for 15 chars

Now I have 2 questions:
1) How come that when I want to print al chars from b I get an error 
(Error: 4invalid UTF-8 sequence)
Normaly I think that the new elements 10..15 from b should get the default 
char, and then there would be no error.

2) Like they say in the documentation, also array a should be affected but 
that doesn't seem to be true because I can still print the 20 original 
chars.

Now to make it a little bit stranger, when I try all this with ints instead 
of chars it works like they say in the documentation. The new elements from 
b get default int 0 and when I print the 3 arrays, I see they are all 
effected by the change of length in b.

Can anyone explain all this? Am I just overlooking something?
My test code for char and int is in the attachment.

Thanks in advance
Jan 


begin 666 array_chars.d



M82D-" D)=W)I=&5F*"(E<R B+&-H*3L-" EW<FET969L;B B7&Y!<G)A>2!B
M(#H


M82 ](")A8F-D969G:&EJ:VQM;F]P<7)S='5V(CL-" EB(#T 85LP+BXQ,%T[

M9W1H(#T
`
end

begin 666 array_ints.d



M" D)=W)I=&5F*"(E9" B+&-H*3L-" EW<FET969L;B B7&Y!<G)A>2!B(#H 

M=W)I=&5F;&XH(EQN07)R87D 8R Z(" B*3L-" EF;W)E86-H*&-H.R!C*0T*

M(#T
M-RPQ."PQ.2PR,"!=.PT*"6( /2!A6S N+C$P73L-" EC(#T 85LQ,"XN,C!=


`
end
May 01 2007
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Jan Hanselaer wrote:
 Hi all,
 
 When looking in to the documentation of arrays I found something that 
 appeared rather "buggy" to me.
 In the documentation you can read this (it's an example to show how changing 
 the size of one array could affect the contents of another, namely when the 
 one is a slice of the other, her in the example b is a slice of a):
 
 char[] a = new char[20];
 char[] b = a[0..10];
 char[] c = a[10..20];
 
 b.length = 15; // always resized in place because it is sliced
                     // from a[] which has enough memory for 15 chars
Correct.
 Now I have 2 questions:
 1) How come that when I want to print al chars from b I get an error 
 (Error: 4invalid UTF-8 sequence)
 Normaly I think that the new elements 10..15 from b should get the default 
 char, and then there would be no error.
They *do* get the default character. The default character, however, has value 0xff and is an invalid byte in UTF-8 (which is what char[]s store). This is by design, to force you to make sure to properly initialize your data. Similar values are used for wchar and dchar. The same reasoning is applied to floating point variables, by the way (float, double, real and their imaginary and complex variants). They are initialized to special value called NaN (Not a Number) that always results when the outcome of a calculation depends on its value.
 2) Like they say in the documentation, also array a should be affected but 
 that doesn't seem to be true because I can still print the 20 original 
 chars.
Your attached code is flawed: --- a = "abcdefghijklmnopqrstuv"; --- doesn't allocate the string on the heap, it makes 'a' refer to a statically-allocated string. Arrays not allocated on the heap can't be resized in place (to a larger length, anyway). Change it to --- a = "abcdefghijklmnopqrstuv".dup; --- to explicitly allocate on the heap. You can change your "%s" format strings to "%x" to see the hexadecimal values of the characters in the string.
 Now to make it a little bit stranger, when I try all this with ints instead 
 of chars it works like they say in the documentation. The new elements from 
 b get default int 0 and when I print the 3 arrays, I see they are all 
 effected by the change of length in b.
Your int code uses an array literal instead of a string literal (obviously, since those can't be used for int arrays), so it allocates on the heap.
 Can anyone explain all this? Am I just overlooking something?
 My test code for char and int is in the attachment.
See above.
May 01 2007