www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Same process to different results?

reply "Taylor Hillegeist" <taylorh140 gmail.com> writes:
When I run the code (compiled on DMD 2.067.1):


------------------------------------------------------
import std.algorithm;
import std.stdio;
import std.range;

string A="AaA";
string B="BbBb";
string C="CcCcC";

void main(){
	int L=25;

   int seg1len=(L-B.length)/2;
   int seg2len=B.length;
   int seg3len=L-seg1len-seg2len;

   (A.cycle.take(seg1len).array
   ~B.cycle.take(seg2len).array
   ~C.cycle.take(seg3len).array).writeln;

   string q = cast(string)
   (A.cycle.take(seg1len).array
   ~B.cycle.take(seg2len).array
   ~C.cycle.take(seg3len).array);

   q.writeln;

}
-----------------------------------------------

I get a weird result of
AaAAaAAaAABbBbCcCcCCcCcCC
A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   C 
   c   C   C   c   C   c   C   C

Any ideas why?
Jul 01 2015
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
I betcha it is because A, B, and C are modified by the first 
pass. A lot of the range functions consume their input.
Jul 01 2015
parent reply "Taylor Hillegeist" <taylorh140 gmail.com> writes:
On Wednesday, 1 July 2015 at 17:06:01 UTC, Adam D. Ruppe wrote:
 I betcha it is because A, B, and C are modified by the first 
 pass. A lot of the range functions consume their input.
Running them one at a time produces the same result. for some reason: (A.cycle.take(seg1len).array ~B.cycle.take(seg2len).array ~C.cycle.take(seg3len).array).writeln; is different from: string q = cast(string) (A.cycle.take(seg1len).array ~B.cycle.take(seg2len).array ~C.cycle.take(seg3len).array); q.writeln; I was wondering if it might be the cast?
Jul 01 2015
parent "anonymous" <anonymous example.com> writes:
On Wednesday, 1 July 2015 at 17:13:03 UTC, Taylor Hillegeist 
wrote:
   string q = cast(string)
   (A.cycle.take(seg1len).array
   ~B.cycle.take(seg2len).array
   ~C.cycle.take(seg3len).array);
   q.writeln;

 I was wondering if it might be the cast?
Yes, the cast is wrong. You're reinterpreting (not converting) an array of `dchar`s (UTF-32 code units) as an array of `char`s (UTF-8 code units). If you print the numeric values of the string, e.g. via std.string.representation, you can see that every actual character has three null bytes following it: ---- import std.string: representation; writeln(q.representation); ---- [65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 66, 0, 0, 0, 98, 0, 0, 0, 66, 0, 0, 0, 98, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 67, 0, 0, 0] ---- Use std.conv.to for less surprising conversions. And don't use casts unless you know exactly what you're doing.
Jul 01 2015
prev sibling next sibling parent "Taylor Hillegeist" <taylorh140 gmail.com> writes:
On Wednesday, 1 July 2015 at 17:00:51 UTC, Taylor Hillegeist 
wrote:
 When I run the code (compiled on DMD 2.067.1):


 ------------------------------------------------------
 import std.algorithm;
 import std.stdio;
 import std.range;

 string A="AaA";
 string B="BbBb";
 string C="CcCcC";

 void main(){
 	int L=25;

   int seg1len=(L-B.length)/2;
   int seg2len=B.length;
   int seg3len=L-seg1len-seg2len;

   (A.cycle.take(seg1len).array
   ~B.cycle.take(seg2len).array
   ~C.cycle.take(seg3len).array).writeln;

   string q = cast(string)
   (A.cycle.take(seg1len).array
   ~B.cycle.take(seg2len).array
   ~C.cycle.take(seg3len).array);

   q.writeln;

 }
 -----------------------------------------------

 I get a weird result of
 AaAAaAAaAABbBbCcCcCCcCcCC
 A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   
 C
   c   C   C   c   C   c   C   C

 Any ideas why?
Some way or another the type was converted to a dchar[] during this process: A.cycle.take(seg1len).array ~B.cycle.take(seg2len).array ~C.cycle.take(seg3len).array Why would it change the type so sneaky like?... Except for maybe its the default behavior with string due to 32bits => (typically one grapheme)? I bet cycle did this.
Jul 01 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/1/15 1:00 PM, Taylor Hillegeist wrote:
 When I run the code (compiled on DMD 2.067.1):


 ------------------------------------------------------
 import std.algorithm;
 import std.stdio;
 import std.range;

 string A="AaA";
 string B="BbBb";
 string C="CcCcC";

 void main(){
      int L=25;

    int seg1len=(L-B.length)/2;
    int seg2len=B.length;
    int seg3len=L-seg1len-seg2len;

    (A.cycle.take(seg1len).array
    ~B.cycle.take(seg2len).array
    ~C.cycle.take(seg3len).array).writeln;

    string q = cast(string)
    (A.cycle.take(seg1len).array
    ~B.cycle.take(seg2len).array
    ~C.cycle.take(seg3len).array);

    q.writeln;

 }
 -----------------------------------------------

 I get a weird result of
 AaAAaAAaAABbBbCcCcCCcCcCC
 A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   C   c
 C   C   c   C   c   C   C

 Any ideas why?
Schizophrenia of Phobos. Phobos thinks a string is a range of dchar instead of a range of char. So what cycle, take, and array all output are dchar ranges and arrays. When you cast the dchar[] result to a string, (which is a char[]), it then treats all the 0's in each dchar element as '\0', printing a blank apparently. -Steve
Jul 01 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/1/15 1:44 PM, Steven Schveighoffer wrote:

 Schizophrenia of Phobos.

 Phobos thinks a string is a range of dchar instead of a range of char.
 So what cycle, take, and array all output are dchar ranges and arrays.

 When you cast the dchar[] result to a string, (which is a char[]), it
 then treats all the 0's in each dchar element as '\0', printing a blank
 apparently.
This has to be one of the most obvious cases I've ever seen that phobos treating string as a range of dchar was the wrong decision. That one can't use ranges to make a new string is ridiculous. Just the thought of "fixing" this by re-encoding... -Steve
Jul 01 2015
parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Wed, Jul 01, 2015 at 02:14:49PM -0400, Steven Schveighoffer via
Digitalmars-d-learn wrote:
 On 7/1/15 1:44 PM, Steven Schveighoffer wrote:
 
Schizophrenia of Phobos.

Phobos thinks a string is a range of dchar instead of a range of
char.  So what cycle, take, and array all output are dchar ranges and
arrays.

When you cast the dchar[] result to a string, (which is a char[]), it
then treats all the 0's in each dchar element as '\0', printing a
blank apparently.
This has to be one of the most obvious cases I've ever seen that phobos treating string as a range of dchar was the wrong decision. That one can't use ranges to make a new string is ridiculous. Just the thought of "fixing" this by re-encoding...
[...] Yeah, although Andrei has vetoed all suggestions of getting rid of autodecoding, this is one of the glaring cases where it's obviously a bad idea. It almost makes me want to create my own custom string type that serves up char instead of dchar. T -- There are four kinds of lies: lies, damn lies, and statistics.
Jul 01 2015