www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - char[][] join ==> string

reply bearophile <bearophileHUGS lycos.com> writes:
Given an array of strings std.string.join() returns a single string:

import std.string;
void main() {
    string[] a1 = ["hello", "red"];
    string j1 = join(a1, " "); // OK
}


But in a program I need an array of mutable arrays of chars. If I join the
arrays I get a mutable array of chars. But I need a string:

import std.string;
void main() {
    char[][] a2 = ["hello".dup, "red".dup];
    string j2 = join(a2, " "); // error
}

Error: cannot implicitly convert expression (join(a," ")) of type char[] to
string

.idup avoids the error:

string j3 = join(a2, " ").idup; // OK

Given the low efficiency of the D GC it's better to reduce memory allocations
as much as possible.
Here join() creates a brand new array, so idup performs a useless copy. To
avoid this extra copy do I have to write another joinString() function?

Bye,
bearophile
Apr 06 2011
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 04/06/2011 05:13 PM, bearophile wrote:
 Given an array of strings std.string.join() returns a single string:

 import std.string;
 void main() {
      string[] a1 = ["hello", "red"];
      string j1 = join(a1, " "); // OK
 }


 But in a program I need an array of mutable arrays of chars. If I 
join the arrays I get a mutable array of chars. But I need a string: Tangentially off-topic: This does not apply to your case, but I think we should think twice before deciding that we need a string. For example, functions parameters should be const(char[]) (or const(char)[]) instead of string as that type accepts both mutable and immutable strings.
 import std.string;
 void main() {
      char[][] a2 = ["hello".dup, "red".dup];
      string j2 = join(a2, " "); // error
If possible, this might work: const(char[]) j2 = join(a2, " "); There is also std.exception.assumeUnique, but it's too eager to be safe and tries to null its parameter and this fails: string j2 = assumeUnique(join(a2, " ")); // error Finally, casting ourselves works: string j2 = cast(string)join(a2, " ");
 }

 Error: cannot implicitly convert expression (join(a," ")) of type 
char[] to string
 ..idup avoids the error:

 string j3 = join(a2, " ").idup; // OK

 Given the low efficiency of the D GC it's better to reduce memory 
allocations as much as possible.
 Here join() creates a brand new array, so idup performs a useless 
copy. To avoid this extra copy do I have to write another joinString() function?
 Bye,
 bearophile
Ali
Apr 06 2011
next sibling parent spir <denis.spir gmail.com> writes:
On 04/07/2011 03:07 AM, Ali Çehreli wrote:
  Given an array of strings std.string.join() returns a single string:

  import std.string;
  void main() {
       string[] a1 = ["hello", "red"];
       string j1 = join(a1, " "); // OK
  }


  But in a program I need an array of mutable arrays of chars. If I join the
arrays I get a mutable array of chars. [...] Finally, casting ourselves works: string j2 = cast(string)join(a2, " ");
Oh, that's very good news! Thans Ali, I never thought at that solution. I'm often i/dup-ing from/to string to manipulate text due to the fact there is no automatic conversion. cast() works in place, doesn't it? so this is supposed avoid to avoid copy. PS: Checked: indeed, it works in-place. But watch the gotcha: unittest { string s = "abc"; char[] chars = cast(char[])s; chars ~= "de"; s = cast(string) chars; writeln(s, ' ', chars); // abcde abcde chars[1] = 'z'; writeln(s, ' ', chars); // azcde azcde } s's chars are mutable ;-) So, I guess there is /really/ no reason for implicite casts between char[] and string no to exist. (I assumed the reason was precisely to avoid such traps). Denis -- _________________ vita es estrany spir.wikidot.com
Apr 07 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 04/07/2011 09:52 AM, spir wrote:
 On 04/07/2011 03:07 AM, Ali Çehreli wrote:
 Given an array of strings std.string.join() returns a single string:

 import std.string;
 void main() {
 string[] a1 = ["hello", "red"];
 string j1 = join(a1, " "); // OK
 }


 But in a program I need an array of mutable arrays of chars. If I join the
arrays I get a mutable array of chars. [...] Finally, casting ourselves works: string j2 = cast(string)join(a2, " ");
Oh, that's very good news! Thans Ali, I never thought at that solution. I'm often i/dup-ing from/to string to manipulate text due to the fact there is no automatic conversion. cast() works in place, doesn't it? so this is supposed avoid to avoid copy. PS: Checked: indeed, it works in-place. But watch the gotcha: unittest { string s = "abc"; char[] chars = cast(char[])s; chars ~= "de"; s = cast(string) chars; writeln(s, ' ', chars); // abcde abcde
Sorry: forgot this line: assert(s.ptr == chars.ptr); // pass
 chars[1] = 'z';
 writeln(s, ' ', chars); // azcde azcde
 }

 s's chars are mutable ;-) So, I guess there is /really/ no reason for implicite
 casts between char[] and string no to exist. (I assumed the reason was
 precisely to avoid such traps).

 Denis
-- _________________ vita es estrany spir.wikidot.com
Apr 07 2011
prev sibling parent reply spir <denis.spir gmail.com> writes:
On 04/07/2011 09:52 AM, spir wrote:
 On 04/07/2011 03:07 AM, Ali Çehreli wrote:
 Given an array of strings std.string.join() returns a single string:

 import std.string;
 void main() {
 string[] a1 = ["hello", "red"];
 string j1 = join(a1, " "); // OK
 }


 But in a program I need an array of mutable arrays of chars. If I join the
arrays I get a mutable array of chars. [...] Finally, casting ourselves works: string j2 = cast(string)join(a2, " ");
Oh, that's very good news! Thans Ali, I never thought at that solution. I'm often i/dup-ing from/to string to manipulate text due to the fact there is no automatic conversion. cast() works in place, doesn't it? so this is supposed avoid to avoid copy. PS: Checked: indeed, it works in-place. But watch the gotcha: unittest { string s = "abc"; char[] chars = cast(char[])s; chars ~= "de"; s = cast(string) chars; writeln(s, ' ', chars); // abcde abcde chars[1] = 'z'; writeln(s, ' ', chars); // azcde azcde } s's chars are mutable ;-) So, I guess there is /really/ no reason for implicite casts between char[] and string no to exist. (I assumed the reason was precisely to avoid such traps).
After some more thought, I guess it's better to leave things as are. We have a way to cast without copy --which is one issue perfectly solved. The other issue --typing-- is small enough to keep it, since it also serves as warning to the programmer about the above trap. What should definitely be done is teaching this idiom in all relevant places of the reference, manuals, tutorials: while this issue is often submitted on D lists, I had never read about it (nore thought about it myself). Questions: did you know this idiom? if yes, have you found it yourself or read about it? if the latter, where? Denis -- _________________ vita es estrany spir.wikidot.com
Apr 07 2011
next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 04/07/2011 01:04 AM, spir wrote:
 On 04/07/2011 09:52 AM, spir wrote:
 On 04/07/2011 03:07 AM, Ali Çehreli wrote:
 Given an array of strings std.string.join() returns a single string:

 import std.string;
 void main() {
 string[] a1 = ["hello", "red"];
 string j1 = join(a1, " "); // OK
 }


 But in a program I need an array of mutable arrays of chars. If I
 join the
arrays I get a mutable array of chars. [...] Finally, casting ourselves works: string j2 = cast(string)join(a2, " ");
 Questions: did you know this idiom? if yes, have you found it yourself
 or read about it? if the latter, where?
I had heard about assumeUnique() a couple of times in these forums. I remember looking at its implementation. Ali
Apr 07 2011
prev sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
spir Wrote:

 unittest {
 string s = "abc";
 char[] chars = cast(char[])s;
 chars ~= "de";
 s = cast(string) chars;
 writeln(s, ' ', chars); // abcde abcde

 chars[1] = 'z';
 writeln(s, ' ', chars); // azcde azcde
 }

 s's chars are mutable ;-) So, I guess there is /really/ no reason for implicite
 casts between char[] and string no to exist. (I assumed the reason was
 precisely to avoid such traps).
After some more thought, I guess it's better to leave things as are. We have a way to cast without copy --which is one issue perfectly solved. The other issue --typing-- is small enough to keep it, since it also serves as warning to the programmer about the above trap. What should definitely be done is teaching this idiom in all relevant places of the reference, manuals, tutorials: while this issue is often submitted on D lists, I had never read about it (nore thought about it myself). Questions: did you know this idiom? if yes, have you found it yourself or read about it? if the latter, where?
Casting to and from string/char[] is very dangerous, even through assumeUnique. AssumeUnique is intended to be used for returning a mutable as immutable from a function. Casting is often a no-op for the CPU and as you discovered removes any safety provided by the type system. While modifying immutable data is undefined, if you can guarantee the data truly is mutable and the compiler won't be optimizing with the assumption of immutability, it is perfectly safe. It is just in the hands of the programmer now.
Apr 07 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Jesse Phillips:

 Casting to and from string/char[] is very dangerous, even through
assumeUnique. AssumeUnique is intended to be used for returning a mutable as
immutable from a function. Casting is often a no-op for the CPU and as you
discovered removes any safety provided by the type system.
A more expressive type system (uniqueness annotations, lending, linear types) allows to write that code in a safe way, but introduces some new complexities too. The GHC Haskell compiler has some experimental extensions (mostly to its type system) disabled on default (you need to add an annotation in your programs to switch each extension on), to experiment and debug/engineer ideas like that. I presume Walter is too much busy to do this in D, but I'd like something similar. On the other hand we'll probably have a Phobos package for "experimental" standard modules. Bye, bearophile
Apr 07 2011
prev sibling parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
On Thu, 07 Apr 2011 02:13:16 +0200, bearophile <bearophileHUGS lycos.com>  
wrote:

 Given an array of strings std.string.join() returns a single string:

 import std.string;
 void main() {
     string[] a1 = ["hello", "red"];
     string j1 = join(a1, " "); // OK
 }


 But in a program I need an array of mutable arrays of chars. If I join  
 the arrays I get a mutable array of chars. But I need a string:

 import std.string;
 void main() {
     char[][] a2 = ["hello".dup, "red".dup];
     string j2 = join(a2, " "); // error
 }

 Error: cannot implicitly convert expression (join(a," ")) of type char[]  
 to string

 .idup avoids the error:

 string j3 = join(a2, " ").idup; // OK

 Given the low efficiency of the D GC it's better to reduce memory  
 allocations as much as possible.
 Here join() creates a brand new array, so idup performs a useless copy.  
 To avoid this extra copy do I have to write another joinString()  
 function?

 Bye,
 bearophile
Isn't this a prime case for std.exception.assumeUnique? -- Simen
Apr 07 2011
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 04/07/2011 09:01 AM, Simen kjaeraas wrote:
 On Thu, 07 Apr 2011 02:13:16 +0200, bearophile
 <bearophileHUGS lycos.com> wrote:

 Given an array of strings std.string.join() returns a single string:

 import std.string;
 void main() {
 string[] a1 = ["hello", "red"];
 string j1 = join(a1, " "); // OK
 }


 But in a program I need an array of mutable arrays of chars. If I join
 the arrays I get a mutable array of chars. But I need a string:

 import std.string;
 void main() {
 char[][] a2 = ["hello".dup, "red".dup];
 string j2 = join(a2, " "); // error
 }

 Error: cannot implicitly convert expression (join(a," ")) of type
 char[] to string

 .idup avoids the error:

 string j3 = join(a2, " ").idup; // OK

 Given the low efficiency of the D GC it's better to reduce memory
 allocations as much as possible.
 Here join() creates a brand new array, so idup performs a useless
 copy. To avoid this extra copy do I have to write another joinString()
 function?

 Bye,
 bearophile
Isn't this a prime case for std.exception.assumeUnique?
Almost. assumeUnique is too eager and tries to null its reference parameter. Copying from std/exception.d: immutable(T)[] assumeUnique(T)(ref T[] array) pure nothrow { auto result = cast(immutable(T)[]) array; array = null; return result; } And that fails as join's return type is not an lvalue. We need a simplyAssumeUnique() that doesn't null the reference parameter :); and it would have no value over casting other than communicating the intent. Ali
Apr 07 2011