www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Concatenation of ubyte[] to char[] works, but assignation doesn't

reply skilion <andrea.9940 gmail.com> writes:
Is this allowed by the language or it is a compiler bug ?

void main() {
    char[] a = "abc".dup;
    ubyte[] b = [1, 2, 3];
    a = b;   // cannot implicitly convert expression (b) of type 
ubyte[] to char[]
    a ~= b;  // works
}
Oct 04 2015
parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Sunday, October 04, 2015 16:13:47 skilion via Digitalmars-d-learn wrote:
 Is this allowed by the language or it is a compiler bug ?

 void main() {
     char[] a = "abc".dup;
     ubyte[] b = [1, 2, 3];
     a = b;   // cannot implicitly convert expression (b) of type
 ubyte[] to char[]
     a ~= b;  // works
 }
When appending, b to a, the elements in b are being copied onto the end of a, and presumably it works in this case, because a ubyte is implicitly convertible to char. But all it's doing is converting the individual elements. It's not converting the array. On other hand, assigning b to a would require converting the array, and array types don't implicitly convert to one another, even if their elements do. Honestly, I think that the fact that the character types implicitly convert to and from the integral types of the corresponding size is problematic at best and error-prone at worst, since it almost never makes sense to do something like append a ubyte to string. However, if it didn't work, then you'd have to do a lot more casting when you do math on characters, which would cause its own set of potential bugs. So, we're kind of screwed either way. - Jonathan M Davis
Oct 04 2015
next sibling parent skilion <andrea.9940 gmail.com> writes:
On Sunday, 4 October 2015 at 21:57:44 UTC, Jonathan M Davis wrote:
 When appending, b to a, the elements in b are being copied onto 
 the end of a, and presumably it works in this case, because a 
 ubyte is implicitly convertible to char. But all it's doing is 
 converting the individual elements. It's not converting the 
 array.

 ...
It make sense now, thanks.
Oct 04 2015
prev sibling parent reply Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Sunday, 4 October 2015 at 21:57:44 UTC, Jonathan M Davis wrote:
 On Sunday, October 04, 2015 16:13:47 skilion via 
 Digitalmars-d-learn wrote:
 Is this allowed by the language or it is a compiler bug ?

 void main() {
     char[] a = "abc".dup;
     ubyte[] b = [1, 2, 3];
     a = b;   // cannot implicitly convert expression (b) of 
 type
 ubyte[] to char[]
     a ~= b;  // works
 }
When appending, b to a, the elements in b are being copied onto the end of a, and presumably it works in this case, because a ubyte is implicitly convertible to char. But all it's doing is converting the individual elements. It's not converting the array. On other hand, assigning b to a would require converting the array, and array types don't implicitly convert to one another, even if their elements do. Honestly, I think that the fact that the character types implicitly convert to and from the integral types of the corresponding size is problematic at best and error-prone at worst, since it almost never makes sense to do something like append a ubyte to string. However, if it didn't work, then you'd have to do a lot more casting when you do math on characters, which would cause its own set of potential bugs. So, we're kind of screwed either way.
I don't think math would be a problem. There are some obvious rules that would likely just work with most existing code: char + int = char char - int = char char - char = int char + char = ERROR
Oct 05 2015
parent reply Jonathan M Davis via Digitalmars-d-learn writes:
parent reply Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Monday, 5 October 2015 at 10:30:02 UTC, Jonathan M Davis wrote:
 On Monday, October 05, 2015 09:07:34 Marc Schütz via 
 Digitalmars-d-learn wrote:
 I don't think math would be a problem. There are some obvious 
 rules that would likely just work with most existing code:

 char + int = char
 char - int = char
 char - char = int
 char + char = ERROR
That depends on whether VRP can figure out that the result will fit. Otherwise, you'd be stuck with int. As it stands, all of these would just end up with int, I believe, though if they're assigned to a char and the int is a constant, then VRP may kick in and make a cast unnecessary.
I think Walter's argument for allowing the int <-> char conversions was that they are necessary to allow arithmetic. My rules show that it works without these implicit conversions. VRP is a different problem, though. AFAICS, in the following code, VRP either is smart enough, or it isn't, no matter whether char implicitly converts to int. int diff = 'a' - 'A'; char c = 'A'; char d = c + diff;
Oct 05 2015
parent reply Jonathan M Davis via Digitalmars-d-learn writes:
parent reply Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Tuesday, 6 October 2015 at 05:38:36 UTC, Jonathan M Davis 
wrote:
 Your suggestion only works by assuming that the result will fit 
 in a char, which doesn't fit at all with how coversions are 
 currently done in D. It would allow for narrowing conversions 
 which lost data. And there's no way that Walter would go for 
 that (and I don't think that he should). VRP solves the problem 
 insofar as it can guarantee that the result will fit in the 
 target type and thus reduces the need for casting, but simply 
 assuming that char + int will fit in a char just doesn't work 
 unless we're going to allow narrowing conversions to lose data, 
 which we aren't.

 If we were to allow the specific conversions that you're 
 suggesting but only when VRP was used, then that could work, 
 though it does make the implicit rules even screwier, becauses 
 it becomes very dependent on how the int that you're trying to 
 assign to a char was generated in the first place (straight 
 assignment wouldn't work, but '0' - 40 would, whereas 'a' + 500 
 wouldn't, etc.). VRP already makes it a bit funky as it is, 
 though mostly in a straightforward manner.
I see, this is a new problem introduced by `char + int = char`. But at least the following could be disallowed without introducing problems: int a = 'a'; char b = 32; But strictly speaking, we already accept overflow (i.e. loss of precision) for ints, so it's a bit inconsistent to disallow it for chars.
Oct 06 2015
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Tuesday, 6 October 2015 at 09:28:29 UTC, Marc Schütz wrote:
 I see, this is a new problem introduced by `char + int = char`. 
 But at least the following could be disallowed without 
 introducing problems:

     int a = 'a';
     char b = 32;

 But strictly speaking, we already accept overflow (i.e. loss of 
 precision) for ints, so it's a bit inconsistent to disallow it 
 for chars.
Yes, D does not have overflow, it has modular arithmetics. So the same argument would hold for an enumeration (like character ranges), do you want them to be modular (a circle) or monotonic (a line). Neither is a good fit for unicode. It probably would make most sense to split the unicode universe into multiple typed ranges, some enumerations, some non-enumerations and avoid char altogether.
Oct 06 2015
prev sibling parent Jonathan M Davis via Digitalmars-d-learn writes: