digitalmars.D - Strange implicit conversion integers on concatenation

uranuz (19/19) Nov 05 2018 Hello to everyone! By mistake I typed some code like the

Adam D. Ruppe (6/9) Nov 05 2018 Me too, this is a design flaw in the language. Following C's

uranuz (3/12) Nov 05 2018 Ok. It's because string is array of char. And int could be
H. S. Teoh (30/39) Nov 05 2018 I have said before, and will continue to say, that I think implicit
Jonathan M Davis (21/57) Nov 05 2018 wrote:

bachmeier (9/11) Nov 05 2018 It's hard for me to see how it would be reasonable to not fix it.

Jonathan M Davis (28/39) Nov 05 2018 It really comes down to what code would break due to the change, how tha...

Neia Neutuladh (15/18) Nov 05 2018 Get a patch and I can make dubautotester run on it to see what breaks.

12345swordy (21/91) Nov 05 2018 We need to avoid the situation where we have to create a DIP for

Jonathan M Davis (39/57) Nov 05 2018 This really shouldn't be decided on a per function basis. It's an issue ...

12345swordy (27/68) Nov 05 2018 What exactly makes you say that? I had scan read the old dips

Jonathan M Davis (27/38) Nov 05 2018 I would have to go digging through the newsgroup history. It's come up o...

Steven Schveighoffer (10/71) Nov 05 2018 It's not just ints to chars, but chars to wchars or dchars, and wchars

H. S. Teoh (34/43) Nov 05 2018 +1. I recall having this conversation before. Was this ever filed as a
Jonathan M Davis (42/82) Nov 05 2018 I'm pretty sure that the change would just result in more errors, so I d...

lngns (12/12) Nov 05 2018 It adds the equivalent char value, which is interpreted as ASCII
Paul Backus (13/33) Nov 05 2018 It seems like the integer 2 is being implicitly converted to a
lithium iodate (12/20) Nov 05 2018 As long as the integral value is statically known to be a valid
12345swordy (4/24) Nov 12 2018 Welp with the recent rejection of the DIP 1005, I don't see this

Andrei Alexandrescu (5/34) Nov 12 2018 If we deprecate that we also need to deprecate:

Adam D. Ruppe (7/10) Nov 12 2018 I'd call that a good thing, many people are surprised when ~ x
12345swordy (14/49) Nov 12 2018 We could replace the
bachmeier (4/8) Nov 12 2018 I sure hope that happens. As I wrote above, this bit me when I
Jonathan M Davis (25/58) Nov 12 2018 And honestly, that's _exactly_ the sort of expression that we'd be looki...
Steven Schveighoffer (5/40) Nov 13 2018 I'm wondering if you realized what you are saying there. Like "if we

uranuz <neuranuz gmail.com> writes:

Hello to everyone! By mistake I typed some code like the 
following without using [std.conv: to] and get strange result. I 
believe that following code shouldn't even compile, but it does 
and gives non-printable symbol appended at the end of string.
The same problem is encountered even without [enum]. Just using 
plain integer value gives the same. Is it a bug or someone realy 
could rely on this behaviour?

import std.stdio;

enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
}

void main()
{
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
}

Output:
Number value:

Nov 05 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.

Me too, this is a design flaw in the language. Following C's 
example, int and char can convert to/from each other. So string ~ 
int will convert int to char (as in reinterpret cast) and append 
that.

It is just the way it is, alas.

Nov 05 2018

uranuz <neuranuz gmail.com> writes:

On Monday, 5 November 2018 at 15:58:40 UTC, Adam D. Ruppe wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.

 Me too, this is a design flaw in the language. Following C's 
 example, int and char can convert to/from each other. So string 
 ~ int will convert int to char (as in reinterpret cast) and 
 append that.

 It is just the way it is, alas.

Ok. It's because string is array of char. And int could be 
implicitly converted to char if it fits the range.

Nov 05 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via Digitalmars-d wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it does
 and gives non-printable symbol appended at the end of string.

 
 Me too, this is a design flaw in the language. Following C's example,
 int and char can convert to/from each other. So string ~ int will
 convert int to char (as in reinterpret cast) and append that.
 
 It is just the way it is, alas.

I have said before, and will continue to say, that I think implicit
conversion between char and non-char types in D does not make sense.

In C, converting between char and int is very common because of the
conflation of char with byte, but in D we have explicit types for byte
and ubyte, which should take care of any of those kinds of use cases,
and char is explicitly defined to be a UTF8 code unit.  Now sure, there
are cases where you want to get at the numerical value of a char --
that's what cast(int) and cast(char) is for.  But *implicitly*
converting between char and int, especially when we went through the
trouble of defining a separate type for char that stands apart from
byte/ubyte, does not make any sense to me.

This problem is especially annoying with function overloads that take
char vs. byte: because of implicit conversion, often the wrong overload
ends up getting called WITHOUT ANY WARNING.  Once, while refactoring
some code, I changed a representation of an object from char to a byte
ID, but in order to do the refactoring piecemeal, I needed to overload
between byte and char so that older code will continue to compile while
the refactoring is still in progress.  Bad idea.  All sorts of random
problems and runtime crashes happened because C's stupid int conversion
rules were liberally applied to D types, causing a gigantic mess where
you never know which overload will get called. (Well OK, it's
predictable if you sit down and work it out, but it's just plain
annoying when a lousy char literal calls the byte overload whereas a
char variable calls the char overload.)  I ended up having to wrap the
type in a struct just to stop the implicit conversion from tripping me
up.


T

-- 
Some days you win; most days you lose.

Nov 05 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

n Monday, November 5, 2018 9:31:56 AM MST H. S. Teoh via Digitalmars-d 
wrote:
 On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via Digitalmars-d 

wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it does
 and gives non-printable symbol appended at the end of string.

 Me too, this is a design flaw in the language. Following C's example,
 int and char can convert to/from each other. So string ~ int will
 convert int to char (as in reinterpret cast) and append that.

 It is just the way it is, alas.

 I have said before, and will continue to say, that I think implicit
 conversion between char and non-char types in D does not make sense.

 In C, converting between char and int is very common because of the
 conflation of char with byte, but in D we have explicit types for byte
 and ubyte, which should take care of any of those kinds of use cases,
 and char is explicitly defined to be a UTF8 code unit.  Now sure, there
 are cases where you want to get at the numerical value of a char --
 that's what cast(int) and cast(char) is for.  But *implicitly*
 converting between char and int, especially when we went through the
 trouble of defining a separate type for char that stands apart from
 byte/ubyte, does not make any sense to me.

 This problem is especially annoying with function overloads that take
 char vs. byte: because of implicit conversion, often the wrong overload
 ends up getting called WITHOUT ANY WARNING.  Once, while refactoring
 some code, I changed a representation of an object from char to a byte
 ID, but in order to do the refactoring piecemeal, I needed to overload
 between byte and char so that older code will continue to compile while
 the refactoring is still in progress.  Bad idea.  All sorts of random
 problems and runtime crashes happened because C's stupid int conversion
 rules were liberally applied to D types, causing a gigantic mess where
 you never know which overload will get called. (Well OK, it's
 predictable if you sit down and work it out, but it's just plain
 annoying when a lousy char literal calls the byte overload whereas a
 char variable calls the char overload.)  I ended up having to wrap the
 type in a struct just to stop the implicit conversion from tripping me
 up.

+1

Unfortunately, I don't know how reasonable it is to fix it at this point,
much as I would love to see it fixed. Historically, I don't think that
Walter could have been convinced, but based on some of the stuff he's said
in recent years, I think that he'd be much more open to the idea now.
However, even if he could now be convinced that ideally the conversion
wouldn't exist, I don't know how easy it would be to get a DIP through when
you consider the potential code breakage. But maybe it's possible to do it
in a smooth enough manner that it could work - especially when many of the
kind of cases where you might actually _want_ such a conversion already
require casting anyway thanks to the rules about integer promotions and
narrowing conversions (e.g. when adding or subtracting from chars).
Regardless, it would have to be well-written DIP with a clean transition
scheme. Having that DIP on removing the implicit conversion of integer and
character literals to bool be accepted would be a start in the right
direction though. If that gets rejected (which I sure hope that it isn't),
then there's probably no hope for a DIP fixing the char situation.

- Jonathan M Davis

Nov 05 2018

bachmeier <no spam.net> writes:

On Monday, 5 November 2018 at 21:11:27 UTC, Jonathan M Davis 
wrote:

 I don't know how reasonable it is to fix it at this point, much 
 as I would love to see it fixed.

It's hard for me to see how it would be reasonable to not fix it. 
This is one of those ugly parts of the language that need to be 
evolved out of the language. If there's a reason to support this, 
it should be done with a compiler switch. I'm pretty sure that 
this was one of the weird things that hit me when I started with 
the language, it was frustrating, and it didn't make a good 
impression.

Nov 05 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, November 5, 2018 2:49:32 PM MST bachmeier via Digitalmars-d 
wrote:
 On Monday, 5 November 2018 at 21:11:27 UTC, Jonathan M Davis

 wrote:
 I don't know how reasonable it is to fix it at this point, much
 as I would love to see it fixed.

 It's hard for me to see how it would be reasonable to not fix it.
 This is one of those ugly parts of the language that need to be
 evolved out of the language. If there's a reason to support this,
 it should be done with a compiler switch. I'm pretty sure that
 this was one of the weird things that hit me when I started with
 the language, it was frustrating, and it didn't make a good
 impression.

It really comes down to what code would break due to the change, how that
code breakage could be mitigated, what the transition process would look
like, and how Walter views the issue at this point. Historically, he has not
seen this as a problem like many of us have, but his views have evolved
somewhat over time. However, he's also become far more wary of breaking code
with language changes.

If this change were proposed in a DIP, a clean transition would be required,
and if a compiler flag were required, I don't know that it would ever
happen. I don't recall a single transition that has required a compiler
switch in D that has ever been completed. Some that have had a compiler
switch to get the old behavior back have worked, but stuff like -property or
-dip25 have never reached the finish line. -dip25 may yet get there given
how it's tied into -dip1000, and I expect that Walter will find a way to
push -dip1000 through given its importance for  safe, but it's still an open
question how on earth we're going to transition to DIP 1000 being the normal
behavior given how big a switch that is.

So, if someone can figure out how to cleanly transition behavior to get rid
of the implicit conversion between character types and integer types without
needing a -dipxxx switch to enable the new behavior, and they can argue it
well enough to convince Walter, then we may very well get there, but
otherwise, I expect that we're stuck. Either way, I think that we have to
see how https://github.com/dlang/DIPs/blob/master/DIPs/DIP1015.md does
first. If _that_ can't get through, I don't think that a DIP to fix implict
conversions and char stands a chance. I'm currently expecting that DIP to be
accepted, but you never know.

- Jonathan M Davis

Nov 05 2018

Neia Neutuladh <neia ikeran.org> writes:

On Mon, 05 Nov 2018 15:14:08 -0700, Jonathan M Davis wrote:
 It really comes down to what code would break due to the change, how
 that code breakage could be mitigated, what the transition process would
 look like, and how Walter views the issue at this point.

Get a patch and I can make dubautotester run on it to see what breaks.

I originally intended to use it to determine which patches we could safely 
backport in order to construct more stable DMDFE versions without vastly 
increasing the amount of human work. Like, we get a patch against master; 
try to apply it against 2.080.1 and see if it (a) still works (b) compiles 
everything that 2.080.1 did (c) doesn't compile anything that 2.080.1 
didn't.

Automatically apply it to the next patch version if all that passes. 
Automatically apply the patch to the next minor version if only (c) fails. 
Make a report for human intervention. Or something like that. The main 
issue is that it takes a *lot* of time to run these tests, so more than 
one patch per day would require me to set up parallelism and upgrade the 
build box.

Evaluating the effects of a proposal like this is pretty similar.

Nov 05 2018

12345swordy <alexanderheistermann gmail.com> writes:

On Monday, 5 November 2018 at 21:11:27 UTC, Jonathan M Davis 
wrote:
 n Monday, November 5, 2018 9:31:56 AM MST H. S. Teoh via 
 Digitalmars-d wrote:
 On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via 
 Digitalmars-d

 wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but 
 it does and gives non-printable symbol appended at the end 
 of string.

 Me too, this is a design flaw in the language. Following C's 
 example, int and char can convert to/from each other. So 
 string ~ int will convert int to char (as in reinterpret 
 cast) and append that.

 It is just the way it is, alas.

 I have said before, and will continue to say, that I think 
 implicit conversion between char and non-char types in D does 
 not make sense.

 In C, converting between char and int is very common because 
 of the conflation of char with byte, but in D we have explicit 
 types for byte and ubyte, which should take care of any of 
 those kinds of use cases, and char is explicitly defined to be 
 a UTF8 code unit.  Now sure, there are cases where you want to 
 get at the numerical value of a char -- that's what cast(int) 
 and cast(char) is for.  But *implicitly* converting between 
 char and int, especially when we went through the trouble of 
 defining a separate type for char that stands apart from 
 byte/ubyte, does not make any sense to me.

 This problem is especially annoying with function overloads 
 that take char vs. byte: because of implicit conversion, often 
 the wrong overload ends up getting called WITHOUT ANY WARNING.
  Once, while refactoring some code, I changed a representation 
 of an object from char to a byte ID, but in order to do the 
 refactoring piecemeal, I needed to overload between byte and 
 char so that older code will continue to compile while the 
 refactoring is still in progress.  Bad idea.  All sorts of 
 random problems and runtime crashes happened because C's 
 stupid int conversion rules were liberally applied to D types, 
 causing a gigantic mess where you never know which overload 
 will get called. (Well OK, it's predictable if you sit down 
 and work it out, but it's just plain annoying when a lousy 
 char literal calls the byte overload whereas a char variable 
 calls the char overload.)  I ended up having to wrap the type 
 in a struct just to stop the implicit conversion from tripping 
 me up.

 +1

 Unfortunately, I don't know how reasonable it is to fix it at 
 this point, much as I would love to see it fixed. Historically, 
 I don't think that Walter could have been convinced, but based 
 on some of the stuff he's said in recent years, I think that 
 he'd be much more open to the idea now. However, even if he 
 could now be convinced that ideally the conversion wouldn't 
 exist, I don't know how easy it would be to get a DIP through 
 when you consider the potential code breakage. But maybe it's 
 possible to do it in a smooth enough manner that it could work 
 - especially when many of the kind of cases where you might 
 actually _want_ such a conversion already require casting 
 anyway thanks to the rules about integer promotions and 
 narrowing conversions (e.g. when adding or subtracting from 
 chars). Regardless, it would have to be well-written DIP with a 
 clean transition scheme. Having that DIP on removing the 
 implicit conversion of integer and character literals to bool 
 be accepted would be a start in the right direction though. If 
 that gets rejected (which I sure hope that it isn't), then 
 there's probably no hope for a DIP fixing the char situation.

 - Jonathan M Davis

We need to avoid the situation where we have to create a DIP for 
every unwanted implicit conversion with regards to calling the 
wrong overload function, we need
better way of doing this. No one wants to wait a year for a DIP 
approval for something that is very minor such as deprecating a 
implicit conversion for native data types.

I think a better course of action is to introduce the keywords 
explicit and implicit. Not as attributes though! I don't want to 
see functions with  nogc  nothrow safe pure  explicit as that is 
too much verbiage and hard to read! Which brings up the question 
of which parameter exactly is explicit?

It much easier to read: void example(int bar, explicit int bob)

The explicit keyword will become very important if we are to 
introduce the implicit keyword, as both of them are instrumental 
in creating types with structs.

I don't mind writing a DIP regarding this, as I think this is 
much easier for the DIP to be accepted then the other one that I 
currently have.

-Alexander

Nov 05 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, November 5, 2018 3:08:24 PM MST 12345swordy via Digitalmars-d 
wrote:
 We need to avoid the situation where we have to create a DIP for
 every unwanted implicit conversion with regards to calling the
 wrong overload function, we need
 better way of doing this. No one wants to wait a year for a DIP
 approval for something that is very minor such as deprecating a
 implicit conversion for native data types.

 I think a better course of action is to introduce the keywords
 explicit and implicit. Not as attributes though! I don't want to
 see functions with  nogc  nothrow safe pure  explicit as that is
 too much verbiage and hard to read! Which brings up the question
 of which parameter exactly is explicit?

 It much easier to read: void example(int bar, explicit int bob)

 The explicit keyword will become very important if we are to
 introduce the implicit keyword, as both of them are instrumental
 in creating types with structs.

 I don't mind writing a DIP regarding this, as I think this is
 much easier for the DIP to be accepted then the other one that I
 currently have.

This really shouldn't be decided on a per function basis. It's an issue on
the type level and should be fixed with the types themselves. The OP's
problem didn't even happen with a function. It happened with a built-in
operator.

Regardless, if you attempt to add keywords to the language at this point,
you will almost certainly lose. I would be _very_ surprised to see Walter or
Andrei go for it. Whether you think attributes are easy to read or not, they
don't eat up an identifier, and Walter and Andrei consider that to be very
important. AFAIK, they also don't consider attributes to be a readability
problem. So, even if trying to add some sort of implicit or explicit marker
to parameters made sense (and I really don't think that it does), I think
that Walter and Andrei have made it pretty clear that that sort of thing
would have to be an attribute and not a keyword.

And honestly, I think that any DIP trying to add general control over
implicit and explict conversions in the language has a _way_ lower chance of
being accepted than one that gets rid of implicit conversions between
character types and integer types. However, in the end, one does not depend
on the other or even really have much to do with the other. A DIP to fix the
implicit conversions between character types and integer types would be a
DIP to fix precisely that, whereas a DIP to mark parameters with implicit or
explicit would be about trying to control implicit or explicit conversions
in general and not about character or integer types specifically, so while
they might be tangentially related, they're really separate issues.

Given the recent DIP on copy constructors and the discussion there, it would
not surprise me to see a future DIP about adding  implicit to constructors
to allow for implicit construction, though I don't know how likely it is for
such a DIP to be accepted given that D's approach (outside of built-in types
anyway) has generally been to avoid implicit conversions to reduce the risk
of bugs, and when combined with alias this, things really start to get
interesting. But I would think that the chances of that getting accepted are
far greater than adding attributes to parameters (be they keywords or actual
attributes). Regardless, that's an issue of conversions in general, and not
just implicit conversions between character types and integer types, which
is really what the discussion is about fixing here, and that can be fixed
regardless of what happens with providing additional control over implicit
conversions in general.

- Jonathan M Davis

Nov 05 2018

12345swordy <alexanderheistermann gmail.com> writes:

On Monday, 5 November 2018 at 23:00:23 UTC, Jonathan M Davis 
wrote:

 Regardless, if you attempt to add keywords to the language at 
 this point, you will almost certainly lose. I would be _very_ 
 surprised to see Walter or Andrei go for it.

What exactly makes you say that? I had scan read the old dips 
that were rejected on the wiki and on the github, and it seems to 
be rejected for other reasons.
Is there previous discussion that you(or others) can linked to?

 Whether you think attributes are easy to read or not, they 
 don't eat up an identifier, and Walter and Andrei consider that 
 to be very important.

The feature that I have in mind requires to be an keyword as it 

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/explicit
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/implicit
Even c++ have type conversions.
http://www.cplusplus.com/doc/tutorial/typecasting/

Bear in mind though that I am still in the brainstorm session and 
research process in regarding this. Don't expect the details from 
me, as I havn't figure everything out yet.
 AFAIK, they also don't consider attributes to be a readability 
 problem. So, even if trying to add some sort of implicit or 
 explicit marker to parameters made sense (and I really don't 
 think that it does), I think that Walter and Andrei have made 
 it pretty clear that that sort of thing would have to be an 
 attribute and not a keyword.

I don't want the implicit and explicit keywords to be attributes 
in the dip I going to write, unless I really have to in order to 
get the DIP approval from Walter and Andrei.

 And honestly, I think that any DIP trying to add general 
 control over implicit and explict conversions in the language 
 has a _way_ lower chance of being accepted than one that gets 
 rid of implicit conversions between character types and integer 
 types. However, in the end, one does not depend on the other or 
 even really have much to do with the other. A DIP to fix the 
 implicit conversions between character types and integer types 
 would be a DIP to fix precisely that, whereas a DIP to mark 
 parameters with implicit or explicit would be about trying to 
 control implicit or explicit conversions in general and not 
 about character or integer types specifically, so while they 
 might be tangentially related, they're really separate issues.

That is very good point. Well consider that when writing the dip.

 Given the recent DIP on copy constructors and the discussion 
 there, it would not surprise me to see a future DIP about 
 adding  implicit to constructors to allow for implicit 
 construction, though I don't know how likely it is for such a 
 DIP to be accepted given that D's approach (outside of built-in 
 types anyway) has generally been to avoid implicit conversions 
 to reduce the risk of bugs, and when combined with alias this, 
 things really start to get interesting.

Gah, don't remind me of alias this. That can of worms has yet to 
be open with multi alias this. Which btw is STILL not implemented 
yet! Hell, I will sign up for the next round of community fund 
projects just to implement the damn thing, because I am that 
impatient.

 But I would think that the chances of that getting accepted are 
 far greater than adding attributes to parameters (be they 
 keywords or actual attributes). Regardless, that's an issue of 
 conversions in general, and not just implicit conversions 
 between character types and integer types, which is really what 
 the discussion is about fixing here, and that can be fixed 
 regardless of what happens with providing additional control 
 over implicit conversions in general.

 - Jonathan M Davis

Sure thing, though the DIP process for deprecation of small 
features shouldn't be that slow!

-Alex

Nov 05 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, November 5, 2018 7:47:42 PM MST 12345swordy via Digitalmars-d 
wrote:
 On Monday, 5 November 2018 at 23:00:23 UTC, Jonathan M Davis

 wrote:
 Regardless, if you attempt to add keywords to the language at
 this point, you will almost certainly lose. I would be _very_
 surprised to see Walter or Andrei go for it.

 What exactly makes you say that? I had scan read the old dips
 that were rejected on the wiki and on the github, and it seems to
 be rejected for other reasons.
 Is there previous discussion that you(or others) can linked to?

I would have to go digging through the newsgroup history. It's come up on a
number of occasions in various threads, and I couldn't say which at this
point. But the very reason that we started putting   on things in the first
place was to avoid creating keywords. We did it before user-defined
attributes were even a thing. And for years now, any time that it's been
considered to add any kind of attribute, it always starts with  . It has
been years since anything involving adding a new keyword gotten anywhere.
Similiarly, Walter has shot down the idea of using contextual keywords
(which you can probably find discussions on pretty easily by searching the
newsgroup history).

So, if you want to create a DIP that proposes adding implicit and explicit
as keywords, feel free to do so, but from what I know of Walter and Andrei's
position on the topic of keywords from what they've said in the newsgroup or
in anything they've said in any discussions that I've had with them in
person, they're not going to be interested in adding new keywords when
adding an attribute that starts with   will work, because adding keywords
means restricting the list of available identifiers, whereas adding new
attributes does not. I honestly do not expect that D2 will _ever_ get any
additional keywords and would be very surprised if it ever did.

 Sure thing, though the DIP process for deprecation of small
 features shouldn't be that slow!

I won't claim the the DIP process couldn't or shouldn't be improved, but at
least we now have a DIP process that actually works, even if it can be slow.
With the old process, DIPs basically almost never went anywhere. A few did,
but most weren't ever even seriously reviewed by Walter and Andrei. While it
may not be perfect, the current process is an _enormous_ improvement.

- Jonathan M Davis

Nov 05 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/5/18 4:11 PM, Jonathan M Davis wrote:
 n Monday, November 5, 2018 9:31:56 AM MST H. S. Teoh via Digitalmars-d
 wrote:
 On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via Digitalmars-d

 wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it does
 and gives non-printable symbol appended at the end of string.

 Me too, this is a design flaw in the language. Following C's example,
 int and char can convert to/from each other. So string ~ int will
 convert int to char (as in reinterpret cast) and append that.

 It is just the way it is, alas.

 I have said before, and will continue to say, that I think implicit
 conversion between char and non-char types in D does not make sense.

 In C, converting between char and int is very common because of the
 conflation of char with byte, but in D we have explicit types for byte
 and ubyte, which should take care of any of those kinds of use cases,
 and char is explicitly defined to be a UTF8 code unit.  Now sure, there
 are cases where you want to get at the numerical value of a char --
 that's what cast(int) and cast(char) is for.  But *implicitly*
 converting between char and int, especially when we went through the
 trouble of defining a separate type for char that stands apart from
 byte/ubyte, does not make any sense to me.

 This problem is especially annoying with function overloads that take
 char vs. byte: because of implicit conversion, often the wrong overload
 ends up getting called WITHOUT ANY WARNING.  Once, while refactoring
 some code, I changed a representation of an object from char to a byte
 ID, but in order to do the refactoring piecemeal, I needed to overload
 between byte and char so that older code will continue to compile while
 the refactoring is still in progress.  Bad idea.  All sorts of random
 problems and runtime crashes happened because C's stupid int conversion
 rules were liberally applied to D types, causing a gigantic mess where
 you never know which overload will get called. (Well OK, it's
 predictable if you sit down and work it out, but it's just plain
 annoying when a lousy char literal calls the byte overload whereas a
 char variable calls the char overload.)  I ended up having to wrap the
 type in a struct just to stop the implicit conversion from tripping me
 up.

 
 +1
 
 Unfortunately, I don't know how reasonable it is to fix it at this point,
 much as I would love to see it fixed. Historically, I don't think that
 Walter could have been convinced, but based on some of the stuff he's said
 in recent years, I think that he'd be much more open to the idea now.
 However, even if he could now be convinced that ideally the conversion
 wouldn't exist, I don't know how easy it would be to get a DIP through when
 you consider the potential code breakage. But maybe it's possible to do it
 in a smooth enough manner that it could work - especially when many of the
 kind of cases where you might actually _want_ such a conversion already
 require casting anyway thanks to the rules about integer promotions and
 narrowing conversions (e.g. when adding or subtracting from chars).
 Regardless, it would have to be well-written DIP with a clean transition
 scheme. Having that DIP on removing the implicit conversion of integer and
 character literals to bool be accepted would be a start in the right
 direction though. If that gets rejected (which I sure hope that it isn't),
 then there's probably no hope for a DIP fixing the char situation.

It's not just ints to chars, but chars to wchars or dchars, and wchars 
to dchars.

Basically a character type should not convert from any other type. 
period. Because it's not "just a number" in a different format.

Do we need a DIP? Probably. but we have changed these types of things in 
the past from what I remember (I seem to recall we had at one point 
implicit truncation for adding 2 smaller numbers together). It is 
possible to still fix.

-Steve

Nov 05 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Nov 05, 2018 at 05:43:19PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 It's not just ints to chars, but chars to wchars or dchars, and wchars
 to dchars.
 
 Basically a character type should not convert from any other type.
 period.  Because it's not "just a number" in a different format.

+1.  I recall having this conversation before.  Was this ever filed as a
bug?  I couldn't find it this morning when I tried to look.


 Do we need a DIP? Probably. but we have changed these types of things
 in the past from what I remember (I seem to recall we had at one point
 implicit truncation for adding 2 smaller numbers together). It is
 possible to still fix.

[...]

If it's possible to fix, I'd like to see it fixed.  So far, I don't
recall hearing anyone strongly oppose such a change; all objections
appear to be only coming from the fear of breaking existing code.

Some things to consider:

- What this implies for the "if C code is compilable as D, it must have
  the same semantics" philosophy that Walter appears to be strongly
  insistent on.  Basically, anything that depends on C's conflation of
  char and (u)byte must either give an error, or give the correct
  semantics.

- The possibility of automatically fixing code broken by the change
  (possibly partial, leaving corner cases as errors to be handled by the
  user -- the idea being to eliminate the rote stuff and only require
  user intervention in the tricky cases).  This may be a good and simple
  use-case for building a tool that could do something like that.  This
  isn't the first time potential code breakage threatens an otherwise
  beneficial language change, where having an automatic source upgrade
  tool could alleviate many of the concerns.

- Once we start making a clear distinction between char types and
  non-char types, will char types still obey C-like int promotion rules,
  or should we consider discarding old baggage that's no longer so
  applicable to modern D?  For example, I envision that this DIP would
  make int + char or char + int illegal, but what should the result of
  char + char or char + wchar be?  I'm tempted to propose outright
  banning char arithmetic without casting, but for some applications
  this might be too onerous.  If we continue follow C rules, char + char
  would implicitly promote to dchar, which arguably could be annoying.


T

-- 
"Computer Science is no more about computers than astronomy is about
telescopes." -- E.W. Dijkstra

Nov 05 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, November 5, 2018 4:14:18 PM MST H. S. Teoh via Digitalmars-d 
wrote:
 On Mon, Nov 05, 2018 at 05:43:19PM -0500, Steven Schveighoffer via
 Digitalmars-d wrote: [...]

 It's not just ints to chars, but chars to wchars or dchars, and wchars
 to dchars.

 Basically a character type should not convert from any other type.
 period.  Because it's not "just a number" in a different format.

 +1.  I recall having this conversation before.  Was this ever filed as a
 bug?  I couldn't find it this morning when I tried to look.

 Do we need a DIP? Probably. but we have changed these types of things
 in the past from what I remember (I seem to recall we had at one point
 implicit truncation for adding 2 smaller numbers together). It is
 possible to still fix.

 [...]

 If it's possible to fix, I'd like to see it fixed.  So far, I don't
 recall hearing anyone strongly oppose such a change; all objections
 appear to be only coming from the fear of breaking existing code.

 Some things to consider:

 - What this implies for the "if C code is compilable as D, it must have
   the same semantics" philosophy that Walter appears to be strongly
   insistent on.  Basically, anything that depends on C's conflation of
   char and (u)byte must either give an error, or give the correct
   semantics.

I'm pretty sure that the change would just result in more errors, so I don't
think that it would cause problems on this front.

 - The possibility of automatically fixing code broken by the change
   (possibly partial, leaving corner cases as errors to be handled by the
   user -- the idea being to eliminate the rote stuff and only require
   user intervention in the tricky cases).  This may be a good and simple
   use-case for building a tool that could do something like that.  This
   isn't the first time potential code breakage threatens an otherwise
   beneficial language change, where having an automatic source upgrade
   tool could alleviate many of the concerns.

An automatic tool would be nice, but I don't know that focusing on that
would be helpful, since it would be making it seem like the amount of
breakage was large, which would make the change seem less acceptable.
Regardless, the breakage couldn't be immediate. It would have to be some
sort of deprecation warning first - possibly similar to whatever was done
with the integer promotion changes a few releases back, though I never
understood what happened there.

 - Once we start making a clear distinction between char types and
   non-char types, will char types still obey C-like int promotion rules,
   or should we consider discarding old baggage that's no longer so
   applicable to modern D?  For example, I envision that this DIP would
   make int + char or char + int illegal, but what should the result of
   char + char or char + wchar be?  I'm tempted to propose outright
   banning char arithmetic without casting, but for some applications
   this might be too onerous.  If we continue follow C rules, char + char
   would implicitly promote to dchar, which arguably could be annoying.

Well, as I understand it, the fact that char + char -> int is related to how
the CPU works, and having it become char + char -> char would be a problem
from that perspective. Having char + char -> dchar would also go against the
whole idea that char is an encoding, because adding two chars together isn't
necessarily going to get you a valid dchar. In reality though, I would
expect reasonable code to be adding ints to char, because you're going to
get stuff like x - 48 to convert ASCII numbers to integers. And honestly,
adding two chars together doesn't even make sense. What does that even mean?
'A' + 'Q' does what? It's nonsense. Ultimately, I think that it would be too
large a change to disallow it (and _maybe_ someone out there has some weird
use case where it sort of makes sense), but I don't see how it makes any
sense to actually do it. So, making it so that adding two chars together
continues to result in an int makes the most sense to me, as does adding an
int and a char (which is the operation that code is actually going to be
doing). Code can then cast back to char (which is what it already has to do
now anyway). It allows code to continue to function as it has (thus reducing
how disruptive the changes are), but if we eliminate the implicit
conversions, we eliminate the common bugs. I think that you'll get _far_
stronger opposition to trying to change the arithmetic operations than to
changing the implicit conversions, and I also think that the gains are far
less obvious.

So basically, I wouldn't advise mucking around with the arithmetic
operations. I'd suggest simply making it so that implicitly converting
between character types and any other type (unless explicitly defined by
something like alias this) be disallowed.

Given that you already have to cast with the arithmetic stuff (at least to
get it back into char), I'm pretty sure the result would actually be that
almost all of the code that would have to be changed as a result would be
code that was either broken or a code smell, which would probably make it a
lot easier to convince Walter to make the change.

- Jonathan M Davis

Nov 05 2018

lngns <contact lngnslnvsk.net> writes:

It adds the equivalent char value, which is interpreted as ASCII 
when printing.
You can append an object of its underlying (or a compatible) type 
to an array.

```
void main()
{
     import std.stdio : writeln;

     writeln("hello " ~ 42); //hello *
     writeln([42, 56] ~ 7); //[42, 56, 7]
}
```

Nov 05 2018

Paul Backus <snarwin gmail.com> writes:

On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the 
 following without using [std.conv: to] and get strange result. 
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.
 The same problem is encountered even without [enum]. Just using 
 plain integer value gives the same. Is it a bug or someone 
 realy could rely on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 

It seems like the integer 2 is being implicitly converted to a 
char--specifically, the character U+0002 Start of Text.

Normally, a ulong wouldn't be implicitly convertible to a char, 
but compile-time constants appear to get special treatment as 
long as their values are in the correct range. If you try it with 
a number too big to fit in a char, you get an error:

void main()
{
     import std.stdio;
     writeln("test " ~ 256);
}

Error: incompatible types for ("test ") ~ (256): string and int

Nov 05 2018

lithium iodate <whatdoiknow doesntexist.net> writes:

On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the 
 following without using [std.conv: to] and get strange result. 
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.
 The same problem is encountered even without [enum]. Just using 
 plain integer value gives the same. Is it a bug or someone 
 realy could rely on this behaviour?

As long as the integral value is statically known to be a valid 
code point and fits into the numerical rangle of the char-type 
(which plain 'char' in this case), automatic conversion is done. 
If you replace the values inside your enum with something bigger 
(> 255) or negative, you will see that the compiler doesn't 
universally allow all such automatic integral->char conversions. 
You can also see this effect when you declare a mutable vs an 
immutable integer and try to append it to a regular string, the 
mutable one will fail. (anything that can be larger than 1114111 
will always fail, as far as I can tell)
Some consider this useful behavior, but it's not uncontroversial.

Nov 05 2018

12345swordy <alexanderheistermann gmail.com> writes:

On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the 
 following without using [std.conv: to] and get strange result. 
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.
 The same problem is encountered even without [enum]. Just using 
 plain integer value gives the same. Is it a bug or someone 
 realy could rely on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 

Welp with the recent rejection of the DIP 1005, I don't see this 
being deprecated any time soon.

-Alex

Nov 12 2018

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/12/18 3:01 PM, 12345swordy wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the following 
 without using [std.conv: to] and get strange result. I believe that 
 following code shouldn't even compile, but it does and gives 
 non-printable symbol appended at the end of string.
 The same problem is encountered even without [enum]. Just using plain 
 integer value gives the same. Is it a bug or someone realy could rely 
 on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 

 
 Welp with the recent rejection of the DIP 1005, I don't see this being 
 deprecated any time soon.
 
 -Alex

If we deprecate that we also need to deprecate:

     string res = `Number value: ` ~ 65;

Not saying we shouldn't, just that there are many implications.


Andrei

Nov 12 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Monday, 12 November 2018 at 20:23:42 UTC, Andrei Alexandrescu 
wrote:
 If we deprecate that we also need to deprecate:

     string res = `Number value: ` ~ 65;

 Not saying we shouldn't, just that there are many implications.

I'd call that a good thing, many people are surprised when ~ x 
doesn't do ~ to!string(x), and besides, a number isn't really a 
character.

I'd be happy if you had to write:

`Number value: ` ~ char(65).

Nov 12 2018

12345swordy <alexanderheistermann gmail.com> writes:

On Monday, 12 November 2018 at 20:23:42 UTC, Andrei Alexandrescu 
wrote:
 On 11/12/18 3:01 PM, 12345swordy wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the 
 following without using [std.conv: to] and get strange 
 result. I believe that following code shouldn't even compile, 
 but it does and gives non-printable symbol appended at the 
 end of string.
 The same problem is encountered even without [enum]. Just 
 using plain integer value gives the same. Is it a bug or 
 someone realy could rely on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 

 
 Welp with the recent rejection of the DIP 1005, I don't see 
 this being deprecated any time soon.
 
 -Alex

 If we deprecate that we also need to deprecate:

     string res = `Number value: ` ~ 65;

 Not saying we shouldn't, just that there are many implications.


 Andrei

We could replace the

string res = `Number value: ` ~ 65;

with:

string res = `Number value: ` ~65.ToString();

which makes

string res = `Number value: 65`

Via extension methods with compile time reflection. (Which I am 
very exited to see with your upcoming DIP that overhauls the 
compile time reflection!)

Which display the intent of converting 65 to a literal string 
equivalent.

Alex

Nov 12 2018

bachmeier <no spam.net> writes:

On Monday, 12 November 2018 at 20:23:42 UTC, Andrei Alexandrescu 
wrote:

 If we deprecate that we also need to deprecate:

     string res = `Number value: ` ~ 65;

 Not saying we shouldn't, just that there are many implications.


 Andrei

I sure hope that happens. As I wrote above, this bit me when I 
started using the language, and it didn't leave a good impression.

Nov 12 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, November 12, 2018 1:23:42 PM MST Andrei Alexandrescu via 
Digitalmars-d wrote:
 On 11/12/18 3:01 PM, 12345swordy wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the following
 without using [std.conv: to] and get strange result. I believe that
 following code shouldn't even compile, but it does and gives
 non-printable symbol appended at the end of string.
 The same problem is encountered even without [enum]. Just using plain
 integer value gives the same. Is it a bug or someone realy could rely
 on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 

 Welp with the recent rejection of the DIP 1005, I don't see this being
 deprecated any time soon.

 -Alex

 If we deprecate that we also need to deprecate:

      string res = `Number value: ` ~ 65;

 Not saying we shouldn't, just that there are many implications.

And honestly, that's _exactly_ the sort of expression that we'd be looking
to have deprecated, because it usually causes bugs. And it just gets worse
in more complex expressions (ones involving the ternary operator seem to be
particularly popular from what I recall). D specifically made character
types separate from integer types, because character types have a distinct
meaning separate from integer types, and then it shot itself in the foot by
allowing them to more or less freely mix via implicit conversions. The main
saving grace is the fact that most code uses char, and the arithmetic
expressions result in int, so to assign back, you need to cast because it's
a narrowing conversion. So, a lot of the casts that we'd want to be required
when converting between integer types and character types are fortunately
required anyway, but stuff like ~ gets around it in many cases, and if
you're using dchar, you're not protected by narrowing conversions. So, the
problem still exists, and it still causes bugs.

Many of us see the fact that code like

    string res = `Number value: ` ~ 65;

compiles as wholly negative, though based on what Walter has said in the
past, unless something has changed, I'm sure that he does not agree on that
count, and the fact that this DIP on bool was rejected on the grounds that
you guys think that bool should be treated as an integer type does make it
sound like it's going to be difficult to convince you that the character
types shouldn't be treated as integer types.

- Jonathan M Davis

Nov 12 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/12/18 3:23 PM, Andrei Alexandrescu wrote:
 On 11/12/18 3:01 PM, 12345swordy wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the following 
 without using [std.conv: to] and get strange result. I believe that 
 following code shouldn't even compile, but it does and gives 
 non-printable symbol appended at the end of string.
 The same problem is encountered even without [enum]. Just using plain 
 integer value gives the same. Is it a bug or someone realy could rely 
 on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 

 Welp with the recent rejection of the DIP 1005, I don't see this being 
 deprecated any time soon.

 
 If we deprecate that we also need to deprecate:
 
      string res = `Number value: ` ~ 65;
 
 Not saying we shouldn't, just that there are many implications.

I'm wondering if you realized what you are saying there. Like "if we 
deprecate one crappy behavior, we have to deprecate all the crappy behavior"

Yes, please.

-Steve

Nov 13 2018

D Programming

C/C++ Programming

Other

digitalmars.D - Strange implicit conversion integers on concatenation