www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Strange implicit conversion integers on concatenation

reply uranuz <neuranuz gmail.com> writes:
Hello to everyone! By mistake I typed some code like the 
following without using [std.conv: to] and get strange result. I 
believe that following code shouldn't even compile, but it does 
and gives non-printable symbol appended at the end of string.
The same problem is encountered even without [enum]. Just using 
plain integer value gives the same. Is it a bug or someone realy 
could rely on this behaviour?

import std.stdio;

enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
}

void main()
{
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
}

Output:
Number value: 
Nov 05 2018
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.
Me too, this is a design flaw in the language. Following C's example, int and char can convert to/from each other. So string ~ int will convert int to char (as in reinterpret cast) and append that. It is just the way it is, alas.
Nov 05 2018
next sibling parent uranuz <neuranuz gmail.com> writes:
On Monday, 5 November 2018 at 15:58:40 UTC, Adam D. Ruppe wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.
Me too, this is a design flaw in the language. Following C's example, int and char can convert to/from each other. So string ~ int will convert int to char (as in reinterpret cast) and append that. It is just the way it is, alas.
Ok. It's because string is array of char. And int could be implicitly converted to char if it fits the range.
Nov 05 2018
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via Digitalmars-d wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it does
 and gives non-printable symbol appended at the end of string.
Me too, this is a design flaw in the language. Following C's example, int and char can convert to/from each other. So string ~ int will convert int to char (as in reinterpret cast) and append that. It is just the way it is, alas.
I have said before, and will continue to say, that I think implicit conversion between char and non-char types in D does not make sense. In C, converting between char and int is very common because of the conflation of char with byte, but in D we have explicit types for byte and ubyte, which should take care of any of those kinds of use cases, and char is explicitly defined to be a UTF8 code unit. Now sure, there are cases where you want to get at the numerical value of a char -- that's what cast(int) and cast(char) is for. But *implicitly* converting between char and int, especially when we went through the trouble of defining a separate type for char that stands apart from byte/ubyte, does not make any sense to me. This problem is especially annoying with function overloads that take char vs. byte: because of implicit conversion, often the wrong overload ends up getting called WITHOUT ANY WARNING. Once, while refactoring some code, I changed a representation of an object from char to a byte ID, but in order to do the refactoring piecemeal, I needed to overload between byte and char so that older code will continue to compile while the refactoring is still in progress. Bad idea. All sorts of random problems and runtime crashes happened because C's stupid int conversion rules were liberally applied to D types, causing a gigantic mess where you never know which overload will get called. (Well OK, it's predictable if you sit down and work it out, but it's just plain annoying when a lousy char literal calls the byte overload whereas a char variable calls the char overload.) I ended up having to wrap the type in a struct just to stop the implicit conversion from tripping me up. T -- Some days you win; most days you lose.
Nov 05 2018
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
n Monday, November 5, 2018 9:31:56 AM MST H. S. Teoh via Digitalmars-d 
wrote:
 On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via Digitalmars-d 
wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it does
 and gives non-printable symbol appended at the end of string.
Me too, this is a design flaw in the language. Following C's example, int and char can convert to/from each other. So string ~ int will convert int to char (as in reinterpret cast) and append that. It is just the way it is, alas.
I have said before, and will continue to say, that I think implicit conversion between char and non-char types in D does not make sense. In C, converting between char and int is very common because of the conflation of char with byte, but in D we have explicit types for byte and ubyte, which should take care of any of those kinds of use cases, and char is explicitly defined to be a UTF8 code unit. Now sure, there are cases where you want to get at the numerical value of a char -- that's what cast(int) and cast(char) is for. But *implicitly* converting between char and int, especially when we went through the trouble of defining a separate type for char that stands apart from byte/ubyte, does not make any sense to me. This problem is especially annoying with function overloads that take char vs. byte: because of implicit conversion, often the wrong overload ends up getting called WITHOUT ANY WARNING. Once, while refactoring some code, I changed a representation of an object from char to a byte ID, but in order to do the refactoring piecemeal, I needed to overload between byte and char so that older code will continue to compile while the refactoring is still in progress. Bad idea. All sorts of random problems and runtime crashes happened because C's stupid int conversion rules were liberally applied to D types, causing a gigantic mess where you never know which overload will get called. (Well OK, it's predictable if you sit down and work it out, but it's just plain annoying when a lousy char literal calls the byte overload whereas a char variable calls the char overload.) I ended up having to wrap the type in a struct just to stop the implicit conversion from tripping me up.
+1 Unfortunately, I don't know how reasonable it is to fix it at this point, much as I would love to see it fixed. Historically, I don't think that Walter could have been convinced, but based on some of the stuff he's said in recent years, I think that he'd be much more open to the idea now. However, even if he could now be convinced that ideally the conversion wouldn't exist, I don't know how easy it would be to get a DIP through when you consider the potential code breakage. But maybe it's possible to do it in a smooth enough manner that it could work - especially when many of the kind of cases where you might actually _want_ such a conversion already require casting anyway thanks to the rules about integer promotions and narrowing conversions (e.g. when adding or subtracting from chars). Regardless, it would have to be well-written DIP with a clean transition scheme. Having that DIP on removing the implicit conversion of integer and character literals to bool be accepted would be a start in the right direction though. If that gets rejected (which I sure hope that it isn't), then there's probably no hope for a DIP fixing the char situation. - Jonathan M Davis
Nov 05 2018
next sibling parent reply bachmeier <no spam.net> writes:
On Monday, 5 November 2018 at 21:11:27 UTC, Jonathan M Davis 
wrote:

 I don't know how reasonable it is to fix it at this point, much 
 as I would love to see it fixed.
It's hard for me to see how it would be reasonable to not fix it. This is one of those ugly parts of the language that need to be evolved out of the language. If there's a reason to support this, it should be done with a compiler switch. I'm pretty sure that this was one of the weird things that hit me when I started with the language, it was frustrating, and it didn't make a good impression.
Nov 05 2018
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, November 5, 2018 2:49:32 PM MST bachmeier via Digitalmars-d 
wrote:
 On Monday, 5 November 2018 at 21:11:27 UTC, Jonathan M Davis

 wrote:
 I don't know how reasonable it is to fix it at this point, much
 as I would love to see it fixed.
It's hard for me to see how it would be reasonable to not fix it. This is one of those ugly parts of the language that need to be evolved out of the language. If there's a reason to support this, it should be done with a compiler switch. I'm pretty sure that this was one of the weird things that hit me when I started with the language, it was frustrating, and it didn't make a good impression.
It really comes down to what code would break due to the change, how that code breakage could be mitigated, what the transition process would look like, and how Walter views the issue at this point. Historically, he has not seen this as a problem like many of us have, but his views have evolved somewhat over time. However, he's also become far more wary of breaking code with language changes. If this change were proposed in a DIP, a clean transition would be required, and if a compiler flag were required, I don't know that it would ever happen. I don't recall a single transition that has required a compiler switch in D that has ever been completed. Some that have had a compiler switch to get the old behavior back have worked, but stuff like -property or -dip25 have never reached the finish line. -dip25 may yet get there given how it's tied into -dip1000, and I expect that Walter will find a way to push -dip1000 through given its importance for safe, but it's still an open question how on earth we're going to transition to DIP 1000 being the normal behavior given how big a switch that is. So, if someone can figure out how to cleanly transition behavior to get rid of the implicit conversion between character types and integer types without needing a -dipxxx switch to enable the new behavior, and they can argue it well enough to convince Walter, then we may very well get there, but otherwise, I expect that we're stuck. Either way, I think that we have to see how https://github.com/dlang/DIPs/blob/master/DIPs/DIP1015.md does first. If _that_ can't get through, I don't think that a DIP to fix implict conversions and char stands a chance. I'm currently expecting that DIP to be accepted, but you never know. - Jonathan M Davis
Nov 05 2018
parent Neia Neutuladh <neia ikeran.org> writes:
On Mon, 05 Nov 2018 15:14:08 -0700, Jonathan M Davis wrote:
 It really comes down to what code would break due to the change, how
 that code breakage could be mitigated, what the transition process would
 look like, and how Walter views the issue at this point.
Get a patch and I can make dubautotester run on it to see what breaks. I originally intended to use it to determine which patches we could safely backport in order to construct more stable DMDFE versions without vastly increasing the amount of human work. Like, we get a patch against master; try to apply it against 2.080.1 and see if it (a) still works (b) compiles everything that 2.080.1 did (c) doesn't compile anything that 2.080.1 didn't. Automatically apply it to the next patch version if all that passes. Automatically apply the patch to the next minor version if only (c) fails. Make a report for human intervention. Or something like that. The main issue is that it takes a *lot* of time to run these tests, so more than one patch per day would require me to set up parallelism and upgrade the build box. Evaluating the effects of a proposal like this is pretty similar.
Nov 05 2018
prev sibling next sibling parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Monday, 5 November 2018 at 21:11:27 UTC, Jonathan M Davis 
wrote:
 n Monday, November 5, 2018 9:31:56 AM MST H. S. Teoh via 
 Digitalmars-d wrote:
 On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via 
 Digitalmars-d
wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but 
 it does and gives non-printable symbol appended at the end 
 of string.
Me too, this is a design flaw in the language. Following C's example, int and char can convert to/from each other. So string ~ int will convert int to char (as in reinterpret cast) and append that. It is just the way it is, alas.
I have said before, and will continue to say, that I think implicit conversion between char and non-char types in D does not make sense. In C, converting between char and int is very common because of the conflation of char with byte, but in D we have explicit types for byte and ubyte, which should take care of any of those kinds of use cases, and char is explicitly defined to be a UTF8 code unit. Now sure, there are cases where you want to get at the numerical value of a char -- that's what cast(int) and cast(char) is for. But *implicitly* converting between char and int, especially when we went through the trouble of defining a separate type for char that stands apart from byte/ubyte, does not make any sense to me. This problem is especially annoying with function overloads that take char vs. byte: because of implicit conversion, often the wrong overload ends up getting called WITHOUT ANY WARNING. Once, while refactoring some code, I changed a representation of an object from char to a byte ID, but in order to do the refactoring piecemeal, I needed to overload between byte and char so that older code will continue to compile while the refactoring is still in progress. Bad idea. All sorts of random problems and runtime crashes happened because C's stupid int conversion rules were liberally applied to D types, causing a gigantic mess where you never know which overload will get called. (Well OK, it's predictable if you sit down and work it out, but it's just plain annoying when a lousy char literal calls the byte overload whereas a char variable calls the char overload.) I ended up having to wrap the type in a struct just to stop the implicit conversion from tripping me up.
+1 Unfortunately, I don't know how reasonable it is to fix it at this point, much as I would love to see it fixed. Historically, I don't think that Walter could have been convinced, but based on some of the stuff he's said in recent years, I think that he'd be much more open to the idea now. However, even if he could now be convinced that ideally the conversion wouldn't exist, I don't know how easy it would be to get a DIP through when you consider the potential code breakage. But maybe it's possible to do it in a smooth enough manner that it could work - especially when many of the kind of cases where you might actually _want_ such a conversion already require casting anyway thanks to the rules about integer promotions and narrowing conversions (e.g. when adding or subtracting from chars). Regardless, it would have to be well-written DIP with a clean transition scheme. Having that DIP on removing the implicit conversion of integer and character literals to bool be accepted would be a start in the right direction though. If that gets rejected (which I sure hope that it isn't), then there's probably no hope for a DIP fixing the char situation. - Jonathan M Davis
We need to avoid the situation where we have to create a DIP for every unwanted implicit conversion with regards to calling the wrong overload function, we need better way of doing this. No one wants to wait a year for a DIP approval for something that is very minor such as deprecating a implicit conversion for native data types. I think a better course of action is to introduce the keywords explicit and implicit. Not as attributes though! I don't want to see functions with nogc nothrow safe pure explicit as that is too much verbiage and hard to read! Which brings up the question of which parameter exactly is explicit? It much easier to read: void example(int bar, explicit int bob) The explicit keyword will become very important if we are to introduce the implicit keyword, as both of them are instrumental in creating types with structs. I don't mind writing a DIP regarding this, as I think this is much easier for the DIP to be accepted then the other one that I currently have. -Alexander
Nov 05 2018
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, November 5, 2018 3:08:24 PM MST 12345swordy via Digitalmars-d 
wrote:
 We need to avoid the situation where we have to create a DIP for
 every unwanted implicit conversion with regards to calling the
 wrong overload function, we need
 better way of doing this. No one wants to wait a year for a DIP
 approval for something that is very minor such as deprecating a
 implicit conversion for native data types.

 I think a better course of action is to introduce the keywords
 explicit and implicit. Not as attributes though! I don't want to
 see functions with  nogc  nothrow safe pure  explicit as that is
 too much verbiage and hard to read! Which brings up the question
 of which parameter exactly is explicit?

 It much easier to read: void example(int bar, explicit int bob)

 The explicit keyword will become very important if we are to
 introduce the implicit keyword, as both of them are instrumental
 in creating types with structs.

 I don't mind writing a DIP regarding this, as I think this is
 much easier for the DIP to be accepted then the other one that I
 currently have.
This really shouldn't be decided on a per function basis. It's an issue on the type level and should be fixed with the types themselves. The OP's problem didn't even happen with a function. It happened with a built-in operator. Regardless, if you attempt to add keywords to the language at this point, you will almost certainly lose. I would be _very_ surprised to see Walter or Andrei go for it. Whether you think attributes are easy to read or not, they don't eat up an identifier, and Walter and Andrei consider that to be very important. AFAIK, they also don't consider attributes to be a readability problem. So, even if trying to add some sort of implicit or explicit marker to parameters made sense (and I really don't think that it does), I think that Walter and Andrei have made it pretty clear that that sort of thing would have to be an attribute and not a keyword. And honestly, I think that any DIP trying to add general control over implicit and explict conversions in the language has a _way_ lower chance of being accepted than one that gets rid of implicit conversions between character types and integer types. However, in the end, one does not depend on the other or even really have much to do with the other. A DIP to fix the implicit conversions between character types and integer types would be a DIP to fix precisely that, whereas a DIP to mark parameters with implicit or explicit would be about trying to control implicit or explicit conversions in general and not about character or integer types specifically, so while they might be tangentially related, they're really separate issues. Given the recent DIP on copy constructors and the discussion there, it would not surprise me to see a future DIP about adding implicit to constructors to allow for implicit construction, though I don't know how likely it is for such a DIP to be accepted given that D's approach (outside of built-in types anyway) has generally been to avoid implicit conversions to reduce the risk of bugs, and when combined with alias this, things really start to get interesting. But I would think that the chances of that getting accepted are far greater than adding attributes to parameters (be they keywords or actual attributes). Regardless, that's an issue of conversions in general, and not just implicit conversions between character types and integer types, which is really what the discussion is about fixing here, and that can be fixed regardless of what happens with providing additional control over implicit conversions in general. - Jonathan M Davis
Nov 05 2018
parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Monday, 5 November 2018 at 23:00:23 UTC, Jonathan M Davis 
wrote:

 Regardless, if you attempt to add keywords to the language at 
 this point, you will almost certainly lose. I would be _very_ 
 surprised to see Walter or Andrei go for it.
What exactly makes you say that? I had scan read the old dips that were rejected on the wiki and on the github, and it seems to be rejected for other reasons. Is there previous discussion that you(or others) can linked to?
 Whether you think attributes are easy to read or not, they 
 don't eat up an identifier, and Walter and Andrei consider that 
 to be very important.
The feature that I have in mind requires to be an keyword as it https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/explicit https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/implicit Even c++ have type conversions. http://www.cplusplus.com/doc/tutorial/typecasting/ Bear in mind though that I am still in the brainstorm session and research process in regarding this. Don't expect the details from me, as I havn't figure everything out yet.
 AFAIK, they also don't consider attributes to be a readability 
 problem. So, even if trying to add some sort of implicit or 
 explicit marker to parameters made sense (and I really don't 
 think that it does), I think that Walter and Andrei have made 
 it pretty clear that that sort of thing would have to be an 
 attribute and not a keyword.
I don't want the implicit and explicit keywords to be attributes in the dip I going to write, unless I really have to in order to get the DIP approval from Walter and Andrei.
 And honestly, I think that any DIP trying to add general 
 control over implicit and explict conversions in the language 
 has a _way_ lower chance of being accepted than one that gets 
 rid of implicit conversions between character types and integer 
 types. However, in the end, one does not depend on the other or 
 even really have much to do with the other. A DIP to fix the 
 implicit conversions between character types and integer types 
 would be a DIP to fix precisely that, whereas a DIP to mark 
 parameters with implicit or explicit would be about trying to 
 control implicit or explicit conversions in general and not 
 about character or integer types specifically, so while they 
 might be tangentially related, they're really separate issues.
That is very good point. Well consider that when writing the dip.
 Given the recent DIP on copy constructors and the discussion 
 there, it would not surprise me to see a future DIP about 
 adding  implicit to constructors to allow for implicit 
 construction, though I don't know how likely it is for such a 
 DIP to be accepted given that D's approach (outside of built-in 
 types anyway) has generally been to avoid implicit conversions 
 to reduce the risk of bugs, and when combined with alias this, 
 things really start to get interesting.
Gah, don't remind me of alias this. That can of worms has yet to be open with multi alias this. Which btw is STILL not implemented yet! Hell, I will sign up for the next round of community fund projects just to implement the damn thing, because I am that impatient.
 But I would think that the chances of that getting accepted are 
 far greater than adding attributes to parameters (be they 
 keywords or actual attributes). Regardless, that's an issue of 
 conversions in general, and not just implicit conversions 
 between character types and integer types, which is really what 
 the discussion is about fixing here, and that can be fixed 
 regardless of what happens with providing additional control 
 over implicit conversions in general.

 - Jonathan M Davis
Sure thing, though the DIP process for deprecation of small features shouldn't be that slow! -Alex
Nov 05 2018
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, November 5, 2018 7:47:42 PM MST 12345swordy via Digitalmars-d 
wrote:
 On Monday, 5 November 2018 at 23:00:23 UTC, Jonathan M Davis

 wrote:
 Regardless, if you attempt to add keywords to the language at
 this point, you will almost certainly lose. I would be _very_
 surprised to see Walter or Andrei go for it.
What exactly makes you say that? I had scan read the old dips that were rejected on the wiki and on the github, and it seems to be rejected for other reasons. Is there previous discussion that you(or others) can linked to?
I would have to go digging through the newsgroup history. It's come up on a number of occasions in various threads, and I couldn't say which at this point. But the very reason that we started putting on things in the first place was to avoid creating keywords. We did it before user-defined attributes were even a thing. And for years now, any time that it's been considered to add any kind of attribute, it always starts with . It has been years since anything involving adding a new keyword gotten anywhere. Similiarly, Walter has shot down the idea of using contextual keywords (which you can probably find discussions on pretty easily by searching the newsgroup history). So, if you want to create a DIP that proposes adding implicit and explicit as keywords, feel free to do so, but from what I know of Walter and Andrei's position on the topic of keywords from what they've said in the newsgroup or in anything they've said in any discussions that I've had with them in person, they're not going to be interested in adding new keywords when adding an attribute that starts with will work, because adding keywords means restricting the list of available identifiers, whereas adding new attributes does not. I honestly do not expect that D2 will _ever_ get any additional keywords and would be very surprised if it ever did.
 Sure thing, though the DIP process for deprecation of small
 features shouldn't be that slow!
I won't claim the the DIP process couldn't or shouldn't be improved, but at least we now have a DIP process that actually works, even if it can be slow. With the old process, DIPs basically almost never went anywhere. A few did, but most weren't ever even seriously reviewed by Walter and Andrei. While it may not be perfect, the current process is an _enormous_ improvement. - Jonathan M Davis
Nov 05 2018
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/5/18 4:11 PM, Jonathan M Davis wrote:
 n Monday, November 5, 2018 9:31:56 AM MST H. S. Teoh via Digitalmars-d
 wrote:
 On Mon, Nov 05, 2018 at 03:58:40PM +0000, Adam D. Ruppe via Digitalmars-d
wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 I believe that following code shouldn't even compile, but it does
 and gives non-printable symbol appended at the end of string.
Me too, this is a design flaw in the language. Following C's example, int and char can convert to/from each other. So string ~ int will convert int to char (as in reinterpret cast) and append that. It is just the way it is, alas.
I have said before, and will continue to say, that I think implicit conversion between char and non-char types in D does not make sense. In C, converting between char and int is very common because of the conflation of char with byte, but in D we have explicit types for byte and ubyte, which should take care of any of those kinds of use cases, and char is explicitly defined to be a UTF8 code unit. Now sure, there are cases where you want to get at the numerical value of a char -- that's what cast(int) and cast(char) is for. But *implicitly* converting between char and int, especially when we went through the trouble of defining a separate type for char that stands apart from byte/ubyte, does not make any sense to me. This problem is especially annoying with function overloads that take char vs. byte: because of implicit conversion, often the wrong overload ends up getting called WITHOUT ANY WARNING. Once, while refactoring some code, I changed a representation of an object from char to a byte ID, but in order to do the refactoring piecemeal, I needed to overload between byte and char so that older code will continue to compile while the refactoring is still in progress. Bad idea. All sorts of random problems and runtime crashes happened because C's stupid int conversion rules were liberally applied to D types, causing a gigantic mess where you never know which overload will get called. (Well OK, it's predictable if you sit down and work it out, but it's just plain annoying when a lousy char literal calls the byte overload whereas a char variable calls the char overload.) I ended up having to wrap the type in a struct just to stop the implicit conversion from tripping me up.
+1 Unfortunately, I don't know how reasonable it is to fix it at this point, much as I would love to see it fixed. Historically, I don't think that Walter could have been convinced, but based on some of the stuff he's said in recent years, I think that he'd be much more open to the idea now. However, even if he could now be convinced that ideally the conversion wouldn't exist, I don't know how easy it would be to get a DIP through when you consider the potential code breakage. But maybe it's possible to do it in a smooth enough manner that it could work - especially when many of the kind of cases where you might actually _want_ such a conversion already require casting anyway thanks to the rules about integer promotions and narrowing conversions (e.g. when adding or subtracting from chars). Regardless, it would have to be well-written DIP with a clean transition scheme. Having that DIP on removing the implicit conversion of integer and character literals to bool be accepted would be a start in the right direction though. If that gets rejected (which I sure hope that it isn't), then there's probably no hope for a DIP fixing the char situation.
It's not just ints to chars, but chars to wchars or dchars, and wchars to dchars. Basically a character type should not convert from any other type. period. Because it's not "just a number" in a different format. Do we need a DIP? Probably. but we have changed these types of things in the past from what I remember (I seem to recall we had at one point implicit truncation for adding 2 smaller numbers together). It is possible to still fix. -Steve
Nov 05 2018
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Nov 05, 2018 at 05:43:19PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 It's not just ints to chars, but chars to wchars or dchars, and wchars
 to dchars.
 
 Basically a character type should not convert from any other type.
 period.  Because it's not "just a number" in a different format.
+1. I recall having this conversation before. Was this ever filed as a bug? I couldn't find it this morning when I tried to look.
 Do we need a DIP? Probably. but we have changed these types of things
 in the past from what I remember (I seem to recall we had at one point
 implicit truncation for adding 2 smaller numbers together). It is
 possible to still fix.
[...] If it's possible to fix, I'd like to see it fixed. So far, I don't recall hearing anyone strongly oppose such a change; all objections appear to be only coming from the fear of breaking existing code. Some things to consider: - What this implies for the "if C code is compilable as D, it must have the same semantics" philosophy that Walter appears to be strongly insistent on. Basically, anything that depends on C's conflation of char and (u)byte must either give an error, or give the correct semantics. - The possibility of automatically fixing code broken by the change (possibly partial, leaving corner cases as errors to be handled by the user -- the idea being to eliminate the rote stuff and only require user intervention in the tricky cases). This may be a good and simple use-case for building a tool that could do something like that. This isn't the first time potential code breakage threatens an otherwise beneficial language change, where having an automatic source upgrade tool could alleviate many of the concerns. - Once we start making a clear distinction between char types and non-char types, will char types still obey C-like int promotion rules, or should we consider discarding old baggage that's no longer so applicable to modern D? For example, I envision that this DIP would make int + char or char + int illegal, but what should the result of char + char or char + wchar be? I'm tempted to propose outright banning char arithmetic without casting, but for some applications this might be too onerous. If we continue follow C rules, char + char would implicitly promote to dchar, which arguably could be annoying. T -- "Computer Science is no more about computers than astronomy is about telescopes." -- E.W. Dijkstra
Nov 05 2018
prev sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, November 5, 2018 4:14:18 PM MST H. S. Teoh via Digitalmars-d 
wrote:
 On Mon, Nov 05, 2018 at 05:43:19PM -0500, Steven Schveighoffer via
 Digitalmars-d wrote: [...]

 It's not just ints to chars, but chars to wchars or dchars, and wchars
 to dchars.

 Basically a character type should not convert from any other type.
 period.  Because it's not "just a number" in a different format.
+1. I recall having this conversation before. Was this ever filed as a bug? I couldn't find it this morning when I tried to look.
 Do we need a DIP? Probably. but we have changed these types of things
 in the past from what I remember (I seem to recall we had at one point
 implicit truncation for adding 2 smaller numbers together). It is
 possible to still fix.
[...] If it's possible to fix, I'd like to see it fixed. So far, I don't recall hearing anyone strongly oppose such a change; all objections appear to be only coming from the fear of breaking existing code. Some things to consider: - What this implies for the "if C code is compilable as D, it must have the same semantics" philosophy that Walter appears to be strongly insistent on. Basically, anything that depends on C's conflation of char and (u)byte must either give an error, or give the correct semantics.
I'm pretty sure that the change would just result in more errors, so I don't think that it would cause problems on this front.
 - The possibility of automatically fixing code broken by the change
   (possibly partial, leaving corner cases as errors to be handled by the
   user -- the idea being to eliminate the rote stuff and only require
   user intervention in the tricky cases).  This may be a good and simple
   use-case for building a tool that could do something like that.  This
   isn't the first time potential code breakage threatens an otherwise
   beneficial language change, where having an automatic source upgrade
   tool could alleviate many of the concerns.
An automatic tool would be nice, but I don't know that focusing on that would be helpful, since it would be making it seem like the amount of breakage was large, which would make the change seem less acceptable. Regardless, the breakage couldn't be immediate. It would have to be some sort of deprecation warning first - possibly similar to whatever was done with the integer promotion changes a few releases back, though I never understood what happened there.
 - Once we start making a clear distinction between char types and
   non-char types, will char types still obey C-like int promotion rules,
   or should we consider discarding old baggage that's no longer so
   applicable to modern D?  For example, I envision that this DIP would
   make int + char or char + int illegal, but what should the result of
   char + char or char + wchar be?  I'm tempted to propose outright
   banning char arithmetic without casting, but for some applications
   this might be too onerous.  If we continue follow C rules, char + char
   would implicitly promote to dchar, which arguably could be annoying.
Well, as I understand it, the fact that char + char -> int is related to how the CPU works, and having it become char + char -> char would be a problem from that perspective. Having char + char -> dchar would also go against the whole idea that char is an encoding, because adding two chars together isn't necessarily going to get you a valid dchar. In reality though, I would expect reasonable code to be adding ints to char, because you're going to get stuff like x - 48 to convert ASCII numbers to integers. And honestly, adding two chars together doesn't even make sense. What does that even mean? 'A' + 'Q' does what? It's nonsense. Ultimately, I think that it would be too large a change to disallow it (and _maybe_ someone out there has some weird use case where it sort of makes sense), but I don't see how it makes any sense to actually do it. So, making it so that adding two chars together continues to result in an int makes the most sense to me, as does adding an int and a char (which is the operation that code is actually going to be doing). Code can then cast back to char (which is what it already has to do now anyway). It allows code to continue to function as it has (thus reducing how disruptive the changes are), but if we eliminate the implicit conversions, we eliminate the common bugs. I think that you'll get _far_ stronger opposition to trying to change the arithmetic operations than to changing the implicit conversions, and I also think that the gains are far less obvious. So basically, I wouldn't advise mucking around with the arithmetic operations. I'd suggest simply making it so that implicitly converting between character types and any other type (unless explicitly defined by something like alias this) be disallowed. Given that you already have to cast with the arithmetic stuff (at least to get it back into char), I'm pretty sure the result would actually be that almost all of the code that would have to be changed as a result would be code that was either broken or a code smell, which would probably make it a lot easier to convince Walter to make the change. - Jonathan M Davis
Nov 05 2018
prev sibling next sibling parent lngns <contact lngnslnvsk.net> writes:
It adds the equivalent char value, which is interpreted as ASCII 
when printing.
You can append an object of its underlying (or a compatible) type 
to an array.

```
void main()
{
     import std.stdio : writeln;

     writeln("hello " ~ 42); //hello *
     writeln([42, 56] ~ 7); //[42, 56, 7]
}
```
Nov 05 2018
prev sibling next sibling parent Paul Backus <snarwin gmail.com> writes:
On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the 
 following without using [std.conv: to] and get strange result. 
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.
 The same problem is encountered even without [enum]. Just using 
 plain integer value gives the same. Is it a bug or someone 
 realy could rely on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 
It seems like the integer 2 is being implicitly converted to a char--specifically, the character U+0002 Start of Text. Normally, a ulong wouldn't be implicitly convertible to a char, but compile-time constants appear to get special treatment as long as their values are in the correct range. If you try it with a number too big to fit in a char, you get an error: void main() { import std.stdio; writeln("test " ~ 256); } Error: incompatible types for ("test ") ~ (256): string and int
Nov 05 2018
prev sibling next sibling parent lithium iodate <whatdoiknow doesntexist.net> writes:
On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the 
 following without using [std.conv: to] and get strange result. 
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.
 The same problem is encountered even without [enum]. Just using 
 plain integer value gives the same. Is it a bug or someone 
 realy could rely on this behaviour?
As long as the integral value is statically known to be a valid code point and fits into the numerical rangle of the char-type (which plain 'char' in this case), automatic conversion is done. If you replace the values inside your enum with something bigger (> 255) or negative, you will see that the compiler doesn't universally allow all such automatic integral->char conversions. You can also see this effect when you declare a mutable vs an immutable integer and try to append it to a regular string, the mutable one will fail. (anything that can be larger than 1114111 will always fail, as far as I can tell) Some consider this useful behavior, but it's not uncontroversial.
Nov 05 2018
prev sibling parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the 
 following without using [std.conv: to] and get strange result. 
 I believe that following code shouldn't even compile, but it 
 does and gives non-printable symbol appended at the end of 
 string.
 The same problem is encountered even without [enum]. Just using 
 plain integer value gives the same. Is it a bug or someone 
 realy could rely on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 
Welp with the recent rejection of the DIP 1005, I don't see this being deprecated any time soon. -Alex
Nov 12 2018
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/12/18 3:01 PM, 12345swordy wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the following 
 without using [std.conv: to] and get strange result. I believe that 
 following code shouldn't even compile, but it does and gives 
 non-printable symbol appended at the end of string.
 The same problem is encountered even without [enum]. Just using plain 
 integer value gives the same. Is it a bug or someone realy could rely 
 on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 
Welp with the recent rejection of the DIP 1005, I don't see this being deprecated any time soon. -Alex
If we deprecate that we also need to deprecate: string res = `Number value: ` ~ 65; Not saying we shouldn't, just that there are many implications. Andrei
Nov 12 2018
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Monday, 12 November 2018 at 20:23:42 UTC, Andrei Alexandrescu 
wrote:
 If we deprecate that we also need to deprecate:

     string res = `Number value: ` ~ 65;

 Not saying we shouldn't, just that there are many implications.
I'd call that a good thing, many people are surprised when ~ x doesn't do ~ to!string(x), and besides, a number isn't really a character. I'd be happy if you had to write: `Number value: ` ~ char(65).
Nov 12 2018
prev sibling next sibling parent 12345swordy <alexanderheistermann gmail.com> writes:
On Monday, 12 November 2018 at 20:23:42 UTC, Andrei Alexandrescu 
wrote:
 On 11/12/18 3:01 PM, 12345swordy wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the 
 following without using [std.conv: to] and get strange 
 result. I believe that following code shouldn't even compile, 
 but it does and gives non-printable symbol appended at the 
 end of string.
 The same problem is encountered even without [enum]. Just 
 using plain integer value gives the same. Is it a bug or 
 someone realy could rely on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 
Welp with the recent rejection of the DIP 1005, I don't see this being deprecated any time soon. -Alex
If we deprecate that we also need to deprecate: string res = `Number value: ` ~ 65; Not saying we shouldn't, just that there are many implications. Andrei
We could replace the string res = `Number value: ` ~ 65; with: string res = `Number value: ` ~65.ToString(); which makes string res = `Number value: 65` Via extension methods with compile time reflection. (Which I am very exited to see with your upcoming DIP that overhauls the compile time reflection!) Which display the intent of converting 65 to a literal string equivalent. Alex
Nov 12 2018
prev sibling next sibling parent bachmeier <no spam.net> writes:
On Monday, 12 November 2018 at 20:23:42 UTC, Andrei Alexandrescu 
wrote:

 If we deprecate that we also need to deprecate:

     string res = `Number value: ` ~ 65;

 Not saying we shouldn't, just that there are many implications.


 Andrei
I sure hope that happens. As I wrote above, this bit me when I started using the language, and it didn't leave a good impression.
Nov 12 2018
prev sibling next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, November 12, 2018 1:23:42 PM MST Andrei Alexandrescu via 
Digitalmars-d wrote:
 On 11/12/18 3:01 PM, 12345swordy wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the following
 without using [std.conv: to] and get strange result. I believe that
 following code shouldn't even compile, but it does and gives
 non-printable symbol appended at the end of string.
 The same problem is encountered even without [enum]. Just using plain
 integer value gives the same. Is it a bug or someone realy could rely
 on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 
Welp with the recent rejection of the DIP 1005, I don't see this being deprecated any time soon. -Alex
If we deprecate that we also need to deprecate: string res = `Number value: ` ~ 65; Not saying we shouldn't, just that there are many implications.
And honestly, that's _exactly_ the sort of expression that we'd be looking to have deprecated, because it usually causes bugs. And it just gets worse in more complex expressions (ones involving the ternary operator seem to be particularly popular from what I recall). D specifically made character types separate from integer types, because character types have a distinct meaning separate from integer types, and then it shot itself in the foot by allowing them to more or less freely mix via implicit conversions. The main saving grace is the fact that most code uses char, and the arithmetic expressions result in int, so to assign back, you need to cast because it's a narrowing conversion. So, a lot of the casts that we'd want to be required when converting between integer types and character types are fortunately required anyway, but stuff like ~ gets around it in many cases, and if you're using dchar, you're not protected by narrowing conversions. So, the problem still exists, and it still causes bugs. Many of us see the fact that code like string res = `Number value: ` ~ 65; compiles as wholly negative, though based on what Walter has said in the past, unless something has changed, I'm sure that he does not agree on that count, and the fact that this DIP on bool was rejected on the grounds that you guys think that bool should be treated as an integer type does make it sound like it's going to be difficult to convince you that the character types shouldn't be treated as integer types. - Jonathan M Davis
Nov 12 2018
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/12/18 3:23 PM, Andrei Alexandrescu wrote:
 On 11/12/18 3:01 PM, 12345swordy wrote:
 On Monday, 5 November 2018 at 15:36:31 UTC, uranuz wrote:
 Hello to everyone! By mistake I typed some code like the following 
 without using [std.conv: to] and get strange result. I believe that 
 following code shouldn't even compile, but it does and gives 
 non-printable symbol appended at the end of string.
 The same problem is encountered even without [enum]. Just using plain 
 integer value gives the same. Is it a bug or someone realy could rely 
 on this behaviour?

 import std.stdio;

 enum TestEnum: ulong {
    Item1 = 2,
    Item3 = 5
 }

 void main()
 {
     string res = `Number value: ` ~ TestEnum.Item1;
     writeln(res);
 }

 Output:
 Number value: 
Welp with the recent rejection of the DIP 1005, I don't see this being deprecated any time soon.
If we deprecate that we also need to deprecate:     string res = `Number value: ` ~ 65; Not saying we shouldn't, just that there are many implications.
I'm wondering if you realized what you are saying there. Like "if we deprecate one crappy behavior, we have to deprecate all the crappy behavior" Yes, please. -Steve
Nov 13 2018