www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Probably trivial, but VERY frustrating compiler bug

reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
https://issues.dlang.org/show_bug.cgi?id=18269

In a nutshell: the exact .stringof of certain function symbols changes
depending on which overload was processed first.  From what I can tell,
it's caused by certain distinct function overloads having the same deco
in the symbol table, so apparently some cache somewhere in the compiler
collides, and the overload that gets processed first will take
precedence.

It's most likely a trivial fix, but it has been at least 3 years since
the problem cropped up, and it has been blocking at least 2 Phobos PRs:

https://github.com/dlang/phobos/pull/5797
https://github.com/dlang/phobos/pull/7556

The lack of progress has been very frustrating, to say the least, so I'm
raising a stink here to see if somebody will do something about it. (I
would, but I've already spent enough of my very limited free time trying
to push through a trivial Phobos change, only to end up having to track
down an obscure problem that has nothing to do with the fix in the first
place.)


T

-- 
MSDOS = MicroSoft's Denial Of Service
Aug 13 2020
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 13 August 2020 at 22:14:15 UTC, H. S. Teoh wrote:
 In a nutshell: the exact .stringof of certain function symbols 
 changes depending on which overload was processed first.
wait why the heck is it using .stringof at all? This inside dmd itself? I know it uses mangle toChars in places but the stringof thing should be for user diagnostics only, no internal use. that might be the real bug perhaps it just called toChars when it should have called toMangle or whatever the function is called.
Aug 13 2020
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 10:20:19PM +0000, Adam D. Ruppe via Digitalmars-d wrote:
 On Thursday, 13 August 2020 at 22:14:15 UTC, H. S. Teoh wrote:
 In a nutshell: the exact .stringof of certain function symbols
 changes depending on which overload was processed first.
wait why the heck is it using .stringof at all? This inside dmd itself? I know it uses mangle toChars in places but the stringof thing should be for user diagnostics only, no internal use.
[...] Well, Phobos has a unittest in std.format that relies on the exact output of .stringof. Come to think of it, maybe we should just kill that unittest with fire.
:-(
T -- He who does not appreciate the beauty of language is not worthy to bemoan its flaws.
Aug 13 2020
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 13 August 2020 at 22:33:04 UTC, H. S. Teoh wrote:
 Well, Phobos has a unittest in std.format that relies on the 
 exact output of .stringof.
yeah stringof should NEVER be used for anything other than diagnostics intended for a human reader along with other info... its exact format is undefined!
 Come to think of it, maybe we should just kill that unittest 
 with fire.
tbh any unittest that tests specific strings is probably suspect.
Aug 13 2020
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Aug 13, 2020 at 10:43:03PM +0000, Adam D. Ruppe via Digitalmars-d wrote:
 On Thursday, 13 August 2020 at 22:33:04 UTC, H. S. Teoh wrote:
 Well, Phobos has a unittest in std.format that relies on the exact
 output of .stringof.
yeah stringof should NEVER be used for anything other than diagnostics intended for a human reader along with other info... its exact format is undefined!
 Come to think of it, maybe we should just kill that unittest with
 fire.
tbh any unittest that tests specific strings is probably suspect.
I found two unittests that test for exact .stringof format: std/format.d:2152 assert(to!string(&bar) == "int delegate(short) nogc delegate() pure nothrow system"); std/format.d:4802 version (linux) formatTest( &func, "void delegate() system" ); The second is the one giving me trouble, and already somebody has put in a `version(linux)` hack, presumably it fails in one of the non-linux machines in the auto-tester. I'm *very* tempted to change that line to "version(none)" right now... T -- Heads I win, tails you lose.
Aug 13 2020
parent FeepingCreature <feepingcreature gmail.com> writes:
On Thursday, 13 August 2020 at 23:14:41 UTC, H. S. Teoh wrote:
 std/format.d:4802	version (linux) formatTest( &func, "void 
 delegate()  system" );

 The second is the one giving me trouble, and already somebody 
 has put in a `version(linux)` hack, presumably it fails in one 
 of the non-linux machines in the auto-tester.

 I'm *very* tempted to change that line to "version(none)" right 
 now...


 T
Just delete it, imo. Let git history serve if people want to look at it. version(none) tests nothing and documents nothing other than "A test was once here."
Aug 14 2020
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 8/13/20 6:20 PM, Adam D. Ruppe wrote:
 On Thursday, 13 August 2020 at 22:14:15 UTC, H. S. Teoh wrote:
 In a nutshell: the exact .stringof of certain function symbols changes 
 depending on which overload was processed first.
wait why the heck is it using .stringof at all? This inside dmd itself? I know it uses mangle toChars in places but the stringof thing should be for user diagnostics only, no internal use. that might be the real bug perhaps it just called toChars when it should have called toMangle or whatever the function is called.
What is the correct way to get a string representation of a type if not T.stringof? This is inside Phobos, not DMD. -Steve
Aug 13 2020
parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Friday, 14 August 2020 at 03:56:38 UTC, Steven Schveighoffer 
wrote:
 On 8/13/20 6:20 PM, Adam D. Ruppe wrote:
 On Thursday, 13 August 2020 at 22:14:15 UTC, H. S. Teoh wrote:
 In a nutshell: the exact .stringof of certain function 
 symbols changes depending on which overload was processed 
 first.
wait why the heck is it using .stringof at all? This inside dmd itself? I know it uses mangle toChars in places but the stringof thing should be for user diagnostics only, no internal use.
What is the correct way to get a string representation of a type if not T.stringof? This is inside Phobos, not DMD. -Steve
"T". The obvious question is "why do you want a string representation of a type"? For debugging? stringof. For later accessing? Find a way to make an alias of the type visible at that spot, then use the alias's name.
Aug 13 2020
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Aug 14, 2020 at 05:26:06AM +0000, FeepingCreature via Digitalmars-d
wrote:
 On Friday, 14 August 2020 at 03:56:38 UTC, Steven Schveighoffer wrote:
 On 8/13/20 6:20 PM, Adam D. Ruppe wrote:
 On Thursday, 13 August 2020 at 22:14:15 UTC, H. S. Teoh wrote:
 In a nutshell: the exact .stringof of certain function symbols
 changes depending on which overload was processed first.
wait why the heck is it using .stringof at all? This inside dmd itself? I know it uses mangle toChars in places but the stringof thing should be for user diagnostics only, no internal use.
What is the correct way to get a string representation of a type if not T.stringof? This is inside Phobos, not DMD. -Steve
"T". The obvious question is "why do you want a string representation of a type"? For debugging? stringof. For later accessing? Find a way to make an alias of the type visible at that spot, then use the alias's name.
This is std.format, it's trying to print a user-readable string that represents the type name. Using .stringof is precisely what .stringof was made for, isn't it? The problem comes when overzealous unittesting wants to write a unittest to test that the code is doing its job in calling .stringof. Just like unittests that assert(1+1 == 2) just in case the compiler isn't doing its job, or perhaps the CPU is, heaven forbid, *malfunctioning*, unittests of this sort inevitably cause more frustration than they help. IOW, the unittest is testing something outside the control of the code it's supposed to be testing, and thereby introduces an extraneous dependency on outside behaviour. T -- My program has no bugs! Only undocumented features...
Aug 13 2020
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 8/14/20 1:26 AM, FeepingCreature wrote:
 On Friday, 14 August 2020 at 03:56:38 UTC, Steven Schveighoffer wrote:
 On 8/13/20 6:20 PM, Adam D. Ruppe wrote:
 On Thursday, 13 August 2020 at 22:14:15 UTC, H. S. Teoh wrote:
 In a nutshell: the exact .stringof of certain function symbols 
 changes depending on which overload was processed first.
wait why the heck is it using .stringof at all? This inside dmd itself? I know it uses mangle toChars in places but the stringof thing should be for user diagnostics only, no internal use.
What is the correct way to get a string representation of a type if not T.stringof? This is inside Phobos, not DMD.
"T". The obvious question is "why do you want a string representation of a type"? For debugging? stringof. For later accessing? Find a way to make an alias of the type visible at that spot, then use the alias's name.
Serialization is the only thing I can think of that both requires you generate a string for each type, and that the string is consistent across versions. I mean, you can invent a naming scheme, but if the compiler can provide it, it's much easier to deal with. I can use the rat's nest that is fullyQualifiedName. But if the compiler had a strict definition of T.stringof (and bonus, if it did what fullyQualifiedName did), then one could actually use it in cases where one needs a string representation of the type. As of now, there are three ways the compiler can give you names of a type: 1. T.stringof 2. typeid(T).name 3. __traits(identifier, T) It would be awesome if all of these were consistent and well-defined. I just looked at one place where I did some serialization that required storing the type name, and I actually had to follow a specific protocol, so I didn't need this. My question really was that -- a question, not an argument. -Steve
Aug 14 2020
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 14 August 2020 at 12:47:01 UTC, Steven Schveighoffer 
wrote:
 As of now, there are three ways the compiler can give you names 
 of a type:
I actually personally use .mangleof when it doesn't have to be user visible... and if it does, I'll just demangle parts of it (the first part for the name is very simple, it is decimal-length-prefixed strings, then you join them back with dot and you're done, it is the type info at the end of the string that gets complicated to demangle). It is pretty well defined and has certain guarantees for linking.
 It would be awesome if all of these were consistent and 
 well-defined.
That said, I'd probably be ok with improving the definitions! I'd still tell people to never use it in mixins but it'd at least be ok for cases like this then.
Aug 14 2020
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 8/14/20 9:16 AM, Adam D. Ruppe wrote:
 On Friday, 14 August 2020 at 12:47:01 UTC, Steven Schveighoffer wrote:
 As of now, there are three ways the compiler can give you names of a 
 type:
I actually personally use .mangleof when it doesn't have to be user visible... and if it does, I'll just demangle parts of it (the first part for the name is very simple, it is decimal-length-prefixed strings, then you join them back with dot and you're done, it is the type info at the end of the string that gets complicated to demangle).
I forgot about mangleof! A 4th string representation. But this one is well defined, and unrelated (mostly) to the other 3.
 
 It is pretty well defined and has certain guarantees for linking.
Yes, it depends on the use case for sure.
 That said, I'd probably be ok with improving the definitions! I'd still 
 tell people to never use it in mixins but it'd at least be ok for cases 
 like this then.
Using it in mixins is not a good idea anyway. If you have the type alias locally, use that. But a consistent string representation would at least allow one to do *something* with that information. In particular, if .stringof and typeid(classInstance).name were consistent, it would make things a lot easier. That brings up another use case -- RTTI. Like if you wanted to implement Object.factory. A language-sanctioned way to say "when I use this string, I mean this type" unambiguously. -Steve
Aug 14 2020
prev sibling parent Boris Carvajal <boris2.9 gmail.com> writes:
On Thursday, 13 August 2020 at 22:14:15 UTC, H. S. Teoh wrote:
 https://issues.dlang.org/show_bug.cgi?id=18269

 In a nutshell: the exact .stringof of certain function symbols 
 changes depending on which overload was processed first.  From 
 what I can tell, it's caused by certain distinct function 
 overloads having the same deco in the symbol table, so 
 apparently some cache somewhere in the compiler collides, and 
 the overload that gets processed first will take precedence.
I replied in bugzilla what I found some time ago (sorry I forgot to reply that time), it's related to the deco thing like you already said.
Aug 13 2020