www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.internals - Separate compilation identifier mess and the bug trail that ensued

reply Atila Neves <atila.neves gmail.com> writes:
Once upon a time, one could not use __traits(getUnitTests) and 
separate compilation, and it begat 
https://issues.dlang.org/show_bug.cgi?id=16995. The problem was 
that there was a counter being used to name unittests, and it 
varied between running the compiler on all files at once (or in 
groups) versus compiling files separately.

Due to a suggestion in the PR, I emulated what was done for 
naming lambdas and delayed naming the unittest until semantic 
analysis. This begat 
https://issues.dlang.org/show_bug.cgi?id=18097 since now one 
could not longer take the address of a unittest function. Well, 
one could but it wouldn't link because it'd be the wrong name by 
the time semantic was done with it.

I went back and undid the semantic setting of the identifier and 
used the file name in the unittest name instead. That begat 
https://issues.dlang.org/show_bug.cgi?id=18111. The name of the 
file isn't the absolute path, and so depending on how the 
compiler is invoked, one still gets linker errors.

I realised the stupidity of my ways - D already mangles according 
to module, and those names are always the same no matter how one 
compiles. I went back to a simpler time, but without the counter, 
relying on line/column numbers instead. That begat 
https://issues.dlang.org/show_bug.cgi?id=18880. Using `static 
foreach` or mixing in unittests with strings meant that the 
different unittests shared line and column numbers. Oops.

In trying to fix https://issues.dlang.org/show_bug.cgi?id=18868 
Johan got a comment about the bug mentioned above and added a fix 
for that as well. Which breaks __traits(getUnitTests) and 
separate compilation, since it uses a counter to disambiguate 
between identical identifiers. But counters can't and won't work, 
because identifiers don't have fully qualified names until 
semantic, and when they're created they might have the exact same 
identifier as a symbol in a different module. Those don't need to 
be disambiguated of course, but there's no way for the compiler 
to know ahead of time. Chaos ensues.

I've been racking my brain thinking about to solve this. I can't 
use file names or rely on the module name (after the symbol has a 
module attached to it), since that has caused bugs in the past 
and would again. The file name doesn't work unless I somehow 
always get the absolute path. Using the module information means 
that at different points of compilation symbols change names.

Does anyone have a good idea of a way out?
Jul 04 2018
next sibling parent Dominikus Dittes Scherkl <dominikus.scherkl continental-corporation.com> writes:
On Wednesday, 4 July 2018 at 22:15:36 UTC, Atila Neves wrote:
 Once upon a time, one could not use __traits(getUnitTests) and 
 separate compilation, and it begat

 Does anyone have a good idea of a way out?
using a hash? (has to also include loop variable for mixin but otherwise should be reliable)
Jul 23 2018
prev sibling parent reply kinke <noone nowhere.com> writes:
On Wednesday, 4 July 2018 at 22:15:36 UTC, Atila Neves wrote:
 Using `static foreach` or mixing in unittests with strings 
 meant that the different unittests shared line and column 
 numbers. Oops.
To make those corner cases unique and independent from compiler invokations, maybe a stack of module-line-column infos could be used instead of a single one. During semantic, each mixin would push its Loc, and each static foreach iteration would push an artificial one incl. iteration index.
Jul 25 2018
parent reply Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 25 July 2018 at 22:19:41 UTC, kinke wrote:
 On Wednesday, 4 July 2018 at 22:15:36 UTC, Atila Neves wrote:
 Using `static foreach` or mixing in unittests with strings 
 meant that the different unittests shared line and column 
 numbers. Oops.
To make those corner cases unique and independent from compiler invokations, maybe a stack of module-line-column infos could be used instead of a single one. During semantic, each mixin would push its Loc, and each static foreach iteration would push an artificial one incl. iteration index.
If I do it during semantic analysis it would be easy: keep a counter per module. But I can't, due to the bugs reported above. Basically there are situations in which the symbol is used (like taking the address of a unittest) before semantic analysis, where the symbol name would change. Atila
Jul 26 2018
parent reply kinke <noone nowhere.com> writes:
On Thursday, 26 July 2018 at 09:09:07 UTC, Atila Neves wrote:
 If I do it during semantic analysis it would be easy: keep a 
 counter per module.
Ah, I thought you wanted to go further (identical names, independent of defined versions for compiler invokation etc.). I looked at a PR of yours, where you changed back from finalizing the identifier in semantic, assuming that it already was like that originally, but I now see that you moved it to semantic shortly before. ;) So if it has to be done before semantic and we have no clue about the D module at that time, can't we use a counter per Loc (filename, line, column)? Only creating one if there's a unittest/invariant/... declaration to be uniquely named of course, to keep the number of counters reasonably small.
Jul 26 2018
parent kinke <noone nowhere.com> writes:
On Thursday, 26 July 2018 at 18:43:24 UTC, kinke wrote:
 On Thursday, 26 July 2018 at 09:09:07 UTC, Atila Neves wrote:
 If I do it during semantic analysis it would be easy: keep a 
 counter per module.
Ah, I thought you wanted to go further (identical names, independent of defined versions for compiler invokation etc.). I looked at a PR of yours, where you changed back from finalizing the identifier in semantic, assuming that it already was like that originally, but I now see that you moved it to semantic shortly before. ;) So if it has to be done before semantic and we have no clue about the D module at that time, can't we use a counter per Loc (filename, line, column)? Only creating one if there's a unittest/invariant/... declaration to be uniquely named of course, to keep the number of counters reasonably small.
E.g., via a static `int[Loc]` map in `generateIdWithLoc()`, and replacing the current counter per <base identifier, line, column> by a counter per <filename, line, column>, thereby isolating each module via unique path (const char* pointer comparison is most likely enough => bitwise Loc comparison). [The identifier could optionally be used as further key component.]
Jul 26 2018