digitalmars.D - Symbol lookup rules and imports
- H. S. Teoh via Digitalmars-d (226/226) Dec 02 2014 Recently, in a bid to reduce the messy tangle that is the web of
- Meta (16/41) Dec 02 2014 This whole thing is a huge hole in D that needs to be fixed (it
- H. S. Teoh via Digitalmars-d (90/139) Dec 02 2014 Yeah, you can imagine I was *not* happy when, some time ago, I upgraded
- Martin Nowak (5/10) Dec 03 2014 It's equivalent to
- ketmar via Digitalmars-d (3/40) Dec 02 2014 i think that this is a great idea! i already love it.
- H. S. Teoh via Digitalmars-d (6/48) Dec 04 2014 OTOH, Kenji has a different solution already in PR form:
- ketmar via Digitalmars-d (5/52) Dec 04 2014 "Static, renamed, and selective imports are always public"? nononono!
- Mike Parker (3/12) Dec 04 2014 I think you misunderstand. They are public now, which makes them
- ketmar via Digitalmars-d (4/51) Dec 04 2014 p.s. i want wildcard support in `import`! no, i want regex support in
- Martin Nowak (1/1) Dec 02 2014 http://wiki.dlang.org/DIP22
- Paolo Invernizzi (5/13) Dec 03 2014 A big +1
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (7/46) Dec 03 2014 Surely this can't be intended behaviour? It seems its imported
- Jacob Carlborg (4/12) Dec 03 2014 I fully agree.
- Dicebot (11/29) Dec 04 2014 +1
- Dicebot (6/16) Dec 04 2014 This is probably single thing I hate most about D module system.
- Rainer Schuetze (45/59) Dec 04 2014 This compiles as expected. I guess you rather mean to import impl from
- H. S. Teoh via Digitalmars-d (35/65) Dec 04 2014 Sorry, poor example. Here's a tested example that produces an error:
- Meta (1/22) Dec 04 2014 My word, I think this example made me physically ill.
- Bruno Medeiros (15/239) Dec 05 2014 I agree too.
- Andrei Alexandrescu (5/13) Dec 07 2014 I agree we need to take a closer look at name lookup now that local
- Walter Bright (5/21) Dec 08 2014 No, it does not. Symbols in the current scope override imports from the ...
- Walter Bright (10/16) Dec 08 2014 No, it does not. Each scope has a symbol table of symbols declared in th...
- H. S. Teoh via Digitalmars-d (27/51) Dec 08 2014 I realize that my understanding on the workings of dmd may not be quite
- Walter Bright (5/10) Dec 08 2014 I understand, my purpose on that post was to correct misunderstanding of...
- ketmar via Digitalmars-d (4/6) Dec 08 2014 e, is=20
- Walter Bright (2/7) Dec 08 2014 I disagree. None of the issues raised have to do with that.
- ketmar via Digitalmars-d (6/16) Dec 08 2014 cope, is
Recently, in a bid to reduce the messy tangle that is the web of interdependencies among Phobos modules, many module-scope imports have been replaced with scoped imports. In addition to reducing gratuitous dependencies (the scoped import is not processed in a template body until the template is actually needed, so if user code doesn't need that template, then it won't pull in a whole bunch of unnecessary Phobos modules), scoped imports also make the code easier to refactor -- you can move things around more easily without breaking it. However, there are major issues with scoped imports currently, that make this otherwise ideal solution less-than-ideal, which stem from the way 'import' is implemented in D. When the compiler encounters an 'import' statement, what it essentially does is to parse the symbols in the target module, and add them to the symbol table for the current scope. While this is a simple, straightforward implementation method, it leads to the following problems: 1) Consider this code: // Quiz for the reader: what does this code do? void myFunc(string text, string number) { import std.conv; import std.stdio; auto x = to!int(number); writeln(text, x); } void main() { myFunc("The count is: ", "123"); } Here's the output: 123 The reason is that `import std.conv;` pulls in *all* the symbols from std.conv into the local scope, including the function called "text". This causes "text" (i.e., std.conv.text) to shadow the parameter "text", so `writeln(text, x);` is interpreted as: std.stdio.writeln(std.conv.text(), x); so the parameter "text" is never used. This is just a simple example; the problem becomes much harder to trace when the shadowed symbol is not this close to the import scope: struct S { string text; ... void method() { import std.conv; ... writeln(text); // uses std.conv.text, not this.text } } Worse yet, suppose we have this code, that today works correctly: struct S { string buf; void format() { buf = ... /* format S's data into string form */ } void method() { import std.file; ... // do some processing based on files format(); // calls this.format() } } Now, suppose at some point in the future, we introduce a function also called format() in std.file, that formats your hard drive. The user code is unchanged. What happens when you recompile the program and run it? It will compile successfully... and then format your hard drive without any warning. This is the crux of the issue behind: https://issues.dlang.org/show_bug.cgi?id=10378 2) OK, so the conclusion is that unqualified scoped imports are dangerous, right? Alright, let's see what happens when we use qualified imports: // mod.d module mod; struct S { // Use a fully-qualified import. // We place it in the body of S because S's methods // repeatedly needs it -- after all, DRY is good, right? import std.format : format; void method1(string fmt) { writeln(format(fmt, ... )); } void method2() { auto s = format("abc %s def", ...); ... } } // main.d module main; import mod; // we need the definition of S void format(S s) { ... /* do something with s */ } void main() { S s; s.format(); // UFCS -- should call main.format(s), right? } Wrong. That last line in main() actually calls std.format.format. Why? Because in mod.d, the `import std.format : format` declaration actually pulls in std.format.format into the symbol table of S, therefore, S.format now refers to std.format.format. This, therefore, hijacks the intended call to main.format. This is: https://issues.dlang.org/show_bug.cgi?id=13808 3) OK, so putting scoped, qualified imports in struct bodies is a bad idea, they should always go into the method bodies, then we'll be OK, right? Sure... until you need to do this: // A struct S { // B void method(R)(R range) if (isInputRange!R) { // C } } This will not compile unless you import std.range : isInputRange somewhere. Putting it in the line marked //A introduces a module-global dependency on std.range, which is what we're trying to eliminate. So that's a no-go. Putting in //C is also a no-go, because we need isInputRange in the sig constraint, which is outside the function body. So the only option is //B. However, if we do that, then the imported symbols will get pulled into S's symbol table, thus you have this situation: // mymod.d module mymod; struct S { // Now everyone is happy, right...? import std.range : isInputRange; void method(R)(R range) if (isInputRange!R) { } } // main.d module main; import mymod; // Well ... void main() { S s; static assert(isInputRange!(int[])); // fails -- as it should static assert(s.isInputRange!(int[])); // works -- WAT? } So even though isInputRange is really only required by the *implementation* of S, yet the consequence of the compiler's implementation of import is that it pollutes S's namespace with the imported symbols in such a way that external code can access said symbols using S as a proxy -- even if S was never intended to act as such a proxy!! And if you used an unqualified import instead, the problem is compounded: // mymod.d module mymod; struct S { import std.algorithm; // oh yes ... void method() { // hey, it does make code easier to write, we // don't have to keep importing std.algorithm // all over the place: auto x = map!(...).filter!(...).sort(); ... } ... } // main.d module main; void main() { // hey, we only need the definition of S, right? import mymod : S; S s; // But we can access all sorts of stuff through S... // (Not the mention UFCS hijacking happening everywhere!) auto r = s.map!(...); auto t = s.filter!(...); auto u = s.canFind(...); // when they shouldn't even work at all!! auto v = canFind(s, ...); // <--- compile error } 4) This isn't the end of the story. There's also this lovely bug: https://issues.dlang.org/show_bug.cgi?id=1238 which, as its number should tell you, has been around for a LONG time. Executive summary: // mymod.d module mymod; private int impl; // main.d module main; import mymod; void impl() { ... } void main() { impl(); // Error: main.impl conflicts with mymod.impl (WAT?) } Basically, the underlying cause is the same. Import pulls the symbols of mymod.d into the current scope (including private ones -- which the compiler currently has to, since otherwise it couldn't compile mymod's references to private symbols). So when you try to reference 'impl', the overload set contains both main.impl and mymod.impl, which is an ambiguity error. This doesn't sound too bad, until you consider that mymod.d could be an external library that, originally, did *not* define a symbol called "impl". So your code happily compiles and works, and then you upgrade to the next release of the library, and suddenly you get a compile error, because the library authors introduced a PRIVATE symbol into the library. In all of the above cases, the compiler's current "simple" implementation of lookup rules and imports is creating a ripple effect of loopholes and corner cases that all add up to poor encapsulation and maintainability problems: local imports in structs makes imported symbols visible to unrelated outside code; private symbols in imported modules conflict with local symbols; even scoped, qualified imports (ostensibly the "cleanest" way to import something) can shadow parameters and enclosing struct/class members. In order to avoid these issues, one has to adopt increasingly convoluted ways of writing code based on a series of avoidances -- avoid module-global imports (introduces needless dependencies), avoid unqualified imports (shadows parameters), avoid putting imports in struct/class bodies (leaks imported symbols to outside world), etc.. This is beginning to sound like C++ all over again (don't use malloc, use new; don't use new for arrays, use new[]; don't use raw pointers, use std::auto_ptr; etc....). :-( I'm finding it harder and harder to accept Walter's stance that symbol lookups should be kept simple and free from complications and convoluted corner cases, etc.. Except that it is *already* full of convoluted pitfalls and corner cases you must avoid, as illustrated above. I wish we would just acknowledge that the current symbol lookup / import rules (or at least the implementation thereof) are inadequate, and find a clean solution to this long-standing issue, instead of hoping that denying the problem will somehow, magically, make it all go away. T -- "Holy war is an oxymoron." -- Lazarus Long
Dec 02 2014
This whole thing is a huge hole in D that needs to be fixed (it may even be necessary to consider it higher priority than the current C++ and GC). As it works currently, I'd go as far as to say that almost every addition to Phobos must be considered a breaking change for these reasons. Given the recent discussion about trying as much as possible to not break code, fixing the issues with import is extremely important. When a library writer can break user code by introducing a *private* symbol (scoped or otherwise), something has gone wrong. Furthermore:// mod.d module mod; struct S { // Use a fully-qualified import. // We place it in the body of S because S's methods // repeatedly needs it -- after all, DRY is good, right? import std.format : format; void method1(string fmt) { writeln(format(fmt, ... )); } void method2() { auto s = format("abc %s def", ...); ... } } // main.d module main; import mod; // we need the definition of S void format(S s) { ... /* do something with s */ } void main() { S s; s.format(); // UFCS -- should call main.format(s), right? }Am I correct that this bug is due to the fact that selective imports from a module implicitly imports all symbols in that module, rather than just the selected symbol? I don't know how anyone could think that D's module system is simple at this point when things behave so differently from how they intuitively should behave (for my own personal definition of intuitive).
Dec 02 2014
On Tue, Dec 02, 2014 at 11:02:18PM +0000, Meta via Digitalmars-d wrote:This whole thing is a huge hole in D that needs to be fixed (it may even be necessary to consider it higher priority than the current C++ and GC).Well, that's up to Walter & Andrei to decide. :-PAs it works currently, I'd go as far as to say that almost every addition to Phobos must be considered a breaking change for these reasons. Given the recent discussion about trying as much as possible to not break code, fixing the issues with import is extremely important. When a library writer can break user code by introducing a *private* symbol (scoped or otherwise), something has gone wrong.Yeah, you can imagine I was *not* happy when, some time ago, I upgraded Phobos and my code broke because std.regex introduced a *private* symbol that just happened to have the same name as a struct I defined myself.Furthermore:Not in this case, I don't think, the crux of the problem is that this: import some.module : symbol; is essentially the same as this: static import some.module; alias symbol = some.module.symbol; Therefore this: struct S { import std.format : format; ... } is essentially equivalent to this: struct S { static import std.format; alias format = std.format.format; // ^^^ the above line is what makes s.format() break. }// mod.d module mod; struct S { // Use a fully-qualified import. // We place it in the body of S because S's methods // repeatedly needs it -- after all, DRY is good, right? import std.format : format; void method1(string fmt) { writeln(format(fmt, ... )); } void method2() { auto s = format("abc %s def", ...); ... } } // main.d module main; import mod; // we need the definition of S void format(S s) { ... /* do something with s */ } void main() { S s; s.format(); // UFCS -- should call main.format(s), right? }Am I correct that this bug is due to the fact that selective imports from a module implicitly imports all symbols in that module, rather than just the selected symbol?I don't know how anyone could think that D's module system is simple at this point when things behave so differently from how they intuitively should behave (for my own personal definition of intuitive).Oh, it's certainly "simple". Simple to the compiler writers, that is, in the sense of the implementation being quite straightforward, but that's not necessarily "simple" from a user's POV! Or perhaps a fairer way to state this, is that "import" as currently implemented may not quite coincide with what users think "import" should do. Currently, "import" means, quite literally, "import module X's symbols into the current scope" -- all the symbols in X get inserted into the symbol table of the current scope. However, what most people understand when they hear of "import" is "add module X to the list of scopes to search when we need to look up a symbol". Perhaps a better name for this "more intuitive" import is "use": that is, use the symbols from module X if they are referenced but don't pull them all into the current scope's symbol table. In this sense, the current behaviour is "correct" because the observed effects are exactly what happens when module X's symbols get added to the scope's namespace. However, this also means it's a poor tool for encapsulation, because its semantics cause encapsulation-breaking effects as described. And it also means that users, who generally tend to expect "use" semantics rather than "import" semantics, will tend to get into trouble when they start using it. Furthermore, there isn't really any practical way right now to get "use" semantics in D, since "import" is the only tool we currently have. To get "use" semantics you'd have to resort to really ugly hackery like define a separate private sub-scope to pull symbols into, for example: struct S { // Private empty struct to provide a water-proof scope // into which we can import symbols without fear of // leakage: private static struct Imports { // Now this won't leak to S's scope: import std.format : format; } void method() { // But it *will* look really ugly: Imports.format(...); } } void format(S s) { ... } S s; s.format(); // now this won't get hijacked by the import Hmm... actually, this gives me an idea. What if we implement a little syntactic sugar for this in the compiler? Say: scope import std.conv ... ; scope import std.format ... ; gets lowered to: private static struct __imports { import std.conv ... ; import std.format ... ; } where __imports is an implicit nested struct that gets introduced to each scope that uses "scope import". Then we introduce a new lookup rule, that if a symbol X cannot be found with the current lookup rules, then the compiler should try searching for __imports.X in the current scope instead. That is to say, if: format("%s", ...); cannot be resolved, then pretend that the user has written: __imports.format("%s", ...); instead. Sortof like the import analogue of UFCS (if a member function can't be found in a call obj.method(), then look for method(obj) in the global scope instead). This way, existing code won't have to change, no breakage will be introduced, and only a small addition (not change) needs to be made to the existing lookup rules. Then whenever we need "use" semantics as opposed to raw "import" semantics, we just write "scope import" instead, and it should all work. (So we hope.) Heh, sounds like DIP material... T -- If blunt statements had a point, they wouldn't be blunt...
Dec 02 2014
On 12/03/2014 12:55 AM, H. S. Teoh via Digitalmars-d wrote:struct S { static import std.format; alias format = std.format.format; // ^^^ the above line is what makes s.format() break. }It's equivalent to private alias format = std.format.format; The problem here is that private isn't checked for aliases and that it currently doesn't affect visibility.
Dec 03 2014
On Tue, 2 Dec 2014 15:55:34 -0800 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:Hmm... actually, this gives me an idea. What if we implement a little syntactic sugar for this in the compiler? Say: =20 scope import std.conv ... ; scope import std.format ... ; =20 gets lowered to: =20 private static struct __imports { import std.conv ... ; import std.format ... ; } =20 where __imports is an implicit nested struct that gets introduced to each scope that uses "scope import". =20 Then we introduce a new lookup rule, that if a symbol X cannot be found with the current lookup rules, then the compiler should try searching for __imports.X in the current scope instead. That is to say, if: =20 format("%s", ...); =20 cannot be resolved, then pretend that the user has written: =20 __imports.format("%s", ...); =20 instead. Sortof like the import analogue of UFCS (if a member function can't be found in a call obj.method(), then look for method(obj) in the global scope instead). =20 This way, existing code won't have to change, no breakage will be introduced, and only a small addition (not change) needs to be made to the existing lookup rules. Then whenever we need "use" semantics as opposed to raw "import" semantics, we just write "scope import" instead, and it should all work. (So we hope.) =20 Heh, sounds like DIP material...i think that this is a great idea! i already love it.
Dec 02 2014
On Wed, Dec 03, 2014 at 02:10:52AM +0200, ketmar via Digitalmars-d wrote:On Tue, 2 Dec 2014 15:55:34 -0800 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:OTOH, Kenji has a different solution already in PR form: https://github.com/D-Programming-Language/dmd/pull/3407 T -- Not all rumours are as misleading as this one.Hmm... actually, this gives me an idea. What if we implement a little syntactic sugar for this in the compiler? Say: scope import std.conv ... ; scope import std.format ... ; gets lowered to: private static struct __imports { import std.conv ... ; import std.format ... ; } where __imports is an implicit nested struct that gets introduced to each scope that uses "scope import". Then we introduce a new lookup rule, that if a symbol X cannot be found with the current lookup rules, then the compiler should try searching for __imports.X in the current scope instead. That is to say, if: format("%s", ...); cannot be resolved, then pretend that the user has written: __imports.format("%s", ...); instead. Sortof like the import analogue of UFCS (if a member function can't be found in a call obj.method(), then look for method(obj) in the global scope instead). This way, existing code won't have to change, no breakage will be introduced, and only a small addition (not change) needs to be made to the existing lookup rules. Then whenever we need "use" semantics as opposed to raw "import" semantics, we just write "scope import" instead, and it should all work. (So we hope.) Heh, sounds like DIP material...i think that this is a great idea! i already love it.
Dec 04 2014
On Thu, 4 Dec 2014 14:15:22 -0800 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:On Wed, Dec 03, 2014 at 02:10:52AM +0200, ketmar via Digitalmars-d wrote:"Static, renamed, and selective imports are always public"? nononono! as for static imports -- ok, let it be (but i don't want it too). as for renamed and selective... in no way!On Tue, 2 Dec 2014 15:55:34 -0800 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote: =20=20 OTOH, Kenji has a different solution already in PR form: =20 https://github.com/D-Programming-Language/dmd/pull/3407Hmm... actually, this gives me an idea. What if we implement a little syntactic sugar for this in the compiler? Say: =20 scope import std.conv ... ; scope import std.format ... ; =20 gets lowered to: =20 private static struct __imports { import std.conv ... ; import std.format ... ; } =20 where __imports is an implicit nested struct that gets introduced to each scope that uses "scope import". =20 Then we introduce a new lookup rule, that if a symbol X cannot be found with the current lookup rules, then the compiler should try searching for __imports.X in the current scope instead. That is to say, if: =20 format("%s", ...); =20 cannot be resolved, then pretend that the user has written: =20 __imports.format("%s", ...); =20 instead. Sortof like the import analogue of UFCS (if a member function can't be found in a call obj.method(), then look for method(obj) in the global scope instead). =20 This way, existing code won't have to change, no breakage will be introduced, and only a small addition (not change) needs to be made to the existing lookup rules. Then whenever we need "use" semantics as opposed to raw "import" semantics, we just write "scope import" instead, and it should all work. (So we hope.) =20 Heh, sounds like DIP material...i think that this is a great idea! i already love it.
Dec 04 2014
On 12/5/2014 10:04 AM, ketmar via Digitalmars-d wrote:On Thu, 4 Dec 2014 14:15:22 -0800 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:I think you misunderstand. They are public now, which makes them horribly broken. The PR aims to fix that.OTOH, Kenji has a different solution already in PR form: https://github.com/D-Programming-Language/dmd/pull/3407"Static, renamed, and selective imports are always public"? nononono! as for static imports -- ok, let it be (but i don't want it too). as for renamed and selective... in no way!
Dec 04 2014
On Thu, 4 Dec 2014 14:15:22 -0800 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:On Wed, Dec 03, 2014 at 02:10:52AM +0200, ketmar via Digitalmars-d wrote:p.s. i want wildcard support in `import`! no, i want regex support in `import`! so i can write `import iv.writer : (err)?write(f|ln)?;`.On Tue, 2 Dec 2014 15:55:34 -0800 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote: =20=20 OTOH, Kenji has a different solution already in PR form: =20 https://github.com/D-Programming-Language/dmd/pull/3407Hmm... actually, this gives me an idea. What if we implement a little syntactic sugar for this in the compiler? Say: =20 scope import std.conv ... ; scope import std.format ... ; =20 gets lowered to: =20 private static struct __imports { import std.conv ... ; import std.format ... ; } =20 where __imports is an implicit nested struct that gets introduced to each scope that uses "scope import". =20 Then we introduce a new lookup rule, that if a symbol X cannot be found with the current lookup rules, then the compiler should try searching for __imports.X in the current scope instead. That is to say, if: =20 format("%s", ...); =20 cannot be resolved, then pretend that the user has written: =20 __imports.format("%s", ...); =20 instead. Sortof like the import analogue of UFCS (if a member function can't be found in a call obj.method(), then look for method(obj) in the global scope instead). =20 This way, existing code won't have to change, no breakage will be introduced, and only a small addition (not change) needs to be made to the existing lookup rules. Then whenever we need "use" semantics as opposed to raw "import" semantics, we just write "scope import" instead, and it should all work. (So we hope.) =20 Heh, sounds like DIP material...i think that this is a great idea! i already love it.
Dec 04 2014
On Tuesday, 2 December 2014 at 22:02:23 UTC, H. S. Teoh via Digitalmars-d wrote:I wish we would just acknowledge that the current symbol lookup / import rules (or at least the implementation thereof) are inadequate, and find a clean solution to this long-standing issue, instead of hoping that denying the problem will somehow, magically, make it all go away.A big +1 --- Paolo
Dec 03 2014
On Tuesday, 2 December 2014 at 22:02:23 UTC, H. S. Teoh via Digitalmars-d wrote:2) OK, so the conclusion is that unqualified scoped imports are dangerous, right? Alright, let's see what happens when we use qualified imports: // mod.d module mod; struct S { // Use a fully-qualified import. // We place it in the body of S because S's methods // repeatedly needs it -- after all, DRY is good, right? import std.format : format; void method1(string fmt) { writeln(format(fmt, ... )); } void method2() { auto s = format("abc %s def", ...); ... } } // main.d module main; import mod; // we need the definition of S void format(S s) { ... /* do something with s */ } void main() { S s; s.format(); // UFCS -- should call main.format(s), right? } Wrong. That last line in main() actually calls std.format.format. Why? Because in mod.d, the `import std.format : format` declaration actually pulls in std.format.format into the symbol table of S, therefore, S.format now refers to std.format.format. This, therefore, hijacks the intended call to main.format. This is: https://issues.dlang.org/show_bug.cgi?id=13808Surely this can't be intended behaviour? It seems its imported publically instead of privately, so that it gets "reexported" to outside users of the struct. If so, I don't think we should add workarounds to std.datetime or any other places where this could occur.
Dec 03 2014
On 2014-12-02 23:00, H. S. Teoh via Digitalmars-d wrote:I'm finding it harder and harder to accept Walter's stance that symbol lookups should be kept simple and free from complications and convoluted corner cases, etc.. Except that it is *already* full of convoluted pitfalls and corner cases you must avoid, as illustrated above. I wish we would just acknowledge that the current symbol lookup / import rules (or at least the implementation thereof) are inadequate, and find a clean solution to this long-standing issue, instead of hoping that denying the problem will somehow, magically, make it all go away.I fully agree. -- /Jacob Carlborg
Dec 03 2014
On Wednesday, 3 December 2014 at 13:34:42 UTC, Jacob Carlborg wrote:On 2014-12-02 23:00, H. S. Teoh via Digitalmars-d wrote:+1 So far it seems there is huge amount of anecdotal evidence that D module system does not work as designed and people keep inventing hacks and workaround to get over it. And yet Walter still considers it success. Well, compared to C includes - no doubt. Compared to actually usable modle system.. meh. Right now it pretty much requires relying on coding discipline and custom rules for import formatting.I'm finding it harder and harder to accept Walter's stance that symbol lookups should be kept simple and free from complications and convoluted corner cases, etc.. Except that it is *already* full of convoluted pitfalls and corner cases you must avoid, as illustrated above. I wish we would just acknowledge that the current symbol lookup / import rules (or at least the implementation thereof) are inadequate, and find a clean solution to this long-standing issue, instead of hoping that denying the problem will somehow, magically, make it all go away.I fully agree.
Dec 04 2014
On Tuesday, 2 December 2014 at 22:02:23 UTC, H. S. Teoh via Digitalmars-d wrote:However, there are major issues with scoped imports currently, that make this otherwise ideal solution less-than-ideal, which stem from the way 'import' is implemented in D. When the compiler encounters an 'import' statement, what it essentially does is to parse the symbols in the target module, and add them to the symbol table for the current scope.This is probably single thing I hate most about D module system. Especially the fact that it is default behaviour with plain "import". I wish all symbols required to be force-qualified with a module by default :(
Dec 04 2014
On 02.12.2014 23:00, H. S. Teoh via Digitalmars-d wrote:4) This isn't the end of the story. There's also this lovely bug: https://issues.dlang.org/show_bug.cgi?id=1238 which, as its number should tell you, has been around for a LONG time. Executive summary: // mymod.d module mymod; private int impl; // main.d module main; import mymod; void impl() { ... } void main() { impl(); // Error: main.impl conflicts with mymod.impl (WAT?) }This compiles as expected. I guess you rather mean to import impl from two modules, one with a public impl, the other with a private impl. I don't think this belongs to the same set of issues as the other examples (though I agree private symbols from imports should not be considered in overload sets). Here is another bad example of local import symbol hijacking: module base; class Base { import std.conv; } module derived; import base; import std.stdio; string text = "123"; class Derived : Base { static void foo() { writeln(text); } } void main() { Derived.foo(); } This prints an empty line! The documentation isn't very specific about symbol lookup rules, I found the import rules on dlang.org/modules.html: "How basic imports work is that first a name is searched for in the current namespace. If it is not found, then it is looked for in the imports. If it is found uniquely among the imports, then that is used. If it is in more than one import, an error occurs." I don't think "namespace" here is chosen to mean something like "C++ namespace". If you replace it with "module", most of the problems described go away. The contradicting specification follows for scoped imports offending part: "The imports are looked up to satisfy any unresolved symbols at that scope. Imported symbols may hide symbols from outer scopes." This defeats one of the major goals of the module system: avoiding symbol hijacking (http://dlang.org/hijack.html). Removing that last sentence would be in line with the paragraph above. I.e. local imports just add to the imports. This does not make the rules any more difficult. (The implementation would be a tad more involved, though.)
Dec 04 2014
On Fri, Dec 05, 2014 at 12:25:49AM +0100, Rainer Schuetze via Digitalmars-d wrote:On 02.12.2014 23:00, H. S. Teoh via Digitalmars-d wrote:Sorry, poor example. Here's a tested example that produces an error: // mod.d module mod; private struct S { } // main.d module main; struct S { float x; int y; } void main() { import mod; S s; } Compiler output: /tmp/test.d(10): Error: ScopeDsymbol main.__anonymous.__anonymous struct mod.S is private4) This isn't the end of the story. There's also this lovely bug: https://issues.dlang.org/show_bug.cgi?id=1238 which, as its number should tell you, has been around for a LONG time. Executive summary: // mymod.d module mymod; private int impl; // main.d module main; import mymod; void impl() { ... } void main() { impl(); // Error: main.impl conflicts with mymod.impl (WAT?) }This compiles as expected. I guess you rather mean to import impl from two modules, one with a public impl, the other with a private impl. I don't think this belongs to the same set of issues as the other examples (though I agree private symbols from imports should not be considered in overload sets).Here is another bad example of local import symbol hijacking:[...] My favorite variation on your example: import std.stdio; void say(string text) { import std.conv; writeln(text); } void main() { say("Hello world"); } prints an empty line. :-P Commenting out the `import std.conv;` line fixes the problem. T -- "A man's wife has more power over him than the state has." -- Ralph Emerson
Dec 04 2014
module base; class Base { import std.conv; } module derived; import base; import std.stdio; string text = "123"; class Derived : Base { static void foo() { writeln(text); } } void main() { Derived.foo(); } This prints an empty line!My word, I think this example made me physically ill.
Dec 04 2014
On 02/12/2014 22:00, H. S. Teoh via Digitalmars-d wrote:Recently, in a bid to reduce the messy tangle that is the web of interdependencies among Phobos modules, many module-scope imports have been replaced with scoped imports. In addition to reducing gratuitous dependencies (the scoped import is not processed in a template body until the template is actually needed, so if user code doesn't need that template, then it won't pull in a whole bunch of unnecessary Phobos modules), scoped imports also make the code easier to refactor -- you can move things around more easily without breaking it. However, there are major issues with scoped imports currently, that make this otherwise ideal solution less-than-ideal, which stem from the way 'import' is implemented in D. When the compiler encounters an 'import' statement, what it essentially does is to parse the symbols in the target module, and add them to the symbol table for the current scope. While this is a simple, straightforward implementation method, it leads to the following problems: 1) Consider this code: // Quiz for the reader: what does this code do? void myFunc(string text, string number) { import std.conv; import std.stdio; auto x = to!int(number); writeln(text, x); } void main() { myFunc("The count is: ", "123"); } Here's the output: 123 The reason is that `import std.conv;` pulls in *all* the symbols from std.conv into the local scope, including the function called "text". This causes "text" (i.e., std.conv.text) to shadow the parameter "text", so `writeln(text, x);` is interpreted as: std.stdio.writeln(std.conv.text(), x); so the parameter "text" is never used. This is just a simple example; the problem becomes much harder to trace when the shadowed symbol is not this close to the import scope: struct S { string text; ... void method() { import std.conv; ... writeln(text); // uses std.conv.text, not this.text } } Worse yet, suppose we have this code, that today works correctly: struct S { string buf; void format() { buf = ... /* format S's data into string form */ } void method() { import std.file; ... // do some processing based on files format(); // calls this.format() } } Now, suppose at some point in the future, we introduce a function also called format() in std.file, that formats your hard drive. The user code is unchanged. What happens when you recompile the program and run it? It will compile successfully... and then format your hard drive without any warning. This is the crux of the issue behind: https://issues.dlang.org/show_bug.cgi?id=10378 2) OK, so the conclusion is that unqualified scoped imports are dangerous, right? Alright, let's see what happens when we use qualified imports: // mod.d module mod; struct S { // Use a fully-qualified import. // We place it in the body of S because S's methods // repeatedly needs it -- after all, DRY is good, right? import std.format : format; void method1(string fmt) { writeln(format(fmt, ... )); } void method2() { auto s = format("abc %s def", ...); ... } } // main.d module main; import mod; // we need the definition of S void format(S s) { ... /* do something with s */ } void main() { S s; s.format(); // UFCS -- should call main.format(s), right? } Wrong. That last line in main() actually calls std.format.format. Why? Because in mod.d, the `import std.format : format` declaration actually pulls in std.format.format into the symbol table of S, therefore, S.format now refers to std.format.format. This, therefore, hijacks the intended call to main.format. This is: https://issues.dlang.org/show_bug.cgi?id=13808 3) OK, so putting scoped, qualified imports in struct bodies is a bad idea, they should always go into the method bodies, then we'll be OK, right? Sure... until you need to do this: // A struct S { // B void method(R)(R range) if (isInputRange!R) { // C } } This will not compile unless you import std.range : isInputRange somewhere. Putting it in the line marked //A introduces a module-global dependency on std.range, which is what we're trying to eliminate. So that's a no-go. Putting in //C is also a no-go, because we need isInputRange in the sig constraint, which is outside the function body. So the only option is //B. However, if we do that, then the imported symbols will get pulled into S's symbol table, thus you have this situation: // mymod.d module mymod; struct S { // Now everyone is happy, right...? import std.range : isInputRange; void method(R)(R range) if (isInputRange!R) { } } // main.d module main; import mymod; // Well ... void main() { S s; static assert(isInputRange!(int[])); // fails -- as it should static assert(s.isInputRange!(int[])); // works -- WAT? } So even though isInputRange is really only required by the *implementation* of S, yet the consequence of the compiler's implementation of import is that it pollutes S's namespace with the imported symbols in such a way that external code can access said symbols using S as a proxy -- even if S was never intended to act as such a proxy!! And if you used an unqualified import instead, the problem is compounded: // mymod.d module mymod; struct S { import std.algorithm; // oh yes ... void method() { // hey, it does make code easier to write, we // don't have to keep importing std.algorithm // all over the place: auto x = map!(...).filter!(...).sort(); ... } ... } // main.d module main; void main() { // hey, we only need the definition of S, right? import mymod : S; S s; // But we can access all sorts of stuff through S... // (Not the mention UFCS hijacking happening everywhere!) auto r = s.map!(...); auto t = s.filter!(...); auto u = s.canFind(...); // when they shouldn't even work at all!! auto v = canFind(s, ...); // <--- compile error } 4) This isn't the end of the story. There's also this lovely bug: https://issues.dlang.org/show_bug.cgi?id=1238 which, as its number should tell you, has been around for a LONG time. Executive summary: // mymod.d module mymod; private int impl; // main.d module main; import mymod; void impl() { ... } void main() { impl(); // Error: main.impl conflicts with mymod.impl (WAT?) } Basically, the underlying cause is the same. Import pulls the symbols of mymod.d into the current scope (including private ones -- which the compiler currently has to, since otherwise it couldn't compile mymod's references to private symbols). So when you try to reference 'impl', the overload set contains both main.impl and mymod.impl, which is an ambiguity error. This doesn't sound too bad, until you consider that mymod.d could be an external library that, originally, did *not* define a symbol called "impl". So your code happily compiles and works, and then you upgrade to the next release of the library, and suddenly you get a compile error, because the library authors introduced a PRIVATE symbol into the library. In all of the above cases, the compiler's current "simple" implementation of lookup rules and imports is creating a ripple effect of loopholes and corner cases that all add up to poor encapsulation and maintainability problems: local imports in structs makes imported symbols visible to unrelated outside code; private symbols in imported modules conflict with local symbols; even scoped, qualified imports (ostensibly the "cleanest" way to import something) can shadow parameters and enclosing struct/class members. In order to avoid these issues, one has to adopt increasingly convoluted ways of writing code based on a series of avoidances -- avoid module-global imports (introduces needless dependencies), avoid unqualified imports (shadows parameters), avoid putting imports in struct/class bodies (leaks imported symbols to outside world), etc.. This is beginning to sound like C++ all over again (don't use malloc, use new; don't use new for arrays, use new[]; don't use raw pointers, use std::auto_ptr; etc....). :-( I'm finding it harder and harder to accept Walter's stance that symbol lookups should be kept simple and free from complications and convoluted corner cases, etc.. Except that it is *already* full of convoluted pitfalls and corner cases you must avoid, as illustrated above. I wish we would just acknowledge that the current symbol lookup / import rules (or at least the implementation thereof) are inadequate, and find a clean solution to this long-standing issue, instead of hoping that denying the problem will somehow, magically, make it all go away. TI agree too. I noticed this too when I was implementing the symbol lookup rules in the semantic engine for DDT. It was very weird to me that the imports would make symbols accessible not only during lexical lookup, but during qualified lookups as well :S . You said that Walter wanted the symbol lookup rules to be as simple as possible. Well, a lot of that bad design could be fixed by simply stating that import symbols are only visibile in lexical lookup. It wouldn't solve all issues, (namely example 1 there) but then we could just use qualified imports instead (and the issues with those would be fixed). -- Bruno Medeiros https://twitter.com/brunodomedeiros
Dec 05 2014
On 12/3/14 8:00 AM, H. S. Teoh via Digitalmars-d wrote:I'm finding it harder and harder to accept Walter's stance that symbol lookups should be kept simple and free from complications and convoluted corner cases, etc.. Except that it is*already* full of convoluted pitfalls and corner cases you must avoid, as illustrated above. I wish we would just acknowledge that the current symbol lookup / import rules (or at least the implementation thereof) are inadequate, and find a clean solution to this long-standing issue, instead of hoping that denying the problem will somehow, magically, make it all go away.I agree we need to take a closer look at name lookup now that local imports are becoming more popular. This may be one of those cases in which simpler language rules transfer complexity to the user. Andrei
Dec 07 2014
On 12/2/2014 2:00 PM, H. S. Teoh via Digitalmars-d wrote:4) This isn't the end of the story. There's also this lovely bug: https://issues.dlang.org/show_bug.cgi?id=1238 which, as its number should tell you, has been around for a LONG time. Executive summary: // mymod.d module mymod; private int impl; // main.d module main; import mymod; void impl() { ... } void main() { impl(); // Error: main.impl conflicts with mymod.impl (WAT?) }That error doesn't happen. That isn't what 1238 is.Basically, the underlying cause is the same. Import pulls the symbols of mymod.d into the current scopeNo, it does not. Symbols in the current scope override imports from the same scope. 1238 is about private symbols from one import set against public symbols from another import, an entirely different thing.
Dec 08 2014
On 12/2/2014 2:00 PM, H. S. Teoh via Digitalmars-d wrote:However, there are major issues with scoped imports currently, that make this otherwise ideal solution less-than-ideal, which stem from the way 'import' is implemented in D. When the compiler encounters an 'import' statement, what it essentially does is to parse the symbols in the target module, and add them to the symbol table for the current scope.No, it does not. Each scope has a symbol table of symbols declared in the current scope, https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.h#L298 and it has an array of imported symbol tables: https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.h#L301 Only if the symbol is not found in the current scope, https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.c#L902 it then looks in the imported symbol tables: https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.c#L910
Dec 08 2014
On Mon, Dec 08, 2014 at 12:34:07AM -0800, Walter Bright via Digitalmars-d wrote:On 12/2/2014 2:00 PM, H. S. Teoh via Digitalmars-d wrote:I realize that my understanding on the workings of dmd may not be quite accurate, and I apologize for any inadvertent misrepresentations, but the fact remains that the current behaviour of import is very counterintuitive, and users will continue having issues with it if the current state of things continue to hold. Having imported symbols shadow symbols in outer lexical scopes, while it may make sense in implementation, is a very counterintuitive behaviour. It means unrelated modules can hijack symbols from outer lexical scopes with no indication whatsoever -- if you `import external_lib` today, everything may work fine, but upon upgrading external_lib to a new version, suddenly one of your outer lexical symbols is hijacked without any warning. Having private symbols conflict with public ones is also very counterintuitive and frustrating behaviour -- even if the underlying cause is different, it's perceived as a related issue. Having scoped imports at module scope leak symbols to importers of the module is also very counterintuitive. Having symbols imported at struct scope visible via S.symbol is extremely counterintuitive, though I believe Kenji has already submitted a PR for this. (Thanks, Kenji!!) And so on. These may be genuine bugs, or design flaws, or an oversight, but whatever the cause, the result is the same fly in what could otherwise have been a tasty soup of a nice module system. T -- One reason that few people are aware there are programs running the internet is that they never crash in any significant way: the free software underlying the internet is reliable to the point of invisibility. -- Glyn Moody, from the article "Giving it all away"However, there are major issues with scoped imports currently, that make this otherwise ideal solution less-than-ideal, which stem from the way 'import' is implemented in D. When the compiler encounters an 'import' statement, what it essentially does is to parse the symbols in the target module, and add them to the symbol table for the current scope.No, it does not. Each scope has a symbol table of symbols declared in the current scope, https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.h#L298 and it has an array of imported symbol tables: https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.h#L301 Only if the symbol is not found in the current scope, https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.c#L902 it then looks in the imported symbol tables: https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.c#L910
Dec 08 2014
On 12/8/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:I realize that my understanding on the workings of dmd may not be quite accurate, and I apologize for any inadvertent misrepresentations, but the fact remains that the current behaviour of import is very counterintuitive, and users will continue having issues with it if the current state of things continue to hold.I understand, my purpose on that post was to correct misunderstanding of how it currently works. The two-step lookup method, of current scope then imports in current scope, is misunderstood by nearly everyone.
Dec 08 2014
On Mon, 08 Dec 2014 14:30:08 -0800 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:The two-step lookup method, of current scope then imports in current scop=e, is=20misunderstood by nearly everyone.this is the clear sign of the existing problem.
Dec 08 2014
On 12/8/2014 7:34 PM, ketmar via Digitalmars-d wrote:On Mon, 08 Dec 2014 14:30:08 -0800 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:I disagree. None of the issues raised have to do with that.The two-step lookup method, of current scope then imports in current scope, is misunderstood by nearly everyone.this is the clear sign of the existing problem.
Dec 08 2014
On Mon, 08 Dec 2014 19:40:38 -0800 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 12/8/2014 7:34 PM, ketmar via Digitalmars-d wrote:cope, isOn Mon, 08 Dec 2014 14:30:08 -0800 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:The two-step lookup method, of current scope then imports in current s=if the thing is "misunderstood by nearly everyone" that means that the thing is counterintu... ah, sorry, i forgot that "being counterintuitive" is an unofficial D motto.=20 I disagree. None of the issues raised have to do with that.misunderstood by nearly everyone.this is the clear sign of the existing problem.
Dec 08 2014