www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Symbol lookup rules and imports

reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
Recently, in a bid to reduce the messy tangle that is the web of
interdependencies among Phobos modules, many module-scope imports have
been replaced with scoped imports. In addition to reducing gratuitous
dependencies (the scoped import is not processed in a template body
until the template is actually needed, so if user code doesn't need that
template, then it won't pull in a whole bunch of unnecessary Phobos
modules), scoped imports also make the code easier to refactor -- you
can move things around more easily without breaking it.

However, there are major issues with scoped imports currently,
that make this otherwise ideal solution less-than-ideal, which stem from
the way 'import' is implemented in D. When the compiler encounters an
'import' statement, what it essentially does is to parse the symbols in
the target module, and add them to the symbol table for the current
scope. While this is a simple, straightforward implementation method, it
leads to the following problems:

1) Consider this code:

	// Quiz for the reader: what does this code do?
	void myFunc(string text, string number) {
		import std.conv;
		import std.stdio;

		auto x = to!int(number);
		writeln(text, x);
	}
	void main() {
		myFunc("The count is: ", "123");
	}

Here's the output:

	123

The reason is that `import std.conv;` pulls in *all* the symbols from
std.conv into the local scope, including the function called "text".
This causes "text" (i.e., std.conv.text) to shadow the parameter "text",
so `writeln(text, x);` is interpreted as:

	std.stdio.writeln(std.conv.text(), x);

so the parameter "text" is never used. This is just a simple example;
the problem becomes much harder to trace when the shadowed symbol is not
this close to the import scope:

	struct S {
		string text;
		...
		void method() {
			import std.conv;
			...
			writeln(text); // uses std.conv.text, not this.text
		}
	}

Worse yet, suppose we have this code, that today works correctly:

	struct S {
		string buf;
		void format() {
			buf = ... /* format S's data into string form */
		}
		void method() {
			import std.file;
			...		// do some processing based on files
			format();	// calls this.format()
		}
	}

Now, suppose at some point in the future, we introduce a function also
called format() in std.file, that formats your hard drive. The user code
is unchanged. What happens when you recompile the program and run it? It
will compile successfully... and then format your hard drive without any
warning.

This is the crux of the issue behind:

	https://issues.dlang.org/show_bug.cgi?id=10378


2) OK, so the conclusion is that unqualified scoped imports are
dangerous, right? Alright, let's see what happens when we use qualified
imports:

	// mod.d
	module mod;
	struct S {
		// Use a fully-qualified import.
		// We place it in the body of S because S's methods
		// repeatedly needs it -- after all, DRY is good, right?
		import std.format : format;

		void method1(string fmt) {
			writeln(format(fmt, ... ));
		}

		void method2() {
			auto s = format("abc %s def", ...);
			...
		}
	}

	// main.d
	module main;
	import mod; // we need the definition of S

	void format(S s) {
		... /* do something with s */
	}

	void main() {
		S s;
		s.format(); // UFCS -- should call main.format(s), right?
	}

Wrong. That last line in main() actually calls std.format.format. Why?
Because in mod.d, the `import std.format : format` declaration actually
pulls in std.format.format into the symbol table of S, therefore,
S.format now refers to std.format.format. This, therefore, hijacks the
intended call to main.format.

This is: https://issues.dlang.org/show_bug.cgi?id=13808


3) OK, so putting scoped, qualified imports in struct bodies is a bad
idea, they should always go into the method bodies, then we'll be OK,
right?  Sure... until you need to do this:

	// A
	struct S {
		// B
		void method(R)(R range)
			if (isInputRange!R)
		{
			// C
		}
	}

This will not compile unless you import std.range : isInputRange
somewhere.  Putting it in the line marked //A introduces a module-global
dependency on std.range, which is what we're trying to eliminate. So
that's a no-go.  Putting in //C is also a no-go, because we need
isInputRange in the sig constraint, which is outside the function body.
So the only option is //B. However, if we do that, then the imported
symbols will get pulled into S's symbol table, thus you have this
situation:

	// mymod.d
	module mymod;
	struct S {
		// Now everyone is happy, right...?
		import std.range : isInputRange;

		void method(R)(R range)
			if (isInputRange!R)
		{
		}
	}

	// main.d
	module main;
	import mymod;

	// Well ...
	void main() {
		S s;

		static assert(isInputRange!(int[])); // fails -- as it should

		static assert(s.isInputRange!(int[])); // works -- WAT?
	}

So even though isInputRange is really only required by the
*implementation* of S, yet the consequence of the compiler's
implementation of import is that it pollutes S's namespace with the
imported symbols in such a way that external code can access said
symbols using S as a proxy -- even if S was never intended to act as
such a proxy!!

And if you used an unqualified import instead, the problem is
compounded:

	// mymod.d
	module mymod;

	struct S {
		import std.algorithm; // oh yes
		...
		void method() {
			// hey, it does make code easier to write, we
			// don't have to keep importing std.algorithm
			// all over the place:
			auto x = map!(...).filter!(...).sort();
			...
		}
		...
	}

	// main.d
	module main;

	void main() {
		// hey, we only need the definition of S, right?
		import mymod : S;

		S s;

		// But we can access all sorts of stuff through S...
		// (Not the mention UFCS hijacking happening everywhere!)
		auto r = s.map!(...);
		auto t = s.filter!(...);
		auto u = s.canFind(...);

		// when they shouldn't even work at all!!
		auto v = canFind(s, ...); // <--- compile error
	}


4) This isn't the end of the story. There's also this lovely bug:

	https://issues.dlang.org/show_bug.cgi?id=1238

which, as its number should tell you, has been around for a LONG time.
Executive summary:

	// mymod.d
	module mymod;
	private int impl;

	// main.d
	module main;
	import mymod;

	void impl() { ... }

	void main() {
		impl(); // Error: main.impl conflicts with mymod.impl (WAT?)
	}

Basically, the underlying cause is the same. Import pulls the symbols of
mymod.d into the current scope (including private ones -- which the
compiler currently has to, since otherwise it couldn't compile mymod's
references to private symbols). So when you try to reference 'impl', the
overload set contains both main.impl and mymod.impl, which is an
ambiguity error.

This doesn't sound too bad, until you consider that mymod.d could be an
external library that, originally, did *not* define a symbol called
"impl". So your code happily compiles and works, and then you upgrade to
the next release of the library, and suddenly you get a compile error,
because the library authors introduced a PRIVATE symbol into the
library.


In all of the above cases, the compiler's current "simple"
implementation of lookup rules and imports is creating a ripple effect
of loopholes and corner cases that all add up to poor encapsulation and
maintainability problems: local imports in structs makes imported
symbols visible to unrelated outside code; private symbols in imported
modules conflict with local symbols; even scoped, qualified imports
(ostensibly the "cleanest" way to import something) can shadow
parameters and enclosing struct/class members.

In order to avoid these issues, one has to adopt increasingly convoluted
ways of writing code based on a series of avoidances -- avoid
module-global imports (introduces needless dependencies), avoid
unqualified imports (shadows parameters), avoid putting imports in
struct/class bodies (leaks imported symbols to outside world), etc..
This is beginning to sound like C++ all over again (don't use malloc,
use new; don't use new for arrays, use new[]; don't use raw pointers,
use std::auto_ptr; etc....). :-(

I'm finding it harder and harder to accept Walter's stance that symbol
lookups should be kept simple and free from complications and convoluted
corner cases, etc.. Except that it is *already* full of convoluted
pitfalls and corner cases you must avoid, as illustrated above. I wish
we would just acknowledge that the current symbol lookup / import rules
(or at least the implementation thereof) are inadequate, and find a
clean solution to this long-standing issue, instead of hoping that
denying the problem will somehow, magically, make it all go away.


T

-- 
"Holy war is an oxymoron." -- Lazarus Long
Dec 02 2014
next sibling parent reply "Meta" <jared771 gmail.com> writes:
This whole thing is a huge hole in D that needs to be fixed (it 
may even be necessary to consider it higher priority than the 
current C++ and GC). As it works currently, I'd go as far as to 
say that almost every addition to Phobos must be considered a 
breaking change for these reasons. Given the recent discussion 
about trying as much as possible to not break code, fixing the 
issues with import is extremely important. When a library writer 
can break user code by introducing a *private* symbol (scoped or 
otherwise), something has gone wrong.

Furthermore:

	// mod.d
	module mod;
	struct S {
		// Use a fully-qualified import.
		// We place it in the body of S because S's methods
		// repeatedly needs it -- after all, DRY is good, right?
		import std.format : format;

		void method1(string fmt) {
			writeln(format(fmt, ... ));
		}

		void method2() {
			auto s = format("abc %s def", ...);
			...
		}
	}

	// main.d
	module main;
	import mod; // we need the definition of S

	void format(S s) {
		... /* do something with s */
	}

	void main() {
		S s;
		s.format(); // UFCS -- should call main.format(s), right?
	}
Am I correct that this bug is due to the fact that selective imports from a module implicitly imports all symbols in that module, rather than just the selected symbol? I don't know how anyone could think that D's module system is simple at this point when things behave so differently from how they intuitively should behave (for my own personal definition of intuitive).
Dec 02 2014
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Dec 02, 2014 at 11:02:18PM +0000, Meta via Digitalmars-d wrote:
 This whole thing is a huge hole in D that needs to be fixed (it may
 even be necessary to consider it higher priority than the current C++
 and GC).
Well, that's up to Walter & Andrei to decide. :-P
 As it works currently, I'd go as far as to say that almost every
 addition to Phobos must be considered a breaking change for these
 reasons. Given the recent discussion about trying as much as possible
 to not break code, fixing the issues with import is extremely
 important. When a library writer can break user code by introducing a
 *private* symbol (scoped or otherwise), something has gone wrong.
Yeah, you can imagine I was *not* happy when, some time ago, I upgraded Phobos and my code broke because std.regex introduced a *private* symbol that just happened to have the same name as a struct I defined myself.
 Furthermore:
 
	// mod.d
	module mod;
	struct S {
		// Use a fully-qualified import.
		// We place it in the body of S because S's methods
		// repeatedly needs it -- after all, DRY is good, right?
		import std.format : format;

		void method1(string fmt) {
			writeln(format(fmt, ... ));
		}

		void method2() {
			auto s = format("abc %s def", ...);
			...
		}
	}

	// main.d
	module main;
	import mod; // we need the definition of S

	void format(S s) {
		... /* do something with s */
	}

	void main() {
		S s;
		s.format(); // UFCS -- should call main.format(s), right?
	}
Am I correct that this bug is due to the fact that selective imports from a module implicitly imports all symbols in that module, rather than just the selected symbol?
Not in this case, I don't think, the crux of the problem is that this: import some.module : symbol; is essentially the same as this: static import some.module; alias symbol = some.module.symbol; Therefore this: struct S { import std.format : format; ... } is essentially equivalent to this: struct S { static import std.format; alias format = std.format.format; // ^^^ the above line is what makes s.format() break. }
 I don't know how anyone could think that D's module system is simple
 at this point when things behave so differently from how they
 intuitively should behave (for my own personal definition of
 intuitive).
Oh, it's certainly "simple". Simple to the compiler writers, that is, in the sense of the implementation being quite straightforward, but that's not necessarily "simple" from a user's POV! Or perhaps a fairer way to state this, is that "import" as currently implemented may not quite coincide with what users think "import" should do. Currently, "import" means, quite literally, "import module X's symbols into the current scope" -- all the symbols in X get inserted into the symbol table of the current scope. However, what most people understand when they hear of "import" is "add module X to the list of scopes to search when we need to look up a symbol". Perhaps a better name for this "more intuitive" import is "use": that is, use the symbols from module X if they are referenced but don't pull them all into the current scope's symbol table. In this sense, the current behaviour is "correct" because the observed effects are exactly what happens when module X's symbols get added to the scope's namespace. However, this also means it's a poor tool for encapsulation, because its semantics cause encapsulation-breaking effects as described. And it also means that users, who generally tend to expect "use" semantics rather than "import" semantics, will tend to get into trouble when they start using it. Furthermore, there isn't really any practical way right now to get "use" semantics in D, since "import" is the only tool we currently have. To get "use" semantics you'd have to resort to really ugly hackery like define a separate private sub-scope to pull symbols into, for example: struct S { // Private empty struct to provide a water-proof scope // into which we can import symbols without fear of // leakage: private static struct Imports { // Now this won't leak to S's scope: import std.format : format; } void method() { // But it *will* look really ugly: Imports.format(...); } } void format(S s) { ... } S s; s.format(); // now this won't get hijacked by the import Hmm... actually, this gives me an idea. What if we implement a little syntactic sugar for this in the compiler? Say: scope import std.conv ... ; scope import std.format ... ; gets lowered to: private static struct __imports { import std.conv ... ; import std.format ... ; } where __imports is an implicit nested struct that gets introduced to each scope that uses "scope import". Then we introduce a new lookup rule, that if a symbol X cannot be found with the current lookup rules, then the compiler should try searching for __imports.X in the current scope instead. That is to say, if: format("%s", ...); cannot be resolved, then pretend that the user has written: __imports.format("%s", ...); instead. Sortof like the import analogue of UFCS (if a member function can't be found in a call obj.method(), then look for method(obj) in the global scope instead). This way, existing code won't have to change, no breakage will be introduced, and only a small addition (not change) needs to be made to the existing lookup rules. Then whenever we need "use" semantics as opposed to raw "import" semantics, we just write "scope import" instead, and it should all work. (So we hope.) Heh, sounds like DIP material... T -- If blunt statements had a point, they wouldn't be blunt...
Dec 02 2014
parent Martin Nowak <code+news.digitalmars dawg.eu> writes:
On 12/03/2014 12:55 AM, H. S. Teoh via Digitalmars-d wrote:
 	struct S {
 		static import std.format;
 		alias format = std.format.format;
 		// ^^^ the above line is what makes s.format() break.
 	}
It's equivalent to private alias format = std.format.format; The problem here is that private isn't checked for aliases and that it currently doesn't affect visibility.
Dec 03 2014
prev sibling next sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 2 Dec 2014 15:55:34 -0800
"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:

 Hmm... actually, this gives me an idea. What if we implement a little
 syntactic sugar for this in the compiler? Say:
=20
 	scope import std.conv ... ;
 	scope import std.format ... ;
=20
 gets lowered to:
=20
 	private static struct __imports {
 		import std.conv ... ;
 		import std.format ... ;
 	}
=20
 where __imports is an implicit nested struct that gets introduced to
 each scope that uses "scope import".
=20
 Then we introduce a new lookup rule, that if a symbol X cannot be found
 with the current lookup rules, then the compiler should try searching
 for __imports.X in the current scope instead. That is to say, if:
=20
 	format("%s", ...);
=20
 cannot be resolved, then pretend that the user has written:
=20
 	__imports.format("%s", ...);
=20
 instead. Sortof like the import analogue of UFCS (if a member function
 can't be found in a call obj.method(), then look for method(obj) in the
 global scope instead).
=20
 This way, existing code won't have to change, no breakage will be
 introduced, and only a small addition (not change) needs to be made to
 the existing lookup rules. Then whenever we need "use" semantics as
 opposed to raw "import" semantics, we just write "scope import" instead,
 and it should all work. (So we hope.)
=20
 Heh, sounds like DIP material...
i think that this is a great idea! i already love it.
Dec 02 2014
prev sibling next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Dec 03, 2014 at 02:10:52AM +0200, ketmar via Digitalmars-d wrote:
 On Tue, 2 Dec 2014 15:55:34 -0800
 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:
 
 Hmm... actually, this gives me an idea. What if we implement a
 little syntactic sugar for this in the compiler? Say:
 
 	scope import std.conv ... ;
 	scope import std.format ... ;
 
 gets lowered to:
 
 	private static struct __imports {
 		import std.conv ... ;
 		import std.format ... ;
 	}
 
 where __imports is an implicit nested struct that gets introduced to
 each scope that uses "scope import".
 
 Then we introduce a new lookup rule, that if a symbol X cannot be
 found with the current lookup rules, then the compiler should try
 searching for __imports.X in the current scope instead. That is to
 say, if:
 
 	format("%s", ...);
 
 cannot be resolved, then pretend that the user has written:
 
 	__imports.format("%s", ...);
 
 instead. Sortof like the import analogue of UFCS (if a member
 function can't be found in a call obj.method(), then look for
 method(obj) in the global scope instead).
 
 This way, existing code won't have to change, no breakage will be
 introduced, and only a small addition (not change) needs to be made
 to the existing lookup rules. Then whenever we need "use" semantics
 as opposed to raw "import" semantics, we just write "scope import"
 instead, and it should all work. (So we hope.)
 
 Heh, sounds like DIP material...
i think that this is a great idea! i already love it.
OTOH, Kenji has a different solution already in PR form: https://github.com/D-Programming-Language/dmd/pull/3407 T -- Not all rumours are as misleading as this one.
Dec 04 2014
prev sibling next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 4 Dec 2014 14:15:22 -0800
"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:

 On Wed, Dec 03, 2014 at 02:10:52AM +0200, ketmar via Digitalmars-d wrote:
 On Tue, 2 Dec 2014 15:55:34 -0800
 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:
=20
 Hmm... actually, this gives me an idea. What if we implement a
 little syntactic sugar for this in the compiler? Say:
=20
 	scope import std.conv ... ;
 	scope import std.format ... ;
=20
 gets lowered to:
=20
 	private static struct __imports {
 		import std.conv ... ;
 		import std.format ... ;
 	}
=20
 where __imports is an implicit nested struct that gets introduced to
 each scope that uses "scope import".
=20
 Then we introduce a new lookup rule, that if a symbol X cannot be
 found with the current lookup rules, then the compiler should try
 searching for __imports.X in the current scope instead. That is to
 say, if:
=20
 	format("%s", ...);
=20
 cannot be resolved, then pretend that the user has written:
=20
 	__imports.format("%s", ...);
=20
 instead. Sortof like the import analogue of UFCS (if a member
 function can't be found in a call obj.method(), then look for
 method(obj) in the global scope instead).
=20
 This way, existing code won't have to change, no breakage will be
 introduced, and only a small addition (not change) needs to be made
 to the existing lookup rules. Then whenever we need "use" semantics
 as opposed to raw "import" semantics, we just write "scope import"
 instead, and it should all work. (So we hope.)
=20
 Heh, sounds like DIP material...
i think that this is a great idea! i already love it.
=20 OTOH, Kenji has a different solution already in PR form: =20 https://github.com/D-Programming-Language/dmd/pull/3407
"Static, renamed, and selective imports are always public"? nononono! as for static imports -- ok, let it be (but i don't want it too). as for renamed and selective... in no way!
Dec 04 2014
parent Mike Parker <aldacron gmail.com> writes:
On 12/5/2014 10:04 AM, ketmar via Digitalmars-d wrote:
 On Thu, 4 Dec 2014 14:15:22 -0800
 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:
 OTOH, Kenji has a different solution already in PR form:

 	https://github.com/D-Programming-Language/dmd/pull/3407
"Static, renamed, and selective imports are always public"? nononono! as for static imports -- ok, let it be (but i don't want it too). as for renamed and selective... in no way!
I think you misunderstand. They are public now, which makes them horribly broken. The PR aims to fix that.
Dec 04 2014
prev sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 4 Dec 2014 14:15:22 -0800
"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:

 On Wed, Dec 03, 2014 at 02:10:52AM +0200, ketmar via Digitalmars-d wrote:
 On Tue, 2 Dec 2014 15:55:34 -0800
 "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> wrote:
=20
 Hmm... actually, this gives me an idea. What if we implement a
 little syntactic sugar for this in the compiler? Say:
=20
 	scope import std.conv ... ;
 	scope import std.format ... ;
=20
 gets lowered to:
=20
 	private static struct __imports {
 		import std.conv ... ;
 		import std.format ... ;
 	}
=20
 where __imports is an implicit nested struct that gets introduced to
 each scope that uses "scope import".
=20
 Then we introduce a new lookup rule, that if a symbol X cannot be
 found with the current lookup rules, then the compiler should try
 searching for __imports.X in the current scope instead. That is to
 say, if:
=20
 	format("%s", ...);
=20
 cannot be resolved, then pretend that the user has written:
=20
 	__imports.format("%s", ...);
=20
 instead. Sortof like the import analogue of UFCS (if a member
 function can't be found in a call obj.method(), then look for
 method(obj) in the global scope instead).
=20
 This way, existing code won't have to change, no breakage will be
 introduced, and only a small addition (not change) needs to be made
 to the existing lookup rules. Then whenever we need "use" semantics
 as opposed to raw "import" semantics, we just write "scope import"
 instead, and it should all work. (So we hope.)
=20
 Heh, sounds like DIP material...
i think that this is a great idea! i already love it.
=20 OTOH, Kenji has a different solution already in PR form: =20 https://github.com/D-Programming-Language/dmd/pull/3407
p.s. i want wildcard support in `import`! no, i want regex support in `import`! so i can write `import iv.writer : (err)?write(f|ln)?;`.
Dec 04 2014
prev sibling next sibling parent "Martin Nowak" <code dawg.eu> writes:
http://wiki.dlang.org/DIP22
Dec 02 2014
prev sibling next sibling parent "Paolo Invernizzi" <paolo.invernizzi no.address> writes:
On Tuesday, 2 December 2014 at 22:02:23 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 I wish we would just acknowledge that the current symbol lookup 
 /  import rules
 (or at least the implementation thereof) are inadequate, and  
 find a
 clean solution to this long-standing issue, instead of hoping  
 that
 denying the problem will somehow, magically, make it all go  
 away.
A big +1 --- Paolo
Dec 03 2014
prev sibling next sibling parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Tuesday, 2 December 2014 at 22:02:23 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 2) OK, so the conclusion is that unqualified scoped imports are
 dangerous, right? Alright, let's see what happens when we use 
 qualified
 imports:

 	// mod.d
 	module mod;
 	struct S {
 		// Use a fully-qualified import.
 		// We place it in the body of S because S's methods
 		// repeatedly needs it -- after all, DRY is good, right?
 		import std.format : format;

 		void method1(string fmt) {
 			writeln(format(fmt, ... ));
 		}

 		void method2() {
 			auto s = format("abc %s def", ...);
 			...
 		}
 	}

 	// main.d
 	module main;
 	import mod; // we need the definition of S

 	void format(S s) {
 		... /* do something with s */
 	}

 	void main() {
 		S s;
 		s.format(); // UFCS -- should call main.format(s), right?
 	}

 Wrong. That last line in main() actually calls 
 std.format.format. Why?
 Because in mod.d, the `import std.format : format` declaration 
 actually
 pulls in std.format.format into the symbol table of S, 
 therefore,
 S.format now refers to std.format.format. This, therefore, 
 hijacks the
 intended call to main.format.

 This is: https://issues.dlang.org/show_bug.cgi?id=13808
Surely this can't be intended behaviour? It seems its imported publically instead of privately, so that it gets "reexported" to outside users of the struct. If so, I don't think we should add workarounds to std.datetime or any other places where this could occur.
Dec 03 2014
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2014-12-02 23:00, H. S. Teoh via Digitalmars-d wrote:

 I'm finding it harder and harder to accept Walter's stance that symbol
 lookups should be kept simple and free from complications and convoluted
 corner cases, etc.. Except that it is *already* full of convoluted
 pitfalls and corner cases you must avoid, as illustrated above. I wish
 we would just acknowledge that the current symbol lookup / import rules
 (or at least the implementation thereof) are inadequate, and find a
 clean solution to this long-standing issue, instead of hoping that
 denying the problem will somehow, magically, make it all go away.
I fully agree. -- /Jacob Carlborg
Dec 03 2014
parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 3 December 2014 at 13:34:42 UTC, Jacob Carlborg 
wrote:
 On 2014-12-02 23:00, H. S. Teoh via Digitalmars-d wrote:

 I'm finding it harder and harder to accept Walter's stance 
 that symbol
 lookups should be kept simple and free from complications and 
 convoluted
 corner cases, etc.. Except that it is *already* full of 
 convoluted
 pitfalls and corner cases you must avoid, as illustrated 
 above. I wish
 we would just acknowledge that the current symbol lookup / 
 import rules
 (or at least the implementation thereof) are inadequate, and 
 find a
 clean solution to this long-standing issue, instead of hoping 
 that
 denying the problem will somehow, magically, make it all go 
 away.
I fully agree.
+1 So far it seems there is huge amount of anecdotal evidence that D module system does not work as designed and people keep inventing hacks and workaround to get over it. And yet Walter still considers it success. Well, compared to C includes - no doubt. Compared to actually usable modle system.. meh. Right now it pretty much requires relying on coding discipline and custom rules for import formatting.
Dec 04 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Tuesday, 2 December 2014 at 22:02:23 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 However, there are major issues with scoped imports currently,
 that make this otherwise ideal solution less-than-ideal, which 
 stem from
 the way 'import' is implemented in D. When the compiler 
 encounters an
 'import' statement, what it essentially does is to parse the 
 symbols in
 the target module, and add them to the symbol table for the 
 current
 scope.
This is probably single thing I hate most about D module system. Especially the fact that it is default behaviour with plain "import". I wish all symbols required to be force-qualified with a module by default :(
Dec 04 2014
prev sibling next sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 02.12.2014 23:00, H. S. Teoh via Digitalmars-d wrote:
 4) This isn't the end of the story. There's also this lovely bug:

 	https://issues.dlang.org/show_bug.cgi?id=1238

 which, as its number should tell you, has been around for a LONG time.
 Executive summary:

 	// mymod.d
 	module mymod;
 	private int impl;

 	// main.d
 	module main;
 	import mymod;

 	void impl() { ... }

 	void main() {
 		impl(); // Error: main.impl conflicts with mymod.impl (WAT?)
 	}
This compiles as expected. I guess you rather mean to import impl from two modules, one with a public impl, the other with a private impl. I don't think this belongs to the same set of issues as the other examples (though I agree private symbols from imports should not be considered in overload sets). Here is another bad example of local import symbol hijacking: module base; class Base { import std.conv; } module derived; import base; import std.stdio; string text = "123"; class Derived : Base { static void foo() { writeln(text); } } void main() { Derived.foo(); } This prints an empty line! The documentation isn't very specific about symbol lookup rules, I found the import rules on dlang.org/modules.html: "How basic imports work is that first a name is searched for in the current namespace. If it is not found, then it is looked for in the imports. If it is found uniquely among the imports, then that is used. If it is in more than one import, an error occurs." I don't think "namespace" here is chosen to mean something like "C++ namespace". If you replace it with "module", most of the problems described go away. The contradicting specification follows for scoped imports offending part: "The imports are looked up to satisfy any unresolved symbols at that scope. Imported symbols may hide symbols from outer scopes." This defeats one of the major goals of the module system: avoiding symbol hijacking (http://dlang.org/hijack.html). Removing that last sentence would be in line with the paragraph above. I.e. local imports just add to the imports. This does not make the rules any more difficult. (The implementation would be a tad more involved, though.)
Dec 04 2014
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Dec 05, 2014 at 12:25:49AM +0100, Rainer Schuetze via Digitalmars-d
wrote:
 
 
 On 02.12.2014 23:00, H. S. Teoh via Digitalmars-d wrote:
4) This isn't the end of the story. There's also this lovely bug:

	https://issues.dlang.org/show_bug.cgi?id=1238

which, as its number should tell you, has been around for a LONG
time.  Executive summary:

	// mymod.d
	module mymod;
	private int impl;

	// main.d
	module main;
	import mymod;

	void impl() { ... }

	void main() {
		impl(); // Error: main.impl conflicts with mymod.impl (WAT?)
	}
This compiles as expected. I guess you rather mean to import impl from two modules, one with a public impl, the other with a private impl. I don't think this belongs to the same set of issues as the other examples (though I agree private symbols from imports should not be considered in overload sets).
Sorry, poor example. Here's a tested example that produces an error: // mod.d module mod; private struct S { } // main.d module main; struct S { float x; int y; } void main() { import mod; S s; } Compiler output: /tmp/test.d(10): Error: ScopeDsymbol main.__anonymous.__anonymous struct mod.S is private
 Here is another bad example of local import symbol hijacking:
[...] My favorite variation on your example: import std.stdio; void say(string text) { import std.conv; writeln(text); } void main() { say("Hello world"); } prints an empty line. :-P Commenting out the `import std.conv;` line fixes the problem. T -- "A man's wife has more power over him than the state has." -- Ralph Emerson
Dec 04 2014
prev sibling parent "Meta" <jared771 gmail.com> writes:
 module base;

 class Base
 {
 	import std.conv;
 }

 module derived;
 import base;
 import std.stdio;

 string text = "123";

 class Derived : Base
 {
 	static void foo()
 	{
 		writeln(text);
 	}
 }

 void main()
 {
 	Derived.foo();
 }

 This prints an empty line!
My word, I think this example made me physically ill.
Dec 04 2014
prev sibling next sibling parent Bruno Medeiros <bruno.do.medeiros+dng gmail.com> writes:
On 02/12/2014 22:00, H. S. Teoh via Digitalmars-d wrote:
 Recently, in a bid to reduce the messy tangle that is the web of
 interdependencies among Phobos modules, many module-scope imports have
 been replaced with scoped imports. In addition to reducing gratuitous
 dependencies (the scoped import is not processed in a template body
 until the template is actually needed, so if user code doesn't need that
 template, then it won't pull in a whole bunch of unnecessary Phobos
 modules), scoped imports also make the code easier to refactor -- you
 can move things around more easily without breaking it.

 However, there are major issues with scoped imports currently,
 that make this otherwise ideal solution less-than-ideal, which stem from
 the way 'import' is implemented in D. When the compiler encounters an
 'import' statement, what it essentially does is to parse the symbols in
 the target module, and add them to the symbol table for the current
 scope. While this is a simple, straightforward implementation method, it
 leads to the following problems:

 1) Consider this code:

 	// Quiz for the reader: what does this code do?
 	void myFunc(string text, string number) {
 		import std.conv;
 		import std.stdio;

 		auto x = to!int(number);
 		writeln(text, x);
 	}
 	void main() {
 		myFunc("The count is: ", "123");
 	}

 Here's the output:

 	123

 The reason is that `import std.conv;` pulls in *all* the symbols from
 std.conv into the local scope, including the function called "text".
 This causes "text" (i.e., std.conv.text) to shadow the parameter "text",
 so `writeln(text, x);` is interpreted as:

 	std.stdio.writeln(std.conv.text(), x);

 so the parameter "text" is never used. This is just a simple example;
 the problem becomes much harder to trace when the shadowed symbol is not
 this close to the import scope:

 	struct S {
 		string text;
 		...
 		void method() {
 			import std.conv;
 			...
 			writeln(text); // uses std.conv.text, not this.text
 		}
 	}

 Worse yet, suppose we have this code, that today works correctly:

 	struct S {
 		string buf;
 		void format() {
 			buf = ... /* format S's data into string form */
 		}
 		void method() {
 			import std.file;
 			...		// do some processing based on files
 			format();	// calls this.format()
 		}
 	}

 Now, suppose at some point in the future, we introduce a function also
 called format() in std.file, that formats your hard drive. The user code
 is unchanged. What happens when you recompile the program and run it? It
 will compile successfully... and then format your hard drive without any
 warning.

 This is the crux of the issue behind:

 	https://issues.dlang.org/show_bug.cgi?id=10378


 2) OK, so the conclusion is that unqualified scoped imports are
 dangerous, right? Alright, let's see what happens when we use qualified
 imports:

 	// mod.d
 	module mod;
 	struct S {
 		// Use a fully-qualified import.
 		// We place it in the body of S because S's methods
 		// repeatedly needs it -- after all, DRY is good, right?
 		import std.format : format;

 		void method1(string fmt) {
 			writeln(format(fmt, ... ));
 		}

 		void method2() {
 			auto s = format("abc %s def", ...);
 			...
 		}
 	}

 	// main.d
 	module main;
 	import mod; // we need the definition of S

 	void format(S s) {
 		... /* do something with s */
 	}

 	void main() {
 		S s;
 		s.format(); // UFCS -- should call main.format(s), right?
 	}

 Wrong. That last line in main() actually calls std.format.format. Why?
 Because in mod.d, the `import std.format : format` declaration actually
 pulls in std.format.format into the symbol table of S, therefore,
 S.format now refers to std.format.format. This, therefore, hijacks the
 intended call to main.format.

 This is: https://issues.dlang.org/show_bug.cgi?id=13808


 3) OK, so putting scoped, qualified imports in struct bodies is a bad
 idea, they should always go into the method bodies, then we'll be OK,
 right?  Sure... until you need to do this:

 	// A
 	struct S {
 		// B
 		void method(R)(R range)
 			if (isInputRange!R)
 		{
 			// C
 		}
 	}

 This will not compile unless you import std.range : isInputRange
 somewhere.  Putting it in the line marked //A introduces a module-global
 dependency on std.range, which is what we're trying to eliminate. So
 that's a no-go.  Putting in //C is also a no-go, because we need
 isInputRange in the sig constraint, which is outside the function body.
 So the only option is //B. However, if we do that, then the imported
 symbols will get pulled into S's symbol table, thus you have this
 situation:

 	// mymod.d
 	module mymod;
 	struct S {
 		// Now everyone is happy, right...?
 		import std.range : isInputRange;

 		void method(R)(R range)
 			if (isInputRange!R)
 		{
 		}
 	}

 	// main.d
 	module main;
 	import mymod;

 	// Well ...
 	void main() {
 		S s;

 		static assert(isInputRange!(int[])); // fails -- as it should

 		static assert(s.isInputRange!(int[])); // works -- WAT?
 	}

 So even though isInputRange is really only required by the
 *implementation* of S, yet the consequence of the compiler's
 implementation of import is that it pollutes S's namespace with the
 imported symbols in such a way that external code can access said
 symbols using S as a proxy -- even if S was never intended to act as
 such a proxy!!

 And if you used an unqualified import instead, the problem is
 compounded:

 	// mymod.d
 	module mymod;

 	struct S {
 		import std.algorithm; // oh yes
 		...
 		void method() {
 			// hey, it does make code easier to write, we
 			// don't have to keep importing std.algorithm
 			// all over the place:
 			auto x = map!(...).filter!(...).sort();
 			...
 		}
 		...
 	}

 	// main.d
 	module main;

 	void main() {
 		// hey, we only need the definition of S, right?
 		import mymod : S;

 		S s;

 		// But we can access all sorts of stuff through S...
 		// (Not the mention UFCS hijacking happening everywhere!)
 		auto r = s.map!(...);
 		auto t = s.filter!(...);
 		auto u = s.canFind(...);

 		// when they shouldn't even work at all!!
 		auto v = canFind(s, ...); // <--- compile error
 	}


 4) This isn't the end of the story. There's also this lovely bug:

 	https://issues.dlang.org/show_bug.cgi?id=1238

 which, as its number should tell you, has been around for a LONG time.
 Executive summary:

 	// mymod.d
 	module mymod;
 	private int impl;

 	// main.d
 	module main;
 	import mymod;

 	void impl() { ... }

 	void main() {
 		impl(); // Error: main.impl conflicts with mymod.impl (WAT?)
 	}

 Basically, the underlying cause is the same. Import pulls the symbols of
 mymod.d into the current scope (including private ones -- which the
 compiler currently has to, since otherwise it couldn't compile mymod's
 references to private symbols). So when you try to reference 'impl', the
 overload set contains both main.impl and mymod.impl, which is an
 ambiguity error.

 This doesn't sound too bad, until you consider that mymod.d could be an
 external library that, originally, did *not* define a symbol called
 "impl". So your code happily compiles and works, and then you upgrade to
 the next release of the library, and suddenly you get a compile error,
 because the library authors introduced a PRIVATE symbol into the
 library.


 In all of the above cases, the compiler's current "simple"
 implementation of lookup rules and imports is creating a ripple effect
 of loopholes and corner cases that all add up to poor encapsulation and
 maintainability problems: local imports in structs makes imported
 symbols visible to unrelated outside code; private symbols in imported
 modules conflict with local symbols; even scoped, qualified imports
 (ostensibly the "cleanest" way to import something) can shadow
 parameters and enclosing struct/class members.

 In order to avoid these issues, one has to adopt increasingly convoluted
 ways of writing code based on a series of avoidances -- avoid
 module-global imports (introduces needless dependencies), avoid
 unqualified imports (shadows parameters), avoid putting imports in
 struct/class bodies (leaks imported symbols to outside world), etc..
 This is beginning to sound like C++ all over again (don't use malloc,
 use new; don't use new for arrays, use new[]; don't use raw pointers,
 use std::auto_ptr; etc....). :-(

 I'm finding it harder and harder to accept Walter's stance that symbol
 lookups should be kept simple and free from complications and convoluted
 corner cases, etc.. Except that it is *already* full of convoluted
 pitfalls and corner cases you must avoid, as illustrated above. I wish
 we would just acknowledge that the current symbol lookup / import rules
 (or at least the implementation thereof) are inadequate, and find a
 clean solution to this long-standing issue, instead of hoping that
 denying the problem will somehow, magically, make it all go away.


 T
I agree too. I noticed this too when I was implementing the symbol lookup rules in the semantic engine for DDT. It was very weird to me that the imports would make symbols accessible not only during lexical lookup, but during qualified lookups as well :S . You said that Walter wanted the symbol lookup rules to be as simple as possible. Well, a lot of that bad design could be fixed by simply stating that import symbols are only visibile in lexical lookup. It wouldn't solve all issues, (namely example 1 there) but then we could just use qualified imports instead (and the issues with those would be fixed). -- Bruno Medeiros https://twitter.com/brunodomedeiros
Dec 05 2014
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/3/14 8:00 AM, H. S. Teoh via Digitalmars-d wrote:
 I'm finding it harder and harder to accept Walter's stance that symbol
 lookups should be kept simple and free from complications and convoluted
 corner cases, etc.. Except that it is*already*  full of convoluted
 pitfalls and corner cases you must avoid, as illustrated above. I wish
 we would just acknowledge that the current symbol lookup / import rules
 (or at least the implementation thereof) are inadequate, and find a
 clean solution to this long-standing issue, instead of hoping that
 denying the problem will somehow, magically, make it all go away.
I agree we need to take a closer look at name lookup now that local imports are becoming more popular. This may be one of those cases in which simpler language rules transfer complexity to the user. Andrei
Dec 07 2014
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/2/2014 2:00 PM, H. S. Teoh via Digitalmars-d wrote:
 4) This isn't the end of the story. There's also this lovely bug:

 	https://issues.dlang.org/show_bug.cgi?id=1238

 which, as its number should tell you, has been around for a LONG time.
 Executive summary:

 	// mymod.d
 	module mymod;
 	private int impl;

 	// main.d
 	module main;
 	import mymod;

 	void impl() { ... }

 	void main() {
 		impl(); // Error: main.impl conflicts with mymod.impl (WAT?)
 	}
That error doesn't happen. That isn't what 1238 is.
 Basically, the underlying cause is the same. Import pulls the symbols of
 mymod.d into the current scope
No, it does not. Symbols in the current scope override imports from the same scope. 1238 is about private symbols from one import set against public symbols from another import, an entirely different thing.
Dec 08 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/2/2014 2:00 PM, H. S. Teoh via Digitalmars-d wrote:
 However, there are major issues with scoped imports currently,
 that make this otherwise ideal solution less-than-ideal, which stem from
 the way 'import' is implemented in D. When the compiler encounters an
 'import' statement, what it essentially does is to parse the symbols in
 the target module, and add them to the symbol table for the current
 scope.
No, it does not. Each scope has a symbol table of symbols declared in the current scope, https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.h#L298 and it has an array of imported symbol tables: https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.h#L301 Only if the symbol is not found in the current scope, https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.c#L902 it then looks in the imported symbol tables: https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.c#L910
Dec 08 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Mon, Dec 08, 2014 at 12:34:07AM -0800, Walter Bright via Digitalmars-d wrote:
 On 12/2/2014 2:00 PM, H. S. Teoh via Digitalmars-d wrote:
However, there are major issues with scoped imports currently,
that make this otherwise ideal solution less-than-ideal, which stem from
the way 'import' is implemented in D. When the compiler encounters an
'import' statement, what it essentially does is to parse the symbols in
the target module, and add them to the symbol table for the current
scope.
No, it does not. Each scope has a symbol table of symbols declared in the current scope, https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.h#L298 and it has an array of imported symbol tables: https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.h#L301 Only if the symbol is not found in the current scope, https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.c#L902 it then looks in the imported symbol tables: https://github.com/D-Programming-Language/dmd/blob/master/src/dsymbol.c#L910
I realize that my understanding on the workings of dmd may not be quite accurate, and I apologize for any inadvertent misrepresentations, but the fact remains that the current behaviour of import is very counterintuitive, and users will continue having issues with it if the current state of things continue to hold. Having imported symbols shadow symbols in outer lexical scopes, while it may make sense in implementation, is a very counterintuitive behaviour. It means unrelated modules can hijack symbols from outer lexical scopes with no indication whatsoever -- if you `import external_lib` today, everything may work fine, but upon upgrading external_lib to a new version, suddenly one of your outer lexical symbols is hijacked without any warning. Having private symbols conflict with public ones is also very counterintuitive and frustrating behaviour -- even if the underlying cause is different, it's perceived as a related issue. Having scoped imports at module scope leak symbols to importers of the module is also very counterintuitive. Having symbols imported at struct scope visible via S.symbol is extremely counterintuitive, though I believe Kenji has already submitted a PR for this. (Thanks, Kenji!!) And so on. These may be genuine bugs, or design flaws, or an oversight, but whatever the cause, the result is the same fly in what could otherwise have been a tasty soup of a nice module system. T -- One reason that few people are aware there are programs running the internet is that they never crash in any significant way: the free software underlying the internet is reliable to the point of invisibility. -- Glyn Moody, from the article "Giving it all away"
Dec 08 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/8/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
 I realize that my understanding on the workings of dmd may not be quite
 accurate, and I apologize for any inadvertent misrepresentations, but
 the fact remains that the current behaviour of import is very
 counterintuitive, and users will continue having issues with it if the
 current state of things continue to hold.
I understand, my purpose on that post was to correct misunderstanding of how it currently works. The two-step lookup method, of current scope then imports in current scope, is misunderstood by nearly everyone.
Dec 08 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 08 Dec 2014 14:30:08 -0800
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 The two-step lookup method, of current scope then imports in current scop=
e, is=20
 misunderstood by nearly everyone.
this is the clear sign of the existing problem.
Dec 08 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/8/2014 7:34 PM, ketmar via Digitalmars-d wrote:
 On Mon, 08 Dec 2014 14:30:08 -0800
 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 The two-step lookup method, of current scope then imports in current scope, is
 misunderstood by nearly everyone.
this is the clear sign of the existing problem.
I disagree. None of the issues raised have to do with that.
Dec 08 2014
parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 08 Dec 2014 19:40:38 -0800
Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On 12/8/2014 7:34 PM, ketmar via Digitalmars-d wrote:
 On Mon, 08 Dec 2014 14:30:08 -0800
 Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 The two-step lookup method, of current scope then imports in current s=
cope, is
 misunderstood by nearly everyone.
this is the clear sign of the existing problem.
=20 I disagree. None of the issues raised have to do with that.
if the thing is "misunderstood by nearly everyone" that means that the thing is counterintu... ah, sorry, i forgot that "being counterintuitive" is an unofficial D motto.
Dec 08 2014