digitalmars.D - proposal: private module-level import for faster compilation
- Timothee Cour via Digitalmars-d (35/35) Jul 20 2016 currently, top-level imports in a module A are visible by other modules ...
- Dicebot (4/4) Jul 20 2016 I think this is a wrong approach patching a problem instead of fixing
- Kagamin (13/17) Jul 21 2016 As I see dependency resolution has function granularity, but
- Kagamin (3/6) Jul 21 2016 So a solution for make would be -deps reporting full dependency
- Jacob Carlborg (14/25) Jul 21 2016 A guess:
- Kagamin (3/5) Jul 22 2016 It works for your example, but doesn't work for idiomatic D code,
- Dicebot (6/12) Jul 22 2016 .. which naturally leads to watching about Benjamin DConf talk about
- Jacob Carlborg (5/10) Jul 23 2016 How does this relate to templates? Or is the suggestion to now use
- Dicebot (4/6) Jul 24 2016 Benjamin proposed to stop considering `export` a protection
- Jacob Carlborg (4/7) Jul 24 2016 Ah, forgot that detail.
- ketmar (5/5) Jul 20 2016 i can't see what problem this thing is trying to solve.
- Jack Stouffer (8/9) Jul 20 2016 This, and function local imports, are hacks around the actual
- ketmar (3/6) Jul 20 2016 it does cache that (see template merging), it even causing some
- Timothee Cour (20/20) Jul 20 2016 this simple example shows this feature would provide a 16X
- ketmar (8/10) Jul 20 2016 100 ms speedup in exchange of creating another special case in
- ketmar (3/3) Jul 20 2016 p.s. the sole improvement in symbol lookup mechanics can speed up
- Timothee Cour (44/48) Jul 20 2016 If this example weren't enough, here's the other even more
- deadalnix (7/12) Jul 20 2016 That is purely an implementation problem. SDC doesn't have this
- ketmar (2/4) Jul 20 2016 this, i believe, closes the topic altogether.
- Jack Stouffer (3/5) Jul 20 2016 I concur. If the root problem is slow compilation, then there are
- Johnjo Willoughby (2/8) Jul 21 2016 Three people agree, this could be a first on the internet!
- Sebastien Alaiwan (69/72) Jul 24 2016 I don't think compilation time is a problem, actually. It more
- Chris Wright (17/20) Jul 24 2016 In order to get an equivalent speedup by refactoring, I need small
- Sebastien Alaiwan (34/44) Jul 24 2016 I agree with you, but I think you got me wrong.
currently, top-level imports in a module A are visible by other modules B importing A, and are visited (recursively) during compilation of A, slowing down compilation and increasing dependencies (eg with separate compilation model, a single file change will trigger a lot of recompilations). I propose a private import [1] to mean an import that's only used inside function definitions, not on the outside scope. It behaves exactly as if it the import occurred inside each scope (function and template definitions). This is applicable for the common use case where an import is only used for symbols inside functions, not for types in function signature. ---- module A; private import util; void fun1(){ // as if we had 'import util;' } void fun2(){ // as if we had 'import util;' } // ERROR: we need 'import util' to use baz in function declaration void fun3(baz a){} ---- module util; void bar(){} struct baz{} ---- module B; import A; ---- The following should not list 'util' as a dependency of B, since it's a 'private import' dmd -c -o- -deps A.d The benefits: faster compilation and recompilation (less dependencies). NOTE [1] on syntax: currently private import just means import, we could use a different name if needed, but the particular syntax to use is a separate discussion.
Jul 20 2016
I think this is a wrong approach patching a problem instead of fixing it. Real solution would be to improve and mature .di header generation and usage by compilers so that it can become the default way to import packages/libraries.
Jul 20 2016
On Wednesday, 20 July 2016 at 09:35:03 UTC, Dicebot wrote:I think this is a wrong approach patching a problem instead of fixing it. Real solution would be to improve and mature .di header generation and usage by compilers so that it can become the default way to import packages/libraries.As I see dependency resolution has function granularity, but headers have only file granularity. How do you expect headers to work on finer granularity level? If a module depends on another module, the header must assume it depends on all members of that module and if one member indirectly changes due to its private dependencies, it must be assumed that all depending modules must be recompiled, because they depend on the changed module even if they don't depend on the changed member and its private dependencies. Not sure if tup can solve this problem. It can if it builds full dependency graph for each file instead of having one graph for the whole project.
Jul 21 2016
On Thursday, 21 July 2016 at 08:52:42 UTC, Kagamin wrote:Not sure if tup can solve this problem. It can if it builds full dependency graph for each file instead of having one graph for the whole project.So a solution for make would be -deps reporting full dependency graph per file. Would it work for make?
Jul 21 2016
On 2016-07-21 10:52, Kagamin wrote:As I see dependency resolution has function granularity, but headers have only file granularity. How do you expect headers to work on finer granularity level? If a module depends on another module, the header must assume it depends on all members of that module and if one member indirectly changes due to its private dependencies, it must be assumed that all depending modules must be recompiled, because they depend on the changed module even if they don't depend on the changed member and its private dependencies. Not sure if tup can solve this problem. It can if it builds full dependency graph for each file instead of having one graph for the whole project.A guess: module a; import b; void foo() { Bar bar; } module b; struct Bar {} The .di/header for module "a" don't need to include "import b" because "Bar" is not part of the interface of module "a". -- /Jacob Carlborg
Jul 21 2016
On Friday, 22 July 2016 at 06:38:25 UTC, Jacob Carlborg wrote:The .di/header for module "a" don't need to include "import b" because "Bar" is not part of the interface of module "a".It works for your example, but doesn't work for idiomatic D code, which is always heavily templated.
Jul 22 2016
On 07/22/2016 10:23 AM, Kagamin wrote:On Friday, 22 July 2016 at 06:38:25 UTC, Jacob Carlborg wrote:.. which naturally leads to watching about Benjamin DConf talk about fixing "export" and that is where everything clicks together. Organizing large projects as bunch of small static libraries per package and defining public API of those via `export` (and not just public) would achieve this topic goal and much more, all without changing the language.The .di/header for module "a" don't need to include "import b" because "Bar" is not part of the interface of module "a".It works for your example, but doesn't work for idiomatic D code, which is always heavily templated.
Jul 22 2016
On 2016-07-22 10:28, Dicebot wrote:.. which naturally leads to watching about Benjamin DConf talk about fixing "export" and that is where everything clicks together. Organizing large projects as bunch of small static libraries per package and defining public API of those via `export` (and not just public) would achieve this topic goal and much more, all without changing the language.How does this relate to templates? Or is the suggestion to now use templates on API boundaries? -- /Jacob Carlborg
Jul 23 2016
On Saturday, 23 July 2016 at 19:22:09 UTC, Jacob Carlborg wrote:How does this relate to templates? Or is the suggestion to now use templates on API boundaries?Benjamin proposed to stop considering `export` a protection attribute and mark all functions called from templates as `export` (allowing `export private` if necessary).
Jul 24 2016
On 2016-07-25 01:12, Dicebot wrote:Benjamin proposed to stop considering `export` a protection attribute and mark all functions called from templates as `export` (allowing `export private` if necessary).Ah, forgot that detail. -- /Jacob Carlborg
Jul 24 2016
i can't see what problem this thing is trying to solve. did you ever measured time taken by building AST of imported module? use separate compilation and/or templates to avoid codegen. problem solved.
Jul 20 2016
On Wednesday, 20 July 2016 at 07:45:12 UTC, Timothee Cour wrote:...This, and function local imports, are hacks around the actual problem: the compilers spending time on codegen on things your program will never use. IIRC compiler also spends extra work on templates because it doesn't cache the result, so things like isInputRange!(string) could be evaluated hundreds of times for your program. Fixing those two things are the actual solution.
Jul 20 2016
On Wednesday, 20 July 2016 at 17:05:11 UTC, Jack Stouffer wrote:IIRC compiler also spends extra work on templates because it doesn't cache the result, so things like isInputRange!(string) could be evaluated hundreds of times for your program.it does cache that (see template merging), it even causing some bugs. yet it is using linear search to find something in cache.
Jul 20 2016
this simple example shows this feature would provide a 16X speedup. time dmd -c -o- -version=A -I$code main.d 0.16s time dmd -c -o- -version=B -I$code main.d 0.01s ---main.d: module tests.private_import.main; import tests.private_import.fun; void test(){} --- ---fun.d: module tests.private_import.fun; version(A) import std.datetime; //version(C) private import std.datetime; void foo(){ // same as version(C) if this feature were implemented version(B) import std.datetime; } ---
Jul 20 2016
On Wednesday, 20 July 2016 at 18:09:06 UTC, Timothee Cour wrote:this simple example shows this feature would provide a 16X speedup.100 ms speedup in exchange of creating another special case in language? no, thank you, won't buy. that was exactly what i meant: if we'll look at *real* numbers instead of scales, we'll find that all amazing "speedups" are measured in terms of milliseconds for most projects, and in terms of seconds for 100mb+ projects. breaking language orthogonality for this is something i can't see as improvement. sorry.
Jul 20 2016
p.s. the sole improvement in symbol lookup mechanics can speed up the things by several factors, and without breaking the language. current dmdfe symbol/template lookup mechanics is not really fast.
Jul 20 2016
On Wednesday, 20 July 2016 at 18:21:46 UTC, ketmar wrote:p.s. the sole improvement in symbol lookup mechanics can speed up the things by several factors, and without breaking the language. current dmdfe symbol/template lookup mechanics is not really fast.If this example weren't enough, here's the other even more compelling argument: speedup up recompilation for makefile-like tools that trigger recompilation when a dependency is modified: --- module fun1; import fun2; void test1(){} --- module fun2; version(A) import std.datetime; //version(proposed_feature) private import std.datetime; void test2(){} --- dmd -c -o- -version=proposed_feature -deps fun1.d would show following dependencies: fun2 dmd -c -o- -version=A -deps fun1.d shows following 68 dependencies (68!) That means that a change in any single dependency would trigger recompilations in many files. fun2 core.attribute core.bitop core.exception core.internal.string core.internal.traits core.memory core.stdc.config core.stdc.errno core.stdc.inttypes core.stdc.signal core.stdc.stdarg core.stdc.stddef core.stdc.stdint core.stdc.stdio core.stdc.stdlib core.stdc.string core.stdc.time core.stdc.wchar_ core.sys.osx.mach.kern_return core.sys.posix.config core.sys.posix.dirent core.sys.posix.fcntl core.sys.posix.inttypes core.sys.posix.signal core.sys.posix.stdlib core.sys.posix.sys.select core.sys.posix.sys.stat core.sys.posix.sys.time core.sys.posix.sys.types core.sys.posix.sys.wait core.sys.posix.time core.sys.posix.unistd core.sys.posix.utime core.time core.vararg object std.algorithm std.algorithm.comparison std.algorithm.iteration std.algorithm.mutation std.algorithm.searching std.algorithm.setops std.algorithm.sorting std.array std.ascii std.bitmanip std.conv std.datetime std.exception std.file std.format std.functional std.internal.cstring std.internal.unicode_tables std.meta std.path std.range std.range.interfaces std.range.primitives std.stdio std.stdiobase std.string std.system std.traits std.typecons std.typetuple std.uni
Jul 20 2016
Same answer : http://forum.dlang.org/post/nmngk8$inm$1 digitalmars.com
Jul 20 2016
On Wednesday, 20 July 2016 at 18:51:49 UTC, Timothee Cour wrote:That means that a change in any single dependency would trigger recompilations in many files.so what? can you even imagine how many things you'll have to recompile if you'll change something in /usr/include? it's just your tools usually ignoring files there, but here you clearly included all system dependencies, so not ignoring libc system include files is a valid point from my side.
Jul 20 2016
On Wednesday, 20 July 2016 at 07:45:12 UTC, Timothee Cour wrote:currently, top-level imports in a module A are visible by other modules B importing A, and are visited (recursively) during compilation of A, slowing down compilation and increasing dependencies (eg with separate compilation model, a single file change will trigger a lot of recompilations).That is purely an implementation problem. SDC doesn't have this problem for instance as it only parse/analyze import on demand. modules imported by A are already not visible to B, but still required sometime to compile B to resolve A's signatures. Implementation problem should not be "fixed" by changing the language.
Jul 20 2016
On Wednesday, 20 July 2016 at 19:11:56 UTC, deadalnix wrote:Implementation problem should not be "fixed" by changing the language.this, i believe, closes the topic altogether.
Jul 20 2016
On Wednesday, 20 July 2016 at 19:11:56 UTC, deadalnix wrote:Implementation problem should not be "fixed" by changing the language.I concur. If the root problem is slow compilation, then there are much simpler, non-breaking changes that can be made to fix that.
Jul 20 2016
On Wednesday, 20 July 2016 at 19:59:42 UTC, Jack Stouffer wrote:On Wednesday, 20 July 2016 at 19:11:56 UTC, deadalnix wrote:Three people agree, this could be a first on the internet!Implementation problem should not be "fixed" by changing the language.I concur. If the root problem is slow compilation, then there are much simpler, non-breaking changes that can be made to fix that.
Jul 21 2016
On Wednesday, 20 July 2016 at 19:59:42 UTC, Jack Stouffer wrote:I concur. If the root problem is slow compilation, then there are much simpler, non-breaking changes that can be made to fix that.I don't think compilation time is a problem, actually. It more has to do with dependency management and encapsulation. Speeding up compilation should never be considered as an acceptable solution here, as it's not scalable: it just pushes the problem away, until your project size increases enough. Here's my understanding of the problem: // client.d import server; void f() { Data x; // Data.sizeof depends on something in server_private. x.something = 3; // offset to 'something' depends on privateStuff.sizeof. } // server.d private import server_private; struct Data { Opaque someOtherThing; int something; } // server_private.d struct Opaque { byte[43] privateStuff; } I you're doing separate compilation, your dependency graph has to express that "client.o" depends on "client.d", "server.d", but also "server_private.d". GDC "-fdeps" option properly lists all transitive imported files (disclaimer: this was my pull request). It's irrelevant here that imports might be private or public, the dependency is still here. In other words, changes to "server_private.d" must alway trigger recompilation of "client.d". I believe the solution proposed by the OP doesn't work, because of voldemort types. It's always possible to return a struct whose size depends on something deeply private. // client.d import server; void f() { auto x = getData(); // Data.sizeof depends on something in server_private. x.something = 3; // offset to 'something' depends on privateStuff.sizeof. } // server.d auto getData() { private import server_private; struct Data { Opaque someOtherThing; int something; } Data x; return x; } // server_private.d struct Opaque { byte[43] privateStuff; } My conclusion is that maybe there's no problem in the language, nor in the dependency generation, nor in the compiler implementation. Maybe it's just a design issue.
Jul 24 2016
On Sun, 24 Jul 2016 12:53:50 +0000, Sebastien Alaiwan wrote:Speeding up compilation should never be considered as an acceptable solution here, as it's not scalable: it just pushes the problem away, until your project size increases enough.In order to get an equivalent speedup by refactoring, I need small modules. Probably no more than a handful of functions or one type. I also need to ensure that my types are as simple as possible -- free functions instead of methods, just so I can put them into different modules. That's busywork for me and an inconvenience for anyone who needs to import anything I wrote. Look at std.algorithm. Tons of methods, and I imported it just for `canFind` and `sort`. Look at std.datetime. It imports eight or ten different modules, and it needs every one of those for something or other. Should we split it into a different module for each type, one for formatting, one for parsing, one for fetching the current time, etc? Because that's what we would have to do to work around the problem in user code. That would be terribly inconvenient and would just waste everyone's time. There is no reason to do it when the compiler could do it for us automatically.
Jul 24 2016
On Sunday, 24 July 2016 at 15:33:04 UTC, Chris Wright wrote:Look at std.algorithm. Tons of methods, and I imported it just for `canFind` and `sort`. Look at std.datetime. It imports eight or ten different modules, and it needs every one of those for something or other. Should we split it into a different module for each type, one for formatting, one for parsing, one for fetching the current time, etc? Because that's what we would have to do to work around the problem in user code. That would be terribly inconvenient and would just waste everyone's time.I agree with you, but I think you got me wrong. Modules like std.algorithm (and nearly every other, in any standard library) have very low cohesion. As you said, most of the time, the whole module gets imported, although only 1% of it is going to be used. (selective imports such as "import std.algorithm : canFind;" help you reduce namespace pollution, but not dependencies, because a change in the imported module could, for example, change symbol names.) I guess low cohesion is OK for standard libraries, because splitting this into lots of files would result in long import lists on the user side, e.g: import std.algorithm.canFind; import std.algorithm.sort; import std.algorithm.splitter; (though, this seems a lot like most of us already do with selective imports). But my point wasn't about the extra compilation time resulting from the unwanted import of 99% of std.algorithm. My point is about the recompilation frequency of *your* modules, due to changes in one module. Although std.algorithm has low cohesion, it never changes (upgrading one's compiler doesn't count, as everything needs to be recompiled anyway). However, if your project has a "utils.d" composed of mostly unrelated functions, that is imported by almost every module in your project, and that is frequently changed, then I believe you have a design issue. Any compiler is going to have a very hard time trying to avoid recompiling modules which only imported something in the 99% of utils.d which wasn't modified (and, by the way, it's not compatible with the separate compilation model). Do you think I'm missing something here?
Jul 24 2016