digitalmars.D - cycle dependencies
- DigitalDesigns (10/10) May 30 2018 Seriously stupid bug here!
- Andrea Fontana (4/15) May 30 2018 I think an example would be useful. Anyway, if you think there's
- Steven Schveighoffer (17/30) May 30 2018 It is ignorant. It only checks for the presence of static constructors,
- Stefan (5/7) May 30 2018 For serious dependency checking, you could try
- Steven Schveighoffer (23/31) May 30 2018 OK, so a couple things:
- DigitalDesigns (34/68) May 30 2018 Why is this a runtime issue? It is not as if the execution of
- Simen =?UTF-8?B?S2rDpnLDpXM=?= (18/24) May 30 2018 Because of separate compilation, the compiler can't do it.
- Steven Schveighoffer (15/39) May 31 2018 Yes, this is really the crux of it.
- David Bennett (16/40) May 31 2018 Thinking about this problem for a while I can up with something
- Steven Schveighoffer (22/63) May 31 2018 It's really an interesting problem, probably one of the most puzzling
- Neia Neutuladh (30/37) May 31 2018 This is how languages get custom object formats and custom
- Steven Schveighoffer (23/43) Jun 01 2018 Yep. I know that this is something that we need to be wary of. The
- Neia Neutuladh (11/18) Jun 01 2018 Thanks to .di files, when I compile my executable, I don't always
- Steven Schveighoffer (19/37) Jun 01 2018 The .di file is just an interface, it doesn't know what's actually
- Neia Neutuladh (12/17) Jun 02 2018 Yes, this is my point. (Communication is much harder than I
- Simen =?UTF-8?B?S2rDpnLDpXM=?= (10/24) Jun 02 2018 I believe Steve's point was that with the suggested json file
Seriously stupid bug here! I had an enum an static this() { } (empty, forgot why I added it) one module and an struct in another module that converted values from the enum. I imported only that enum and guess what?!?! Cycle dependency! removed the static this() { } and worked! The cycle dependency checking code is far too ignorant. It just blindly checks without any rationale. Someone said this was because phobos had a bug in it so the cycles were checked, maybe it's time to address that issue rather than breaking code arbitrarily?
May 30 2018
On Wednesday, 30 May 2018 at 12:09:21 UTC, DigitalDesigns wrote:Seriously stupid bug here! I had an enum an static this() { } (empty, forgot why I added it) one module and an struct in another module that converted values from the enum. I imported only that enum and guess what?!?! Cycle dependency! removed the static this() { } and worked! The cycle dependency checking code is far too ignorant. It just blindly checks without any rationale. Someone said this was because phobos had a bug in it so the cycles were checked, maybe it's time to address that issue rather than breaking code arbitrarily?I think an example would be useful. Anyway, if you think there's a bug there, write a report: https://issues.dlang.org Andrea
May 30 2018
On 5/30/18 8:09 AM, DigitalDesigns wrote:Seriously stupid bug here! I had an enum an static this() { } (empty, forgot why I added it) one module and an struct in another module that converted values from the enum. I imported only that enum and guess what?!?! Cycle dependency! removed the static this() { } and worked! The cycle dependency checking code is far too ignorant.It is ignorant. It only checks for the presence of static constructors, not what they do. This is as designed -- there are too many ways for dependencies to leak between modules that it's really really hard to make it check real dependencies.It just blindly checks without any rationale. Someone said this was because phobos had a bug in it so the cycles were checked, maybe it's time to address that issue rather than breaking code arbitrarily?I don't know what was said, but the cycle code was very broken originally. A few years ago, I fixed it, and then it was re-broken by accident, and I re-fixed it. It should work properly now. But that has nothing to do with adding an empty static ctor, it had to do with properly detecting cycles. The cycle detection functions as designed. You can disable it via DRT flags, but this means that there is no guarantee that they run in the proper order. The compiler would have to build a much more sophisticated dependency tree in order to do anything more clever, but I doubt it would ever be able to be perfect in detecting which things truly depend on one another. -Steve
May 30 2018
On Wednesday, 30 May 2018 at 13:26:53 UTC, Steven Schveighoffer wrote:On 5/30/18 8:09 AM, DigitalDesigns wrote: ... it's really really hard to make it check real dependencies.For serious dependency checking, you could try https://github.com/funkwerk/depend. It's used in quite large commercial projects.
May 30 2018
On 5/30/18 11:50 AM, Stefan wrote:On Wednesday, 30 May 2018 at 13:26:53 UTC, Steven Schveighoffer wrote:OK, so a couple things: 1. The cycle checking has to be really really fast, and run in a small amount of memory. It runs on startup of EVERY D program. I would LOVE to get rid of this problem, but unless we either invent our own object/linker system, or create some tools to work on the linked binary, we have to do this. 2. The cycle algorithm is fast and works correctly, given the dependencies that the compiler has provided. It's really not the cycle algorithm that is the problem, but the determination of dependencies. In other words, the problem is the compiler not tracing all the actual dependencies and outlining that in some structure to be parsed later during program startup. As you increase the number of dependency points, and therefore the graph size, the cycle algorithm has to take longer to figure out if there are any cycles. So even if we can fix the problem outlined here, the cost may not make it worth the effort! There are some ideas Martin and I have fleshed out a bit, which would help reduced the cyclic dependencies, but so far, nobody has had the time and/or skills to implement. For instance: https://issues.dlang.org/show_bug.cgi?id=16265 https://github.com/dlang/druntime/pull/1602#issuecomment-231527759 -SteveOn 5/30/18 8:09 AM, DigitalDesigns wrote: ... it's really really hard to make it check real dependencies.For serious dependency checking, you could try https://github.com/funkwerk/depend. It's used in quite large commercial projects.
May 30 2018
On Wednesday, 30 May 2018 at 18:49:40 UTC, Steven Schveighoffer wrote:On 5/30/18 11:50 AM, Stefan wrote:Why is this a runtime issue? It is not as if the execution of static this are non-deterministic. The compiler and linker must order the calls in some way. Maybe this is what you mean by own object/linker? But even then, they would only have to be checked once so why check every time the exe is ran when once it is ran it must remain statically oriented for all future. two exe's could be produced, one with only the static this code which is ran after linking and verified, if doesn't pass then a linker error is thrown. These methods could be used for better alternatives but the real issue is dependencies of static constructors. How bout instead require the user to deal with it? The user can specify the order to run and the compiler just blindly obeys. pragma(cycle, 1) static this() { } ... pragma(cycle, 2) static this() { } It could require cycle to be unique or just share and let the compiler do what it wants(assumes no cycles). This gives the programmer a fix but doesn't assume there are cycles, which most times there are not. Later on, more intelligent cycle detecting code can be created and added and then can give a warning in which the user can go back and add the pragmas. Alternatively, only require the pragmas on modules that cyclically import each other. Even if their is no cycle, the programmer just adds a pragma to satisfy the compiler/linker/rt.On Wednesday, 30 May 2018 at 13:26:53 UTC, Steven Schveighoffer wrote:OK, so a couple things: 1. The cycle checking has to be really really fast, and run in a small amount of memory. It runs on startup of EVERY D program. I would LOVE to get rid of this problem, but unless we either invent our own object/linker system, or create some tools to work on the linked binary, we have to do this. 2. The cycle algorithm is fast and works correctly, given the dependencies that the compiler has provided. It's really not the cycle algorithm that is the problem, but the determination of dependencies. In other words, the problem is the compiler not tracing all the actual dependencies and outlining that in some structure to be parsed later during program startup. As you increase the number of dependency points, and therefore the graph size, the cycle algorithm has to take longer to figure out if there are any cycles. So even if we can fix the problem outlined here, the cost may not make it worth the effort! There are some ideas Martin and I have fleshed out a bit, which would help reduced the cyclic dependencies, but so far, nobody has had the time and/or skills to implement. For instance: https://issues.dlang.org/show_bug.cgi?id=16265 https://github.com/dlang/druntime/pull/1602#issuecomment-231527759 -SteveOn 5/30/18 8:09 AM, DigitalDesigns wrote: ... it's really really hard to make it check real dependencies.For serious dependency checking, you could try https://github.com/funkwerk/depend. It's used in quite large commercial projects.
May 30 2018
On Wednesday, 30 May 2018 at 20:57:32 UTC, DigitalDesigns wrote:Why is this a runtime issue? It is not as if the execution of static this are non-deterministic. The compiler and linker must order the calls in some way. Maybe this is what you mean by own object/linker? But even then, they would only have to be checked once so why check every time the exe is ran when once it is ran it must remain statically oriented for all future.Because of separate compilation, the compiler can't do it. Because the generic linker doesn't do that sort of thing, the linker can't do it. The first part is essentially intractable - e.g. module A's static this uses a global variable in module B that's set by module C. Module A may be compiled separately from module C, so the compiler can't see the dependency. If the linker is to do it, the compiler needs to encode the information in the object file, and the linker must be made specially to support this. Maybe this could be put in some optional section in the object file, and linkers that don't support it would just ignore the information. If a non-compliant linker is used, the runtime needs to have a fallback, so even this won't get us entirely out of the woods. Only supporting special linkers comes with its own set of problems. -- Simen
May 30 2018
On 5/31/18 2:14 AM, Simen Kjærås wrote:On Wednesday, 30 May 2018 at 20:57:32 UTC, DigitalDesigns wrote:Yes, this is really the crux of it. I want to stress that the cycle detection is not really for the sake of detecting cycles, it's because you need to sort the modules in the order their static ctors should be called. When you can't sort them in an order, you have to call them in an arbitrary order (probably just in the order they were linked). It is really a shame we have to do this at runtime, but that's due to the compilation model we are stuck with, and the linker tools we deal with. However, once the binary is formed, it's possible we could *edit* the binary to hard-code the order -- at that point, everything is known. I have thrown that out there in terms of a nice project someone with the correct skills could do: https://forum.dlang.org/post/nl3du7$1as5$1 digitalmars.com OneWhy is this a runtime issue? It is not as if the execution of static this are non-deterministic. The compiler and linker must order the calls in some way. Maybe this is what you mean by own object/linker? But even then, they would only have to be checked once so why check every time the exe is ran when once it is ran it must remain statically oriented for all future.Because of separate compilation, the compiler can't do it. Because the generic linker doesn't do that sort of thing, the linker can't do it. The first part is essentially intractable - e.g. module A's static this uses a global variable in module B that's set by module C. Module A may be compiled separately from module C, so the compiler can't see the dependency. If the linker is to do it, the compiler needs to encode the information in the object file, and the linker must be made specially to support this. Maybe this could be put in some optional section in the object file, and linkers that don't support it would just ignore the information. If a non-compliant linker is used, the runtime needs to have a fallback, so even this won't get us entirely out of the woods. Only supporting special linkers comes with its own set of problems.
May 31 2018
On Thursday, 31 May 2018 at 15:15:44 UTC, Steven Schveighoffer wrote:On 5/31/18 2:14 AM, Simen Kjærås wrote:Thinking about this problem for a while I can up with something that could both reduce the work the runtime had to do and allow us to remove the error in the OP. But now I see most of it was suggested in that pull request. [1] Would the best way to implement this be to extend ModuleInfo and include a getter that loads the dependencies like importedModules(), or should the ctor/dtor stuff be moved to a new tables? Also we might be able to get the compiler to insert a cluster_hash that's unique for each compiler run and a pre-ordered cluster_index. Then use this to speed up sorting in the runtime. [1] https://github.com/dlang/druntime/pull/1602#issuecomment-231527759On Wednesday, 30 May 2018 at 20:57:32 UTC, DigitalDesigns wrote:Yes, this is really the crux of it. I want to stress that the cycle detection is not really for the sake of detecting cycles, it's because you need to sort the modules in the order their static ctors should be called. When you can't sort them in an order, you have to call them in an arbitrary order (probably just in the order they were linked). It is really a shame we have to do this at runtime, but that's due to the compilation model we are stuck with, and the linker tools we deal with.Why is this a runtime issue? It is not as if the execution of static this are non-deterministic. The compiler and linker must order the calls in some way.Because of separate compilation, the compiler can't do it. Because the generic linker doesn't do that sort of thing, the linker can't do it. The first part is essentially intractable - e.g. module A's static this uses a global variable in module B that's set by module C. Module A may be compiled separately from module C, so the compiler can't see the dependency.
May 31 2018
On 5/31/18 5:12 PM, David Bennett wrote:On Thursday, 31 May 2018 at 15:15:44 UTC, Steven Schveighoffer wrote:It's really an interesting problem, probably one of the most puzzling algorithmic problems I've had to solve. However, a lot of this is complicated because we don't do it at compile-time, but at runtime instead. We have way more time and memory and whatever we want to throw at it if we can move the sorting and cycle check to build time instead of runtime. If we want to make the check more fine grained, we are talking metadata for every function call, and which static ctor-initialized items it may access, and which lines actually initialize them.On 5/31/18 2:14 AM, Simen Kjærås wrote:Thinking about this problem for a while I can up with something that could both reduce the work the runtime had to do and allow us to remove the error in the OP. But now I see most of it was suggested in that pull request. [1] Would the best way to implement this be to extend ModuleInfo and include a getter that loads the dependencies like importedModules(), or should the ctor/dtor stuff be moved to a new tables?On Wednesday, 30 May 2018 at 20:57:32 UTC, DigitalDesigns wrote:Yes, this is really the crux of it. I want to stress that the cycle detection is not really for the sake of detecting cycles, it's because you need to sort the modules in the order their static ctors should be called. When you can't sort them in an order, you have to call them in an arbitrary order (probably just in the order they were linked). It is really a shame we have to do this at runtime, but that's due to the compilation model we are stuck with, and the linker tools we deal with.Why is this a runtime issue? It is not as if the execution of static this are non-deterministic. The compiler and linker must order the calls in some way.Because of separate compilation, the compiler can't do it. Because the generic linker doesn't do that sort of thing, the linker can't do it. The first part is essentially intractable - e.g. module A's static this uses a global variable in module B that's set by module C. Module A may be compiled separately from module C, so the compiler can't see the dependency.Also we might be able to get the compiler to insert a cluster_hash that's unique for each compiler run and a pre-ordered cluster_index. Then use this to speed up sorting in the runtime.Yeah, I don't think the runtime speed is a huge problem. Initializing the hash doesn't take much time at all. Hm... I just had a crazy idea: what if D had a switch that allowed it to store a dependency graph of constructors into a json file, and then when you link, there is a wrapper which consumes all these files, runs the cycle detection and sort, and then compiles a perfectly sorted moduleinfo section to be included in the binary (obviously, written in D and compiled by the compiler). This doesn't affect the linker, and uses all the existing tools we have to create object files and binaries. I'll have to bring this out into its own thread so it's not buried. -Steve
May 31 2018
On Thursday, 31 May 2018 at 23:17:20 UTC, Steven Schveighoffer wrote:Hm... I just had a crazy idea: what if D had a switch that allowed it to store a dependency graph of constructors into a json file, and then when you link, there is a wrapper which consumes all these files, runs the cycle detection and sort, and then compiles a perfectly sorted moduleinfo section to be included in the binary (obviously, written in D and compiled by the compiler).This is how languages get custom object formats and custom linkers. A fly in the ointment is .di files. This works today and does the right thing: // bar.d import modulewithstaticctor; shared static this() {} string something = modulewithstaticctor.someValue; // bar.di // no static ctor, no imports string something; // foo.d import bar, std.stdiod; void main() { writefln(something); } So mixing that dependency graph with .di files is "here be dragons" territory. If the dependency data were inserted into the object files instead (which it should be? that's what ModuleInfo is), then the compiler could potentially read this out before linking, and we could be sure that the resulting order is correct. Like, insert a weak symbol into each normal object file saying there's no dependency graph, and then the compiler can run on a set of object files to produce a new object file with the complete dependency graph as a strong symbol. But that's more work.
May 31 2018
On 5/31/18 8:49 PM, Neia Neutuladh wrote:On Thursday, 31 May 2018 at 23:17:20 UTC, Steven Schveighoffer wrote:Yep. I know that this is something that we need to be wary of. The proposed file format, however, would not be something like an object or even a linker, just a text file that hints in how to link these together without having to do the sort.Hm... I just had a crazy idea: what if D had a switch that allowed it to store a dependency graph of constructors into a json file, and then when you link, there is a wrapper which consumes all these files, runs the cycle detection and sort, and then compiles a perfectly sorted moduleinfo section to be included in the binary (obviously, written in D and compiled by the compiler).This is how languages get custom object formats and custom linkers.A fly in the ointment is .di files. This works today and does the right thing:This actually is not a problem, because the dependency tree doesn't depend on whether the imported module has static ctors. It's how this works today -- each module only knows whether the modules has static ctors, and what imports it has. It doesn't go further than that. The dependency file would keep this same mechanism, we wouldn't be doing more interpretation.If the dependency data were inserted into the object files instead (which it should be? that's what ModuleInfo is), then the compiler could potentially read this out before linking, and we could be sure that the resulting order is correct. Like, insert a weak symbol into each normal object file saying there's no dependency graph, and then the compiler can run on a set of object files to produce a new object file with the complete dependency graph as a strong symbol.It's an interesting idea, the only benefit really being to avoid having separate files for the data. I don't know enough about object files to be able to speak coherently about it. I have a much better feeling, however, about having a more readable/understandable file format than object files. It's definitely a lot easier to deal with something like a JSON file than an object file.But that's more work.There's nothing in this proposal that is less than a lot of work :) So putting a bit more on top is OK, if it makes things more useable. This isn't something to flesh out buried in a sub-thread anyway. I'm going to put together a more complete description and proposal. Not a DIP yet, because I want to gauge interest first. -Steve
Jun 01 2018
On Friday, 1 June 2018 at 12:50:31 UTC, Steven Schveighoffer wrote:Thanks to .di files, when I compile my executable, I don't always know which modules import which other modules. That was half the point of my example. Are you talking about giving the initialization order for a subset of modules? Because that wouldn't give much of a speed increase. Or do you need all of your dependencies to have their own dependency order, and the compiler can concatenate them? As an alternative, you could produce the initialization order by running the program and seeing what order it uses.A fly in the ointment is .di files. This works today and does the right thing:This actually is not a problem, because the dependency tree doesn't depend on whether the imported module has static ctors. It's how this works today -- each module only knows whether the modules has static ctors, and what imports it has. It doesn't go further than that.
Jun 01 2018
On 6/1/18 11:42 AM, Neia Neutuladh wrote:On Friday, 1 June 2018 at 12:50:31 UTC, Steven Schveighoffer wrote:The .di file is just an interface, it doesn't know what's actually compiled in the binary. To put it another way, the compiler only generates a ModuleInfo (or dependency modules) for .d files. .di files are simply a public API for the .d files.Thanks to .di files, when I compile my executable, I don't always know which modules import which other modules. That was half the point of my example.A fly in the ointment is .di files. This works today and does the right thing:This actually is not a problem, because the dependency tree doesn't depend on whether the imported module has static ctors. It's how this works today -- each module only knows whether the modules has static ctors, and what imports it has. It doesn't go further than that.Are you talking about giving the initialization order for a subset of modules? Because that wouldn't give much of a speed increase. Or do you need all of your dependencies to have their own dependency order, and the compiler can concatenate them?Speed is one aspect, but the more important aspect is to do more fine-grained dependency tracking. Doing it on a function level vs. simply assuming all things in the module depend on all things in the dependent modules, allowing for instance module cycles where there is no real cycle between the data initialized. That's not going to work well with today's mechanisms because it would be much more expensive both in runtime and memory usage, and even further highlight the drawback that we are solving the same static problem on every run of a program.As an alternative, you could produce the initialization order by running the program and seeing what order it uses.Yes, but this still involves pushing all the dependency data into the binary. All we should care about at runtime is the order to execute ctors. We could even eliminate the needless storage of all the dependency modules that is there now. -Steve
Jun 01 2018
On Friday, 1 June 2018 at 17:59:21 UTC, Steven Schveighoffer wrote:The .di file is just an interface, it doesn't know what's actually compiled in the binary. To put it another way, the compiler only generates a ModuleInfo (or dependency modules) for .d files. .di files are simply a public API for the .d files.Yes, this is my point. (Communication is much harder than I thought.) When you encounter a .di file, you can't rely on an automated tool to tell you what modules need to be initialized or figure out an order for them. You have to do it manually, and if you mess it up, you get undefined behavior. This is why I called it "here be dragons" -- it's fraught. Unless your goal was to omit the depth-first search in the common case while preserving the rest of the current logic. I'm curious how much time that would save.
Jun 02 2018
On Saturday, 2 June 2018 at 17:17:02 UTC, Neia Neutuladh wrote:On Friday, 1 June 2018 at 17:59:21 UTC, Steven Schveighoffer wrote:I believe Steve's point was that with the suggested json file describing dependencies, that would be a part of the public interface, and so would be distributed alongside .di files. Of course, someone could forget to distribute those, and in such a case the runtime cycle test could be extended to do the conservative test if the compiler hasn't registered all modules as having json info. -- SimenThe .di file is just an interface, it doesn't know what's actually compiled in the binary. To put it another way, the compiler only generates a ModuleInfo (or dependency modules) for .d files. .di files are simply a public API for the .d files.Yes, this is my point. (Communication is much harder than I thought.) When you encounter a .di file, you can't rely on an automated tool to tell you what modules need to be initialized or figure out an order for them. You have to do it manually, and if you mess it up, you get undefined behavior.
Jun 02 2018