digitalmars.D - PROPOSAL: dsimpl
- Russell Lewis (129/129) Feb 22 2008 I have been thinking recently that it would make a lot of sense for us
- Anders Johnsen (7/7) Feb 23 2008 That is a really cool idea!
- Anders Johnsen (3/15) Feb 23 2008 Just found http://code.google.com/p/dil/
- Anders Johnsen (2/20) Feb 23 2008 I was maybe a bit to fast about this - it looks like it's GPL v.3
I have been thinking recently that it would make a lot of sense for us to develop an "D simplifier." This is a tool which would take complex D code, with all the cutting edge features, perform syntax checking, and then simplify it, and produce D code which is based on a vastly reduced feature set. (I argue that it should produce D code, instead of C, because D has many structural features, like dynamic arrays, garbage collection, modules, newfangled-const, etc. which would be impractical to express in C.) Originally, Walter described D as a language that was easy to parse and easy to write a compiler for. While he has kept the language syntax pretty easy to parse, I think we can all agree that it has developed, over time, into a monster of complexity. There's a reason why GDC can't keep up with DMD: it's hard. To make things worse, D has evolved over time to include very complex and (sometimes) poorly-specified features. IFTI (implicit function template instantiation) is a classic example: you write a template, close your eyes, and compile. It's a crap shoot whether the compiler will deduce what seems obvious to you or if it will fail. Let me say right here, though, I don't mean this as a criticism of Walter: IFTI, while a compelling and (IMHO) necessary feature, is mind-bogglingly hard to get right. To write a compiler which could deduce everything that a programmer might deduce is a Herculean task. So I ask anybody: do you think you could write a compiler which deduced everything that DMD deduced? Even if you could, how would you verify that you did? If *anyone* tried to write a compiler of their own, the inevitable consequence would be to fork the language into templates which DMD can deduce, and templates which the other compiler can deduce. Then consider CTFE (compile time function evaluation). This seems better specified...but would you say it would be "easy" to write a compiler that included this feature? Because of this, we have, effectively, only a single compiler in the ecosystem. This is how it stays until we change something. The key insight I had was that the vast majority of the features that make a D compiler complex (like IFTI, CTFE, mixins, etc.) are not things which *structurally* change the program; they are deductions and calculations carried out by the front-end. So, my proposal is that we start an open-source project which is a "D simplifier." This tool takes really complex D code, and performs transformations on it...but only performs transformations that are clear, always-valid, semantics-retaining transformations. That is, it performs only the sorts of transformations that *any* D compiler will need to make. The output is vastly simplified D code. Then, once it is working, we hound Walter to make future language changes (whenever possible) in this "dsimpl" tool, rather than making private changes in DMD. Of course, DMD (and all other compilers) will need to add new features over time. Some features (like const) really need to be known by the back-end D compiler. But most D features can be rewritten in a simpler form, and dsimpl is the standard, shared tool which should accomplish this rewriting. My argument is that it would be easy - or at least much easier - to write a D compiler which compiles that restricted subset. This would make it practical to have an ecosystem with many different D compilers, all of which reflect the same D standard. This tool also makes it practical for people to contribute to language development (when IFTI doesn't deduce your template, you can write the code to make it work, and contribute it to the project) or to experiment with new features. THE COUNTER-ARGUMENT, AND WHY IT REALLY IS THE PRO-ARGUMENT The obvious counter-argument to this is that any "dsimpl" tool, as I envision it, would be very complex. But that is, of course, exactly the reason why we *MUST* have this tool. If the translation tool is complex, then building a compiler that does the translation *and* then compiles the result is *VERY* complex. IMHO, the only reason to *not* have a common, open-source, standard translation tool is if translation was easy! SPECIFIC DESIGN IMHO, if you are going to build an open-source parser for the D language (which, of course, is the first step in writing a compiler/simplifier), you should package the parser as a library. So I would split the project into three parts: * A parser library. The objects include toString() functions to produce the D code that they represent. But the library could also be used by other tools. * A transformation program, which transformed the parsed code into a new form. 3rd party compilers might pick up the code at this point and compile it immediately. * Code which calls toString() on the transformed code to output the simplified code as D THINGS I THINK A SIMPLIFIER SHOULD DO My operating theory is that the simplifier should automatically do things which reduce the complexity of the language. But it should limit itself to things the implementation of which is clear and likely to be acceptable to all compiler writers. If we do a questionable transformation, and as a consequence complier X says "I can't use the simplifier, it does the wrong thing," then we haven't really accomplished what we need to. Here's my proposed (and incomplete) list of things it could do: * IFTI (make all templates explicit) * CTFE (replace all compile time function calls with literals) * Handle mixins * Replace all "auto" variables (type deduction) with explicit types * Make all calls to "typesafe variadic functions" (functions taking variadic arrays) explicit - that is, manually build the array argument which you will pass * Resolve version() and debug() blocks * Expand all aliases to the underlying types * Convert foreach() statements over arrays into for() statements * Convert foreach() statements over other types into delegate literals and call them in the correct way * Convert all operator overloading into member function calls * Rewrite all integer & string literals in a standard form, so the backend parser only has to parse one format for each type. * ...many more things... In addition, I think we could add a set of optional transformations, to be used only by people who wanted them. Such as: * Collapse all code (except library code) into a single source file (necessary for some of the optimizations below). This would involve name mangling of the symbols. * Expand all templates into non-template entities (again, requiring name mangling) and then remove the templates * Declare structs for all delegate literals, with the literal converted to a member function. Use the struct explicitly. (This removes any need to support delegate literals in the simplified language.) * When closures are required, allocate the struct on the heap instead of on the stack. * Inlining * Dead code elimination * Expression simplification & optimization * Advanced whole-program analysis & optimization WHO STARTS THIS? Unfortunately, I need approval from my management chain before I start any open source project, and it's very hard to get. So I can't start this project. But, it's much easier for me to get approval to join a project that is already ongoing. So this email is my proposal; if other people think that this is a good idea, they'll have to get it well started. But then I will join later. :) Thoughts? Russ
Feb 22 2008
That is a really cool idea! But i cannot stop thinking on clang when reading your post. It does pretty much what expect a simplifier to do, plus being a compiler. http://clang.llvm.org/ I would suggest learning a bit from clang when making this. And I would definitely join the project if it was started. Anders Johnsen
Feb 23 2008
Anders Johnsen wrote:That is a really cool idea! But i cannot stop thinking on clang when reading your post. It does pretty much what expect a simplifier to do, plus being a compiler. http://clang.llvm.org/ I would suggest learning a bit from clang when making this. And I would definitely join the project if it was started. Anders JohnsenJust found http://code.google.com/p/dil/ Looks like it's exactly what i was thinking about :)
Feb 23 2008
Anders Johnsen wrote:Anders Johnsen wrote:I was maybe a bit to fast about this - it looks like it's GPL v.3That is a really cool idea! But i cannot stop thinking on clang when reading your post. It does pretty much what expect a simplifier to do, plus being a compiler. http://clang.llvm.org/ I would suggest learning a bit from clang when making this. And I would definitely join the project if it was started. Anders JohnsenJust found http://code.google.com/p/dil/ Looks like it's exactly what i was thinking about :)
Feb 23 2008