digitalmars.dip.ideas - Mixin C
- Paul Backus (43/43) Mar 07 What if instead of importing C files like D modules, we could
- Richard (Rikki) Andrew Cattermole (8/8) Mar 07 ImportC is fundamentally a compiler specific extension.
- Paul Backus (10/16) Mar 07 Yeah that's reasonable. Honestly it would barely even look
- Richard (Rikki) Andrew Cattermole (7/28) Mar 07 I was thinking about a new kind of string, one that produces a struct
- zjh (10/13) Mar 07 Why not create a separate file `name extension`, such as
- Iain Buclaw (40/48) Mar 08 The first thing that comes to mind is that it'll allow doing
- Lance Bachmeier (8/10) Mar 08 Translation of C macros to D can sometimes [give incorrect
- Lance Bachmeier (3/13) Mar 08 And I realize you have addressed the macro thing too, but I think
- Walter Bright (8/8) Mar 29 You can add unit tests for C code already:
- bachmeier (13/21) Mar 29 The problem raised in that discussion has to do with things like
- Tim (47/50) Mar 08 Are multiple mixin(C) blocks evaluated in the same context?
- Paul Backus (42/88) Mar 08 Each block is preprocessed separately, and the result of
- Paul Backus (14/23) Mar 08 This also makes it trivial to apply function attributes to C
- Daniel N (2/12) Mar 09 wow, that is cool!
- Walter Bright (4/5) Mar 29 ```
- Steven Schveighoffer (31/52) Mar 22 So the entirety of `stdio.h` is included in the body of the D
- Paul Backus (35/62) Mar 23 In this specific example, it's overkill.
- Steven Schveighoffer (20/63) Mar 27 My objection is to the *requirement* that you include it in the
- Paul Backus (58/75) Mar 27 I think most C programmers would regard this as a horrifying
- max haughton (2/6) Mar 28 https://github.com/dlang/dmd/pull/14114
- Walter Bright (10/13) Mar 29 I don't see how it would improve preprocessor support. C macros used in ...
- Paul Backus (3/7) Mar 29 Message received. I won't spend any more time on this going
- Walter Bright (2/9) Mar 29 Your time is valuable and I don't want to waste it.
What if instead of importing C files like D modules, we could write bits of C code directly in the middle of our D code, like we do with inline ASM? It might look something like this: ```d void main() { mixin(C) { #include <stdio.h> printf("Hello from C!\n"); } } ``` Here's how it could work: 1. The compiler takes the content of the `mixin(C)` block and passes it through the external C preprocessor. 2. The result of (1) is parsed as a C AST fragment using the ImportC parser. 3. The result of (2) is spliced into the AST in place of the `mixin(C)` block, and undergoes semantic analysis using ImportC semantics. Mixin C would solve two big issues with the current ImportC approach: the poor preprocessor support, and the conflicts between `.c` and `.d` files in the compiler's import paths. Because Mixin C runs the preprocessor at the point of *usage* rather than the point of *definition*, it allows you to make full use of C APIs that rely on the preprocessor, without having to translate macros to D (either automatically or by hand). Because Mixin C blocks appear as code fragments inside `.d` files, rather than as separate `.c` files, you'll never have to worry about accidentally importing a C file when you meant to import a D module. By the way, if you did want to treat a `.c` or `.h` file like its own module, you'd still be able to do so with Mixin C. Just write a simple `.d` wrapper, like this: ```d module libwhatever; mixin(C) { #include <libwhatever.h> } ``` I haven't spent much time fleshing out the details of this idea, but it seems pretty promising. What do you guys think?
Mar 07
ImportC is fundamentally a compiler specific extension. Having raw C blocks in a D file gives me concern for IDE's wrt. syntax highlighting. We do have a comparable feature with inline assembly support in ldc/gdc where it uses strings. I would suggest that this is the direction to go in rather than a raw code block. It's a good idea that I do think is the right approach to the problem.
Mar 07
On Friday, 8 March 2024 at 03:37:02 UTC, Richard (Rikki) Andrew Cattermole wrote:Having raw C blocks in a D file gives me concern for IDE's wrt. syntax highlighting. We do have a comparable feature with inline assembly support in ldc/gdc where it uses strings. I would suggest that this is the direction to go in rather than a raw code block.Yeah that's reasonable. Honestly it would barely even look different if you use a q{...} string: mixin(C) q{ #include <stdio.h> printf("Hello from C!\n"); }; I guess the lexer currently barfs on #include, but surely we can bend the rules on that if we need to.
Mar 07
On 08/03/2024 5:02 PM, Paul Backus wrote:On Friday, 8 March 2024 at 03:37:02 UTC, Richard (Rikki) Andrew Cattermole wrote:I was thinking about a new kind of string, one that produces a struct that has both a language string that the user provided as well as the contents. That way editors can syntax highlight if they understand it, or ignore it if they don't. Might be overkill, but it does have some interesting possibilities.Having raw C blocks in a D file gives me concern for IDE's wrt. syntax highlighting. We do have a comparable feature with inline assembly support in ldc/gdc where it uses strings. I would suggest that this is the direction to go in rather than a raw code block.Yeah that's reasonable. Honestly it would barely even look different if you use a q{...} string: mixin(C) q{ #include <stdio.h> printf("Hello from C!\n"); }; I guess the lexer currently barfs on #include, but surely we can bend the rules on that if we need to.
Mar 07
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:What if instead of importing C files like D modules, we could write bits of C code directly in the middle of our D code, like we do with inline ASM?Why not create a separate file `name extension`, such as `'dc'`,then like this: ```d //aa.dc #include <stdio.h> printf("Hello from C!\n"); //b.d import aa; ```
Mar 07
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:```d module libwhatever; mixin(C) { #include <libwhatever.h> } ``` I haven't spent much time fleshing out the details of this idea, but it seems pretty promising. What do you guys think?The first thing that comes to mind is that it'll allow doing things inline that would otherwise be impossible in D. ```d import std.stdio; extern(C) int func(int x); mixin(C) { // gdc supports these asm-declarations via gcc (ldc should too via clang) asm(" .globl func .type func, function func: .cfi_startproc movl %edi, %eax addl $1, %eax ret .cfi_endproc "); } int main() { int n = func(72); // mixin(C) not necessary as gdc+ldc support this natively with asm{""::;} mixin(C) { asm ("leal (%0,%0,4),%0" : "=r" (n) : "0" (n)); } writeln("73*5 = ", n); // 73*5 = 365 // ditto, asm{""}, but this is purely for presentation. mixin(C) { asm ("movq $60, %rax\n" "movq $2, %rdi\n" "syscall"); } assert(0); } ```
Mar 08
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:I haven't spent much time fleshing out the details of this idea, but it seems pretty promising. What do you guys think?Translation of C macros to D can sometimes [give incorrect behavior](https://github.com/dlang/dmd/pull/16199). With the current implementation, it does that silently, and that obviously raises questions about the reliability of the final product. My proposal is to allow the user to compile C code inside unit tests so there's at least a chance of catching bugs. What you are proposing would make that possible.
Mar 08
On Friday, 8 March 2024 at 14:39:46 UTC, Lance Bachmeier wrote:On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:And I realize you have addressed the macro thing too, but I think this is separately a valid use case.I haven't spent much time fleshing out the details of this idea, but it seems pretty promising. What do you guys think?Translation of C macros to D can sometimes [give incorrect behavior](https://github.com/dlang/dmd/pull/16199). With the current implementation, it does that silently, and that obviously raises questions about the reliability of the final product. My proposal is to allow the user to compile C code inside unit tests so there's at least a chance of catching bugs. What you are proposing would make that possible.
Mar 08
You can add unit tests for C code already: ``` import myccode; // #define cmacro(a) 2 unittest { assert(cmacro(1) == 2); } ```
Mar 29
On Friday, 29 March 2024 at 07:21:43 UTC, Walter Bright wrote:You can add unit tests for C code already: ``` import myccode; // #define cmacro(a) 2 unittest { assert(cmacro(1) == 2); } ```The problem raised in that discussion has to do with things like ``` #define DOUBLE(x) (x) + (x) DOUBLE(i++); ``` The output of DOUBLE isn't the problem, it's the part below it, which you referred to as metaprogramming. You can only test that by running the preprocessor on both lines. Currently, you'd have to create a new C file and add it to your project in order to put it in a unittest. Now that some time has passed, where I've used this feature with at least 100,000 lines of C code, I'm not as concerned about it.
Mar 29
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:What if instead of importing C files like D modules, we could write bits of C code directly in the middle of our D code, like we do with inline ASM?Are multiple mixin(C) blocks evaluated in the same context? Symbols and macros from one mixin(C) block could then be used in another mixin(C) block, like in this example: ```D mixin(C) { #include <stdio.h> }; void main() { mixin(C) { printf("test\n"); } } ``` The preprocessor call for the second block would need to know all macros from the first call. Can code in mixin(C) statements access local variables from D? How would name conflicts be resolved when an identifier exists both in the current module and a C header file? In the following example `BUFSIZ` is both a local variable and a macro from a C header: ```D void main() { int BUFSIZ = 5; mixin(C) { #include <stdio.h> printf("BUFSIZ = %d\n", BUFSIZ); } } ``` Are variables declared in mixin(C) statements interpreted as global or local variables? ```D void main() { mixin(C) { #include <stdio.h> fprintf(stderr, "test\n"); } } ``` The header declares variable `stderr`. If this is now a local variable, because the header is included inside a function, it could cause problems. Maybe this could be solved by treating C variables marked with `extern` as globals.
Mar 08
On Friday, 8 March 2024 at 17:06:21 UTC, Tim wrote:Are multiple mixin(C) blocks evaluated in the same context? Symbols and macros from one mixin(C) block could then be used in another mixin(C) block, like in this example: ```D mixin(C) { #include <stdio.h> }; void main() { mixin(C) { printf("test\n"); } } ``` The preprocessor call for the second block would need to know all macros from the first call.Each block is preprocessed separately, and the result of preprocessing is then evaluated (type-checked and compiled) in the context of the enclosing D scope. So, symbols are visible across different blocks, but preprocessor macros are not. Your example would work, because it would expand to this: ```d extern(C) int printf(const(char)* format, ...); // Other definitions from stdio.h ... void main() { printf("test\n"); } ``` But this example would not work: ```d mixin(C) { #include <stdio.h> #define info(s) printf("[info] %s\n", s); } void main() { mixin(C) { info("test"); // error - undefined identifier 'info' } } ```Can code in mixin(C) statements access local variables from D? How would name conflicts be resolved when an identifier exists both in the current module and a C header file? In the following example `BUFSIZ` is both a local variable and a macro from a C header: ```D void main() { int BUFSIZ = 5; mixin(C) { #include <stdio.h> printf("BUFSIZ = %d\n", BUFSIZ); } } ```Name lookup in mixin(C) blocks would follow the normal D scoping rules. In this example, since BUFSIZ is a macro, it would be expanded by the preprocessor before the D compiler even parses the C code, and the value from `stdio.h` would be printed. If BUFSIZ were a variable instead of a macro, then you would get a compile-time error for defining two variables with the same name in the same scope.Are variables declared in mixin(C) statements interpreted as global or local variables? ```D void main() { mixin(C) { #include <stdio.h> fprintf(stderr, "test\n"); } } ``` The header declares variable `stderr`. If this is now a local variable, because the header is included inside a function, it could cause problems. Maybe this could be solved by treating C variables marked with `extern` as globals.I believe the C standard actually requires such variables to be treated as globals. The relevant sections are [6.2.2 Linkages of identifiers][1] and [6.2.4 Storage durations of objects][2]. So, assuming the D compiler implements the C standard correctly, this should Just Work.
Mar 08
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:By the way, if you did want to treat a `.c` or `.h` file like its own module, you'd still be able to do so with Mixin C. Just write a simple `.d` wrapper, like this: ```d module libwhatever; mixin(C) { #include <libwhatever.h> } ```This also makes it trivial to apply function attributes to C declarations. For example, if you want everything from `libwhatever.h` to be `nothrow` and ` nogc`, you just write this: ```d nothrow nogc mixin(C) { #include <libwhatever.h> } ``` There's currently no way to do this with ImportC, and even if there were, it would likely require modifying the header file (for example, with the `#pragma` suggested in [issue 23812][1]). [1]: https://issues.dlang.org/show_bug.cgi?id=23812
Mar 08
On Friday, 8 March 2024 at 18:03:47 UTC, Paul Backus wrote:```d nothrow nogc mixin(C) { #include <libwhatever.h> } ``` There's currently no way to do this with ImportC, and even if there were, it would likely require modifying the header file (for example, with the `#pragma` suggested in [issue 23812][1]). [1]: https://issues.dlang.org/show_bug.cgi?id=23812wow, that is cool!
Mar 09
On 3/8/2024 10:03 AM, Paul Backus wrote:There's currently no way to do this with ImportC``` __declspec(nothrow) int foo(); ```
Mar 29
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:What if instead of importing C files like D modules, we could write bits of C code directly in the middle of our D code, like we do with inline ASM? It might look something like this: ```d void main() { mixin(C) { #include <stdio.h> printf("Hello from C!\n"); } } ``` Here's how it could work: 1. The compiler takes the content of the `mixin(C)` block and passes it through the external C preprocessor. 2. The result of (1) is parsed as a C AST fragment using the ImportC parser. 3. The result of (2) is spliced into the AST in place of the `mixin(C)` block, and undergoes semantic analysis using ImportC semantics.So the entirety of `stdio.h` is included in the body of the D main function? Is that wise? The other problem is if you want to use C expressions in D. For example, let's say you have the C definition: ```c #define PI 3.14159 ``` How can I use this in D land? I could assign it to a variable maybe? ```d mixin(C) { #include "pidef.h" double PI_ = PI; } ``` Note, I have to use a new name. And it has to be a variable, because that's all you can do in C. What if I wanted it to be an enum? Too bad, C doesn't support that. What I'd like to see is: a) the C preprocessor is run on *all the mixin(C) islands of the file* regardless of where they appear, whether they are in templates, etc. Basically, take all the mixin(C) things and concatenate them, run the result through the preprocessor, and put the results back where they were. THEN run the importC compiler on them. This allows a more cohesive C-like experience, without having to import/define things over and over. b) Allow mixin(C) expressions, such as `enum PI = mixin(C) { PI }`. Maybe this was already the intention? But I didn't get that vibe from the proposal. -Steve
Mar 22
On Saturday, 23 March 2024 at 02:51:31 UTC, Steven Schveighoffer wrote:So the entirety of `stdio.h` is included in the body of the D main function? Is that wise?In this specific example, it's overkill. In general...is there a better alternative? The C preprocessor is a blunt instrument, and if we want to have full support for it, we are going to have to live with the consequences of that bluntness. With Mixin C, the D programmer at least gets to choose whether they would rather pay the cost of `#include`-ing C headers multiple times, or the cost of translating preprocessor macros by hand.The other problem is if you want to use C expressions in D. For example, let's say you have the C definition: ```c #define PI 3.14159 ``` How can I use this in D land? I could assign it to a variable maybe? ```d mixin(C) { #include "pidef.h" double PI_ = PI; } ``` Note, I have to use a new name. And it has to be a variable, because that's all you can do in C. What if I wanted it to be an enum? Too bad, C doesn't support that.You can use a lambda: ```d enum PI = () { mixin(C) { #include "pidef.h" return PI; } }(); ```What I'd like to see is: a) the C preprocessor is run on *all the mixin(C) islands of the file* regardless of where they appear, whether they are in templates, etc. Basically, take all the mixin(C) things and concatenate them, run the result through the preprocessor, and put the results back where they were. THEN run the importC compiler on them. This allows a more cohesive C-like experience, without having to import/define things over and over.Some downsides to this approach: 1. Concatenating all of the `mixin(C)` blocks in a module for preprocessing violates D's scoping rules and creates a lot of opportunities for "spooky action at a distance." 2. This would allow sharing macro definitions across `mixin(C)` blocks, but would *not* allow sharing declarations. You'd still have to `#include <stdio.h>` twice if you wanted to call `printf` in two different blocks, for example. 3. In order to "put the results back where they were" the D compiler would have to parse the preprocessor's output for [line markers][1]. Since the format of these is not specified by the C standard, this means the D compiler would have to have separate parsers for each C preprocessor implementation (or, at least, one for gcc/clang and one for MSVC). [1]: https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html
Mar 23
On Saturday, 23 March 2024 at 19:02:52 UTC, Paul Backus wrote:On Saturday, 23 March 2024 at 02:51:31 UTC, Steven Schveighoffer wrote:My objection is to the *requirement* that you include it in the main function. It's very different from D nested imports, as it redefines everything inside the function. Having to re-import *everything* everywhere you need to use a macro is really bad.So the entirety of `stdio.h` is included in the body of the D main function? Is that wise?In this specific example, it's overkill. In general...is there a better alternative? The C preprocessor is a blunt instrument, and if we want to have full support for it, we are going to have to live with the consequences of that bluntness.You can use a lambda: ```d enum PI = () { mixin(C) { #include "pidef.h" return PI; } }(); ```Ugh, still having to include the entirety of a C header inside a function context.But that's what you get with C. For instance, you can #define a macro inside a function, and use it inside another function, as long as it comes later in the file. It's not spooky to C programmers. You can even #undef things or re #define them.What I'd like to see is: a) the C preprocessor is run on *all the mixin(C) islands of the file* regardless of where they appear, whether they are in templates, etc. Basically, take all the mixin(C) things and concatenate them, run the result through the preprocessor, and put the results back where they were. THEN run the importC compiler on them. This allows a more cohesive C-like experience, without having to import/define things over and over.Some downsides to this approach: 1. Concatenating all of the `mixin(C)` blocks in a module for preprocessing violates D's scoping rules and creates a lot of opportunities for "spooky action at a distance."2. This would allow sharing macro definitions across `mixin(C)` blocks, but would *not* allow sharing declarations. You'd still have to `#include <stdio.h>` twice if you wanted to call `printf` in two different blocks, for example.but you wouldn't have to include them inside the functions. You get the function definitions and macros in the right place (at module level).3. In order to "put the results back where they were" the D compiler would have to parse the preprocessor's output for line markers. Since the format of these is not specified by the C standard, this means the D compiler would have to have separate parsers for each C preprocessor implementation (or, at least, one for gcc/clang and one for MSVC).I came up with an approach for this, I detailed it in my dconf talk last year. All preprocessors have a flag which preserves comments. But yes, you are right this is a big hacky problem to solve. -Steve
Mar 27
On Thursday, 28 March 2024 at 02:24:34 UTC, Steven Schveighoffer wrote:On Saturday, 23 March 2024 at 19:02:52 UTC, Paul Backus wrote:I think most C programmers would regard this as a horrifying abuse of the preprocessor--the kind of thing they switch to D (and other languages) to get away from.Some downsides to this approach: 1. Concatenating all of the `mixin(C)` blocks in a module for preprocessing violates D's scoping rules and creates a lot of opportunities for "spooky action at a distance."But that's what you get with C. For instance, you can #define a macro inside a function, and use it inside another function, as long as it comes later in the file. It's not spooky to C programmers. You can even #undef things or re #define them.It sounds like the usage pattern you're envisioning is something like this: ```d /// Bindings for libfoo module foo; mixin(C) { #include "libfoo.h" } /// Wraps foo_do_stuff void doStuff(int x) { mixin(C) { // reuses top-level #include foo_do_stuff(x, FOO_SOME_MACRO); } } /// Wraps foo_do_other_stuff void doOtherStuff(const(char)* s) { mixin(C) { // reuses top-level #include foo_do_other_stuff(s, FOO_SOME_OTHER_MACRO); } } ``` I agree that supporting this usage pattern would be desirable, but I'm not sure concatenating every `mixin(C)` block in a module is the best way to do so. Perhaps instead we can have a dedicated "`mixin(C)` header" block that can appear at module scope, whose content is prepended to each `mixin(C)` block? E.g., ```d /// Bindings for libfoo module foo; mixinC_header { #include "foo.h" } void doStuff(int x) { mixin(C) { // #include "foo.h" inserted here foo_do_stuff(x, FOO_SOME_MACRO); } } // etc. ``` Granted, this only solves the UX problems of the original proposal, not the performance problems--under the hood you are still doing a bunch of separate preprocessor calls, and including the same file over and over. But, again, you are not *forced* to use Mixin C like this, and programmers who want to optimize their build times will still have alternatives to turn to that do not require opening the door to total macro madness.2. This would allow sharing macro definitions across `mixin(C)` blocks, but would *not* allow sharing declarations. You'd still have to `#include <stdio.h>` twice if you wanted to call `printf` in two different blocks, for example.but you wouldn't have to include them inside the functions. You get the function definitions and macros in the right place (at module level).
Mar 27
On Friday, 8 March 2024 at 03:23:12 UTC, Paul Backus wrote:What if instead of importing C files like D modules, we could write bits of C code directly in the middle of our D code, like we do with inline ASM? [...]https://github.com/dlang/dmd/pull/14114
Mar 28
On 3/7/2024 7:23 PM, Paul Backus wrote:Mixin C would solve two big issues with the current ImportC approach: the poor preprocessor support,I don't see how it would improve preprocessor support. C macros used in D code won't fare any better than they do now.and the conflicts between `.c` and `.d` files in the compiler's import paths.Frankly, I never understood why that was an issue. Change the name of one of them, or put them in different paths. I appreciate the effort you've put into this. But I have to be blunt. I've seen this before. Here's what it looks like in practice: https://elsmar.com/elsmarqualityforum/media/redneck-car-air-conditioning.1560/ D is a beautiful language to program in. Let's keep it that way! https://www.joemacari.com/stock/ferrari-daytona-spyder/10004904
Mar 29
On Friday, 29 March 2024 at 07:49:23 UTC, Walter Bright wrote:I appreciate the effort you've put into this. But I have to be blunt. I've seen this before. Here's what it looks like in practice: https://elsmar.com/elsmarqualityforum/media/redneck-car-air-conditioning.1560/Message received. I won't spend any more time on this going forward.
Mar 29
On 3/29/2024 2:22 PM, Paul Backus wrote:On Friday, 29 March 2024 at 07:49:23 UTC, Walter Bright wrote:Your time is valuable and I don't want to waste it.I appreciate the effort you've put into this. But I have to be blunt. I've seen this before. Here's what it looks like in practice: https://elsmar.com/elsmarqualityforum/media/redneck-car-air-conditioning.1560/Message received. I won't spend any more time on this going forward.
Mar 29