digitalmars.D - [SAoC] 'DPP with Linux kernel headers' Project Thread
- Cristian Becerescu (50/50) Sep 16 2019 Hi all,
- Jacob Carlborg (13/15) Sep 18 2019 DStep [1], which is very similar to dpp and uses libclang, will forward
- Atila Neves (4/11) Sep 22 2019 You can pass command-line options directly to dpp with
- Cristian Becerescu (46/46) Oct 04 2019 Sorry for not updating this thread in a while.
- Cristian Becerescu (95/95) Oct 10 2019 It's time for a new update, so here we go. Sorry for those long
- Atila Neves (3/6) Oct 11 2019 Could you please file issues on github so I can fix them? Thanks.
- RazvanN (3/11) Oct 13 2019 It looks like Cristian has already solved these issues. He will
- Cristian Becerescu (10/10) Oct 20 2019 Update for week 1 of Milestone 2.
- Cristian Becerescu (25/25) Oct 28 2019 Update for week 2 of Milestone 2
- Newbie2019 (7/10) Oct 28 2019 Maybe you can try:
- Cristian Becerescu (6/18) Oct 28 2019 Initially I thought about checking if the argument to typeof() is
- Jacob Carlborg (4/8) Oct 29 2019 What exactly is the problem?
- Cristian Becerescu (12/13) Nov 01 2019 Sorry for taking so long to reply.
- Jacob Carlborg (22/28) Nov 02 2019 Wow, that's complicated. DStep [1] generates this:
- Cristian Becerescu (7/15) Nov 03 2019 When encountering anonymous structs or unions, dpp gives them a
- Jacob Carlborg (6/11) Nov 04 2019 But the question is why is DPP generating named unions/structs
- Atila Neves (4/13) Nov 04 2019 Probably because I either didn't know that at the time but knew
- rikki cattermole (4/19) Nov 04 2019 It sounds like you were too liberal in doing this.
- Jacob Carlborg (12/14) Nov 04 2019 You mean like this?
- rikki cattermole (4/20) Nov 04 2019 Yes it does.
- Atila Neves (3/23) Nov 05 2019 Sounds about right.
- bachmeier (3/19) Nov 04 2019 I don't know if this is related, but dpp ignored anonymous enums,
- Cristian Becerescu (12/12) Nov 03 2019 Update for week 3 of Milestone 2
- Cristian Becerescu (82/82) Nov 16 2019 Update for week 4 of Milestone 2 (ended on November 10th, sorry
- Patrick Schluter (6/13) Nov 16 2019 Shouldn't that be done automatically by the D compiler? It's
- Cristian Becerescu (12/27) Nov 16 2019 I tested some cases.
- Patrick Schluter (9/38) Nov 17 2019 I wanted to interject that it is in violation of the C standard
- Cristian Becerescu (15/35) Nov 18 2019 PR: https://github.com/atilaneves/dpp/pull/213
- Cristian Becerescu (32/48) Nov 25 2019 Update for week 1 of Milestone 3
- Cristian Becerescu (41/44) Dec 02 2019 Update for week 2 of Milestone 3
- Cristian Becerescu (42/42) Dec 09 2019 Update for week 3 of Milestone 3
- Cristian Becerescu (31/31) Dec 16 2019 Update for week 4 of Milestone 3
- Cristian Becerescu (12/12) Dec 23 2019 Update for week 1 of Milestone 4
Hi all, My name is Cristi and in the next few months I will be working on the "DPP with Linux kernel headers" project for this year's edition of SAoC. This thread will be used to post further updates (though anyone is encouraged to make suggestions, ask questions and pinpoint mistakes). DPP is a tool used to directly include C/C++ headers in D files; it is basically a D compiler wrapper which permits D files with #include directives to be compiled. Since D doesn't have a preprocessor, running DPP on a .dpp file will create a valid .d file that can be compiled (e.g. with dmd). A .dpp file can include one or more C/C++ headers, each possibly containing C/C++ macros, and function and structure declarations. The included headers are parsed using libclang. At the moment, DPP doesn't work when the .dpp file includes linux header files. As a consequence, writing a linux driver in D (e.g. the case of Alex Militaru's driver), implies manually translating all used C headers to .di interfaces (redeclaring all structures and functions to a D-compilable format). The project aims to solve this issue and enable DPP to work with all linux headers. Milestones for this project: Milestone 1: Investigate and narrow down the issues that dpp has with the compilation of linux kernel header files. Weeks 1-2: Get accustomed with the infrastructure: build the kernel with clang, clone dpp, understand the internal structure of both dpp and the kernel. Weeks 3-4: Compile a .c file with clang that includes a random linux kernel header, try to reproduce with dpp. There are a series of issues that need to be addressed: -> how to pass clang command line options to dpp? -> what command line options are needed? -> how to specify what version of libclang should be used? -> are the object files compatible? Milestone 2: Fix all issues encountered at milestone 1 so that Alex Militaru's driver can be integrated with the linux kernel using dpp. Unfortunately, we cannot detail too much on this step as it really depends on what we will discover during milestone 1. We can provide more information about this at the end of milestone 1. If things go smoothly and we integrate Alex Militaru's driver very fast, then the next step will be to create a testing infrastructure that makes sure that dpp works well with **all** the headers in the kernel. Milestone 3 + 4: Work on integrating dpp with C++. From now on, I will be posting updates weekly (or every two weeks) regarding the progress done on this project. Quick note: I'm currently in the last two weeks of an internship, but I will do my best to finish all that I've set out to do. Many thanks, Cristi
Sep 16 2019
On 2019-09-16 22:51, Cristian Becerescu wrote:-> how to pass clang command line options to dpp?DStep [1], which is very similar to dpp and uses libclang, will forward any unrecognized arguments to libclang. If there are any conflicts, it's always possible to pass "--" to separate Clang arguments from DStep arguments. It's common practice and is supported by std.getopts, IIRC.-> how to specify what version of libclang should be used?I would assume that dpp doesn't use `dlopen` and just uses whatever version of libclang is installed. DStep supports static linking of libclang and the pre-compiled release binaries are statically linked with libclang to minimize the risk that an unsupported version of libclang is used. [1] https://github.com/jacob-carlborg/dstep -- /Jacob Carlborg
Sep 18 2019
On Monday, 16 September 2019 at 20:51:08 UTC, Cristian Becerescu wrote:Hi all, My name is Cristi and in the next few months I will be working on the "DPP with Linux kernel headers" project for this year's edition of SAoC. This thread will be used to post further updates (though anyone is encouraged to make suggestions, ask questions and pinpoint mistakes). [...]You can pass command-line options directly to dpp with --clang-option.
Sep 22 2019
Sorry for not updating this thread in a while. I've managed to do some progress on this project, but from now on I'll have more time, considering I finished my internship. Here are some updates and things I found out during those weeks: 1. Yes, as Atila said, I can pass flags to clang through dpp by using --clang-option. The only inconvenience with this is that every clang flag needs to be preceeded by a --clang-option flag (this becomes unmanageable when dealing with tens of compiler flags, e.g. for compiling kernel headers). This is not a problem which affects the capabilities of d++, but it impacts the ease of use and increases the chances of mistyping, forgetting to add the flag or adding it in a wrong place. function (used for parsing the command line) could accept multiple flags at once for clang-options: --clang-options=-D__KERNEL__,-Werror,-Wextra). options: --clang-options "-D__KERNEL__ -Werror -Wextra", which would be more convenient when copy-pasting flags from the kernel build files. 2. The LLVM compiler infrastructure doesn't yet support the 'asm goto' gcc extension. I encountered this problem while trying to compile a simple, empty main C program which #included linux/namei.h. If the kernel is built with CONFIG_JUMP_LABEL and CC_HAVE_ASM_GOTO set, using clang will not work (for details, see asm/compiler.h, asm/compiler_types.h, asm/compiler-gcc.h, asm/compiler-clang.h). There is a macro (asm_volatile_goto(x)) used in asm/jump_label.h which is declared in asm/compiler-gcc.h, but not in asm/compiler-clang.h (asm/jump_label.h includes one of the two, but always uses the macro, even though in one of the included files the macro is not defined). Undefining CC_HAVE_ASM_GOTO (or just not defining it with -D...) when compiling the .c file worked. The motivation of debugging this was to first make sure that I can compile a .c file (containing a kernel header) using clang, so that then, when I compile a .dpp file using d++ (and clang internally) and get errors, I know that there are problems withing dpp and not clang. 3. After making sure clang worked with the above .c program, I tried compiling the .dpp one with the same clang options preceded by --clang-option. This currently does not work, as clang reports some "undeclared identifier" and other syntax errors which I'll have to dig into deeper to understand whether it's a problem with the way the flags are passed to clang or something else.
Oct 04 2019
It's time for a new update, so here we go. Sorry for those long posts :) This past week I've dived deeper into the 3rd problem mentioned in the last update. There were multiple problems when trying to generate and compile a D program from this DPP: // foo.dpp #include <linux/namei.h> void main() { } Even though compiling a C program which included the same kernel header with clang worked, compiling it through dpp didn't. I've managed to find out that this happens because dpp appends some include directories by default, the problem being related to '/usr/include/' in particular (which is set through the function call systemPaths() from the D libclang [1]). Clang complains about undeclared identifiers and expected closed parenthesis, but I still have to investigate why including that directory to the include directories messes things up. ------------------------------------------------- Consider this code from kernel.h: #define u64_to_user_ptr(x) ( \ { \ typecheck(u64, x); \ (void __user *)(uintptr_t)x; \ } \ ) In dpp, when translating this into D code, we also check for pairs of open and closed parenthesis [2]. When finding a '(', we increment the index of the tokens array until we find a matching ')'. If the C code is valid (and in the above example it is), this should work well, but it doesn't, resulting in a fatal error: range violation. The reason is, as seen in [2], we only check for tokens with the ')' spelling, when, in reality, the last parenthesis of the macro is not spelled ')', but '\\\n)' ('\' character, followed by newline, followed by the actual parenthesis). ------------------------------------------------- I'll give you an example: // test.h struct module; void f(struct module *); struct module { int a; int b; }; Generating a D file from a DPP one which includes the above header will look like this: // test.d, generated from test.dpp through the last version of d++ from github // ... extern(C) { void f(module*) nogc nothrow; struct module__ { int a; int b; } } struct module; void main() {} // ... Clearly there are multiple wrong things here: - the module struct should be named module_ and not module__ (this is what dpp should do internally) - even though the struct is renamed, the parameter types are not - we are declaring the same structure again (with the original C spelling) outside of the extern(C) block because dpp thinks module was an undeclared structure - compilation of this D program clearly doesn't work The reasons for those bugs are a bit subtler, so I'm not going over them as it would make this post quite big. ------------------------------------------------- I have implemented working solutions for all the above problems. They pass all the unit tests and I can also successfully generate an executable file from a .dpp which contains the linux/namei.h kernel header. I just have to clean some things and start making pull requests and maybe get some feedback. paths (just a workaround for the moment), but as I will investigate this further, I will try to see what the underlying problem really is (probably some collisions with other files). Going from here, I will investigate if my changes impact other non-unit-tested C cases. Also, I will try running dpp with other kernel and non-kernel headers as well, making sure there are no other bugs or untreated edge cases. Cristi [1] https://code.dlang.org/packages/libclang [2] https://github.com/atilaneves/dpp/blob/master/source/dpp/translation/macro_.d#L326
Oct 10 2019
On Thursday, 10 October 2019 at 20:28:20 UTC, Cristian Becerescu wrote:It's time for a new update, so here we go. Sorry for those long posts :) [...]Could you please file issues on github so I can fix them? Thanks.
Oct 11 2019
On Friday, 11 October 2019 at 13:11:48 UTC, Atila Neves wrote:On Thursday, 10 October 2019 at 20:28:20 UTC, Cristian Becerescu wrote:It looks like Cristian has already solved these issues. He will probably make some PRs next week.It's time for a new update, so here we go. Sorry for those long posts :) [...]Could you please file issues on github so I can fix them? Thanks.
Oct 13 2019
Update for week 1 of Milestone 2. I've implemented solutions for the issues discovered in the previous 2 weeks and made pull requests as follows: - Allow multiple clang options to be specified more easily & add option to avoid including the system paths by default https://github.com/atilaneves/dpp/pull/185 - Fix a bug related to multi-line macro definitions https://github.com/atilaneves/dpp/pull/186 - Fix the renaming of D keywords to be consistent and correct https://github.com/atilaneves/dpp/pull/187
Oct 20 2019
Update for week 2 of Milestone 2 - PRs from the previous week were successfully merged - Tested running dpp to translate linux/virtio.h In this phase, I've begun testing dpp with every linux header used by Alex Militaru's driver, starting with linux/virtio.h. There are multiple cases where a function is declared through a macro, where the function return type is specified using __typeof(X)__ (e.g. __typeof(X)__ functionName()). The D translation would be something like: typeof(X) functionName(); In C, X could be a variable of a certain type, the type itself (e.g. int), or an alias of a type. In D, it is not valid code if X is either a certain type, or an alias of a type. E.g.: alias u32 = uint; typeof(u32) f1(); // err typeof(int) f2(); // err Another issue is related to translating nested (on multiple levels) anonymous unions and structs (specifically in linux/slab.h, line 596), but I still have to wrap my head around this (because the generated code is a bit complicated). An inconvenience is that on my machine, running dpp with virtio.h takes about 15 minutes, which slows down the debugging process with this specific header.
Oct 28 2019
On Monday, 28 October 2019 at 11:14:24 UTC, Cristian Becerescu wrote:Update for week 2 of Milestone 2 - PRs from the previous week were successfully merged [...]Maybe you can try: u32 i=0; typeof(u32.init) f1(); // err typeof(int.init) f2(); // err typeof(i.init) f2(); // err
Oct 28 2019
On Monday, 28 October 2019 at 12:14:50 UTC, Newbie2019 wrote:On Monday, 28 October 2019 at 11:14:24 UTC, Cristian Becerescu wrote:Initially I thought about checking if the argument to typeof() is a type or an alias to a type (and just use the argument as is in those scenarios). But your suggestion is more generic, and it works on all those cases. Thank you! I will try it out.Update for week 2 of Milestone 2 - PRs from the previous week were successfully merged [...]Maybe you can try: u32 i=0; typeof(u32.init) f1(); // err typeof(int.init) f2(); // err typeof(i.init) f2(); // err
Oct 28 2019
On 2019-10-28 12:14, Cristian Becerescu wrote:Another issue is related to translating nested (on multiple levels) anonymous unions and structs (specifically in linux/slab.h, line 596), but I still have to wrap my head around this (because the generated code is a bit complicated).What exactly is the problem? -- /Jacob Carlborg
Oct 29 2019
On Tuesday, 29 October 2019 at 13:34:50 UTC, Jacob Carlborg wrote:What exactly is the problem?Sorry for taking so long to reply. I've written a simple example where dpp fails to generate valid D code for the problem I mention earlier. The example is in this gist (for syntax highlighting): https://gist.github.com/cbecerescu/9a114bb92b23bd8e275a8eae08c6cc2f The problem is that getters and setters in A_struct are not generated for the 'c' and 'd' fields from 'union _Anonymous_2'. In fact, not only they are not generated, but, as you can see in the last few lines in the gist, the last 2 generated functions don't make any sense (they don't access any field from _anonymous_5, and so compiling this will fail).
Nov 01 2019
On 2019-11-01 21:39, Cristian Becerescu wrote:Sorry for taking so long to reply. I've written a simple example where dpp fails to generate valid D code for the problem I mention earlier. The example is in this gist (for syntax highlighting): https://gist.github.com/cbecerescu/9a114bb92b23bd8e275a8eae08c6cc2fWow, that's complicated. DStep [1] generates this: extern (C): struct A_struct { union { int a; struct { int b; union { int c; char d; } } } } [1] http://github.com/jacob-carlborg/dstep -- /Jacob Carlborg
Nov 02 2019
On Saturday, 2 November 2019 at 17:02:45 UTC, Jacob Carlborg wrote:On 2019-11-01 21:39, Cristian Becerescu wrote:When encountering anonymous structs or unions, dpp gives them a name. And this forces dpp to declare a member of that 'now-named-anon' record type and also provide accessors for the 'now-named-anon' record (because unnamed records also implicitly declare a member of their type).Sorry for taking so long to reply. I've written a simple example where dpp fails to generate valid D code for the problem I mention earlier. The example is in this gist (for syntax highlighting): https://gist.github.com/cbecerescu/9a114bb92b23bd8e275a8eae08c6cc2fWow, that's complicated.
Nov 03 2019
On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu wrote:When encountering anonymous structs or unions, dpp gives them a name. And this forces dpp to declare a member of that 'now-named-anon' record type and also provide accessors for the 'now-named-anon' record (because unnamed records also implicitly declare a member of their type).But the question is why is DPP generating named unions/structs when D supports anonymous ones? -- /Jacob Carlborg
Nov 04 2019
On Monday, 4 November 2019 at 09:17:57 UTC, Jacob Carlborg wrote:On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu wrote:Probably because I either didn't know that at the time but knew and forgot. I don't remember what I was trying to do or why just doing the obvious failed.When encountering anonymous structs or unions, dpp gives them a name. And this forces dpp to declare a member of that 'now-named-anon' record type and also provide accessors for the 'now-named-anon' record (because unnamed records also implicitly declare a member of their type).But the question is why is DPP generating named unions/structs when D supports anonymous ones?
Nov 04 2019
On 05/11/2019 5:48 AM, Atila Neves wrote:On Monday, 4 November 2019 at 09:17:57 UTC, Jacob Carlborg wrote:It sounds like you were too liberal in doing this. There is a variant of struct/union in C which has a named instance but no type name.On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu wrote:Probably because I either didn't know that at the time but knew and forgot. I don't remember what I was trying to do or why just doing the obvious failed.When encountering anonymous structs or unions, dpp gives them a name. And this forces dpp to declare a member of that 'now-named-anon' record type and also provide accessors for the 'now-named-anon' record (because unnamed records also implicitly declare a member of their type).But the question is why is DPP generating named unions/structs when D supports anonymous ones?
Nov 04 2019
On 2019-11-04 17:54, rikki cattermole wrote:There is a variant of struct/union in C which has a named instance but no type name.You mean like this? struct Foo { struct { int a; } b; } That still needs to be translated to a named struct in D. -- /Jacob Carlborg
Nov 04 2019
On 05/11/2019 8:33 AM, Jacob Carlborg wrote:On 2019-11-04 17:54, rikki cattermole wrote:Yes it does. This is probably why I'm guessing that Atila generated a name. Just did so too liberally ;)There is a variant of struct/union in C which has a named instance but no type name.You mean like this? struct Foo { struct { int a; } b; } That still needs to be translated to a named struct in D.
Nov 04 2019
On Tuesday, 5 November 2019 at 01:25:40 UTC, rikki cattermole wrote:On 05/11/2019 8:33 AM, Jacob Carlborg wrote:Sounds about right.On 2019-11-04 17:54, rikki cattermole wrote:Yes it does. This is probably why I'm guessing that Atila generated a name. Just did so too liberally ;)There is a variant of struct/union in C which has a named instance but no type name.You mean like this? struct Foo { struct { int a; } b; } That still needs to be translated to a named struct in D.
Nov 05 2019
On Monday, 4 November 2019 at 16:48:21 UTC, Atila Neves wrote:On Monday, 4 November 2019 at 09:17:57 UTC, Jacob Carlborg wrote:I don't know if this is related, but dpp ignored anonymous enums, so it may have been the same issue.On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu wrote:Probably because I either didn't know that at the time but knew and forgot. I don't remember what I was trying to do or why just doing the obvious failed.When encountering anonymous structs or unions, dpp gives them a name. And this forces dpp to declare a member of that 'now-named-anon' record type and also provide accessors for the 'now-named-anon' record (because unnamed records also implicitly declare a member of their type).But the question is why is DPP generating named unions/structs when D supports anonymous ones?
Nov 04 2019
Update for week 3 of Milestone 2 - I've proposed a solution for the 'typeof with actual types instead of expressions' issue from last week (https://github.com/atilaneves/dpp/pull/201) - I've identified the problem with the nested C11 structs and unions mentioned last week. An explanation for this was given in response to Jacob's questions above (https://forum.dlang.org/post/ckxmzciqkijaarwgmsdh forum.dlang.org and https://forum.dlang.org/post/xbbafbbscjlgaoqznxlw forum.dlang.org). I also proposed a solution for this issue here: https://github.com/atilaneves/dpp/pull/204 - On several occasions (when translating virtio.h), some function parameters' types contain the struct keyowrd which produces compiling errors. I'll try to solve this particular issue and other bugs in the next few days
Nov 03 2019
Update for week 4 of Milestone 2 (ended on November 10th, sorry for the delay) Continued testing dpp with virtio.h and found multiple bugs, especially regarding renaming: Accessors for members of anonymous records are not renamed when the members are keywords So in this case struct A { union { unsigned int version; char module; }; }; the members themselves would be renamed (to version_ and module_), but the accessor function names would not (they would be auto version(...) and auto module(...) etc.). (Solved during week 5 of Milestone 2) The fixFields() method doesn't work when multiple structs have a field (of type pointer to struct) which needs renaming So in this case struct A; struct B { struct A *A; }; struct C { struct A* A; }; dpp would only rename one of the fields (either the one in B or in C), because the _fieldDeclarations associative array overwrites the already existing (if any) line number with a new one. So it would rename only the field which is contained in the last processed struct. Also, this affects C11 anon records, if, for example we would add the struct below to the ones above struct D { union { struct A* A; int d; }; }; (Both of those problems have a proposed solution, during week 5 of Milestone 2) In some cases, dpp writes a clang warning into the generated D file. For example (for virtio.h): foo.d.tmp:81822:141: warning: __VA_ARGS__ can only appear in the expansion of a C99 variadic macro #define __PVOP_VCALL ( op , pre , post, ... ) ____PVOP_VCALL ( op , CLBR_ANY , __VA_ARGS__ ). This is written exactly as is in the D file. I still have to figure why this happens. Some code is translated to enum BLA = 6LLU; LLU is not valid in D (LL needs to be changed to L). An enum is initialized with a value of 68719476704, which produces this: Error: cannot implicitly convert expression 68719476704L of type long to int I will probably need to check if some values of the enum are > int.max, and, if so, declare "enum : long" instead of "enum". Again regarding C11 anon records: in its output, dpp writes "const(_Anonymous_55) version_;", and this conflicts with the accessor functions' name for a struct field. I will try to figure out the reason why dpp adds that declaration in the first place and fix this (most probably another renaming issue). Error: undefined identifier xattr_handler The above happens in a struct declaration (a field of that type is declared) and also in a function declaration (a parameter of that type). The struct declaration for xattr_handler is not in the generated D file .Will look more into this. usr/bin/ld: Warning: size of symbol `_D3foo10local_apic12__reserved_1MUNaNbNdNiNfZk' changed from 15 in foo.o to 18 in foo.o After manually trying to solve the bugs above in the generated foo.d D file, the only problem remaining is the linker one above.
Nov 16 2019
On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian Becerescu wrote:An enum is initialized with a value of 68719476704, which produces this: Error: cannot implicitly convert expression 68719476704L of type long to int I will probably need to check if some values of the enum are > int.max, and, if so, declare "enum : long" instead of "enum".Shouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).
Nov 16 2019
On Saturday, 16 November 2019 at 18:39:51 UTC, Patrick Schluter wrote:On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian Becerescu wrote:I tested some cases. enum e { A = 68719476704 }; // Compiles fine enum e { A = 68719476704, B = 1 }; // Still compiles fine enum e { B = 1, A = 68719476704 }; // Error: cannot implicitly convert expression 68719476704L of type long to int I looked through the docs [1] and found this: "If the EnumBaseType is not explicitly set, and the first EnumMember has an AssignExpression, it is set to the type of that AssignExpression. Otherwise, it defaults to type int." [1] https://dlang.org/spec/enum.htmlAn enum is initialized with a value of 68719476704, which produces this: Error: cannot implicitly convert expression 68719476704L of type long to int I will probably need to check if some values of the enum are > int.max, and, if so, declare "enum : long" instead of "enum".Shouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).
Nov 16 2019
On Saturday, 16 November 2019 at 23:55:45 UTC, Cristian Becerescu wrote:On Saturday, 16 November 2019 at 18:39:51 UTC, Patrick Schluter wrote:I wanted to interject that it is in violation of the C standard (as D has one of its goal that constructs that are syntactically identical to C behave the same as in C) but after checking the standard C99, it is in fact implementation-defined, with the added remark "An implementation may delay the choice of which integer type until all enumeration constants have been seen". So, even in C the compiler may implement as it wishes.On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian Becerescu wrote:I tested some cases. enum e { A = 68719476704 }; // Compiles fine enum e { A = 68719476704, B = 1 }; // Still compiles fine enum e { B = 1, A = 68719476704 }; // Error: cannot implicitly convert expression 68719476704L of type long to int I looked through the docs [1] and found this: "If the EnumBaseType is not explicitly set, and the first EnumMember has an AssignExpression, it is set to the type of that AssignExpression. Otherwise, it defaults to type int." [1] https://dlang.org/spec/enum.htmlAn enum is initialized with a value of 68719476704, which produces this: Error: cannot implicitly convert expression 68719476704L of type long to int I will probably need to check if some values of the enum areShouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).int.max, and, if so, declare "enum : long" instead of"enum".
Nov 17 2019
Update for week 5 of Milestone 2 I've implemented and submitted solutions for the following bugs:Accessors for members of anonymous records are not renamed when the members are keywordsPR: https://github.com/atilaneves/dpp/pull/213 Status: Merged.The fixFields() method doesn't work when multiple structs have a field (of type pointer to struct) which needs renamingPR: https://github.com/atilaneves/dpp/pull/214 Status: Merged.Some code is translated to enum BLA = 6LLU; LLU is not valid in D (LL needs to be changed to L).PR: https://github.com/atilaneves/dpp/pull/215 Status: Need to add unit test.An enum is initialized with a value of 68719476704, which produces this: Error: cannot implicitly convert expression 68719476704L of type long to intPR: https://github.com/atilaneves/dpp/pull/217 Status: AppVeyor build fails (on Windows 32-bit), will look into it.Again regarding C11 anon records: in its output, dpp writes "const(_Anonymous_55) version_;", and this conflicts with the accessor functions' name for a struct field. I will try to figure out the reason why dpp adds that declaration in the first place and fix this (most probably another renaming issue).PR: https://github.com/atilaneves/dpp/pull/216 Status: Just need to add a comment and after that probably done. As for the rest of the issues, I'm still debugging and trying to figure out how to reproduce some of them on smaller examples.
Nov 18 2019
Update for week 1 of Milestone 3 (1) I've identified the actual cause for a previous issue and also implemented a solution for it:Error: undefined identifier xattr_handler The above happens in a struct declaration (a field of that type is declared) and also in a function declaration (a parameter of that type). The struct declaration for xattr_handler is not in the generated D file. Will look more into this.The problem there was that we would have something like this: void f(struct A**); // C code; A is undeclared and, because A is undeclared, DPP should generate an plain 'struct A;'. It turns out that previously, DPP was only looking at the cursor type for the first pointee of 'struct A**', which was 'struct A*' (which is not a Record type). So DPP would not generate the corresponding plain struct. The solution is simply moving on the pointee type until the cursor type is not a Pointer type (so eventually we would get to the Record type in the case above). PR: https://github.com/atilaneves/dpp/pull/218 (2) Found another bug, this time about bitfields. A succinct explanation and a small example can be found here: https://gist.github.com/cbecerescu/29188e4c0f0bb83e0e85e4e0dccc8c30 (3) I've tried using DPP with other kernel headers as well (specifically netdevice.h), and usually I get an error telling me the resources have been exhausted. My guess is that the issue is the 'lines' array, which contains all of the lines to be written in the generated D file. As far as I know, we don't flush at any point during this process. I assume this array gets very large at some point (I've seen a maximum of ~100K generated D files so far, although other factors could be also impactful, maybe the AST). I thought of two approaches here: either try to write to the file when 'lines' gets too big, or translate each C header into different D modules. I'm still open to suggestions. (4) I still have not got to the cause of this (I assume it's a sort of redirection of stderr, but I can't reproduce it with any warnings I tried):In some cases, dpp writes a clang warning into the generated D file. For example > (for virtio.h): foo.d.tmp:81822:141: warning: __VA_ARGS__ can only appear in the expansion of a C99 variadic macro #define __PVOP_VCALL ( op , pre , post, ... ) ____PVOP_VCALL ( op , CLBR_ANY , __VA_ARGS__ ). This is written exactly as is in the D file. I still have to figure why this happens.
Nov 25 2019
Update for week 2 of Milestone 3 - Fixed the bitfields issue (https://github.com/atilaneves/dpp/pull/219) - Tested DPP with all the kernel headers included by Alex's driver + other 70 random kernel headers - Discovered (what I hope to be the last) 5 issues 1. In some obscure parts of the kernel, there exists something like 'extern __visibile const void __nosave_begin;', which is a variable of type void (but declared extern, so the compiler doesn't complain). Still thinking about how DPP should translate this to valid D code. 2. Unknown enums in function return/param types won't generate opaque definitions for that enum (as it is the case for struct). e.g.: enum A f(enum B); // A and B would be uknown identifiers struct C *f(struct D *); // C and D would be declared by default by DPP 3. Unreachable struct e.g.: // C struct A { struct B { int a; int b; }; }; void f(struct B *); // valid in C // D void f(B *); // not valid in D, this is how DPP translates the function declaration void f(A.B *); // OK, this should probably be the actual output 4. 'struct sockaddr size not known', and some functions return this struct type by value (not pointer to it). Definition of this struct should probably be there. Will check to see if I missed any flags in the Makefile.(3) I've tried using DPP with other kernel headers as well (specifically netdevice.h), and usually I get an error telling me the resources have been exhausted.For this issue I previously proposed 2 approaches, which have their own limitations (thanks Atila for pointing them out). The exhaustion of memory seems to happen while cpp/clang is still running, so the issue is not about the lines array (which anyway would not be bigger than MB, and the RAM usage is more than 6GB). I will, therefore, do some profiling to narrow down the causes for this and decide if there's something we could do about it.
Dec 02 2019
Update for week 3 of Milestone 3 - Fixed an issue which deals with the way nested aggregates are used outside of their parent (enclosing) aggregate e.g.: struct A { struct B { int a; int c; } b_obj; int other; }; void f(struct B*); // works in C, in D it should be A.B (but it was still translated to just B) // Also applicable to field types A better explanation and an actual solution are presented in the PR below. PR: https://github.com/atilaneves/dpp/pull/226 - Did some profiling on DPP with lots of kernel header files included in the same .dpp file. Memory usage after libclang finishes parsing: 300MB. Memory usage after DPP finishes processing the AST provided by libclang: 6-7GB (this is where some fishy things happen). DPP then tries to use 'cpp' on the .d.tmp file for C preprocessing, but the process spawning fails, as there is not enough memory left to fork and execute (I was testing on an 8GB machine). It turns out (and huge thanks to Edi and Razvan for your help with this) that the Appender object '_lines' (which stores the strings to be written in the translated D file) leaks memory. This was apparently an already known issue: https://issues.dlang.org/show_bug.cgi?id=19511 I've changed the type from Appender to built-in array and the maximum memory usage at any point is somewhere between 2.0-2.4GB, way better than 7GB :) Edi also wrote a simple test and it seems that for appending ~ 10^6 strings of length 20 each, Appender is just 20ms faster. I also tested this with headers generating a 250K line (850K words total) D file and performance doesn't differ (I also think that as long as Appender is leaking it shouldn't be used no matter the speed boost, considering the large C codebases that DPP could be used with). PR: https://github.com/atilaneves/dpp/pull/227
Dec 09 2019
Update for week 4 of Milestone 3 - For the memory usage issue: I previously proposed changing from Appender to array (as I've seen the memory dropping from ~6GB to ~2GB). In the last week I've been continuously testing those two versions to be sure. However I don't get constant results: sometimes both the appender and the array use more than 5GB, and other times both use 2GB (once I even got to max 600MB). The tests were done with the same input file and after system restart (just to make sure it's not a cache thing of some sorts). If you have any guess about why this might happen, please feel free to reply and let us know. I also tested the performance for Appender vs array on Python.h (which is way smaller than the kernel headers I tested above), as Atila requested in the PR. Appender version (~230MB): User time (seconds): 18.78 System time (seconds): 2.62 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:21.41 Array version (~240MB): User time (seconds): 17.39 System time (seconds): 0.22 Percent of CPU this job got: 100% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:17.61 - I've identified the issue which inserted C warnings in the generated D code (it's the C preprocessor whose output is redirected into the file). I have a local fix for this (redirecting the error messages), but I didn't make a PR yet. - Also locally fixed an issue with void type extern variables. As discussed with my mentors, the solution is to change from void to char*.
Dec 16 2019
Update for week 1 of Milestone 4 - Solved and made a PR for the clang warnings being written to the .d file: https://github.com/atilaneves/dpp/pull/232 - Solved and made a PR for the issue of variables of type void in C: https://github.com/atilaneves/dpp/pull/233 I had some issues running the unittests locally and I've been trying to debug that while working on the above PRs. I decided to push the changes anyway, but I think that in the meantime Atila managed to fix those issues. In the next 2 weeks I won't be able to consistently work on the project as before, so I will probably be posting the updates for weeks 2 and 3 of Milestone 4 as part of the update for week 3.
Dec 23 2019