www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [SAoC] 'DPP with Linux kernel headers' Project Thread

reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Hi all,

My name is Cristi and in the next few months I will be working on 
the "DPP with Linux kernel headers" project for this year's 
edition of SAoC. This thread will be used to post further updates 
(though anyone is encouraged to make suggestions, ask questions 
and pinpoint mistakes).

DPP is a tool used to directly include C/C++ headers in D files; 
it is basically a D compiler wrapper which permits D files with 
#include directives to be compiled.
Since D doesn't have a preprocessor, running DPP on a .dpp file 
will create a valid .d file that can be compiled (e.g. with dmd). 
A .dpp file can include one or more C/C++ headers, each possibly 
containing C/C++ macros, and function and structure declarations. 
The included headers are parsed using libclang.

At the moment, DPP doesn't work when the .dpp file includes linux 
header files. As a consequence, writing a linux driver in D (e.g. 
the case of Alex Militaru's driver), implies manually translating 
all used C headers to .di interfaces (redeclaring all structures 
and functions to a D-compilable format). The project aims to 
solve this issue and enable DPP to work with all linux headers.

Milestones for this project:

Milestone 1: Investigate and narrow down the issues that dpp has 
with the compilation of linux kernel header files.
Weeks 1-2: Get accustomed with the infrastructure: build the 
kernel with clang, clone dpp, understand the internal structure 
of both dpp and the kernel.
Weeks 3-4: Compile a .c file with clang that includes a random 
linux kernel header, try to reproduce with dpp. There are a 
series of issues that need to be addressed:
        -> how to pass clang command line options to dpp?
        -> what command line options are needed?
        -> how to specify what version of libclang should be used?
        -> are the object files compatible?

Milestone 2: Fix all issues encountered at milestone 1 so that 
Alex Militaru's driver can be integrated with the linux kernel 
using dpp.
Unfortunately, we cannot detail too much on this step as it 
really depends on what we will discover during milestone 1. We 
can provide more information about this at the end of milestone 
1. If things go smoothly and we integrate Alex Militaru's driver 
very fast, then the next step will
be to create a testing infrastructure that makes sure that dpp 
works well with **all** the headers in the kernel.

Milestone 3 + 4:  Work on integrating dpp with C++.

 From now on, I will be posting updates weekly (or every two 
weeks) regarding the progress done on this project.

Quick note: I'm currently in the last two weeks of an internship, 
but I will do my best to finish all that I've set out to do.

Many thanks,
Cristi
Sep 16 2019
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2019-09-16 22:51, Cristian Becerescu wrote:

         -> how to pass clang command line options to dpp?
DStep [1], which is very similar to dpp and uses libclang, will forward any unrecognized arguments to libclang. If there are any conflicts, it's always possible to pass "--" to separate Clang arguments from DStep arguments. It's common practice and is supported by std.getopts, IIRC.
         -> how to specify what version of libclang should be used?
I would assume that dpp doesn't use `dlopen` and just uses whatever version of libclang is installed. DStep supports static linking of libclang and the pre-compiled release binaries are statically linked with libclang to minimize the risk that an unsupported version of libclang is used. [1] https://github.com/jacob-carlborg/dstep -- /Jacob Carlborg
Sep 18 2019
prev sibling next sibling parent Atila Neves <atila.neves gmail.com> writes:
On Monday, 16 September 2019 at 20:51:08 UTC, Cristian Becerescu 
wrote:
 Hi all,

 My name is Cristi and in the next few months I will be working 
 on the "DPP with Linux kernel headers" project for this year's 
 edition of SAoC. This thread will be used to post further 
 updates (though anyone is encouraged to make suggestions, ask 
 questions and pinpoint mistakes).

 [...]
You can pass command-line options directly to dpp with --clang-option.
Sep 22 2019
prev sibling parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Sorry for not updating this thread in a while.
I've managed to do some progress on this project, but from now on 
I'll have more time, considering I finished my internship.

Here are some updates and things I found out during those weeks:

1. Yes, as Atila said, I can pass flags to clang through dpp by 
using --clang-option. The only inconvenience with this is that 
every clang flag needs to be preceeded by a --clang-option flag 
(this becomes unmanageable when dealing with tens of compiler 
flags, e.g. for compiling kernel headers).

This is not a problem which affects the capabilities of d++, but 
it impacts the ease of use and increases the chances of 
mistyping, forgetting to add the flag or adding it in a wrong 
place.


function (used for parsing the command line) could accept 
multiple flags at once for clang-options: 
--clang-options=-D__KERNEL__,-Werror,-Wextra).


options: --clang-options "-D__KERNEL__ -Werror -Wextra", which 
would be more convenient when copy-pasting flags from the kernel 
build files.

2. The LLVM compiler infrastructure doesn't yet support the 'asm 
goto' gcc extension.

I encountered this problem while trying to compile a simple, 
empty main C program which #included linux/namei.h.
If the kernel is built with CONFIG_JUMP_LABEL and 
CC_HAVE_ASM_GOTO set, using clang will not work (for details, see 
asm/compiler.h, asm/compiler_types.h, asm/compiler-gcc.h, 
asm/compiler-clang.h). There is a macro (asm_volatile_goto(x)) 
used in asm/jump_label.h which is declared in asm/compiler-gcc.h, 
but not in asm/compiler-clang.h (asm/jump_label.h includes one of 
the two, but always uses the macro, even though in one of the 
included files the macro is not defined). Undefining 
CC_HAVE_ASM_GOTO (or just not defining it with -D...) when 
compiling the .c file worked.

The motivation of debugging this was to first make sure that I 
can compile a .c file (containing a kernel header) using clang, 
so that then, when I compile a .dpp file using d++ (and clang 
internally) and get errors, I know that there are problems 
withing dpp and not clang.

3. After making sure clang worked with the above .c program, I 
tried compiling the .dpp one with the same clang options preceded 
by --clang-option. This currently does not work, as clang reports 
some "undeclared identifier" and other syntax errors which I'll 
have to dig into deeper to understand whether it's a problem with 
the way the flags are passed to clang or something else.
Oct 04 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
It's time for a new update, so here we go. Sorry for those long 
posts :)

This past week I've dived deeper into the 3rd problem mentioned 
in the last update.
There were multiple problems when trying to generate and compile 
a D program from this DPP:

// foo.dpp
#include <linux/namei.h>

void main()
{
}


Even though compiling a C program which included the same kernel 
header with clang worked, compiling it through dpp didn't. I've 
managed to find out that this happens because dpp appends some 
include directories by default, the problem being related to 
'/usr/include/' in particular (which is set through the function 
call systemPaths() from the D libclang [1]). Clang complains 
about undeclared identifiers and expected closed parenthesis, but 
I still have to investigate why including that directory to the 
include directories messes things up.

-------------------------------------------------


Consider this code from kernel.h:

#define u64_to_user_ptr(x) (		\
{					\
	typecheck(u64, x);		\
	(void __user *)(uintptr_t)x;	\
}					\					
)

In dpp, when translating this into D code, we also check for 
pairs of open and closed parenthesis [2]. When finding a '(', we 
increment the index of the tokens array until we find a matching 
')'. If the C code is valid (and in the above example it is), 
this should work well, but it doesn't, resulting in a fatal 
error: range violation. The reason is, as seen in [2], we only 
check for tokens with the ')' spelling, when, in reality, the 
last parenthesis of the macro is not spelled ')', but '\\\n)' 
('\' character, followed by newline, followed by the actual 
parenthesis).

-------------------------------------------------


I'll give you an example:

// test.h
struct module;

void f(struct module *);

struct module {
	int a;
	int b;
};

Generating a D file from a DPP one which includes the above 
header will look like this:

// test.d, generated from test.dpp through the last version of 
d++ from github
// ...
extern(C)
{
	void f(module*)  nogc nothrow;
	struct module__
	{
		int a;
		int b;
	}
}

struct module;

void main() {}
// ...

Clearly there are multiple wrong things here:
- the module struct should be named module_ and not module__ 
(this is what dpp should do internally)
- even though the struct is renamed, the parameter types are not
- we are declaring the same structure again (with the original C 
spelling) outside of the extern(C) block because dpp thinks 
module was an undeclared structure
- compilation of this D program clearly doesn't work

The reasons for those bugs are a bit subtler, so I'm not going 
over them as it would make this post quite big.

-------------------------------------------------

I have implemented working solutions for all the above problems. 
They pass all the unit tests and I can also successfully generate 
an executable file from a .dpp which contains the linux/namei.h 
kernel header. I just have to clean some things and start making 
pull requests and maybe get some feedback.

paths (just a workaround for the moment), but as I will 
investigate this further, I will try to see what the underlying 
problem really is (probably some collisions with other files).

Going from here, I will investigate if my changes impact other 
non-unit-tested C cases. Also, I will try running dpp with other 
kernel and non-kernel headers as well, making sure there are no 
other bugs or untreated edge cases.

Cristi


[1] https://code.dlang.org/packages/libclang
[2] 
https://github.com/atilaneves/dpp/blob/master/source/dpp/translation/macro_.d#L326
Oct 10 2019
next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Thursday, 10 October 2019 at 20:28:20 UTC, Cristian Becerescu 
wrote:
 It's time for a new update, so here we go. Sorry for those long 
 posts :)

 [...]
Could you please file issues on github so I can fix them? Thanks.
Oct 11 2019
parent RazvanN <razvan.nitu1305 gmail.com> writes:
On Friday, 11 October 2019 at 13:11:48 UTC, Atila Neves wrote:
 On Thursday, 10 October 2019 at 20:28:20 UTC, Cristian 
 Becerescu wrote:
 It's time for a new update, so here we go. Sorry for those 
 long posts :)

 [...]
Could you please file issues on github so I can fix them? Thanks.
It looks like Cristian has already solved these issues. He will probably make some PRs next week.
Oct 13 2019
prev sibling parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 1 of Milestone 2.

I've implemented solutions for the issues discovered in the 
previous 2 weeks and made pull requests as follows:

- Allow multiple clang options to be specified more easily & add 
option to avoid including the system paths by default
https://github.com/atilaneves/dpp/pull/185

- Fix a bug related to multi-line macro definitions
https://github.com/atilaneves/dpp/pull/186

- Fix the renaming of D keywords to be consistent and correct
https://github.com/atilaneves/dpp/pull/187
Oct 20 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 2 of Milestone 2

- PRs from the previous week were successfully merged

- Tested running dpp to translate linux/virtio.h

In this phase, I've begun testing dpp with every linux header 
used by Alex Militaru's driver, starting with linux/virtio.h.

There are multiple cases where a function is declared through a 
macro, where the function return type is specified using 
__typeof(X)__ (e.g. __typeof(X)__ functionName()). The D 
translation would be something like:

typeof(X) functionName();

In C, X could be a variable of a certain type, the type itself 
(e.g. int), or an alias of a type.
In D, it is not valid code if X is either a certain type, or an 
alias of a type.

E.g.:
alias u32 = uint;

typeof(u32) f1(); // err
typeof(int) f2(); // err

Another issue is related to translating nested (on multiple 
levels) anonymous unions and structs (specifically in 
linux/slab.h, line 596), but I still have to wrap my head around 
this (because the generated code is a bit complicated).

An inconvenience is that on my machine, running dpp with virtio.h 
takes about 15 minutes, which slows down the debugging process 
with this specific header.
Oct 28 2019
next sibling parent reply Newbie2019 <newbie2019 gmail.com> writes:
On Monday, 28 October 2019 at 11:14:24 UTC, Cristian Becerescu 
wrote:
 Update for week 2 of Milestone 2

 - PRs from the previous week were successfully merged

 [...]
Maybe you can try: u32 i=0; typeof(u32.init) f1(); // err typeof(int.init) f2(); // err typeof(i.init) f2(); // err
Oct 28 2019
parent Cristian Becerescu <cristian.becerescu yahoo.com> writes:
On Monday, 28 October 2019 at 12:14:50 UTC, Newbie2019 wrote:
 On Monday, 28 October 2019 at 11:14:24 UTC, Cristian Becerescu 
 wrote:
 Update for week 2 of Milestone 2

 - PRs from the previous week were successfully merged

 [...]
Maybe you can try: u32 i=0; typeof(u32.init) f1(); // err typeof(int.init) f2(); // err typeof(i.init) f2(); // err
Initially I thought about checking if the argument to typeof() is a type or an alias to a type (and just use the argument as is in those scenarios). But your suggestion is more generic, and it works on all those cases. Thank you! I will try it out.
Oct 28 2019
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2019-10-28 12:14, Cristian Becerescu wrote:

 Another issue is related to translating nested (on multiple levels) 
 anonymous unions and structs (specifically in linux/slab.h, line 596), 
 but I still have to wrap my head around this (because the generated code 
 is a bit complicated).
What exactly is the problem? -- /Jacob Carlborg
Oct 29 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
On Tuesday, 29 October 2019 at 13:34:50 UTC, Jacob Carlborg wrote:
 What exactly is the problem?
Sorry for taking so long to reply. I've written a simple example where dpp fails to generate valid D code for the problem I mention earlier. The example is in this gist (for syntax highlighting): https://gist.github.com/cbecerescu/9a114bb92b23bd8e275a8eae08c6cc2f The problem is that getters and setters in A_struct are not generated for the 'c' and 'd' fields from 'union _Anonymous_2'. In fact, not only they are not generated, but, as you can see in the last few lines in the gist, the last 2 generated functions don't make any sense (they don't access any field from _anonymous_5, and so compiling this will fail).
Nov 01 2019
parent reply Jacob Carlborg <doob me.com> writes:
On 2019-11-01 21:39, Cristian Becerescu wrote:

 Sorry for taking so long to reply.
 
 I've written a simple example where dpp fails to generate valid D code 
 for the problem I mention earlier. The example is in this gist (for 
 syntax highlighting): 
 https://gist.github.com/cbecerescu/9a114bb92b23bd8e275a8eae08c6cc2f
Wow, that's complicated. DStep [1] generates this: extern (C): struct A_struct { union { int a; struct { int b; union { int c; char d; } } } } [1] http://github.com/jacob-carlborg/dstep -- /Jacob Carlborg
Nov 02 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
On Saturday, 2 November 2019 at 17:02:45 UTC, Jacob Carlborg 
wrote:
 On 2019-11-01 21:39, Cristian Becerescu wrote:

 Sorry for taking so long to reply.
 
 I've written a simple example where dpp fails to generate 
 valid D code for the problem I mention earlier. The example is 
 in this gist (for syntax highlighting): 
 https://gist.github.com/cbecerescu/9a114bb92b23bd8e275a8eae08c6cc2f
Wow, that's complicated.
When encountering anonymous structs or unions, dpp gives them a name. And this forces dpp to declare a member of that 'now-named-anon' record type and also provide accessors for the 'now-named-anon' record (because unnamed records also implicitly declare a member of their type).
Nov 03 2019
parent reply Jacob Carlborg <doob me.com> writes:
On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu 
wrote:

 When encountering anonymous structs or unions, dpp gives them a 
 name. And this forces dpp to declare a member of that 
 'now-named-anon' record type and also provide accessors for the 
 'now-named-anon' record (because unnamed records also 
 implicitly declare a member of their type).
But the question is why is DPP generating named unions/structs when D supports anonymous ones? -- /Jacob Carlborg
Nov 04 2019
parent reply Atila Neves <atila.neves gmail.com> writes:
On Monday, 4 November 2019 at 09:17:57 UTC, Jacob Carlborg wrote:
 On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu 
 wrote:

 When encountering anonymous structs or unions, dpp gives them 
 a name. And this forces dpp to declare a member of that 
 'now-named-anon' record type and also provide accessors for 
 the 'now-named-anon' record (because unnamed records also 
 implicitly declare a member of their type).
But the question is why is DPP generating named unions/structs when D supports anonymous ones?
Probably because I either didn't know that at the time but knew and forgot. I don't remember what I was trying to do or why just doing the obvious failed.
Nov 04 2019
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 05/11/2019 5:48 AM, Atila Neves wrote:
 On Monday, 4 November 2019 at 09:17:57 UTC, Jacob Carlborg wrote:
 On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu wrote:

 When encountering anonymous structs or unions, dpp gives them a name. 
 And this forces dpp to declare a member of that 'now-named-anon' 
 record type and also provide accessors for the 'now-named-anon' 
 record (because unnamed records also implicitly declare a member of 
 their type).
But the question is why is DPP generating named unions/structs when D supports anonymous ones?
Probably because I either didn't know that at the time but knew and forgot. I don't remember what I was trying to do or why just doing the obvious failed.
It sounds like you were too liberal in doing this. There is a variant of struct/union in C which has a named instance but no type name.
Nov 04 2019
parent reply Jacob Carlborg <doob me.com> writes:
On 2019-11-04 17:54, rikki cattermole wrote:

 There is a variant of struct/union in C which has a named instance but 
 no type name.
You mean like this? struct Foo { struct { int a; } b; } That still needs to be translated to a named struct in D. -- /Jacob Carlborg
Nov 04 2019
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 05/11/2019 8:33 AM, Jacob Carlborg wrote:
 On 2019-11-04 17:54, rikki cattermole wrote:
 
 There is a variant of struct/union in C which has a named instance but 
 no type name.
You mean like this? struct Foo {     struct     {         int a;     } b; } That still needs to be translated to a named struct in D.
Yes it does. This is probably why I'm guessing that Atila generated a name. Just did so too liberally ;)
Nov 04 2019
parent Atila Neves <atila.neves gmail.com> writes:
On Tuesday, 5 November 2019 at 01:25:40 UTC, rikki cattermole 
wrote:
 On 05/11/2019 8:33 AM, Jacob Carlborg wrote:
 On 2019-11-04 17:54, rikki cattermole wrote:
 
 There is a variant of struct/union in C which has a named 
 instance but no type name.
You mean like this? struct Foo {     struct     {         int a;     } b; } That still needs to be translated to a named struct in D.
Yes it does. This is probably why I'm guessing that Atila generated a name. Just did so too liberally ;)
Sounds about right.
Nov 05 2019
prev sibling parent bachmeier <no spam.net> writes:
On Monday, 4 November 2019 at 16:48:21 UTC, Atila Neves wrote:
 On Monday, 4 November 2019 at 09:17:57 UTC, Jacob Carlborg 
 wrote:
 On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu 
 wrote:

 When encountering anonymous structs or unions, dpp gives them 
 a name. And this forces dpp to declare a member of that 
 'now-named-anon' record type and also provide accessors for 
 the 'now-named-anon' record (because unnamed records also 
 implicitly declare a member of their type).
But the question is why is DPP generating named unions/structs when D supports anonymous ones?
Probably because I either didn't know that at the time but knew and forgot. I don't remember what I was trying to do or why just doing the obvious failed.
I don't know if this is related, but dpp ignored anonymous enums, so it may have been the same issue.
Nov 04 2019
prev sibling parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 3 of Milestone 2

- I've proposed a solution for the 'typeof with actual types 
instead of expressions' issue from last week 
(https://github.com/atilaneves/dpp/pull/201)

- I've identified the problem with the nested C11 structs and 
unions mentioned last week. An explanation for this was given in 
response to Jacob's questions above 
(https://forum.dlang.org/post/ckxmzciqkijaarwgmsdh forum.dlang.org and
https://forum.dlang.org/post/xbbafbbscjlgaoqznxlw forum.dlang.org). I also
proposed a solution for this issue here:
https://github.com/atilaneves/dpp/pull/204

- On several occasions (when translating virtio.h), some function 
parameters' types contain the struct keyowrd which produces 
compiling errors. I'll try to solve this particular issue and 
other bugs in the next few days
Nov 03 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 4 of Milestone 2 (ended on November 10th, sorry 
for the delay)

Continued testing dpp with virtio.h and found multiple bugs, 
especially regarding renaming:


Accessors for members of anonymous records are not renamed when 
the members are keywords

So in this case

struct A {
     union {
         unsigned int version;
         char module;
     };
};

the members themselves would be renamed (to version_ and 
module_), but the accessor function names would not (they would 
be auto version(...) and auto module(...) etc.).

(Solved during week 5 of Milestone 2)


The fixFields() method doesn't work when multiple structs have a 
field (of type pointer to struct) which needs renaming

So in this case

struct A;
struct B {
     struct A *A;
};

struct C {
     struct A* A;
};

dpp would only rename one of the fields (either the one in B or 
in C), because the _fieldDeclarations associative array 
overwrites the already existing (if any) line number with a new 
one. So it would rename only the field which is contained in the 
last processed struct.

Also, this affects C11 anon records, if, for example we would add 
the struct below to the ones above

struct D {
     union {
         struct A* A;
         int d;
     };
};

(Both of those problems have a proposed solution, during week 5 
of Milestone 2)


In some cases, dpp writes a clang warning into the generated D 
file. For example (for virtio.h):

foo.d.tmp:81822:141: warning: __VA_ARGS__ can only appear in the 
expansion of a C99 variadic macro #define __PVOP_VCALL ( op , pre 
, post, ... ) ____PVOP_VCALL ( op , CLBR_ANY , 

__VA_ARGS__ ).

This is written exactly as is in the D file. I still have to 
figure why this happens.


Some code is translated to enum BLA = 6LLU;
LLU is not valid in D (LL needs to be changed to L).


An enum is initialized with a value of 68719476704, which 
produces this:
Error: cannot implicitly convert expression 68719476704L of type 
long to int

I will probably need to check if some values of the enum are > 
int.max, and, if so, declare "enum : long" instead of "enum".


Again regarding C11 anon records: in its output, dpp writes 
"const(_Anonymous_55) version_;", and this conflicts with the 
accessor functions' name for a struct field. I will try to figure 
out the reason why dpp adds that declaration in the first place 
and fix this (most probably another renaming issue).


Error: undefined identifier xattr_handler

The above happens in a struct declaration (a field of that type 
is declared) and also in a function declaration (a parameter of 
that type). The struct declaration for xattr_handler is not in 
the generated D file .Will look more into this.


usr/bin/ld: Warning: size of symbol 
`_D3foo10local_apic12__reserved_1MUNaNbNdNiNfZk' changed from 15 
in foo.o to 18 in foo.o

After manually trying to solve the bugs above in the generated 
foo.d D file, the only problem remaining is the linker one above.
Nov 16 2019
next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian Becerescu 
wrote:

 An enum is initialized with a value of 68719476704, which 
 produces this:
 Error: cannot implicitly convert expression 68719476704L of 
 type long to int

 I will probably need to check if some values of the enum are > 
 int.max, and, if so, declare "enum : long" instead of "enum".
Shouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).
Nov 16 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
On Saturday, 16 November 2019 at 18:39:51 UTC, Patrick Schluter 
wrote:
 On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian 
 Becerescu wrote:

 An enum is initialized with a value of 68719476704, which 
 produces this:
 Error: cannot implicitly convert expression 68719476704L of 
 type long to int

 I will probably need to check if some values of the enum are > 
 int.max, and, if so, declare "enum : long" instead of "enum".
Shouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).
I tested some cases. enum e { A = 68719476704 }; // Compiles fine enum e { A = 68719476704, B = 1 }; // Still compiles fine enum e { B = 1, A = 68719476704 }; // Error: cannot implicitly convert expression 68719476704L of type long to int I looked through the docs [1] and found this: "If the EnumBaseType is not explicitly set, and the first EnumMember has an AssignExpression, it is set to the type of that AssignExpression. Otherwise, it defaults to type int." [1] https://dlang.org/spec/enum.html
Nov 16 2019
parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Saturday, 16 November 2019 at 23:55:45 UTC, Cristian Becerescu 
wrote:
 On Saturday, 16 November 2019 at 18:39:51 UTC, Patrick Schluter 
 wrote:
 On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian 
 Becerescu wrote:

 An enum is initialized with a value of 68719476704, which 
 produces this:
 Error: cannot implicitly convert expression 68719476704L of 
 type long to int

 I will probably need to check if some values of the enum are
 int.max, and, if so, declare "enum : long" instead of
"enum".
Shouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).
I tested some cases. enum e { A = 68719476704 }; // Compiles fine enum e { A = 68719476704, B = 1 }; // Still compiles fine enum e { B = 1, A = 68719476704 }; // Error: cannot implicitly convert expression 68719476704L of type long to int I looked through the docs [1] and found this: "If the EnumBaseType is not explicitly set, and the first EnumMember has an AssignExpression, it is set to the type of that AssignExpression. Otherwise, it defaults to type int." [1] https://dlang.org/spec/enum.html
I wanted to interject that it is in violation of the C standard (as D has one of its goal that constructs that are syntactically identical to C behave the same as in C) but after checking the standard C99, it is in fact implementation-defined, with the added remark "An implementation may delay the choice of which integer type until all enumeration constants have been seen". So, even in C the compiler may implement as it wishes.
Nov 17 2019
prev sibling parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 5 of Milestone 2

I've implemented and submitted solutions for the following bugs:


 Accessors for members of anonymous records are not renamed when 
 the members are keywords
PR: https://github.com/atilaneves/dpp/pull/213 Status: Merged.

 The fixFields() method doesn't work when multiple structs have 
 a field (of type pointer to struct) which needs renaming
PR: https://github.com/atilaneves/dpp/pull/214 Status: Merged.

 Some code is translated to enum BLA = 6LLU;
 LLU is not valid in D (LL needs to be changed to L).
PR: https://github.com/atilaneves/dpp/pull/215 Status: Need to add unit test.

 An enum is initialized with a value of 68719476704, which 
 produces this:
 Error: cannot implicitly convert expression 68719476704L of 
 type long to int
PR: https://github.com/atilaneves/dpp/pull/217 Status: AppVeyor build fails (on Windows 32-bit), will look into it.

 Again regarding C11 anon records: in its output, dpp writes 
 "const(_Anonymous_55) version_;", and this conflicts with the 
 accessor functions' name for a struct field. I will try to 
 figure out the reason why dpp adds that declaration in the 
 first place and fix this (most probably another renaming issue).
PR: https://github.com/atilaneves/dpp/pull/216 Status: Just need to add a comment and after that probably done. As for the rest of the issues, I'm still debugging and trying to figure out how to reproduce some of them on smaller examples.
Nov 18 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 1 of Milestone 3

(1) I've identified the actual cause for a previous issue and 
also implemented a solution for it:


 Error: undefined identifier xattr_handler

 The above happens in a struct declaration (a field of that type 
 is declared) and also in a function declaration (a parameter of 
 that type). The struct declaration for xattr_handler is not in 
 the generated D file. Will look more into this.
The problem there was that we would have something like this: void f(struct A**); // C code; A is undeclared and, because A is undeclared, DPP should generate an plain 'struct A;'. It turns out that previously, DPP was only looking at the cursor type for the first pointee of 'struct A**', which was 'struct A*' (which is not a Record type). So DPP would not generate the corresponding plain struct. The solution is simply moving on the pointee type until the cursor type is not a Pointer type (so eventually we would get to the Record type in the case above). PR: https://github.com/atilaneves/dpp/pull/218 (2) Found another bug, this time about bitfields. A succinct explanation and a small example can be found here: https://gist.github.com/cbecerescu/29188e4c0f0bb83e0e85e4e0dccc8c30 (3) I've tried using DPP with other kernel headers as well (specifically netdevice.h), and usually I get an error telling me the resources have been exhausted. My guess is that the issue is the 'lines' array, which contains all of the lines to be written in the generated D file. As far as I know, we don't flush at any point during this process. I assume this array gets very large at some point (I've seen a maximum of ~100K generated D files so far, although other factors could be also impactful, maybe the AST). I thought of two approaches here: either try to write to the file when 'lines' gets too big, or translate each C header into different D modules. I'm still open to suggestions. (4) I still have not got to the cause of this (I assume it's a sort of redirection of stderr, but I can't reproduce it with any warnings I tried):

 In some cases, dpp writes a clang warning into the generated D 
 file. For example > (for virtio.h):

foo.d.tmp:81822:141: warning: __VA_ARGS__ can only appear in the 
expansion of a C99 variadic macro #define __PVOP_VCALL ( op , 
pre , post, ... ) ____PVOP_VCALL ( op , CLBR_ANY , 

__VA_ARGS__ ).

 This is written exactly as is in the D file. I still have to 
 figure why this happens.
Nov 25 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 2 of Milestone 3

- Fixed the bitfields issue 
(https://github.com/atilaneves/dpp/pull/219)
- Tested DPP with all the kernel headers included by Alex's 
driver + other 70 random kernel headers
- Discovered (what I hope to be the last) 5 issues

1. In some obscure parts of the kernel, there exists something 
like 'extern __visibile const void __nosave_begin;', which is a 
variable of type void (but declared extern, so the compiler 
doesn't complain). Still thinking about how DPP should translate 
this to valid D code.

2. Unknown enums in function return/param types won't generate 
opaque definitions for that enum (as it is the case for struct).

e.g.: enum A f(enum B); // A and B would be uknown identifiers
       struct C *f(struct D *); // C and D would be declared by 
default by DPP

3. Unreachable struct

e.g.:
// C
struct A {
     struct B {
         int a;
         int b;
     };
};

void f(struct B *); // valid in C

// D
void f(B *); // not valid in D, this is how DPP translates the 
function declaration
void f(A.B *); // OK, this should probably be the actual output

4. 'struct sockaddr size not known', and some functions return 
this struct type by value (not pointer to it). Definition of this 
struct should probably be there. Will check to see if I missed 
any flags in the Makefile.

 (3) I've tried using DPP with other kernel headers as well 
 (specifically netdevice.h), and usually I get an error telling 
 me the resources have been exhausted.
For this issue I previously proposed 2 approaches, which have their own limitations (thanks Atila for pointing them out). The exhaustion of memory seems to happen while cpp/clang is still running, so the issue is not about the lines array (which anyway would not be bigger than MB, and the RAM usage is more than 6GB). I will, therefore, do some profiling to narrow down the causes for this and decide if there's something we could do about it.
Dec 02 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 3 of Milestone 3

- Fixed an issue which deals with the way nested aggregates are 
used outside of their parent (enclosing) aggregate

e.g.:

struct A {
     struct B {
         int a;
         int c;
     } b_obj;
     int other;
};

void f(struct B*); // works in C, in D it should be A.B (but it 
was still translated to just B)
// Also applicable to field types

A better explanation and an actual solution are presented in the 
PR below.
PR: https://github.com/atilaneves/dpp/pull/226

- Did some profiling on DPP with lots of kernel header files 
included in the same .dpp file.
Memory usage after libclang finishes parsing: 300MB.
Memory usage after DPP finishes processing the AST provided by 
libclang: 6-7GB (this is where some fishy things happen).
DPP then tries to use 'cpp' on the .d.tmp file for C 
preprocessing, but the process spawning fails, as there is not 
enough memory left to fork and execute (I was testing on an 8GB 
machine).

It turns out (and huge thanks to Edi and Razvan for your help 
with this) that the Appender object  '_lines' (which stores the 
strings to be written in the translated D file) leaks memory. 
This was apparently an already known issue: 
https://issues.dlang.org/show_bug.cgi?id=19511

I've changed the type from Appender to built-in array and the 
maximum memory usage at any point is somewhere between 2.0-2.4GB, 
way better than 7GB :)

Edi also wrote a simple test and it seems that for appending ~ 
10^6 strings of length 20 each, Appender is just 20ms faster. I 
also tested this with headers generating a 250K line (850K words 
total) D file and performance doesn't differ (I also think that 
as long as Appender is leaking it shouldn't be used no matter the 
speed boost, considering the large C codebases that DPP could be 
used with).
PR: https://github.com/atilaneves/dpp/pull/227
Dec 09 2019
parent reply Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 4 of Milestone 3

- For the memory usage issue: I previously proposed changing from 
Appender to array (as I've seen the memory dropping from ~6GB to 
~2GB). In the last week I've been continuously testing those two 
versions to be sure. However I don't get constant results: 
sometimes both the appender and the array use more than 5GB, and 
other times both use 2GB (once I even got to max 600MB). The 
tests were done with the same input file and after system restart 
(just to make sure it's not a cache thing of some sorts). If you 
have any guess about why this might happen, please feel free to 
reply and let us know.

I also tested the performance for Appender vs array on Python.h 
(which is way smaller than the kernel headers I tested above), as 
Atila requested in the PR.

Appender version (~230MB):

User time (seconds): 18.78
System time (seconds): 2.62
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:21.41

Array version (~240MB):

User time (seconds): 17.39
System time (seconds): 0.22
Percent of CPU this job got: 100%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:17.61

- I've identified the issue which inserted C warnings in the 
generated D code (it's the C preprocessor whose output is 
redirected into the file). I have a local fix for this 
(redirecting the error messages), but I didn't make a PR yet.

- Also locally fixed an issue with void type extern variables. As 
discussed with my mentors, the solution is to change from void to 
char*.
Dec 16 2019
parent Cristian Becerescu <cristian.becerescu yahoo.com> writes:
Update for week 1 of Milestone 4

- Solved and made a PR for the clang warnings being written to 
the .d file: https://github.com/atilaneves/dpp/pull/232

- Solved and made a PR for the issue of variables of type void in 
C: https://github.com/atilaneves/dpp/pull/233

I had some issues running the unittests locally and I've been 
trying to debug that while working on the above PRs. I decided to 
push the changes anyway, but I think that in the meantime Atila 
managed to fix those issues.

In the next 2 weeks I won't be able to consistently work on the 
project as before, so I will probably be posting the updates for 
weeks 2 and 3 of Milestone 4 as part of the update for week 3.
Dec 23 2019