www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - htod - convert C .h files to D import files

reply Walter Bright <newshound digitalmars.com> writes:
I'm not sure how useful this will be.

Here it is: http://www.digitalmars.com/d/htod.html
May 21 2006
next sibling parent Dave <Dave_member pathlink.com> writes:
Walter Bright wrote:
 I'm not sure how useful this will be.
 
 Here it is: http://www.digitalmars.com/d/htod.html
Thanks!
May 21 2006
prev sibling next sibling parent Tom <ihate spam.com> writes:
Walter Bright escribió:
 I'm not sure how useful this will be.
 
 Here it is: http://www.digitalmars.com/d/htod.html
Wow, that was quick. Seems like a great work!
May 21 2006
prev sibling next sibling parent Derek Parnell <derek psych.ward> writes:
On Sun, 21 May 2006 16:17:38 -0700, Walter Bright wrote:

 I'm not sure how useful this will be.
Very useful. Thanks for this Walter.
 Here it is: http://www.digitalmars.com/d/htod.html
I just used it on windows.h. It found a 'bug' in winscard.h ;-) After I added #include <BaseTsd.h> #include <Guiddef.h> to winscard.h and ran it again, it generated a very useful windows.d file. The command I used was ... htod windows.h -Iy:\dm\include -hs -DUNICODE and I got a 177823 lines in windows.d (mostly comments of course.). With the -hc switch this went down to 85572 lines of useful code. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 22/05/2006 12:05:43 PM
May 21 2006
prev sibling next sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Walter Bright wrote:

 I'm not sure how useful this will be.
 
 Here it is: http://www.digitalmars.com/d/htod.html
Q: Will there be source code later, for porting to GDC ? "Bugs: 2. No Linux version" (no Mac OS X version either) --anders
May 22 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Anders F Björklund wrote:
 Walter Bright wrote:
 
 I'm not sure how useful this will be.

 Here it is: http://www.digitalmars.com/d/htod.html
Q: Will there be source code later, for porting to GDC ?
Sorry, but it's totally based on the C compiler. But that also means it could be done to gcc.
 "Bugs: 2. No Linux version" (no Mac OS X version either)
True.
May 22 2006
next sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Walter Bright wrote:
 Anders F Björklund wrote:
 Walter Bright wrote:

 I'm not sure how useful this will be.

 Here it is: http://www.digitalmars.com/d/htod.html
Q: Will there be source code later, for porting to GDC ?
Sorry, but it's totally based on the C compiler. But that also means it could be done to gcc.
I've been playing around with libcpp (the c preprocessor) from gcc and was wondering if that was indeed enough to make a tool like htod. I mean, do you need to parse the C code further, or is a preprocessor (passing you the tokens) enough to do what htod does? (You'd have to detect the 3 tokens "unsigned long long" yourself and replace it with "ulong", but that's find&replace, hardly parsing) L.
May 22 2006
parent Walter Bright <newshound digitalmars.com> writes:
Lionello Lunesu wrote:
 I've been playing around with libcpp (the c preprocessor) from gcc and 
 was wondering if that was indeed enough to make a tool like htod.
 
 I mean, do you need to parse the C code further, or is a preprocessor 
 (passing you the tokens) enough to do what htod does?
 
 (You'd have to detect the 3 tokens "unsigned long long" yourself and 
 replace it with "ulong", but that's find&replace, hardly parsing)
htod does a real parse of the C code, and generates the D output from the internal symbol table. Parsing it all the way means that the corner cases work. Also, it means it'll work with C++ header files that have extern "C" declarations in them. "unsigned long long" can be: unsigned long long long unsigned long long long unsigned int unsigned long long unsigned long const int long etc.
May 22 2006
prev sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Walter Bright wrote:

 Q: Will there be source code later, for porting to GDC ?
Sorry, but it's totally based on the C compiler. But that also means it could be done to gcc.
It doesn't have to be complete... As long as it's something like "insert C/C++ front end here", it could then be adapted to GCC ? Anyway, we can probably do a re-implementation from the documentation but the more info you are able to provide the more similar they'll get. If we can do a larger implementation test, like Dstress for the compiler, then we can run that on the two "htod" to compare them... Will do a spike / hack in a scripting language. "ghtod", I suppose. ? Time will tell if that'll do, or if we need a "real" C or D program. --anders
May 22 2006
parent Walter Bright <newshound digitalmars.com> writes:
Anders F Björklund wrote:
 Walter Bright wrote:
 
 Q: Will there be source code later, for porting to GDC ?
Sorry, but it's totally based on the C compiler. But that also means it could be done to gcc.
It doesn't have to be complete... As long as it's something like "insert C/C++ front end here", it could then be adapted to GCC ? Anyway, we can probably do a re-implementation from the documentation but the more info you are able to provide the more similar they'll get. If we can do a larger implementation test, like Dstress for the compiler, then we can run that on the two "htod" to compare them... Will do a spike / hack in a scripting language. "ghtod", I suppose. ? Time will tell if that'll do, or if we need a "real" C or D program. --anders
There really isn't much to it, it just walks the symbol table of the C compiler.
May 22 2006
prev sibling next sibling parent Lionello Lunesu <lio lunesu.remove.com> writes:
Walter Bright wrote:
 I'm not sure how useful this will be.
 
 Here it is: http://www.digitalmars.com/d/htod.html
Cool tool! Seems to work fine with some small headers I've tested. But shouldn't the tool create a .di file, like dmd -H does? L.
May 22 2006
prev sibling next sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
//C typedef signed char GLbyte;
alias sbyte GLbyte;

dmd gl.d
gl.d(42): identifier 'sbyte' is not defined ?? something new or something old? L.
May 22 2006
parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
I have found some other issues with htod, but am reluctant to post them. 
  Don't want Walter's focus to shift to other tools. (Who am I to worry 
about Walter's focus? :S)

Anyway, here are some things to keep in mind when using htod:

* signed char => sbyte
* wchar_t not known (it's built-in in vc?)
* void func( void (*callback)(void) ) => void func( void 
function(...)callback );
* comments are 1 line off

L.
May 23 2006
next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Lionello Lunesu wrote:
 I have found some other issues with htod, but am reluctant to post them. 
  Don't want Walter's focus to shift to other tools. (Who am I to worry 
 about Walter's focus? :S)
 
 Anyway, here are some things to keep in mind when using htod:
 
 * signed char => sbyte
What's the issue?
 * wchar_t not known (it's built-in in vc?)
Hmm. Example?
 * void func( void (*callback)(void) ) => void func( void 
 function(...)callback );
That's correct.
 * comments are 1 line off
This can happen with macros, but it shouldn't happen with declarations.
May 23 2006
parent reply "Lionello Lunesu" <lionello lunesu.remove.com> writes:
"Walter Bright" <newshound digitalmars.com> wrote in message 
news:e4vhd6$25ti$1 digitaldaemon.com...
 Lionello Lunesu wrote:
 I have found some other issues with htod, but am reluctant to post them. 
 Don't want Walter's focus to shift to other tools. (Who am I to worry 
 about Walter's focus? :S)

 Anyway, here are some things to keep in mind when using htod:

 * signed char => sbyte
What's the issue?
dmd claims (correctly) it doesn't know about sbyte: gl.d(42): identifier 'sbyte' is not defined
 * wchar_t not known (it's built-in in vc?)
Hmm. Example?
const wchar_t* APIENTRY gluErrorUnicodeStringEXT ( GLenum errCode); wchar_t wasn't defined anywhere (although I can find it in the headers, but it must have been ifdef'ed or something). And I think it's indeed handled as a built-in type by VC, which might be the reason why it's not typedef'en explicitely. This header is from the PlatformSDK, by the way, so I was expecting some ms-only stuff.
 * void func( void (*callback)(void) ) => void func( void 
 function(...)callback );
That's correct.
Thought so. Still, useful info for a knowledge base : )
 * comments are 1 line off
This can happen with macros, but it shouldn't happen with declarations.
I'll have to check the exact occurence at the office tomorrow, but it happened during the conversion of gl.h (as did the others): #define SOME_CONSTANT 123 /* multi-line comment */ => /* multi-line const SOME_CONSTANT = 123; comment */ The code ended up being commented and I had to shift it up (or down; don't recall). Just another thing to be noted, that's all. All-in-all, it still saved me a lot of time! L.
May 23 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Lionello Lunesu wrote:
 "Walter Bright" <newshound digitalmars.com> wrote in message 
 news:e4vhd6$25ti$1 digitaldaemon.com...
 Lionello Lunesu wrote:
 I have found some other issues with htod, but am reluctant to post them. 
 Don't want Walter's focus to shift to other tools. (Who am I to worry 
 about Walter's focus? :S)

 Anyway, here are some things to keep in mind when using htod:

 * signed char => sbyte
What's the issue?
dmd claims (correctly) it doesn't know about sbyte: gl.d(42): identifier 'sbyte' is not defined
<slaps forehead> Arrgh!
May 23 2006
parent reply "Lionello Lunesu" <lionello lunesu.remove.com> writes:
 * signed char => sbyte
What's the issue?
dmd claims (correctly) it doesn't know about sbyte: gl.d(42): identifier 'sbyte' is not defined
<slaps forehead> Arrgh!
Hey! Don't get me wrong! I've posted before that "byte" should be unsigned! So, create an sbyte for the signed byte and let's have a laugh about it. L.
May 23 2006
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Lionello Lunesu wrote:
 * signed char => sbyte
What's the issue?
dmd claims (correctly) it doesn't know about sbyte: gl.d(42): identifier 'sbyte' is not defined
<slaps forehead> Arrgh!
Hey! Don't get me wrong! I've posted before that "byte" should be unsigned! So, create an sbyte for the signed byte and let's have a laugh about it.
The integer types are designed to be named consistently with each other. byte, short, int and long are signed. ubyte, ushort, uint and ulong are unsigned. Stewart.
May 24 2006
parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Stewart Gordon wrote:
 byte, short, int and long are signed.
 ubyte, ushort, uint and ulong are unsigned.
 
 Stewart.
Yes, obviously. Interesting though that Walter himself wrote "sbyte" and didn't notice what was wrong with it :) There's something to be said for intuitiveness, even when it's not consequent. L.
May 24 2006
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Lionello Lunesu wrote:
 Stewart Gordon wrote:
 byte, short, int and long are signed.
 ubyte, ushort, uint and ulong are unsigned.

 Stewart.
Yes, obviously. Interesting though that Walter himself wrote "sbyte" and didn't notice what was wrong with it :) There's something to be said for intuitiveness, even when it's not consequent. L.
Funny, I always thought having 'char' in C being signed unlike every other integer type was hugely unintuitive. Usually, the first thing I do with a C or C++ program is something like this: #define byte signed char #define ubyte unsigned char Otherwise I end up getting confused... -- Daniel -- v1sw5+8Yhw5ln4+5pr6OFma8u6+7Lw4Tm6+7l6+7D a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
May 24 2006
prev sibling parent Sean Kelly <sean f4.ca> writes:
Lionello Lunesu wrote:
 * wchar_t not known (it's built-in in vc?)
vc actually offers an option of whether to define wchar_t as a separate type or merely to make it a typedef of unsigned short. I prefer the former option, but it doesn't work with all third-party code. Sean
May 23 2006
prev sibling next sibling parent reply Don Clugston <dac nospam.com.au> writes:
"I'm not sure how useful this will be."

That's such a typical Walter quote <g>. About as useful as a Ferrari 
that only works on roads, I reckon. Time to throw away my regexp billy-cart.

A perfect H to D converter is probably impossible, because #defines are 
so ambiguous. It takes a human who knows how the header file is intended 
to be used. However, it ought to be possible to construct patch files to 
apply to C headers prior to running them into htod. It also seems 
feasible in the long term to persuade some open-source projects to make 
their headers D-friendly.

I'm left wondering what this means for the Windows API project. I 
certainly would have done things differently, if I'd known this was 
coming. Where to now?
May 23 2006
parent Walter Bright <newshound digitalmars.com> writes:
Don Clugston wrote:
 A perfect H to D converter is probably impossible, because #defines are 
 so ambiguous.
Not just that, there are things like using "char" as "byte" in C code. There's no way to reliably distinguish the usages.
 It takes a human who knows how the header file is intended 
 to be used.
That's right. That's why the original declarations are included as comments to make it easy to go through the results line by line and check them.
 However, it ought to be possible to construct patch files to 
 apply to C headers prior to running them into htod. It also seems 
 feasible in the long term to persuade some open-source projects to make 
 their headers D-friendly.
That's why the __HTOD__ macro is predefined when running htod.
 I'm left wondering what this means for the Windows API project. I 
 certainly would have done things differently, if I'd known this was 
 coming. Where to now?
Windows would still need a lot of hand work because of its heavy dependence on the preprocessor. There are still nearly 100,000 lines of code to check. What I hope is that the D outputting version of SWIG will no longer be necessary, that one can just run the C output of SWIG through htod. That way we can easily leverage all the work put into SWIG by others.
May 23 2006
prev sibling next sibling parent reply BCS <BCS pathlink.com> writes:
Walter Bright wrote:
 I'm not sure how useful this will be.
 
 Here it is: http://www.digitalmars.com/d/htod.html
Cool!! Just the ticket. Out of curiosity, how much new coding did this take? feature request: print to stdout (would be handy for piping to sed or the like)
May 23 2006
parent Walter Bright <newshound digitalmars.com> writes:
BCS wrote:
 Out of curiosity, how much new coding did this take?
Maybe 1500 lines.
May 23 2006
prev sibling parent reply BCS <BCS pathlink.com> writes:
Walter Bright wrote:
 I'm not sure how useful this will be.
 
 Here it is: http://www.digitalmars.com/d/htod.html
One case that it doesn't handle but would seem to be simple is: #define FOO "bar" | v const char[] FOO = "bar\0"; //explicitly null terminate
May 24 2006
next sibling parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
BCS wrote:
 Walter Bright wrote:
 I'm not sure how useful this will be.

 Here it is: http://www.digitalmars.com/d/htod.html
One case that it doesn't handle but would seem to be simple is: #define FOO "bar" | v const char[] FOO = "bar\0"; //explicitly null terminate
Why would you need it to be explicitly null terminated ? D string literals have a zero anyway just after their normal data (it's not reflected by the 'length' property, but it's there). -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/M d-pu s+: a-->----- C+++$>++++ UL P+ L+ E--- W++ N++ o? K? w++ !O !M V? PS- PE- Y PGP t 5 X? R tv-- b DI- D+ G e>+++ h>++ !r !y ------END GEEK CODE BLOCK------ Tomasz Stachowiak /+ a.k.a. h3r3tic +/
May 24 2006
parent reply BCS <BCS_member pathlink.com> writes:
In article <e52ivj$3am$1 digitaldaemon.com>, Tom S says...
BCS wrote:
 Walter Bright wrote:
 I'm not sure how useful this will be.

 Here it is: http://www.digitalmars.com/d/htod.html
One case that it doesn't handle but would seem to be simple is: #define FOO "bar" | v const char[] FOO = "bar\0"; //explicitly null terminate
Why would you need it to be explicitly null terminated ? D string literals have a zero anyway just after their normal data (it's not reflected by the 'length' property, but it's there).
<code file=a.h> #define FOO "start" #define FOO "stop" int func(char* string); <\code> <code file=b.d> import a; .. char[] command = FOO; .. char[] use = command.dup; // don't known why, but someone going to do it .. func(use.prt); // now what?? .. <\code>
May 24 2006
parent Lionello Lunesu <lio lunesu.remove.com> writes:
BCS wrote:
 In article <e52ivj$3am$1 digitaldaemon.com>, Tom S says...
 BCS wrote:
 Walter Bright wrote:
 I'm not sure how useful this will be.

 Here it is: http://www.digitalmars.com/d/htod.html
One case that it doesn't handle but would seem to be simple is: #define FOO "bar" | v const char[] FOO = "bar\0"; //explicitly null terminate
Why would you need it to be explicitly null terminated ? D string literals have a zero anyway just after their normal data (it's not reflected by the 'length' property, but it's there).
<code file=a.h> #define FOO "start" #define FOO "stop" int func(char* string); <\code> <code file=b.d> import a; .. char[] command = FOO; .. char[] use = command.dup; // don't known why, but someone going to do it .. func(use.prt); // now what?? .. <\code>
But that's simply a bug in the code, not in the "header". What if "use" included the zero terminator, what would "use ~= "t";" result it? A "t" after the zero terminator? So, either the append code would have to strip zero terminators before appending (sure hope it won't), or the length property should not include the zero, which is exactly what's being done now. L.
May 24 2006
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Wed, 24 May 2006 14:15:43 -0700, BCS wrote:

 Walter Bright wrote:
 I'm not sure how useful this will be.
 
 Here it is: http://www.digitalmars.com/d/htod.html
One case that it doesn't handle but would seem to be simple is: #define FOO "bar" | v const char[] FOO = "bar\0"; //explicitly null terminate
But if that happened, things lke concatenation would cause problems. #define FOO "foo" #define BAR "bar" | | V const char[] FOO = "foo\0"; //explicitly null terminate const char[] BAR = "bar\0"; //explicitly null terminate . . . char[] x = FOO ~ BAR; // Now it has an embedded \0 ! -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 25/05/2006 5:03:13 PM
May 25 2006
parent reply BCS <BCS_member pathlink.com> writes:
In article <1xu5l9je8ge3f.txbccq4m8i7m.dlg 40tude.net>, Derek Parnell says...
On Wed, 24 May 2006 14:15:43 -0700, BCS wrote:

[...]

But if that happened, things lke concatenation would cause problems.

#define FOO "foo"
#define BAR "bar"
   |
   |
   V
const char[] FOO = "foo\0"; //explicitly null terminate
const char[] BAR = "bar\0"; //explicitly null terminate

. . .

  char[] x = FOO ~ BAR; // Now it has an embedded \0 !

-- 
I'll grant that. I think this will be another case of there being no 1-to-1 correspondence from C to D. I expect that with char constants, they are more likely to get copied than concatenated. The work around for either is easy as long as you remember it. Char[] foo = "bar", copy; copy = bar ~ \0; \\cat copies Char[] foo = "bar\0" bing = "baz\0", copy; copy = bar[0..$-1] ~ bing ; \\cat copies
May 25 2006
parent reply Rémy Mouëza <Rémy_member pathlink.com> writes:
But if that happened, things lke concatenation would cause problems.

#define FOO "foo"
#define BAR "bar"
   |
   |
   V
const char[] FOO = "foo\0"; //explicitly null terminate
const char[] BAR = "bar\0"; //explicitly null terminate

. . .

  char[] x = FOO ~ BAR; // Now it has an embedded \0 !
I thought that D managed the "\0" characters ( that was why we hadn't to put any in our char [] variables ). Therefore I made the following test : // constchar.d import std.stdio ; const char [] foo = "foo\0" ; typeof ( foo ) bar = "bar\0" ; void main () { char [] test = foo ~ bar ; writefln ( test ); } ray Moonraker:~/dee/tmp$ ./constchar foobar Seems that there is no problem. I may have not understand something.
May 27 2006
next sibling parent Lars Ivar Igesund <larsivar igesund.net> writes:
Rémy Mouëza wrote:

 import std.stdio ;
 
 const char [] foo  = "foo\0" ;
 typeof ( foo ) bar = "bar\0" ;
 
 void main ()
 {
 char [] test = foo ~ bar ;
 
 writefln ( test );
 }
No, there are no problems, but your string test has length 8 now (\0 is a non-printable character). -- Lars Ivar Igesund blog at http://larsivi.net DSource & #D: larsivi
May 27 2006
prev sibling parent reply Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Rémy Mouëza wrote:
But if that happened, things lke concatenation would cause problems.

#define FOO "foo"
#define BAR "bar"
  |
  |
  V
const char[] FOO = "foo\0"; //explicitly null terminate
const char[] BAR = "bar\0"; //explicitly null terminate

. . .

 char[] x = FOO ~ BAR; // Now it has an embedded \0 !
I thought that D managed the "\0" characters ( that was why we hadn't to put any in our char [] variables ). Therefore I made the following test : // constchar.d import std.stdio ; const char [] foo = "foo\0" ; typeof ( foo ) bar = "bar\0" ; void main () { char [] test = foo ~ bar ; writefln ( test ); } ray Moonraker:~/dee/tmp$ ./constchar foobar Seems that there is no problem. I may have not understand something.
With DMD 0.158 on Windows, I got "foo bar" from this. Note the space between. Also note that writef is written in D, for D, and therefore doesn't use null characters as end-of-string (it relies on the .length property instead). The big problem had to do with sending these strings to C code, which /will/ see the null's as an end-of-array marker. Try replacing your writefln with a printf, and you will only get "foo" printed to the screen. -- Chris Nicholson-Sauls
May 27 2006
parent Rémy Mouëza <Rémy_member pathlink.com> writes:
In article <e5a6dp$ols$1 digitaldaemon.com>, Chris Nicholson-Sauls says...
With DMD 0.158 on Windows, I got "foo bar" from this.  Note the space between. 
Also note 
that writef is written in D, for D, and therefore doesn't use null characters
as 
end-of-string (it relies on the .length property instead).  The big problem had
to do with 
sending these strings to C code, which /will/ see the null's as an end-of-array
marker. 
Try replacing your writefln with a printf, and you will only get "foo" printed
to the screen.
Using printf with dmd 0.158 on Linux, I've got 5 spaces and then "foo". I guess that even a restricted use of the C preprocessor for our D interfaces won't solve the problem since it seems to be type specific, will it ?
May 28 2006