www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - exern (C) linkage problem

reply Charles Hixson <charleshixsn earthlink.net> writes:
I'm trying to link a C routine to a D program, passing string 
parameters, but I keep getting segmentation errors.
As you can see, these are simple test routines, so the names don't 
reflect current status, but merely where I intend to arrive...but I've 
hit severe roadblocks.
(FWIW, I've tried including -fpic in the gcc command, and it didn't 
appear to make any difference.)

Makefile:
biblio: biblio.d sqlitebase.o
	dmd	biblio.d sqlitebase.o -ofbiblio

sqlitebase.o: sqlitebase.c sqlitebase.h
	gcc -c sqlitebase.c

biblio.d:
import   std.stdio;

//extern (C)   void   dbdefine (char[] str);
extern (C)   void   dbdefine (char[] inStr, ref char[255] outStr);

void   main()
{  char[255]   retVal;
    char[]   msg   =   cast(char[])"Hello from C\0";
    dbdefine   (msg, retVal);
    writeln ("Hello, World");
}

sqlitebase.h:

//void   dbdefine (char str[]);
void   dbdefine (char inStr[], char outStr[255]);

sqlitebase.c:

#include   "sqlitebase.h"

//void   dbdefine (char str[])
void   dbdefine (char inStr[], char outStr[255])
{   //int   i   =   0;
    //while (str[i] != 0)   i++;
    //printStr   (i, str);
    //^^--segmentation fault--^^
    //   printf ("%s/n", str);
    //^^--warning: incompatible implicit declaration of built-in 
function ‘printf’--^^
    //int   i   =   str[0];
    //putchar(i);
    //^^--segmentation fault--^^
    int   i   =   -1;
    while (++i < 255)
    {   if (inStr[i] == 0)   break;
       outStr[i]   =   inStr[i];
    }

}
Jul 18 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Charles Hixson:
 extern (C)   void   dbdefine (char[] inStr, ref char[255] outStr);
I think C and D char[] don't go well together. D arrays are 2-word long structs. Try again with something simpler, like a pointer and length. Bye, bearophile
Jul 18 2010
parent Charles Hixson <charleshixsn earthlink.net> writes:
On 07/18/2010 01:56 PM, bearophile wrote:
 Charles Hixson:
 extern (C)   void   dbdefine (char[] inStr, ref char[255] outStr);
I think C and D char[] don't go well together. D arrays are 2-word long structs. Try again with something simpler, like a pointer and length. Bye, bearophile
Thanks, changing everything to char pointers and passing, e.g., inStr.ptr worked.
Jul 18 2010
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sun, 18 Jul 2010 13:08:57 -0700, Charles Hixson wrote:

 I'm trying to link a C routine to a D program, passing string
 parameters, but I keep getting segmentation errors. As you can see,
 these are simple test routines, so the names don't reflect current
 status, but merely where I intend to arrive...but I've hit severe
 roadblocks.
 (FWIW, I've tried including -fpic in the gcc command, and it didn't
 appear to make any difference.)
 
 Makefile:
 biblio: biblio.d sqlitebase.o
 	dmd	biblio.d sqlitebase.o -ofbiblio
 
 sqlitebase.o: sqlitebase.c sqlitebase.h
 	gcc -c sqlitebase.c
 
 biblio.d:
 import   std.stdio;
 
 //extern (C)   void   dbdefine (char[] str); extern (C)   void  
 dbdefine (char[] inStr, ref char[255] outStr);
 
 void   main()
 {  char[255]   retVal;
     char[]   msg   =   cast(char[])"Hello from C\0"; dbdefine   (msg,
     retVal);
     writeln ("Hello, World");
 }
 
 sqlitebase.h:
 
 //void   dbdefine (char str[]);
 void   dbdefine (char inStr[], char outStr[255]);
 
 sqlitebase.c:
 
 #include   "sqlitebase.h"
 
 //void   dbdefine (char str[])
 void   dbdefine (char inStr[], char outStr[255]) {   //int   i   =   0;
     //while (str[i] != 0)   i++;
     //printStr   (i, str);
     //^^--segmentation fault--^^
     //   printf ("%s/n", str);
     //^^--warning: incompatible implicit declaration of built-in
 function ‘printf’--^^
     //int   i   =   str[0];
     //putchar(i);
     //^^--segmentation fault--^^
     int   i   =   -1;
     while (++i < 255)
     {   if (inStr[i] == 0)   break;
        outStr[i]   =   inStr[i];
     }
 
 }
Since bearophile already answered with a solution to your problem, I'll just chime in with a few small tips (of which you may already be aware): 1. D string *literals* are already zero-terminated, so you don't need to add the \0 character explicitly. Also, they cast implicitly to const (char)*, so if your function doesn't change inStr, it's perfectly fine to do extern(C) void dbdefine (const char* inStr); dbdefine("Hello from C"); 2. For D strings in general the \0 must be added, but this is very easy to forget. Therefore, when passing strings to C functions I always use the std.string.toStringz() function. It takes a D string, adds a \0 if necessary, and returns a pointer to the first character. string s = getAStringFromSomewhere(); dbdefine(toStringz(s)); -Lars
Jul 19 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Lars T. Kyllingstad:
 2. For D strings in general the \0 must be added, but this is very easy 
 to forget.  Therefore, when passing strings to C functions I always use 
 the std.string.toStringz() function.  It takes a D string, adds a \0 if 
 necessary, and returns a pointer to the first character.
 
   string s = getAStringFromSomewhere();
   dbdefine(toStringz(s));
The C code has to use those string pointers with lot of care. The D type system can help you remember to use the toStringz, this is just an idea: import std.string: toStringz; typedef char* Cstring; extern(C) Cstring strcmp(Cstring s1, Cstring s2); Cstring toCString(T)(T[] s) { return cast(Cstring)toStringz(s); } void main() { auto s1 = "abba"; auto s2 = "red"; // auto r = strcmp(toCString(s1), s2); // compile error auto r = strcmp(toCString(s1), toCString(s2)); // OK } Unfortunately Andrei has killed the useful typedef. So you have to use a struct with alias this, that often doesn't work. Bye, bearophile
Jul 19 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
 typedef char* Cstring;
 extern(C) Cstring strcmp(Cstring s1, Cstring s2);
 ...
You can use just a struct too: import std.string: toStringz; struct Cstring { const(char)* ptr; } extern(C) Cstring strcmp(Cstring s1, Cstring s2); Cstring toCString(T)(T[] s) { return Cstring(toStringz(s)); } void main() { auto s1 = "abba"; auto s2 = "red"; // auto r = strcmp(toCString(s1), s2); // compile error auto r = strcmp(toCString(s1), toCString(s2)); // OK } Bye, bearophile
Jul 19 2010
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Mon, 19 Jul 2010 18:01:25 -0400, bearophile wrote:

 typedef char* Cstring;
 extern(C) Cstring strcmp(Cstring s1, Cstring s2); ...
You can use just a struct too: import std.string: toStringz; struct Cstring { const(char)* ptr; } extern(C) Cstring strcmp(Cstring s1, Cstring s2); Cstring toCString(T)(T[] s) { return Cstring(toStringz(s)); } void main() { auto s1 = "abba"; auto s2 = "red"; // auto r = strcmp(toCString(s1), s2); // compile error auto r = strcmp(toCString(s1), toCString(s2)); // OK } Bye, bearophile
Good point. Actually, I think there should be a CString type in Phobos, but I think it should wrap a ubyte*, not a char*. The reason for this is that D's char is supposed to be a UTF-8 code unit, whereas C's char can be anything. -Lars
Jul 20 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
In that code, for further safety, I'd like to make it not possible (without a
cast) code like this (here toStringz doesn't get called):
strcmp(Cstring(s1.ptr), Cstring(s2.ptr));

So I think this code is a bit better:

import std.string: toStringz;

struct Cstring {
    const(char)* ptr; // const(ubyte)* ?
    static Cstring opCall(string s) {
        Cstring cs;
        cs.ptr = toStringz(s);
        return cs;
    }
}

extern(C) Cstring strcmp(Cstring s1, Cstring s2);

void main() {
    auto s1 = "abba";
    auto s2 = "red";
    auto r2 = strcmp(Cstring(s1), Cstring(s2));
}

Lars T. Kyllingstad:

 but I think it should wrap a ubyte*, not a char*.  The reason for this is 
 that D's char is supposed to be a UTF-8 code unit, whereas C's char can 
 be anything.
Right. But toStringz() returns a const(char)*, so do you want to change toStringz() first? Bye, bearophile
Jul 20 2010
parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Tue, 20 Jul 2010 05:10:47 -0400, bearophile wrote:

 In that code, for further safety, I'd like to make it not possible
 (without a cast) code like this (here toStringz doesn't get called):
 strcmp(Cstring(s1.ptr), Cstring(s2.ptr));
 
 So I think this code is a bit better:
 
 import std.string: toStringz;
 
 struct Cstring {
     const(char)* ptr; // const(ubyte)* ?
     static Cstring opCall(string s) {
         Cstring cs;
         cs.ptr = toStringz(s);
         return cs;
     }
 }
 
 extern(C) Cstring strcmp(Cstring s1, Cstring s2);
 
 void main() {
     auto s1 = "abba";
     auto s2 = "red";
     auto r2 = strcmp(Cstring(s1), Cstring(s2));
 }
 
 Lars T. Kyllingstad:
 
 but I think it should wrap a ubyte*, not a char*.  The reason for this
 is that D's char is supposed to be a UTF-8 code unit, whereas C's char
 can be anything.
Right. But toStringz() returns a const(char)*, so do you want to change toStringz() first?
Yes. I think we should stop using char* when interfacing with C code altogether. The "right" thing to do, if you can call it that, would be to use char* only if you KNOW the C function expects text input encoded as UTF-8 (or just plain ASCII), and ubyte* for other encodings and non- textual data. But this rule requires knowledge of what each function does with its input and must hence be applied on a case-by-case basis, which makes automated translation of C headers to D difficult. So I say make it simple, don't assume that your C functions handle UTF-8, and use ubyte* everywhere. (Actually, it's not that simple, either. I just remembered that C's char is sometimes signed, sometimes unsigned...) Maybe this should be discussed on the main NG. It's been bothering me for a while. I think I'll start a topic on it later. -Lars
Jul 20 2010