www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - extern __gshared const(char)* symbol fails

reply James Blachly <james.blachly gmail.com> writes:
Hi all,

I am linking to a C library which defines a symbol,

const char seq_nt16_str[] = "=ACMGRSVTWYHKDBN";

In the C sources, this is an array of 16 bytes (17 I guess, 
because it is written as a string).

In the C headers, it is listed as extern const char 
seq_nt16_str[];

When linking to this library from another C program, I am able to 
treat seq_nt16_str as any other array, and being defined as [] 
fundamentally it is a pointer.

When linking to this library from D, I have declared it as:

extern __gshared const(char)* seq_nt16_str;

***But this segfaults when I treat it like an array (e.g. by 
accessing members by index).***

Because I know the length, I can instead declare:

extern __gshared const(char)[16] seq_nt16_str;

My question is: Why can I not treat it opaquely and use it 
declared as char* ? Does this have anything to do with it being a 
global stored in the static data segment?
Aug 30 2018
next sibling parent reply Neia Neutuladh <neia ikeran.org> writes:
On Friday, 31 August 2018 at 06:20:09 UTC, James Blachly wrote:
 Hi all,

 I am linking to a C library which defines a symbol,

 const char seq_nt16_str[] = "=ACMGRSVTWYHKDBN";

 In the C sources, this is an array of 16 bytes (17 I guess, 
 because it is written as a string).

 In the C headers, it is listed as extern const char 
 seq_nt16_str[];

 When linking to this library from another C program, I am able 
 to treat seq_nt16_str as any other array, and being defined as 
 [] fundamentally it is a pointer.

 When linking to this library from D, I have declared it as:

 extern __gshared const(char)* seq_nt16_str;

 ***But this segfaults when I treat it like an array (e.g. by 
 accessing members by index).***
I believe this should be extern extern(C)? I'm surprised that this segfaults rather than having a link error. A bare `extern` means "this symbol is defined somewhere else". `extern(C)` means "this symbol should have C linkage". When I try it with just `extern`, I see a link error: scratch.o: In function `_Dmain': scratch.d:(.text._Dmain[_Dmain]+0x7): undefined reference to `_D7scratch5cdataPa' collect2: error: ld returned 1 exit status Error: linker exited with status 1
Aug 31 2018
next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 8/31/18 1:18 PM, Neia Neutuladh wrote:
 On Friday, 31 August 2018 at 06:20:09 UTC, James Blachly wrote:
 Hi all,

 I am linking to a C library which defines a symbol,

 const char seq_nt16_str[] = "=ACMGRSVTWYHKDBN";

 In the C sources, this is an array of 16 bytes (17 I guess, because it 
 is written as a string).

 In the C headers, it is listed as extern const char seq_nt16_str[];

 When linking to this library from another C program, I am able to 
 treat seq_nt16_str as any other array, and being defined as [] 
 fundamentally it is a pointer.

 When linking to this library from D, I have declared it as:

 extern __gshared const(char)* seq_nt16_str;

 ***But this segfaults when I treat it like an array (e.g. by accessing 
 members by index).***
I believe this should be extern extern(C)? I'm surprised that this segfaults rather than having a link error.
Yeah, I had to add extern(C) in my tests to get it to link. I think he must have extern(C): somewhere above. -Steve
Aug 31 2018
prev sibling parent reply James Blachly <james.blachly gmail.com> writes:
On Friday, 31 August 2018 at 17:18:58 UTC, Neia Neutuladh wrote:
 On Friday, 31 August 2018 at 06:20:09 UTC, James Blachly wrote:
 Hi all,

 ...

 When linking to this library from D, I have declared it as:

 extern __gshared const(char)* seq_nt16_str;

 ***But this segfaults when I treat it like an array (e.g. by 
 accessing members by index).***
I believe this should be extern extern(C)? I'm surprised that this segfaults rather than having a link error. A bare `extern` means "this symbol is defined somewhere else". `extern(C)` means "this symbol should have C linkage".
I am so sorry -- I should have been more clear that this is in the context of a large header-to-D translation .d file, so the whole thing is wrapped in extern(C) via an extern(C): at the top of the file.
Aug 31 2018
parent Laeeth Isharc <Laeeth laeeth.com> writes:
On Friday, 31 August 2018 at 18:49:26 UTC, James Blachly wrote:
 On Friday, 31 August 2018 at 17:18:58 UTC, Neia Neutuladh wrote:
 On Friday, 31 August 2018 at 06:20:09 UTC, James Blachly wrote:
 Hi all,

 ...

 When linking to this library from D, I have declared it as:

 extern __gshared const(char)* seq_nt16_str;

 ***But this segfaults when I treat it like an array (e.g. by 
 accessing members by index).***
I believe this should be extern extern(C)? I'm surprised that this segfaults rather than having a link error. A bare `extern` means "this symbol is defined somewhere else". `extern(C)` means "this symbol should have C linkage".
I am so sorry -- I should have been more clear that this is in the context of a large header-to-D translation .d file, so the whole thing is wrapped in extern(C) via an extern(C): at the top of the file.
In case you weren't aware of it, take a look at atilaneves DPP on GitHub or code.dlang.org. auto translates C headers at build time and mostly it just works. If it doesn't, file an issue and in time it will be fixed.
Sep 02 2018
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 8/31/18 2:20 AM, James Blachly wrote:
 Hi all,
 
 I am linking to a C library which defines a symbol,
 
 const char seq_nt16_str[] = "=ACMGRSVTWYHKDBN";
 
 In the C sources, this is an array of 16 bytes (17 I guess, because it 
 is written as a string).
 
 In the C headers, it is listed as extern const char seq_nt16_str[];
 
 When linking to this library from another C program, I am able to treat 
 seq_nt16_str as any other array, and being defined as [] fundamentally 
 it is a pointer.
 
 When linking to this library from D, I have declared it as:
 
 extern __gshared const(char)* seq_nt16_str;
 
 ***But this segfaults when I treat it like an array (e.g. by accessing 
 members by index).***
 
 Because I know the length, I can instead declare:
 
 extern __gshared const(char)[16] seq_nt16_str;
 
 My question is: Why can I not treat it opaquely and use it declared as 
 char* ? Does this have anything to do with it being a global stored in 
 the static data segment?
 
What the C compiler is doing is storing it as data, and then storing the symbol to point at the first element in the data. When you use const char* in D, it's expecting a *pointer* to be stored at that address, not the data itself. So using it means segfault. The static array is the correct translation, even though it leaks implementation details. In C, it's working because C has the notion of a symbol being where an array starts. D has no concept of a C array like that, every array must have a length. So there is no equivalent you can use in D -- you have to supply the length. Alternatively, you can treat it as a const char: extern(C) extern const(char) seq_nt16_str; void main() { import core.stdc.stdio; printf("%s\n", &seq_nt16_str); // should print the string } You could wrap it like this: pragma(mangle, "seq_nt16_str"); private extern(C) extern const(char) _seq_nt16_str_STORAGE; property const(char)* seq_nt16_str() { return &_seq_nt16_str_STORAGE; } To make the code look similar. -Steve
Aug 31 2018
next sibling parent James Blachly <james.blachly gmail.com> writes:
On Friday, 31 August 2018 at 17:50:17 UTC, Steven Schveighoffer 
wrote:
 What the C compiler is doing is storing it as data, and then 
 storing the symbol to point at the first element in the data.

 When you use const char* in D, it's expecting a *pointer* to be 
 stored at that address, not the data itself. So using it means 
 segfault. The static array is the correct translation, even 
 though it leaks implementation details.

 In C, it's working because C has the notion of a symbol being 
 where an array starts. D has no concept of a C array like that, 
 every array must have a length. So there is no equivalent you 
 can use in D -- you have to supply the length.
NKML also wrote:
 You need to declare your extern array as array in D and also in 
 C, so that the compiler would know what that is (an array, not 
 a pointer). In many situations C compiler would silently 
 convert an array into a pointer (when it already knows its 
 dealing with array), but it won't convert a pointer into an 
 array.
Thank you Steve and NKML for your very clear and concise answers. This makes perfect sense. I would like not to write as a static array in D because I cannot guarantee future version of the library to which I am linking would not change the length of the data. Steve's trick, below, looks like the ticket.
 Alternatively, you can treat it as a const char:

 extern(C) extern const(char) seq_nt16_str;

 void main()
 {
    import core.stdc.stdio;
    printf("%s\n", &seq_nt16_str); // should print the string
 }

 You could wrap it like this:

 pragma(mangle, "seq_nt16_str");
 private extern(C) extern const(char) _seq_nt16_str_STORAGE;

  property const(char)* seq_nt16_str()
 {
    return &_seq_nt16_str_STORAGE;
 }

 To make the code look similar.

 -Steve
That is a great trick, and I will use it.
Aug 31 2018
prev sibling parent Edgar Huckert <edgar.huckert huckert.com> writes:
On Friday, 31 August 2018 at 17:50:17 UTC, Steven Schveighoffer 
wrote:
...
 When you use const char* in D, it's expecting a *pointer* to be 
 stored at that address, not the data itself. So using it means 
 segfault. The static array is the correct translation, even 
 though it leaks implementation details.

 In C, it's working because C has the notion of a symbol being 
 where an array starts. D has no concept of a C array like that, 
 every array must have a length. So there is no equivalent you 
 can use in D -- you have to supply the length.
I think this is only correct for dynamic arrays. For static arrays I have the impression that it works exactly as in C, i.e. the address of the array is the address of the first array element. See this simple code: import std.stdio; import std.array; void main() { // static array ulong [4] ulArr1 = [0,1,2,3]; ulong *p1 = ulArr1.ptr; ulong *p2 = &(ulArr1[0]); ulong [4] *p3 = &ulArr1; writeln("same pointers: ", cast(void *)p1 == cast(void *)p2); writeln("same pointers: ", cast(void *)p3 == cast(void *)p2); writeln(""); // dynamic array ulong [] ulArr2 = [0,1,2,3]; p1 = ulArr2.ptr; p2 = &(ulArr2[0]); ulong [] *p5 = &ulArr2; writeln("same pointers: ", cast(void *)p1 == cast(void *)p2); writeln("same pointers: ", cast(void *)p5 == cast(void *)p2); } // end main() This produces (with dmd): same pointers: true same pointers: true same pointers: true same pointers: false
Sep 01 2018
prev sibling parent nkm1 <t4nk074 openmailbox.org> writes:
On Friday, 31 August 2018 at 06:20:09 UTC, James Blachly wrote:
 Hi all,

 I am linking to a C library which defines a symbol,

 const char seq_nt16_str[] = "=ACMGRSVTWYHKDBN";

 In the C sources, this is an array of 16 bytes (17 I guess, 
 because it is written as a string).

 In the C headers, it is listed as extern const char 
 seq_nt16_str[];

 When linking to this library from another C program, I am able 
 to treat seq_nt16_str as any other array, and being defined as 
 [] fundamentally it is a pointer.
No. This is a misconception. Fundamentally, it's an array.
 When linking to this library from D, I have declared it as:

 extern __gshared const(char)* seq_nt16_str;

 ***But this segfaults when I treat it like an array (e.g. by 
 accessing members by index).***

 Because I know the length, I can instead declare:

 extern __gshared const(char)[16] seq_nt16_str;

 My question is: Why can I not treat it opaquely and use it 
 declared as char* ? Does this have anything to do with it being 
 a global stored in the static data segment?
For the same reason you can't do it in C. --- main.c --- #include <stdio.h> extern const char* array; /* then try array[] */ int main(void) { printf("%.5s\n", array); return 0; } --- lib.c --- const char array[] = "hello world"; Segmentation fault You need to declare your extern array as array in D and also in C, so that the compiler would know what that is (an array, not a pointer). In many situations C compiler would silently convert an array into a pointer (when it already knows its dealing with array), but it won't convert a pointer into an array.
Aug 31 2018