www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - pointers, assignments, Garbage Collection Oh My?

reply "JohnnyK" <johnnykinsey comcast.net> writes:
I hope you like the subject matter and I hope it is not too 
simplistic or have been answered before.
   Anyway I have a question about how the garbage collector works 
in a very specific situation.  When passing string type to a 
function in a shared library or DLL and assigning it to a 
variable of type string inside the function and returning the 
internal string.  Such as this.

export string mytest(string tstStr)
{
   string st = tstStr;
   /* abbreviated to protect the innocent but other operations
      such as concatenating and deleting may be done to st before 
the return
   */
   return st;
}

Is the string type a pointer or is it something else?  In the 
line where tstStr is assigned to st does it copy the address in 
tstStr to st or does it copy the value in tstStr?  I am just a 
bit confused about string types since I come from a C background 
and C has no type like this.  Also what is returned by this 
function?  Does this function return a pointer or the contents of 
an array?  If I do export this what does it do to the Garbage 
Collection?  Does the Garbage Collection collect tstStr or st?  
Also notice the comment in the function.
Jul 10 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-07-10 19:18, JohnnyK wrote:
 I hope you like the subject matter and I hope it is not too simplistic
 or have been answered before.
    Anyway I have a question about how the garbage collector works in a
 very specific situation.  When passing string type to a function in a
 shared library or DLL and assigning it to a variable of type string
 inside the function and returning the internal string.  Such as this.

 export string mytest(string tstStr)
 {
    string st = tstStr;
    /* abbreviated to protect the innocent but other operations
       such as concatenating and deleting may be done to st before the
 return
    */
    return st;
 }

 Is the string type a pointer or is it something else?  In the line where
 tstStr is assigned to st does it copy the address in tstStr to st or
 does it copy the value in tstStr?  I am just a bit confused about string
 types since I come from a C background and C has no type like this.
 Also what is returned by this function?  Does this function return a
 pointer or the contents of an array?  If I do export this what does it
 do to the Garbage Collection?  Does the Garbage Collection collect
 tstStr or st? Also notice the comment in the function.
A string in D, and all arrays, is a struct looking like this: struct Array (T) { T* ptr; size_t length; } "string" happens to be an alias looking like this: alias immutable(char)[] string; That means you can do all operations on a string except assigning to specific elements: string str = "foo"; str ~= "bar"; // ok auto str2 = str ~ "foo"; // ok str[0] == "f"; // ok str[0] = 'a'; // error, cannot assign to immutable auto str3 = str[1 .. 3]; // ok It won't collect anything along as it is scope. If a variable goes out of scope and nothing else points to that data it will collect it (eventually). Hope this helps a bit. -- /Jacob Carlborg
Jul 10 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
 A string in D, and all arrays, is a struct looking like this:

 struct Array (T)
 {
     T* ptr;
     size_t length;
 }
I always thought it looks like this: struct Array(T) { T* ptr; size_t length, capacity; }
Jul 10 2013
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Jul 10, 2013, at 10:45 AM, Namespace <rswhite4 googlemail.com> wrote:

 A string in D, and all arrays, is a struct looking like this:
=20
 struct Array (T)
 {
    T* ptr;
    size_t length;
 }
=20 I always thought it looks like this: =20 struct Array(T) { T* ptr; size_t length, capacity; }
Sadly, no. The only way to determine the capacity of an array is to = query the GC.=
Jul 10 2013
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/10/2013 11:10 AM, Sean Kelly wrote:

 On Jul 10, 2013, at 10:45 AM, Namespace <rswhite4 googlemail.com> wrote:

 A string in D, and all arrays, is a struct looking like this:

 struct Array (T)
 {
     T* ptr;
     size_t length;
 }
I always thought it looks like this: struct Array(T) { T* ptr; size_t length, capacity; }
Sadly, no. The only way to determine the capacity of an array is to
query the GC.

And to be pedantic, length comes first:

struct Array (T)
{
     size_t length;
     T* ptr;
}

Which is actually property-like because assigning to length does pretty 
complex stuff. So the member cannot be named as 'length':

struct Array (T)
{
     size_t length_;
     T* ptr;
}

Anyway... :)

Ali
Jul 10 2013
next sibling parent reply "JohnnyK" <johnnykinsey comcast.net> writes:
On Wednesday, 10 July 2013 at 18:22:24 UTC, Ali Çehreli wrote:
 On 07/10/2013 11:10 AM, Sean Kelly wrote:

 On Jul 10, 2013, at 10:45 AM, Namespace
<rswhite4 googlemail.com> wrote:
 A string in D, and all arrays, is a struct looking like
this:
 struct Array (T)
 {
     T* ptr;
     size_t length;
 }
I always thought it looks like this: struct Array(T) { T* ptr; size_t length, capacity; }
Sadly, no. The only way to determine the capacity of an
array is to query the GC.

 And to be pedantic, length comes first:

 struct Array (T)
 {
     size_t length;
     T* ptr;
 }

 Which is actually property-like because assigning to length 
 does pretty complex stuff. So the member cannot be named as 
 'length':

 struct Array (T)
 {
     size_t length_;
     T* ptr;
 }

 Anyway... :)

 Ali
Reminds me of how Delphi (aka Pascal) strings are work. Thanks everyone this answers some of my questions. Now what about when the return type of a function is a string? Is D returning the pointer to the string structure or is it returning the structure?
Jul 10 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jul 10, 2013 at 08:38:40PM +0200, JohnnyK wrote:
[...]
 Reminds me of how Delphi (aka Pascal) strings are work.  Thanks
 everyone this answers some of my questions.  Now what about when the
 return type of a function is a string?  Is D returning the pointer
 to the string structure or is it returning the structure?
D structs are always passed by value, so it's the length+pointer pair that's being returned. There's no extra indirection involved here. What it points to stays where it is on the GC heap. So for example: int[] data = [1,2,3]; int[] f() { return data; } void main() { auto arr = f(); // returns copy of data's ptr+length arr.length--; // modifies this copy assert(arr == [1,2]); assert(data == [1,2,3]); // data itself hasn't changed } Does this help? T -- Be in denial for long enough, and one day you'll deny yourself of things you wish you hadn't.
Jul 10 2013
parent "JohnnyK" <johnnykinsey comcast.net> writes:
On Wednesday, 10 July 2013 at 18:45:56 UTC, H. S. Teoh wrote:
 On Wed, Jul 10, 2013 at 08:38:40PM +0200, JohnnyK wrote:
 [...]
 Reminds me of how Delphi (aka Pascal) strings are work.  Thanks
 everyone this answers some of my questions.  Now what about 
 when the
 return type of a function is a string?  Is D returning the 
 pointer
 to the string structure or is it returning the structure?
D structs are always passed by value, so it's the length+pointer pair that's being returned. There's no extra indirection involved here. What it points to stays where it is on the GC heap. So for example: int[] data = [1,2,3]; int[] f() { return data; } void main() { auto arr = f(); // returns copy of data's ptr+length arr.length--; // modifies this copy assert(arr == [1,2]); assert(data == [1,2,3]); // data itself hasn't changed } Does this help?
I understand thanks for the information.
Jul 10 2013
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, July 10, 2013 20:38:40 JohnnyK wrote:
 Reminds me of how Delphi (aka Pascal) strings are work. Thanks
 everyone this answers some of my questions. Now what about when
 the return type of a function is a string? Is D returning the
 pointer to the string structure or is it returning the structure?
Absolutely everywhere that you see an array (or string) in code, it's really the struct underneath the hood. In no case is just a pointer passed, since a pointer would not be an array. To get a pointer to the array's data, you have to use its ptr property. - Jonathan M Davis
Jul 10 2013
prev sibling next sibling parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Wednesday, 10 July 2013 at 18:22:24 UTC, Ali Çehreli wrote:
 And to be pedantic, length comes first:

 struct Array (T)
 {
     size_t length;
     T* ptr;
 }

 Which is actually property-like because assigning to length 
 does pretty complex stuff. So the member cannot be named as 
 'length':

 struct Array (T)
 {
     size_t length_;
     T* ptr;
 }

 Anyway... :)

 Ali
To be pedantic dynamic arrays are implemented in D simply as struct Array { size_t length; void* ptr; } and there is no type parametrization since such arrays handling is opaque for users (in druntime they are treated as void[]). Parametrization can be useful in user side since performing any operations with structure above (void*) will lead to errors. But in user side there is no point in manipulating the structure directly, as it can be done using usual properties/druntime without template bloat. By the way, the style ABI page is written, allows implementation to have more fields.
Jul 10 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-07-11 04:59, Maxim Fomin wrote:

 To be pedantic dynamic arrays are implemented in D simply as

 struct Array
 {
      size_t length;
      void* ptr;
 }

 and there is no type parametrization since such arrays handling is
 opaque for users (in druntime they are treated as void[]).
 Parametrization can be useful in user side since performing any
 operations with structure above (void*) will lead to errors. But in user
 side there is no point in manipulating the structure directly, as it can
 be done using usual properties/druntime without template bloat. By the
 way, the style ABI page is written, allows implementation to have more
 fields.
typeof("asd".ptr) gives back immutable(char)*, not immutable(void)*. So from a user point of view it's as if the array is templated. -- /Jacob Carlborg
Jul 11 2013
parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Thursday, 11 July 2013 at 07:13:50 UTC, Jacob Carlborg wrote:
 On 2013-07-11 04:59, Maxim Fomin wrote:

 To be pedantic dynamic arrays are implemented in D simply as

 struct Array
 {
     size_t length;
     void* ptr;
 }

 and there is no type parametrization since such arrays 
 handling is
 opaque for users (in druntime they are treated as void[]).
 Parametrization can be useful in user side since performing any
 operations with structure above (void*) will lead to errors. 
 But in user
 side there is no point in manipulating the structure directly, 
 as it can
 be done using usual properties/druntime without template 
 bloat. By the
 way, the style ABI page is written, allows implementation to 
 have more
 fields.
typeof("asd".ptr) gives back immutable(char)*, not immutable(void)*. So from a user point of view it's as if the array is templated.
It's in the user side. In druntime it is void[] + typeinfo. I am not aware of any part in dmd/druntime where arrays are repsented as templates (or strongly typed) as depicted in this dicsussion. And current treatment can be barely called templatization as there is no templates, template instantiations and typicall horrible mangling at all. More precise description is not templatization but some kind of implicit conversion from array of specific type in source code to void array plus typeinfo in runtime library. If user tries to use struct Array(T) {...} instead of usual arrays he will gain no benefit but useless template bloat.
Jul 11 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-07-11 09:43, Maxim Fomin wrote:

 It's in the user side. In druntime it is void[] + typeinfo. I am not
 aware of any part in dmd/druntime where arrays are repsented as
 templates (or strongly typed) as depicted in this dicsussion. And
 current treatment can be barely called templatization as there is no
 templates, template instantiations and typicall horrible mangling at
 all. More precise description is not templatization but some kind of
 implicit conversion from array of specific type in source code to void
 array plus typeinfo in runtime library. If user tries to use struct
 Array(T) {...} instead of usual arrays he will gain no benefit but
 useless template bloat.
Yes, but that is easier to type. All the above or: struct Array (T) { size_t length; T* ptr; } -- /Jacob Carlborg
Jul 11 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-07-11 13:19, Jacob Carlborg wrote:

 Yes, but that is easier to type. All the above or:

 struct Array (T)
 {
      size_t length;
      T* ptr;
 }
"that" should have been "what". -- /Jacob Carlborg
Jul 11 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-07-10 20:22, Ali Çehreli wrote:

 And to be pedantic, length comes first:

 struct Array (T)
 {
      size_t length;
      T* ptr;
 }
I thought "ptr" came first, that's the reason you could cast to the pointer type. Not that one should do that. Perhaps there's some compiler/runtime magic involved. -- /Jacob Carlborg
Jul 11 2013
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/11/2013 12:23 AM, Jacob Carlborg wrote:

 On 2013-07-10 20:22, Ali Çehreli wrote:

 And to be pedantic, length comes first:

 struct Array (T)
 {
      size_t length;
      T* ptr;
 }
I thought "ptr" came first, that's the reason you could cast to the pointer type. Not that one should do that. Perhaps there's some compiler/runtime magic involved.
There must be little magic and that magic should be the same as getting the .ptr property. Otherwise, the "value" of a struct object cannot be casted to pointer type: struct S { int *p; } auto s = S(); int *p = cast(int*)s; Error: e2ir: cannot cast s of type S to type int* Ali
Jul 11 2013
parent reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Thursday, 11 July 2013 at 17:07:18 UTC, Ali Çehreli wrote:
 On 07/11/2013 12:23 AM, Jacob Carlborg wrote:

 On 2013-07-10 20:22, Ali Çehreli wrote:

 And to be pedantic, length comes first:

 struct Array (T)
 {
      size_t length;
      T* ptr;
 }
I thought "ptr" came first, that's the reason you could cast
to the
 pointer type. Not that one should do that. Perhaps there's
some
 compiler/runtime magic involved.
There must be little magic and that magic should be the same as getting the .ptr property. Otherwise, the "value" of a struct object cannot be casted to pointer type: struct S { int *p; } auto s = S(); int *p = cast(int*)s; Error: e2ir: cannot cast s of type S to type int* Ali
In context of slices cast(int*)arr is essentially s.ptr. There is no magic but accessing right field of struct.
Jul 11 2013
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/11/2013 10:20 AM, Maxim Fomin wrote:

 On Thursday, 11 July 2013 at 17:07:18 UTC, Ali Çehreli wrote:
 that magic should be the same as
 getting the .ptr property.
Yes.
 In context of slices cast(int*)arr is essentially s.ptr.
And yes. :) Ali
Jul 11 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jul 10, 2013 at 07:45:25PM +0200, Namespace wrote:
A string in D, and all arrays, is a struct looking like this:

struct Array (T)
{
    T* ptr;
    size_t length;
}
I always thought it looks like this: struct Array(T) { T* ptr; size_t length, capacity; }
Nope, the capacity is an attribute of the GC memory that the array is pointing to, not the array "itself", which is merely a slice of this GC memory. When you append to an array, basically what happens is the GC is asked "is this the end of the GC block and can we extend it please?" If it's not the end of the GC block, a new block is allocated; otherwise, it is extended, then the new data is written into it. T -- Those who've learned LaTeX swear by it. Those who are learning LaTeX swear at it. -- Pete Bleackley
Jul 10 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 10 July 2013 at 17:18:09 UTC, JohnnyK wrote:
 I hope you like the subject matter and I hope it is not too 
 simplistic or have been answered before.
   Anyway I have a question about how the garbage collector 
 works in a very specific situation.  When passing string type 
 to a function in a shared library or DLL and assigning it to a 
 variable of type string inside the function and returning the 
 internal string.  Such as this.

 export string mytest(string tstStr)
 {
   string st = tstStr;
   /* abbreviated to protect the innocent but other operations
      such as concatenating and deleting may be done to st 
 before the return
   */
   return st;
 }

 Is the string type a pointer or is it something else?  In the 
 line where tstStr is assigned to st does it copy the address in 
 tstStr to st or does it copy the value in tstStr?  I am just a 
 bit confused about string types since I come from a C 
 background and C has no type like this.  Also what is returned 
 by this function?  Does this function return a pointer or the 
 contents of an array?  If I do export this what does it do to 
 the Garbage Collection?  Does the Garbage Collection collect 
 tstStr or st?  Also notice the comment in the function.
Others have answered about how strings and other arrays work (also, see http://dlang.org/arrays.html and http://dlang.org/d-array-article.html), so I'll address the garbage collection question. Shared libraries don't make a difference here AFAIK so we won't worry about them. The GC will only collect something that there are no live references to. If an array gets reallocated (by e.g. appending to it beyond it's capacity) and there's no reference anywhere to the old data, then it can be collected (no guarantee it will be).
Jul 10 2013
prev sibling parent "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Wednesday, 10 July 2013 at 17:18:09 UTC, JohnnyK wrote:
 export string mytest(string tstStr)
 {
   string st = tstStr;
   /* abbreviated to protect the innocent but other operations
      such as concatenating and deleting may be done to st 
 before the return
   */
   return st;
 }
Arrays are complex, as mention see: http://dlang.org/d-array-article.html The snippet shown, st references the same data as tstStr. The comment states "concatenating," then st does not point to the same data as tstStr.
Jul 10 2013