www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Testing array ptr for offset 0...

reply Era Scarecrow <rtcvb32 yahoo.com> writes:
  So I'm experimenting and want to override how arrays are managed 
for a short duration, this is for optimization of not having to 
make another array to keep track of lengths and then shorten them 
when i can in fact just manage it myself *very* briefly.

  So, I need to make sure the structure is compatible with the 
built-in array structure (since the remainder of the calls are 
via normal convention and no allocation/resizing is done except 
by progressively smaller slices, so it should be compatible as 
long as it's a valid pointer).

  But i seem to have an issue trying to test if ptr is at offset 0 
or not. Having the wrong order would be catastrophic, and if the 
source changes (for whatever reason) i don't want to have to fix 
it.

  So... the obvious test: static if ([].ptr.offsetof == 0)

  If the ptr is at offset 0, we declare it first, otherwise 
second, simple... Except this fails since "no property 'offsetof' 
for type 'void*'". SO... I make a template to test for it instead.

template isArrayPtrOffsetZero() {
   auto check() {
     static assert([].sizeof == (size_t.sizeof*2));
     size_t[] arr = new size_t[1];      //length == 1.
     return *(cast(size_t*) &arr) != 1;
   }
   enum isArrayPtrOffsetZero = check();
}

struct X {
   static if (isArrayPtrOffsetZero!()) {
     int* ptr;
     size_t length;
   } else { ... }
}

  Except this blows up and always gives the wrong answer... ==1 
length test reverses it but is now the wrong test, and changing 
it to isArrayLengthOffsetZero will in turn give the wrong answer 
again...

  Am i going about this the wrong way?
May 26 2016
next sibling parent reply Kagamin <spam here.lot> writes:
try this:

struct X
{
	byte[] data;
	alias data this;
}
May 26 2016
parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 26 May 2016 at 09:16:48 UTC, Kagamin wrote:
 try this:

 struct X
 {
 	byte[] data;
 	alias data this;
 }
This doesn't help me with what I want, and effectively does nothing as far as I can tell. Although testing is showing managing the pointer & length directly is a little slower for whatever reason.
May 26 2016
prev sibling next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 05/26/2016 09:51 AM, Era Scarecrow wrote:
   So I'm experimenting and want to override how arrays are managed for a
 short duration, this is for optimization of not having to make another
 array to keep track of lengths and then shorten them when i can in fact
 just manage it myself *very* briefly.

   So, I need to make sure the structure is compatible with the built-in
 array structure (since the remainder of the calls are via normal
 convention and no allocation/resizing is done except by progressively
 smaller slices, so it should be compatible as long as it's a valid
 pointer).
I don't follow. Why can't you use a built-in array? What can you do with the stand-in that you can't do with the array itself? [...]
 template isArrayPtrOffsetZero() {
    auto check() {
      static assert([].sizeof == (size_t.sizeof*2));
      size_t[] arr = new size_t[1];      //length == 1.
      return *(cast(size_t*) &arr) != 1;
    }
    enum isArrayPtrOffsetZero = check();
 }
[...]
   Except this blows up and always gives the wrong answer... ==1 length
 test reverses it but is now the wrong test, and changing it to
 isArrayLengthOffsetZero will in turn give the wrong answer again...
Here's how I understand that code: isArrayPtrOffsetZero is supposed to check if the offset of ptr is 0. You do that by setting the length of some array to 1, and then checking if the first half of the array structure is not 1. If it's 1, then length's offset is 0, so ptr's offset can't be 0. If it's not 1, then length's offset isn't 0, so ptr's offset must be 0. Seems a bit back and forth, but as far as I can see it works as you want. Why do you think the answer it gives is wrong? What answer does it give you?
May 26 2016
parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 26 May 2016 at 11:47:13 UTC, ag0aep6g wrote:
 I don't follow. Why can't you use a built-in array? What can 
 you do with the stand-in that you can't do with the array 
 itself?
I can do the job with the built-in arrays, however the need for another temporary array to keep track of lengths so I can slice them after the fact is what i want to avoid. I'm also slicing static arrays. I'm trying to see if this code would be more efficient or not (if it is, it's not by much, which I'm sure has to do with the cache more than anything else). To do what I want currently it's something like... enum Size = 1024, Other = 128; Data[Size][Other] staticarray; //stack allocation Data[][] sliced = staticarray[]; scan(sliced, condition); void scan(ref Data[][] data, Condition cond) { int lengths[Size]; foreach(i; ...) { if (cond) data[i][lengths[i]++] = ... } //cleanup/shrink foreach(i, l; lengths) data[i] = data[i][0 .. l]; } By adding a struct overload for opOpAssign I can shrink it all down to this, and avoid lengths entirely... As such internally the length starts at 0, and checks are in place to ensure the bounds are never exceeded. void scan(ref Data[][] data, Condition cond) { foreach(i; ...) { if (cond) data[i] ~= ... } }
 Seems a bit back and forth, but as far as I can see it works as 
 you want. Why do you think the answer it gives is wrong? What 
 answer does it give you?
The answer is wrong _because_ the code blows up. I inverted the check to check if length is at offset 0 since (the length can only be 1 and a better test), however it _still_ gives the wrong answer. Curiously CTFE is fine with this casting as long as the pointer stays in that one spot. template isArrayLengthOffsetZero() { bool check() { size_t[] arr = new size_t[1]; return *(cast(size_t*) &arr) == 1; } enum isArrayLengthOffsetZero = check(); //wrong answer // enum isArrayLengthOffsetZero = !check(); //right answer but logically wrong } //test struct struct S(T) { static if (isArrayLengthOffsetZero!()) { size_t length; T* ptr; } else { T* ptr; size_t length; } this(T* p) { ptr = p; } //opOpAssign } unittest { import std.stdio; //union to get our overlapping data types union X { string str; S!(immutable char) s_str;} X x; x.s_str = S!(immutable char)("t".ptr); //only ptr assigned //writeln(x.str); //blows up writeln(x.str.length); //not 0... assert(x.str.length == 0); //fails } Most likely the internal array structure CTFE uses for the array is inverted, and if that's the case the proper answer to how to get the simple offset of ptr (or length) is a mess. Forcing the inverted answer works, but this is a lot of overhead to try and make an algorithm faster. Speedups are minimal at best, so I'm about ready to drop this line of thought.
May 26 2016
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 05/26/2016 11:13 PM, Era Scarecrow wrote:
    By adding a struct overload for opOpAssign I can shrink it all down
 to this, and avoid lengths entirely... As such internally the length
 starts at 0, and checks are in place to ensure the bounds are never
 exceeded.

    void scan(ref Data[][] data, Condition cond) {
      foreach(i; ...) {
        if (cond)
          data[i] ~= ...
      }
    }
Sorry, I'm still lost. Why can't you do whatever you're doing in opOpAssign directly there, or in a free function? Does the pseudo-array contain any additional data? Would a wrapper around a built-in array not work? [...]
   The answer is wrong _because_ the code blows up. I inverted the check
 to check if length is at offset 0 since (the length can only be 1 and a
 better test), however it _still_ gives the wrong answer. Curiously CTFE
 is fine with this casting as long as the pointer stays in that one spot.
Oh, I see. Yeah, the answer is garbage. Seeing the spec section that Adam D. Ruppe linked, the length field actually comes first. That CTFE errors out when you move the pointer to a variable screams that something is wrong. The code seems to be seriously misinterpreted in CTFE. ---- enum x = () { size_t[] arr = [13]; return *(cast(size_t*) &arr); } (); pragma(msg, x); ---- That prints "13LU". Apparently &arr is mistaken for arr here. I've filed an issue: https://issues.dlang.org/show_bug.cgi?id=16081
May 26 2016
parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 26 May 2016 at 22:15:42 UTC, ag0aep6g wrote:
 On 05/26/2016 11:13 PM, Era Scarecrow wrote:
    void scan(ref Data[][] data, Condition cond) {
      foreach(i; ...) {
        if (cond)
          data[i] ~= ...
      }
    }
Sorry, I'm still lost. Why can't you do whatever you're doing in opOpAssign directly there, or in a free function? Does the pseudo-array contain any additional data? Would a wrapper around a built-in array not work?
Well a few things at work. First the array is originally a static array, so I pass it as a slice; But if you pass it as a slice you get the entire length, and modifying the length requires either changing length or appending, both which may cause allocation (something I want to avoid). By making my own compatible structure I can manage the length and pointer directly and precisely. Second I would prefer the length to start at 0 and grow UP TO the size of the static array. If it was just a single array this wouldn't be an issue, but I'm working with a couple hundred of these. Plus I don't want to add more data to the memory or structure than necessary to treat it as an array after the fact. I shouldn't have to. Truthfully the same result can be done with shrinking it afterwards, but I was trying to see if I could make it any faster by managing the pointer and length directly and avoiding the large temporaries. So far any speed gains are so minimal it's worth dropping.
 Oh, I see. Yeah, the answer is garbage. Seeing the spec section 
 that Adam D. Ruppe linked, the length field actually comes 
 first.

 That CTFE errors out when you move the pointer to a variable 
 screams that something is wrong. The code seems to be seriously 
 misinterpreted in CTFE.

 I've filed an issue:
 https://issues.dlang.org/show_bug.cgi?id=16081
Glad I found a bug! :) And all because [].ptr.offsetof wasn't an option... I just wanted to guarantee it was the correct order (in case it gets changed in the future). But ultimately it isn't worth pursuing.
May 26 2016
parent reply Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Thursday, 26 May 2016 at 22:47:02 UTC, Era Scarecrow wrote:
 On Thursday, 26 May 2016 at 22:15:42 UTC, ag0aep6g wrote:
 Sorry, I'm still lost. Why can't you do whatever you're doing 
 in opOpAssign directly there, or in a free function? Does the 
 pseudo-array contain any additional data? Would a wrapper 
 around a built-in array not work?
Well a few things at work. First the array is originally a static array, so I pass it as a slice; But if you pass it as a slice you get the entire length, and modifying the length requires either changing length or appending, both which may cause allocation (something I want to avoid). By making my own compatible structure I can manage the length and pointer directly and precisely.
You can do that with arrays, too, without causing allocations: assert(slice.length < static_array.length); slice = slice.ptr[0 .. slice.length+1]; Of course that's unsafe, but your pointer magic certainly is, too.
May 27 2016
parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Friday, 27 May 2016 at 09:18:47 UTC, Marc Schütz wrote:
 You can do that with arrays, too, without causing allocations:

     assert(slice.length < static_array.length);
     slice = slice.ptr[0 .. slice.length+1];

 Of course that's unsafe, but your pointer magic certainly is, 
 too.
Initial impressions suggest it's actually slower. The multiple dereferences to do it since it's a 2d array...
May 27 2016
prev sibling parent Kagamin <spam here.lot> writes:
On Thursday, 26 May 2016 at 21:13:14 UTC, Era Scarecrow wrote:
  To do what I want currently it's something like...

   enum Size = 1024, Other = 128;
   Data[Size][Other] staticarray;  //stack allocation
   Data[][] sliced = staticarray[];
   scan(sliced, condition);

   void scan(ref Data[][] data, Condition cond) {
     int lengths[Size];

     foreach(i; ...) {
       if (cond)
         data[i][lengths[i]++] = ...
     }

     //cleanup/shrink
     foreach(i, l; lengths)
       data[i] = data[i][0 .. l];
   }
Like this: static void rawlength(ref Data[] r, size_t len) { r = r.ptr[0..len]; } void scan(ref Data[][] data, Condition cond) { foreach(i; ...) { if (cond) data[i].rawlength = ... } }
May 27 2016
prev sibling next sibling parent reply Alex Parrill <initrd.gz gmail.com> writes:
On Thursday, 26 May 2016 at 07:51:46 UTC, Era Scarecrow wrote:
 ...
This smells like an XY problem [0]. Why exactly do you need the internal layout of the array structure? The line "not having to make another array to keep track of lengths and then shorten them" is fairly vague. "Shortening" an array via slicing is basically free (it's just some integer arithmetic), but I'm not sure if that's what you meant. [0]: http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem
May 26 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 26 May 2016 at 12:30:30 UTC, Alex Parrill wrote:
 The line "not having to make another array to keep track of 
 lengths and then shorten them" is fairly vague. "Shortening" an 
 array via slicing is basically free (it's just some integer 
 arithmetic), but I'm not sure if that's what you meant.
Indeed, creating a slice from a static array, another slice, or a pointer is free too...
May 26 2016
parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 26 May 2016 at 12:33:31 UTC, Adam D. Ruppe wrote:
 On Thursday, 26 May 2016 at 12:30:30 UTC, Alex Parrill wrote:
 The line "not having to make another array to keep track of 
 lengths and then shorten them" is fairly vague. "Shortening" 
 an array via slicing is basically free (it's just some integer 
 arithmetic), but I'm not sure if that's what you meant.
Indeed, creating a slice from a static array, another slice, or a pointer is free too...
But managing the lengths separately and then re-slicing it afterwards is not free. It does have a low constant cost though; but can I remove that cost entirely (including the foreach loop overhead)? There's also the annoyance that on windows machines the stack size seems to be fairly small. Sometimes in my tests the program will crash before it starts because the requested size of a static array on the stack is too big. This means if I have a big stack allocation already, I want to try and avoid further allocations if I can avoid it. byte[1024*1024] buffer; //on my machine, it just crashes when this is allocated.
May 26 2016
prev sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 26 May 2016 at 07:51:46 UTC, Era Scarecrow wrote:
  If the ptr is at offset 0, we declare it first, otherwise 
 second, simple... Except this fails since "no property 
 'offsetof' for type 'void*'". SO... I make a template to test 
 for it instead.
Built-in arrays are kinda magical and thus their members aren't ordinary members you can inspect like that.
     return *(cast(size_t*) &arr) != 1;
And I'm pretty sure that cast is illegal in CTFE. Perhaps the bug here is that dmd isn't throwing an error when it is supposed to... The value being compared gives 0 when I try it, which wouldn't make sense for either .length or .ptr.
  Am i going about this the wrong way?
I don't think there's a need for a test because the ABI spec says length is always defined to be at offset zero anyway: http://dlang.org/spec/abi.html#arrays If you want to test it, a runtime assert would probably be enough, at least then it won't do the wrong thing and your message can tell the user to recompile. Runtime assert is prolly better anyway just in case ctfe disagrees with the execution platform.
May 26 2016