digitalmars.D.learn - shared arrray problem

Charles Hixson via Digitalmars-d-learn (22/22) Nov 19 2016 I have a piece of code that looks thus:

Nicolas Gurrola (6/8) Nov 19 2016 This method should do what you want. You are only returning a

Charles Hixson via Digitalmars-d-learn (23/28) Nov 19 2016 It's worse than that, if they modify the length the array may be

ag0aep6g (11/15) Nov 19 2016 Arguably, any D programmer must be aware that appending to a dynamic

Charles Hixson via Digitalmars-d-learn (11/26) Nov 19 2016 Yes. I was hoping someone would pop up with some syntax making the

ag0aep6g (8/14) Nov 19 2016 Yup, head const is not part of the language. You'd have to find a

Charles Hixson via Digitalmars-d-learn (14/28) Nov 19 2016 Whether you would call the change "break things for your code" might be

ag0aep6g (31/38) Nov 20 2016 I don't see how it's dubious. It's an error by the user. When users are

Charles Hixson via Digitalmars-d-learn (7/45) Nov 20 2016 Well, that precise approach wouldn't work. (The traits aren't a part of...

ag0aep6g (2/4) Nov 20 2016 What do you mean by "traits"?

Charles Hixson via Digitalmars-d-learn (46/84) Nov 20 2016 Thinking it over a bit more, the item returned would need to be a

ag0aep6g (17/59) Nov 20 2016 Instead of extra 'start' and 'end' fields you can slice the array. A

Charles Hixson via Digitalmars-d-learn (12/71) Nov 21 2016 While you are definitely correct about slices, I really feel more

Charles Hixson via Digitalmars-d-learn writes:

I have a piece of code that looks thus:

/**    Returns an editable file header missing the header length and data
  * length portions.  Those cannot be edited by a routine outside this 
class.
  * Access to them is available via the lenHead and lenRec functions.
  * Warning:  Do NOT change the size of the header.  If you do the size
  * will be reset to the current value before it is saved, and results are
  * unpredictable.  Also do not replace it.  This class maintains it's own
  * pointer to the header, and your replacement will be ignored. */
ubyte[]    header()         property    {    return fHead[4..$];    }

I want other classes to be able to modify the tail of the array, but not 
to resize it.  Eventually it should be written to a file. This way works 
(well, should work) but looks radically unsafe for the reasons indicated 
in the comments.  I'm sure there must be a better way, but can't think 
of what it would be.  The only alternative I've been able to think of 
is, approx:

bool    saveToHeader (ubyte[] rec)    { ... buf[4..$] = rec[0..$]; ... }

but that would be guaranteed to have an extra copying step, and it's 
bool because if the length of the passed parameter isn't correct 
saveToHeader would fail.  It may still be better since in this case the 
copying would be rather minimal, but the general problem bothers me 
because I don't see a good solution.

Nov 19 2016

Nicolas Gurrola <padresfan11 gmail.com> writes:

On Saturday, 19 November 2016 at 18:51:05 UTC, Charles Hixson 
wrote:

 ubyte[]    header()         property    {    return 
 fHead[4..$];    }

This method should do what you want. You are only returning a 
slice of the fHead array, so if the caller modifies the length it 
will only affect of the return value, and not the length of fHead 
itself.

Nov 19 2016

Charles Hixson via Digitalmars-d-learn writes:

On 11/19/2016 11:10 AM, Nicolas Gurrola via Digitalmars-d-learn wrote:
 On Saturday, 19 November 2016 at 18:51:05 UTC, Charles Hixson wrote:

 ubyte[]    header()         property {    return fHead[4..$];    }

 This method should do what you want. You are only returning a slice of 
 the fHead array, so if the caller modifies the length it will only 
 affect of the return value, and not the length of fHead itself.

It's worse than that, if they modify the length the array may be 
reallocated in RAM so that the pointers held by the containing class do 
not point to the changed values.  (Read the header comments...it's not 
nice at all.)
More, the class explicitly requires the array to be a particular length 
as it needs to fit into a spot in a binary file, so I really want to 
forbid any replacement of the array for any reason.  The first four 
bytes are managed by the class and returned via alternate routines which 
do not allow the original values to be altered, and this is necessary.  
I could make them immutable, but actually what I do is return, e.g., an 
int of 256 * fHead[0] + fHead[1], which is the length of the header.  
It's an int to allow negative values to be returned in case of error.

So what I'll probably eventually decide on is some variation of 
saveToHead, e.g.:
     bool    saveToHeader    (ubyte[] rec)
     {  if    (rec.length + 4 > dheader)    return    false;
         fHead[4..recLen + 4] = rec[0..$];
         return    true;
     }
unless I can think of something better.  Actually for this particular 
case that's not a bad approach, but for the larger problem it's a lousy 
kludge.

Nov 19 2016

ag0aep6g <anonymous example.com> writes:

On 11/19/2016 10:26 PM, Charles Hixson via Digitalmars-d-learn wrote:
 It's worse than that, if they modify the length the array may be
 reallocated in RAM so that the pointers held by the containing class do
 not point to the changed values.  (Read the header comments...it's not
 nice at all.)

Arguably, any D programmer must be aware that appending to a dynamic 
array potentially means making a copy of the data, and that changes to 
length are not visible to other views of the data.

But it's an opportunity to mess up, for sure. You could return a wrapper 
around the array that supports editing the data but not changing the 
length or appending.

Looks like std.experimental.typecons.Final [1] is supposed to be that 
wrapper. But in a little test I can still set the length. Not sure if 
that's a bug, or if Final has slightly different goals.


[1] https://dlang.org/phobos/std_experimental_typecons.html#.Final

Nov 19 2016

Charles Hixson via Digitalmars-d-learn writes:

On 11/19/2016 01:50 PM, ag0aep6g via Digitalmars-d-learn wrote:
 On 11/19/2016 10:26 PM, Charles Hixson via Digitalmars-d-learn wrote:
 It's worse than that, if they modify the length the array may be
 reallocated in RAM so that the pointers held by the containing class do
 not point to the changed values.  (Read the header comments...it's not
 nice at all.)

 Arguably, any D programmer must be aware that appending to a dynamic 
 array potentially means making a copy of the data, and that changes to 
 length are not visible to other views of the data.

 But it's an opportunity to mess up, for sure. You could return a 
 wrapper around the array that supports editing the data but not 
 changing the length or appending.

 Looks like std.experimental.typecons.Final [1] is supposed to be that 
 wrapper. But in a little test I can still set the length. Not sure if 
 that's a bug, or if Final has slightly different goals.


 [1] https://dlang.org/phobos/std_experimental_typecons.html#.Final

Yes.  I was hoping someone would pop up with some syntax making the 
array, but not its contents, const or immutable, which I couldn't figure 
out how to do, and which is what I really hoped would be the answer, but 
it appears that this isn't part of the syntax.  If the array is 
constant, so is it's contents.  I really *can't* allow the length to be 
changed, and if the array is reallocated, it won't get saved.  But the 
contents of the array are intended to be changed by the calling routines.

Again, for this particular problem the kludge of copying the values into 
the array works fine (and is what I've decided to do), but that's not a 
good general solution to this kind of problem.

Nov 19 2016

ag0aep6g <anonymous example.com> writes:

On 11/20/2016 01:33 AM, Charles Hixson via Digitalmars-d-learn wrote:
 Yes.  I was hoping someone would pop up with some syntax making the
 array, but not its contents, const or immutable, which I couldn't figure
 out how to do, and which is what I really hoped would be the answer, but
 it appears that this isn't part of the syntax.

Yup, head const is not part of the language. You'd have to find a 
library solution or write something yourself.

 I really *can't* allow the length to be
 changed,

Your emphasis suggests that user could break things for your code. They 
can't. Any changes to the length will only affect the slice on the 
user's end. They can only fool themselves. That may be bad enough to 
warrant a more restricted return type, but for your code it's safe to 
return a plain dynamic array.

Nov 19 2016

Charles Hixson via Digitalmars-d-learn writes:

On 11/19/2016 05:52 PM, ag0aep6g via Digitalmars-d-learn wrote:
 On 11/20/2016 01:33 AM, Charles Hixson via Digitalmars-d-learn wrote:
 Yes.  I was hoping someone would pop up with some syntax making the
 array, but not its contents, const or immutable, which I couldn't figure
 out how to do, and which is what I really hoped would be the answer, but
 it appears that this isn't part of the syntax.

 Yup, head const is not part of the language. You'd have to find a 
 library solution or write something yourself.

 I really *can't* allow the length to be
 changed,

 Your emphasis suggests that user could break things for your code. 
 They can't. Any changes to the length will only affect the slice on 
 the user's end. They can only fool themselves. That may be bad enough 
 to warrant a more restricted return type, but for your code it's safe 
 to return a plain dynamic array.

Whether you would call the change "break things for your code" might be 
dubious.  It would be effectively broken, even if technically my code 
was doing the correct thing.  But my code wouldn't be storing the data 
that needed storing, so effectively it would be broken. "Write something 
for yourself" is what I'd like to do, given that the language doesn't 
have that built-in support, but I can't see how to do it.  I want to end 
up with a continuous array of ubytes of a given length with certain 
parts reserved to only be directly accessible to the defining class, and 
other parts accessible to the calling class(es).  And the length of the 
array isn't known until run time.  So I guess the only safe solution is 
to do an extra copy...which isn't a problem in this particular 
application as I only need to do it twice per file opening (once on 
opening, once on closing), but for other applications would be a real drag.

Nov 19 2016

ag0aep6g <anonymous example.com> writes:

On 11/20/2016 04:34 AM, Charles Hixson via Digitalmars-d-learn wrote:
 Whether you would call the change "break things for your code" might be
 dubious.  It would be effectively broken, even if technically my code
 was doing the correct thing.  But my code wouldn't be storing the data
 that needed storing, so effectively it would be broken.

I don't see how it's dubious. It's an error by the user. When users are 
given a dynamic array (and not by reference), they cannot expect that 
your code sees changes to length. That's just not how arrays work. When 
a user has that wrong expectation, and writes wrong code because of it, 
then it's arguably their own fault. However, if you want you can hold 
their hand a bit and make the mistake less likely.

 "Write something
 for yourself" is what I'd like to do, given that the language doesn't
 have that built-in support, but I can't see how to do it.

Wrap the array in a struct that has indexing, but doesn't allow setting 
the length or appending. Here's a quick prototype:

----
struct ConstLengthArray(E)
{
     private E[] data;
     this(E[] arr) { this.data = arr; }
     ref inout(E) opIndex(size_t i) inout { return data[i]; }
      property size_t length() const { return data.length; }
}

void main()
{
     auto cla = ConstLengthArray!ubyte([1, 2, 3, 4, 5]);

     /* Mutating elements is allowed: */
     cla[0] = 10;
     assert(cla[0] == 10);

     /* No setting length, no appending: */
     static assert(!__traits(compiles, cla.length = 3));
     static assert(!__traits(compiles, cla ~= 6));
}
----

You might want to add support for slicing, concatenation, etc. Maybe 
allow implicit conversion to const(E[]), though that would also allow 
conversion to const(E)[] and that has a settable length again.

Nov 20 2016

Charles Hixson via Digitalmars-d-learn writes:

On 11/20/2016 03:42 AM, ag0aep6g via Digitalmars-d-learn wrote:
 On 11/20/2016 04:34 AM, Charles Hixson via Digitalmars-d-learn wrote:
 Whether you would call the change "break things for your code" might be
 dubious.  It would be effectively broken, even if technically my code
 was doing the correct thing.  But my code wouldn't be storing the data
 that needed storing, so effectively it would be broken.

 I don't see how it's dubious. It's an error by the user. When users 
 are given a dynamic array (and not by reference), they cannot expect 
 that your code sees changes to length. That's just not how arrays 
 work. When a user has that wrong expectation, and writes wrong code 
 because of it, then it's arguably their own fault. However, if you 
 want you can hold their hand a bit and make the mistake less likely.

 "Write something
 for yourself" is what I'd like to do, given that the language doesn't
 have that built-in support, but I can't see how to do it.

 Wrap the array in a struct that has indexing, but doesn't allow 
 setting the length or appending. Here's a quick prototype:

 ----
 struct ConstLengthArray(E)
 {
     private E[] data;
     this(E[] arr) { this.data = arr; }
     ref inout(E) opIndex(size_t i) inout { return data[i]; }
      property size_t length() const { return data.length; }
 }

 void main()
 {
     auto cla = ConstLengthArray!ubyte([1, 2, 3, 4, 5]);

     /* Mutating elements is allowed: */
     cla[0] = 10;
     assert(cla[0] == 10);

     /* No setting length, no appending: */
     static assert(!__traits(compiles, cla.length = 3));
     static assert(!__traits(compiles, cla ~= 6));
 }
 ----

 You might want to add support for slicing, concatenation, etc. Maybe 
 allow implicit conversion to const(E[]), though that would also allow 
 conversion to const(E)[] and that has a settable length again.

Well, that precise approach wouldn't work.  (The traits aren't a part of 
the sturct, e.g.), but returning a struct (or perhaps a class) rather 
than an actual array has promise.  It could even allow separate callers 
to have separate views of the data based on some sort of registered key, 
which they could share on an as-needed basis.  That's too much overhead 
work for this project, but has promise for the more general problem.

Nov 20 2016

ag0aep6g <anonymous example.com> writes:

On 11/20/2016 08:30 PM, Charles Hixson via Digitalmars-d-learn wrote:
 Well, that precise approach wouldn't work.  (The traits aren't a part of
 the sturct, e.g.),

What do you mean by "traits"?

Nov 20 2016

Charles Hixson via Digitalmars-d-learn writes:

On 11/20/2016 03:42 AM, ag0aep6g via Digitalmars-d-learn wrote:
 On 11/20/2016 04:34 AM, Charles Hixson via Digitalmars-d-learn wrote:
 Whether you would call the change "break things for your code" might be
 dubious.  It would be effectively broken, even if technically my code
 was doing the correct thing.  But my code wouldn't be storing the data
 that needed storing, so effectively it would be broken.

 I don't see how it's dubious. It's an error by the user. When users 
 are given a dynamic array (and not by reference), they cannot expect 
 that your code sees changes to length. That's just not how arrays 
 work. When a user has that wrong expectation, and writes wrong code 
 because of it, then it's arguably their own fault. However, if you 
 want you can hold their hand a bit and make the mistake less likely.

 "Write something
 for yourself" is what I'd like to do, given that the language doesn't
 have that built-in support, but I can't see how to do it.

 Wrap the array in a struct that has indexing, but doesn't allow 
 setting the length or appending. Here's a quick prototype:

 ----
 struct ConstLengthArray(E)
 {
     private E[] data;
     this(E[] arr) { this.data = arr; }
     ref inout(E) opIndex(size_t i) inout { return data[i]; }
      property size_t length() const { return data.length; }
 }

 void main()
 {
     auto cla = ConstLengthArray!ubyte([1, 2, 3, 4, 5]);

     /* Mutating elements is allowed: */
     cla[0] = 10;
     assert(cla[0] == 10);

     /* No setting length, no appending: */
     static assert(!__traits(compiles, cla.length = 3));
     static assert(!__traits(compiles, cla ~= 6));
 }
 ----

 You might want to add support for slicing, concatenation, etc. Maybe 
 allow implicit conversion to const(E[]), though that would also allow 
 conversion to const(E)[] and that has a settable length again

Thinking it over a bit more, the item returned would need to be a 
struct, but the struct wouldn't contain the array, it would just contain 
a reference to the array and a start and end offset.  The array would 
need to live somewhere else, in the class (or struct...but class is 
better as you don't want the array evaporating by accident) that created 
the returned value.  This means you are dealing with multiple levels of 
indirection, so it's costly compared to array access, but cheap compared 
to lots of large copies.  So the returned value would be something like:
struct
{
     private:
     /** this is a reference to the data that lives elsewhere.  It 
should be a pointer, but I don't like the syntax*/
     ubyte[]  data;
     int    start, end;    ///    first and last valid indicies into data
     public:
     this (ubyte[] data, int start, int end)
     {    this.data = data; this.start = start; this.end = end;}
     ...
     // various routines to access the data, but to limit the access to 
the spec'd range, and
     // nothing to change the bounds
}
Which is really the answer you already posted, but just a bit more 
detail on the construct, and what it meant.  (Yeah, I could allow types 
other than ubyte as the base case, but I don't want to.  I'm thinking of 
this mainly as a means of sharing a buffer between applications where 
different parts have exclusive access to different parts of the buffer, 
and where the buffer will be written to a file with a single fwrite, or 
since the underlying storage will be an array, it could even be 
rawwrite).  I don't want to specify any more than I must about how the 
methods calling this will format the storage, and this means that those 
with access to different parts may well use different collections of 
types, but all types eventually map down to ubytes (or bytes), so ubytes 
is the common ground.  Perhaps I'll need to write inbuffer,outbuffer 
methods/wrappings, but that's far in the future.

P.S.:  The traits that I mentioned previously were those given by:
     static assert(!__traits(compiles, cla.length = 3));
     static assert(!__traits(compiles, cla ~= 6));
in your main routine.  I assumed that they were validity tests.  I don't 
understand why they were static.  I've never happened to use static 
asserts, but I would assume that when they ran cla wouldn't be defined.

N.B.:  Even this much is just thinking about design, not something I'll 
actually do at the moment.  But this is a problem I keep coming up 
against, so a bit of thought now seemed a good idea.

Nov 20 2016

ag0aep6g <anonymous example.com> writes:

On 11/20/2016 09:09 PM, Charles Hixson via Digitalmars-d-learn wrote:
 Thinking it over a bit more, the item returned would need to be a
 struct, but the struct wouldn't contain the array, it would just contain
 a reference to the array and a start and end offset.  The array would
 need to live somewhere else, in the class (or struct...but class is
 better as you don't want the array evaporating by accident) that created
 the returned value.  This means you are dealing with multiple levels of
 indirection, so it's costly compared to array access, but cheap compared
 to lots of large copies.  So the returned value would be something like:
 struct
 {
     private:
     /** this is a reference to the data that lives elsewhere.  It should
 be a pointer, but I don't like the syntax*/
     ubyte[]  data;
     int    start, end;    ///    first and last valid indicies into data
     public:
     this (ubyte[] data, int start, int end)
     {    this.data = data; this.start = start; this.end = end;}
     ...
     // various routines to access the data, but to limit the access to
 the spec'd range, and
     // nothing to change the bounds
 }

Instead of extra 'start' and 'end' fields you can slice the array. A 
dynamic array already is just a reference coupled with a length, i.e. a 
pointer with restricted indexing. So you can slice the original array 
with your offsets and create the struct with that slice.

I feel like there is a misunderstanding somewhere, but I'm not sure on 
whose side. As far as I can tell, your understanding of dynamic arrays 
may be lacking, or maybe I don't understand what you're getting at.

 Which is really the answer you already posted, but just a bit more
 detail on the construct, and what it meant.  (Yeah, I could allow types
 other than ubyte as the base case, but I don't want to.  I'm thinking of
 this mainly as a means of sharing a buffer between applications where
 different parts have exclusive access to different parts of the buffer,
 and where the buffer will be written to a file with a single fwrite, or
 since the underlying storage will be an array, it could even be
 rawwrite).  I don't want to specify any more than I must about how the
 methods calling this will format the storage, and this means that those
 with access to different parts may well use different collections of
 types, but all types eventually map down to ubytes (or bytes), so ubytes
 is the common ground.  Perhaps I'll need to write inbuffer,outbuffer
 methods/wrappings, but that's far in the future.

Sure, go with a specialized type instead of a template, if that makes 
more sense for your use case. As far as I see, the concept is 
independent of the element type, so it seemed natural to make it a 
template, but a special type is perfectly fine and probably has less 
pitfalls.

 P.S.:  The traits that I mentioned previously were those given by:
     static assert(!__traits(compiles, cla.length = 3));
     static assert(!__traits(compiles, cla ~= 6));
 in your main routine.  I assumed that they were validity tests.  I don't
 understand why they were static.  I've never happened to use static
 asserts, but I would assume that when they ran cla wouldn't be defined.

Those are tests to ensure that cla's length cannot be set and that it 
cannot be appended to. The asserts check that the code does not compile, 
i.e. that you cannot do those things. They're static simply because they 
can be. The code inside is not executed at run-time.

Nov 20 2016

Charles Hixson via Digitalmars-d-learn writes:

On 11/20/2016 12:41 PM, ag0aep6g via Digitalmars-d-learn wrote:
 On 11/20/2016 09:09 PM, Charles Hixson via Digitalmars-d-learn wrote:
 Thinking it over a bit more, the item returned would need to be a
 struct, but the struct wouldn't contain the array, it would just contain
 a reference to the array and a start and end offset.  The array would
 need to live somewhere else, in the class (or struct...but class is
 better as you don't want the array evaporating by accident) that created
 the returned value.  This means you are dealing with multiple levels of
 indirection, so it's costly compared to array access, but cheap compared
 to lots of large copies.  So the returned value would be something like:
 struct
 {
     private:
     /** this is a reference to the data that lives elsewhere. It should
 be a pointer, but I don't like the syntax*/
     ubyte[]  data;
     int    start, end;    ///    first and last valid indicies into data
     public:
     this (ubyte[] data, int start, int end)
     {    this.data = data; this.start = start; this.end = end;}
     ...
     // various routines to access the data, but to limit the access to
 the spec'd range, and
     // nothing to change the bounds
 }

 Instead of extra 'start' and 'end' fields you can slice the array. A 
 dynamic array already is just a reference coupled with a length, i.e. 
 a pointer with restricted indexing. So you can slice the original 
 array with your offsets and create the struct with that slice.

 I feel like there is a misunderstanding somewhere, but I'm not sure on 
 whose side. As far as I can tell, your understanding of dynamic arrays 
 may be lacking, or maybe I don't understand what you're getting at.

While you are definitely correct about slices, I really feel more 
comfortable with the start and end fields.  I keep being afraid some 
smart optimizer is going to decide I don't really need the entire 
array.  This is probably unjustified, but I find start and end fields 
easier to think about.  I don't think either of us misunderstands what's 
going on, we just feel differently about methods that are approximately 
equivalent.  Slices would probably be marginally more efficient  (well, 
certainly so as you'd need two less fields), so if this were a public 
library there would be an excellent argument for doing it your way.  I 
keep trying to think of a reason that start and end fields are better, 
and failing.  All I've got is that I feel more comfortable with them.
 Which is really the answer you already posted, but just a bit more
 detail on the construct, and what it meant.  (Yeah, I could allow types
 other than ubyte as the base case, but I don't want to.  I'm thinking of
 this mainly as a means of sharing a buffer between applications where
 different parts have exclusive access to different parts of the buffer,
 and where the buffer will be written to a file with a single fwrite, or
 since the underlying storage will be an array, it could even be
 rawwrite).  I don't want to specify any more than I must about how the
 methods calling this will format the storage, and this means that those
 with access to different parts may well use different collections of
 types, but all types eventually map down to ubytes (or bytes), so ubytes
 is the common ground.  Perhaps I'll need to write inbuffer,outbuffer
 methods/wrappings, but that's far in the future.

 Sure, go with a specialized type instead of a template, if that makes 
 more sense for your use case. As far as I see, the concept is 
 independent of the element type, so it seemed natural to make it a 
 template, but a special type is perfectly fine and probably has less 
 pitfalls.

 P.S.:  The traits that I mentioned previously were those given by:
     static assert(!__traits(compiles, cla.length = 3));
     static assert(!__traits(compiles, cla ~= 6));
 in your main routine.  I assumed that they were validity tests. I don't
 understand why they were static.  I've never happened to use static
 asserts, but I would assume that when they ran cla wouldn't be defined.

 Those are tests to ensure that cla's length cannot be set and that it 
 cannot be appended to. The asserts check that the code does not 
 compile, i.e. that you cannot do those things. They're static simply 
 because they can be. The code inside is not executed at run-time.

Nov 21 2016

D Programming

C/C++ Programming

Other

digitalmars.D.learn - shared arrray problem