digitalmars.D - Negative array indices?

Norbert Nemec (28/28) May 06 2004 Hi there,

Matthew (4/10) May 06 2004 That stinks
J Anderson (4/9) May 06 2004 Yes it has.

Matthew (10/22) May 06 2004 To be less facetious. The reason it's a very bad idea is that array subs...

Norbert Nemec (14/23) May 06 2004 OK, I understand this. How about finding another syntax for describing a

J Anderson (5/32) May 06 2004 This has been discussed before, a few times. It's only syntax sugar but...
Mark (8/14) May 06 2004 How about defining a symbol for the str.length such as '$', then

Matthew (4/19) May 06 2004 Weird city. I was thinking just the same thing, but without the erudite
Norbert Nemec (10/19) May 06 2004 I like that one! It would not only be more readable but also more flexib...
Derek (6/25) May 07 2004 Agreed. I have been advocating this exact same thing for the Euphoria

Harvey Stroud (24/48) May 09 2004 It seems to me that supporting the negative array index of C for the sak...

Stewart Gordon (15/29) May 10 2004 There would be a performance hit if we had to check at runtime if every
Norbert Nemec (17/25) May 10 2004 For pointers, negative indices actually make sense. If you allow indexin...

Stewart Gordon (38/43) May 10 2004 "The former possibility to assign to the .size property of an array

Norbert Nemec (35/79) May 10 2004 Of course. Sorry about that error.

J Anderson (8/12) May 10 2004 It won't get uncontrollable if your careful not to create persistent

Norbert Nemec (21/32) May 10 2004 What does the following routine return:

J Anderson (16/60) May 10 2004 That is not true, its quite easy to learn how D arrays behave. You only...

Stewart Gordon (21/33) May 11 2004 If you want to guarantee that it's a separate copy, of course you'd call

Norbert Nemec (20/38) May 11 2004 For strings, it might not be very useful. For arrays in general, though,

Stewart Gordon (18/24) May 11 2004

Norbert Nemec (14/35) May 11 2004 Guess, it is just a question of documenting clearly what happens. It sho...

Walter (4/6) May 10 2004 http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.h...

Norbert Nemec (7/17) May 10 2004 I have a long list of changes and corrections I want to work in first.

Harvey Stroud (43/67) May 11 2004 ----- Original Message -----

Norbert Nemec (6/22) May 11 2004 Thanks. The basic idea still is rather simple, but explaining it in deta...

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Hi there,

I wonder whether this has been discussed before:

* How about allowing negative array indices to count backward from the end
of the array?

* How about allowing to drop one of the bounds of a range to indicate
beginning or end of the array?

An example should make clear what I mean:

        char[] str = "0123456789"
        assert(str[2] == '2');
        assert(str[-3] == '7');
        assert(str[1..3] == "12");
        assert(str[..4] == "0123"); // just for completeness
                                    // could be written as str[0..4] just as
well
        str[-11]; // throws ArrayBoundsError

        assert(str[7..] == "789");
        assert(str[-4..] == "6789");
        assert(str[4..-2] == "4567");

One question I could not resolve for myself: should illegal ranges throw an
ArrayBoundsError or return an empty/truncated string? One way would be an
extremely tolerant behaviour like:

        assert(str[4..2] == "");
        assert(str[-2..3] == "");

        assert(str[7..22] == "789");
        assert(str[-7..8] == "34567");

Alternatively, one could argue that each case should throw an
ArrayBoundsError. What is the current behaviour?

Ciao,
Nobbi

May 06 2004

"Matthew" <matthew.hat stlsoft.dot.org> writes:

"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:c7co59$11i4$1 digitaldaemon.com...
 Hi there,

 I wonder whether this has been discussed before:

 * How about allowing negative array indices to count backward from the end
 of the array?

That stinks

 * How about allowing to drop one of the bounds of a range to indicate
 beginning or end of the array?

That doesn't.

May 06 2004

J Anderson <REMOVEanderson badmama.com.au> writes:

Norbert Nemec wrote:

Hi there,

I wonder whether this has been discussed before:

* How about allowing negative array indices to count backward from the end
of the array?
  

Yes it has.

-- 
-Anderson: http://badmama.com.au/~anderson/

May 06 2004

"Matthew" <matthew.hat stlsoft.dot.org> writes:

To be less facetious. The reason it's a very bad idea is that array subscripting
in C and C++ and D can be done with signed integers because it is legal _and
meaningful_ to pass a -ve subscript to mean prior to the given base (pointer
and/or array).

Since D's support of C constructs most certainly encompasses this very
important,
albeit dangerous, construct, it would be nonsensical to have built-in D arrays
use a back-relative offset and pointers use -ve offset. It would be a total
killer.

"J Anderson" <REMOVEanderson badmama.com.au> wrote in message
news:c7d5bc$1lsl$1 digitaldaemon.com...
 Norbert Nemec wrote:

Hi there,

I wonder whether this has been discussed before:

* How about allowing negative array indices to count backward from the end
of the array?

 Yes it has.

 -- 
 -Anderson: http://badmama.com.au/~anderson/

May 06 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Matthew wrote:

 To be less facetious. The reason it's a very bad idea is that array
 subscripting in C and C++ and D can be done with signed integers because
 it is legal _and meaningful_ to pass a -ve subscript to mean prior to the
 given base (pointer and/or array).
 
 Since D's support of C constructs most certainly encompasses this very
 important, albeit dangerous, construct, it would be nonsensical to have
 built-in D arrays use a back-relative offset and pointers use -ve offset.
 It would be a total killer.

OK, I understand this. How about finding another syntax for describing a
range bound that is counted backwards from the end of an array? p.e.

        char[] str = "0123456789"
        assert(str[<3] == '7');
        assert(str[<4..] == "6789");
        assert(str[4..<2] == "4567");

The < as a prefix does not exist yet and would only be valid for indices or
ragne bounds, just like the .. infix operator. Alternative syntax proposals
are probably easy to find.

In general, some way to indicate this "counting back from the end" really
improves string handling qualities of a language.

Ciao,
Nobbi

May 06 2004

J Anderson <REMOVEanderson badmama.com.au> writes:

Norbert Nemec wrote:

Matthew wrote:

  

To be less facetious. The reason it's a very bad idea is that array
subscripting in C and C++ and D can be done with signed integers because
it is legal _and meaningful_ to pass a -ve subscript to mean prior to the
given base (pointer and/or array).

Since D's support of C constructs most certainly encompasses this very
important, albeit dangerous, construct, it would be nonsensical to have
built-in D arrays use a back-relative offset and pointers use -ve offset.
It would be a total killer.
    

OK, I understand this. How about finding another syntax for describing a
range bound that is counted backwards from the end of an array? p.e.

        char[] str = "0123456789"
        assert(str[<3] == '7');
        assert(str[<4..] == "6789");
        assert(str[4..<2] == "4567");

The < as a prefix does not exist yet and would only be valid for indices or
ragne bounds, just like the .. infix operator. Alternative syntax proposals
are probably easy to find.

In general, some way to indicate this "counting back from the end" really
improves string handling qualities of a language.

Ciao,
Nobbi
  

This has been discussed before, a few times.  It's only syntax sugar but 
a good idea.

-- 
-Anderson: http://badmama.com.au/~anderson/

May 06 2004

Mark <Mark_member pathlink.com> writes:

OK, I understand this. How about finding another syntax for describing a
range bound that is counted backwards from the end of an array? p.e.

        char[] str = "0123456789"
        assert(str[<3] == '7');
        assert(str[<4..] == "6789");
        assert(str[4..<2] == "4567");


How about defining a symbol for the str.length such as '$', then

char[] str = "0123456789"
assert(str[$-3] == '7');
assert(str[$-4..$] == "6789");
assert(str[4..$-2] == "4567");

and since '$' already has the meaning "end of line" in other contexts
readability is maintained without the clutter of 'str.length'.

Mark.

May 06 2004

"Matthew" <matthew.hat stlsoft.dot.org> writes:

"Mark" <Mark_member pathlink.com> wrote in message
news:c7dmj7$2lsu$1 digitaldaemon.com...
OK, I understand this. How about finding another syntax for describing a
range bound that is counted backwards from the end of an array? p.e.

        char[] str = "0123456789"
        assert(str[<3] == '7');
        assert(str[<4..] == "6789");
        assert(str[4..<2] == "4567");


 How about defining a symbol for the str.length such as '$', then

 char[] str = "0123456789"
 assert(str[$-3] == '7');
 assert(str[$-4..$] == "6789");
 assert(str[4..$-2] == "4567");

 and since '$' already has the meaning "end of line" in other contexts
 readability is maintained without the clutter of 'str.length'.

Weird city. I was thinking just the same thing, but without the erudite
rationale. Consider yourself "hear, hear"'d!

May 06 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Mark wrote:
 How about defining a symbol for the str.length such as '$', then
 
 char[] str = "0123456789"
 assert(str[$-3] == '7');
 assert(str[$-4..$] == "6789");
 assert(str[4..$-2] == "4567");
 
 and since '$' already has the meaning "end of line" in other contexts
 readability is maintained without the clutter of 'str.length'.

I like that one! It would not only be more readable but also more flexible
than my idea. Consider for example things like:

        str[..$/2]

It would make even more sense for multidimensional arrays, since there
str.length would become str.range[i] with i indexing the different
dimensions. Just consider:
        mymatrix[..$-2,..$-2]
as syntactic sugar for
        mymatrix[0..mymatrix.range[0]-2,0..mymatrix.range[1]-2]

May 06 2004

Derek <ddparnell bigpond.com> writes:

On Thu, 6 May 2004 15:45:43 +0000 (UTC), Mark wrote:

OK, I understand this. How about finding another syntax for describing a
range bound that is counted backwards from the end of an array? p.e.

        char[] str = "0123456789"
        assert(str[<3] == '7');
        assert(str[<4..] == "6789");
        assert(str[4..<2] == "4567");

 
 
 How about defining a symbol for the str.length such as '$', then
 
 char[] str = "0123456789"
 assert(str[$-3] == '7');
 assert(str[$-4..$] == "6789");
 assert(str[4..$-2] == "4567");
 
 and since '$' already has the meaning "end of line" in other contexts
 readability is maintained without the clutter of 'str.length'.

Agreed. I have been advocating this exact same thing for the Euphoria
language for ages now.


-- 
Derek
Melbourne, Australia

May 07 2004

"Harvey Stroud" <hstroud ntlworld.com> writes:

It seems to me that supporting the negative array index of C for the sake of
backward compatibility goes against the design philosophy for D, which as I
see it,  is the keeping of the general look and feel of C++ while discarding
dubious features of which -ve array indexing is surely one?  Wouldn't it
make sense to remove this dangerous behaviour from the language, or better
to replace it with an alternative meaning. Besides, how many people out
there actually use indexing in this way, although maybe for pointer
manipulation it could be useful, albeit error prone.

Introducing a special operator ($) to denote the length strikes me as
ungainly, making the code more perl-like, but perhaps that's just my dislike
of none C symbols.

Has anybody given any thought to an [optional] stride value:

int[] x = a[1..10 : 2];    // Gets every other element of the array

Cheers,
Harvey.


"Matthew" <matthew.hat stlsoft.dot.org> wrote in message
news:c7d5p2$1n2f$1 digitaldaemon.com...
 To be less facetious. The reason it's a very bad idea is that array

subscripting
 in C and C++ and D can be done with signed integers because it is legal

_and
 meaningful_ to pass a -ve subscript to mean prior to the given base

(pointer
 and/or array).

 Since D's support of C constructs most certainly encompasses this very

important,
 albeit dangerous, construct, it would be nonsensical to have built-in D

arrays
 use a back-relative offset and pointers use -ve offset. It would be a

total
 killer.

 "J Anderson" <REMOVEanderson badmama.com.au> wrote in message
 news:c7d5bc$1lsl$1 digitaldaemon.com...
 Norbert Nemec wrote:

Hi there,

I wonder whether this has been discussed before:

* How about allowing negative array indices to count backward from the



end
of the array?

 Yes it has.

 -- 
 -Anderson: http://badmama.com.au/~anderson/

May 09 2004

Stewart Gordon <smjg_1998 yahoo.com> writes:

Harvey Stroud wrote:
 It seems to me that supporting the negative array index of C for the sake of
 backward compatibility goes against the design philosophy for D, which as I
 see it,  is the keeping of the general look and feel of C++ while discarding
 dubious features of which -ve array indexing is surely one?

We already do discard this.  It's called array bounds checking.

 Wouldn't it make sense to remove this dangerous behaviour from the language,
or better
 to replace it with an alternative meaning.

There would be a performance hit if we had to check at runtime if every 
index is +ve or -ve, wherever it can't be determined at compile time.

 Besides, how many people out there actually use indexing in this way, although
maybe for pointer
 manipulation it could be useful, albeit error prone.

Well, D&DP in general is almost bound to be error prone.  But since 
preserving D&DP support is part of D's design strategy, perhaps it 
should be kept.

 Introducing a special operator ($) to denote the length strikes me as
 ungainly, making the code more perl-like, but perhaps that's just my dislike
 of none C symbols.

Do you have an idea for a nicer symbol to use for this?

 Has anybody given any thought to an [optional] stride value:
 
 int[] x = a[1..10 : 2];    // Gets every other element of the array

<snip top of upside-down reply>

Not yet AFAIK.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the 
unfortunate victim of intensive mail-bombing at the moment.  Please keep 
replies on the 'group where everyone may benefit.

May 10 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Harvey Stroud wrote:

 It seems to me that supporting the negative array index of C for the sake
 of backward compatibility goes against the design philosophy for D...

For pointers, negative indices actually make sense. If you allow indexing of
raw pointers (which I think is a good idea) then prohibiting negative
indices would be strange. For arrays, negative indices are, of cause,
caught by the range checking mechanism.

Raw pointers, of course, are error prone. Anyway it's the philosophy of D to
give the developer all the tools to shoot himself in the foot, but make it
clear what the dangerous tools are, and encourage him to avoid these tools
completely.

 Introducing a special operator ($) to denote the length strikes me as
 ungainly, making the code more perl-like, but perhaps that's just my
 dislike of none C symbols.

That's just personal taste. $ has no meaning in D so far, and it is a plain
ASCII character. Why not put it to use?

B.t.w: in the suggested meaning, $ would not be a normal operator at all,
but something special that does not exist in D so far: a "zero-ary
operator" or however you want to call it.

 Has anybody given any thought to an [optional] stride value:
 
 int[] x = a[1..10 : 2];    // Gets every other element of the array

See my multidimension array proposal at

http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html

it contains strided slices and much more.

May 10 2004

Stewart Gordon <smjg_1998 yahoo.com> writes:

Norbert Nemec wrote:

<snip>
 See my multidimension array proposal at
 
 http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 
 it contains strided slices and much more.

"The former possibility to assign to the .size property of an array 
should be dropped since it is obscuring what actually happens:"

Did you mean .size or .length?

If .length, then I'm not sure I'd agree with you.

"upsizing on the other hand, means allocating new memory and copying the 
existing data. For this operation, a different syntax should be found 
that makes clear what is happening."

Not necessarily.  It could mean simply changing the range, filling up 
already allocated growing space.

The point of D isn't to have the programmer concern him/herself with the 
inner workings of everything.  If they wanted that, they'd probably use 
assembly.  Or maybe compromise with plain old C.

The idea of D is to support syntax that makes sense to the human 
programmer, while allowing the compiler to implement it efficiently.

         "M[a][b] = new mytype();

Be aware of the difference between the type declaration mytype[B][A] and 
the dereferenciation mytype[a][b]."

You mean "the dereferenciation M[a][b]"?

"In its full generality, this internal representation would, of course, 
allow all kinds of weird shapes and self-overlappings."

And word/dword-alignment of rows where the individual elements may be 
byte-aligned, if there are benefits to that.

"The property .diag() sums up all strides and returns a one-dimensional 
array reference corresponding to the total diagonal of the original array."

What if the array isn't square/cube/tesseract/general hypercube?  Would 
it just count the shortest dimension, i.e. as far as the diagonal 
remains inside the array?

What conversions would be allowed between multidimensional arrays and 
old-fashioned D linear arrays?  Even something that can be understood by 
third-party foreign code?

I'd also suggest running the proposal through a spellchecker at some point.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the 
unfortunate victim of intensive mail-bombing at the moment.  Please keep 
replies on the 'group where everyone may benefit.

May 10 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Stewart Gordon wrote:

 Norbert Nemec wrote:
 
 <snip>
 See my multidimension array proposal at
 


http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 
 it contains strided slices and much more.

 
 "The former possibility to assign to the .size property of an array
 should be dropped since it is obscuring what actually happens:"
 
 Did you mean .size or .length?

Of course. Sorry about that error.

 If .length, then I'm not sure I'd agree with you.
 
 "upsizing on the other hand, means allocating new memory and copying the
 existing data. For this operation, a different syntax should be found
 that makes clear what is happening."
 
 Not necessarily.  It could mean simply changing the range, filling up
 already allocated growing space.

Even worse! If, after upsizing, you don't even know whether you are
referencing new space or the same as before, the whole thing gets
completely uncontrollable.

The current situation is: dynamic arrays actually are references to the
heap. Two arrays may reference the same portion of the heap, so changing
one will change the other. Anyhow, the language does everything to obscure
this fact and make it rather hard to predict, when it happens. Unless you
really know the details, you will often call .dup without need, and, in the
other way, you will have trouble if you trust that two arrays refer to the
same space.

 The point of D isn't to have the programmer concern him/herself with the
 inner workings of everything.  If they wanted that, they'd probably use
 assembly.  Or maybe compromise with plain old C.

That is true, but I'm not talking about implementation details but about
semantics. Dynamic arrays are references and behave like references. The
language tries to hide this from the developer but does not do so
completely, resulting in behaviour that is hard to predict and hard to
understand.

 The idea of D is to support syntax that makes sense to the human
 programmer, while allowing the compiler to implement it efficiently.

True, but if you hide implementation details, this should be done
completely. There is a semantic difference between reference and value
types. Currently, dynamic arrays behave somewhere in between, making it
very confusing to use them efficiently.

          "M[a][b] = new mytype();
 
 Be aware of the difference between the type declaration mytype[B][A] and
 the dereferenciation mytype[a][b]."
 
 You mean "the dereferenciation M[a][b]"?

True, another typo.

 "The property .diag() sums up all strides and returns a one-dimensional
 array reference corresponding to the total diagonal of the original
 array."
 
 What if the array isn't square/cube/tesseract/general hypercube?  Would
 it just count the shortest dimension, i.e. as far as the diagonal
 remains inside the array?

True, I thought about picking the smallest range. Should have said so, I
guess.

 What conversions would be allowed between multidimensional arrays and
 old-fashioned D linear arrays?

old-fashioned D array, or "lightweight array references" as they are called
in the proposal are implicitely casted to mytype[[1]] (trivially setting
the stride to 1) In the other direction, a direct cast does not make sense,
since the stride might be !=1. I was thinking about having mytype[[1]].dup
return a mytype[] reference. This would avoid the need for another language
construct. Not sure about it, yet.

 Even something that can be understood by third-party foreign code?

Conversion to Fortran arrays is already trivially possible. (A few
convenience functions might make it even more comfortable.) Everything else
should be easy to implement.

 I'd also suggest running the proposal through a spellchecker at some
 point.

Sorry - missed out on that, I guess... :-(

May 10 2004

J Anderson <REMOVEanderson badmama.com.au> writes:

Norbert Nemec wrote:

If, after upsizing, you don't even know whether you are
referencing new space or the same as before, the whole thing gets
completely uncontrollable.
  

It won't get uncontrollable if your careful not to create persistent 
alias of the same memory location. You would have to do that any way you 
look at it.  Just because C had malloc and realloc didn't change this 
problem at all.  Please give a good source example of where D arrays 
fail you.

-- 
-Anderson: http://badmama.com.au/~anderson/

May 10 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

J Anderson wrote:

 Norbert Nemec wrote:
 
If, after upsizing, you don't even know whether you are
referencing new space or the same as before, the whole thing gets
completely uncontrollable.
  

 It won't get uncontrollable if your careful not to create persistent
 alias of the same memory location. You would have to do that any way you
 look at it.  Just because C had malloc and realloc didn't change this
 problem at all.  Please give a good source example of where D arrays
 fail you.

What does the following routine return:

---------------------------
char myrountine(char[] input, uint param) {
        char[] strB = input;
        strB.size = param;
        input[0] = 'x';
        return strB[0];
};
---------------------------

admittedly, you will probably call strB a persistent alias and tell me to
avoid it, but how would I know? The language spec sounds like: If you want
to make sure your array is unique, call .dup - otherwise nothing is
guaranteed. This will result in people calling .dup unnecessarily, just
because they are frightened of the "nothing is guaranteed". Actually: if
you don't know the implementation details, you just have to build in .dups
that are probably unnecessary.

Furthermore, sometimes, you might actually be interested in having
definitely aliased arrays. The language spec, though, does not give you
much certainty that an implementation might not suddenly call .dup for some
reason.

May 10 2004

J Anderson <REMOVEanderson badmama.com.au> writes:

Norbert Nemec wrote:

J Anderson wrote:

  

Norbert Nemec wrote:

    

If, after upsizing, you don't even know whether you are
referencing new space or the same as before, the whole thing gets
completely uncontrollable.
 
      

It won't get uncontrollable if your careful not to create persistent
alias of the same memory location. You would have to do that any way you
look at it.  Just because C had malloc and realloc didn't change this
problem at all.  Please give a good source example of where D arrays
fail you.
    

What does the following routine return:

---------------------------
char myrountine(char[] input, uint param) {
        char[] strB = input;
        strB.size = param;
        input[0] = 'x';
        return strB[0];
};
---------------------------
  

admittedly, you will probably call strB a persistent alias and tell me to
avoid it, but how would I know? 

You've just answered your own question.

The language spec sounds like: If you want
to make sure your array is unique, call .dup - otherwise nothing is
guaranteed. This will result in people calling .dup unnecessarily, just
because they are frightened of the "nothing is guaranteed". 
  

That is not true, its quite easy to learn how D arrays behave.  You only 
need to use dup if you want to modify a copy of the array.  That is you 
don't want to modify the original array. 

Actually: if
you don't know the implementation details, you just have to build in .dups
that are probably unnecessary.

You should know what the function you call does, otherwise why call it.  
Functions that modify the size of an array are generally very easy to 
spot and are rare (most of the array resize should be encapsulated with 
its own module.)

Changing the name does not help things one bit.  It's performace reasons 
that make arrays like this nessary.  If you want a garrenteed array 
positions, wrap it in a class and create something like STL's slow 
vector class.

Furthermore, sometimes, you might actually be interested in having
definitely aliased arrays. The language spec, though, does not give you
much certainty that an implementation might not suddenly call .dup for some
reason.
  

This is a design issue.  Use pointers to the array, or wrap them in classes.

-- 
-Anderson: http://badmama.com.au/~anderson/

May 10 2004

Stewart Gordon <smjg_1998 yahoo.com> writes:

Norbert Nemec wrote:

 Stewart Gordon wrote:

<snip>
 The current situation is: dynamic arrays actually are references to 
 the heap. Two arrays may reference the same portion of the heap, so 
 changing one will change the other. Anyhow, the language does 
 everything to obscure this fact and make it rather hard to predict, 
 when it happens. Unless you really know the details, you will often 
 call .dup without need,

If you want to guarantee that it's a separate copy, of course you'd call
dup.

Of course, a decent compiler would coalesce two statements

	int[] qwert = yuiop.dup;
	qwert.length = asdfg;

into a single allocation operation.

 and, in the other way, you will have trouble if you trust that two 
 arrays refer to the same space.

To which someone might say, "Don't do that then!"

At the moment I can see little use for wanting to access one array by
what's effectively another, longer array.

<snip>
 Conversion to Fortran arrays is already trivially possible. (A few 
 convenience functions might make it even more comfortable.) 
 Everything else should be easy to implement.

<snip>

True, if the strides remain those for a new array.  But if you've been
playing with strided/block/diagonal slicing, then unless Fortran arrays
support striding on this level, you'd need to do some rearrangement.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the
unfortunate victim of intensive mail-bombing at the moment.  Please keep
replies on the 'group where everyone may benefit.

May 11 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Stewart Gordon wrote:

 Norbert Nemec wrote:
 
 and, in the other way, you will have trouble if you trust that two
 arrays refer to the same space.

 
 To which someone might say, "Don't do that then!"
 
 At the moment I can see little use for wanting to access one array by
 what's effectively another, longer array.

For strings, it might not be very useful. For arrays in general, though,
there are many cases where it really is extremely useful. Imagine a 1GB
array in memory, maybe representing a huge multidimensional matrix or
whatever. You would really want to be able to handle multiple references to
portions of that data in a comfortable way without the risk of suddenly
getting a copy unintentionally.

 <snip>
 Conversion to Fortran arrays is already trivially possible. (A few
 convenience functions might make it even more comfortable.)
 Everything else should be easy to implement.

 <snip>
 
 True, if the strides remain those for a new array.  But if you've been
 playing with strided/block/diagonal slicing, then unless Fortran arrays
 support striding on this level, you'd need to do some rearrangement.

Of course. If a given fortran routine expects data aligned in memory in a
given way, you might need to copy the data to that alignment before passing
a reference to the fortran routine. Anyhow: if D is able to handle arrays
in arbitrary alignment and striding, you may often be able to handle the
data in Fortran alignment for a long time without necessary conversions. 

Example: get an array from Fortran, use a D-library function on it, pass it
back to Fortran. No conversion necessary, because the D library can easily
handle the array no matter how it is aligned in memory, because the
alignment information is fully enclosed in the array reference with minimal
(with good optimization: neglectible) overhead in terms of access time.

Furthermore: writing a wrapper for a Fortran library, the wrapper can do all
necessary conversions automatically, without doing any unnecessary
conversions back and forth.

May 11 2004

Stewart Gordon <smjg_1998 yahoo.com> writes:

Norbert Nemec wrote:

<snip>
 For strings, it might not be very useful. For arrays in general, though,
 there are many cases where it really is extremely useful. Imagine a 1GB
 array in memory, maybe representing a huge multidimensional matrix or
 whatever. You would really want to be able to handle multiple references to
 portions of that data in a comfortable way without the risk of suddenly
 getting a copy unintentionally.

<snip>

You can, if you allocate the matrix first and then start creating 
windows of it.  Slice references don't unintentionally turn into copies. 
   (Of course, I'm not sure what happens if you increase the length of a 
slice reference, but if that's an issue you'd avoid it anyway for this 
kind of work.)  As long as the matrix doesn't grow, you're safe.

If the matrix wants to be variable in size, you can still treat it as 
being one size (a reasonable maximum, whatever that may be) for 
allocation purposes.  Of course, if no maximum is reasonable, or you 
bump into an unreasonable circumstance, you'd need to deal with 
reallocation whether the .length property is there and assignable or not.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the 
unfortunate victim of intensive mail-bombing at the moment.  Please keep 
replies on the 'group where everyone may benefit.

May 11 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Stewart Gordon wrote:

 Norbert Nemec wrote:
 
 <snip>
 For strings, it might not be very useful. For arrays in general, though,
 there are many cases where it really is extremely useful. Imagine a 1GB
 array in memory, maybe representing a huge multidimensional matrix or
 whatever. You would really want to be able to handle multiple references
 to portions of that data in a comfortable way without the risk of
 suddenly getting a copy unintentionally.

 <snip>
 
 You can, if you allocate the matrix first and then start creating
 windows of it.  Slice references don't unintentionally turn into copies.
    (Of course, I'm not sure what happens if you increase the length of a
 slice reference, but if that's an issue you'd avoid it anyway for this
 kind of work.)  As long as the matrix doesn't grow, you're safe.

Guess, it is just a question of documenting clearly what happens. It should
just be absolutely clear which operations might copy data.

By now, I have even been convinced to cut the paragraph about making .length
read only. Anyhow: it definitely has to be documented in which way it
works, what exactly .dup does, etc.

 If the matrix wants to be variable in size, you can still treat it as
 being one size (a reasonable maximum, whatever that may be) for
 allocation purposes.  Of course, if no maximum is reasonable, or you
 bump into an unreasonable circumstance, you'd need to deal with
 reallocation whether the .length property is there and assignable or not.

OK. Guess, I'll just accept that the behaviour upsizing by assigning to
.length is not predictable if you don't know where the array reference came
from.

B.t.w.: assigning to range[] in my multidimensional arrays is even more
tricky, since you have to consider the full shape to see whether upsizing
in place might be possible. I'm still not sure whether it might be best to
allow assignment to length in one-dimensional arrays, but leave the range
property read-only.

May 11 2004

"Walter" <newshound digitalmars.com> writes:

"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:c7nj4j$218d$1 digitaldaemon.com...
 See my multidimension array proposal at

http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 it contains strided slices and much more.

Could you put that into the D wiki?

May 10 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Walter wrote:

 
 "Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
 news:c7nj4j$218d$1 digitaldaemon.com...
 See my multidimension array proposal at


http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 it contains strided slices and much more.

 
 Could you put that into the D wiki?

I have a long list of changes and corrections I want to work in first.
Stewart Gordon and Ben Hinkle gave me some very valuable input that should
find its way into the the proposal. Once I'm satisfied with it and the most
controversial points are solved, it is, of course, up to you, Walter, to do
with it what you like.

May 10 2004

"Harvey Stroud" <hstroud ntlworld.com> writes:

----- Original Message ----- 
From: "Norbert Nemec" <Norbert.Nemec gmx.de>
Newsgroups: digitalmars.D
Sent: Monday, May 10, 2004 10:48 AM
Subject: Re: Negative array indices?

 Harvey Stroud wrote:

 It seems to me that supporting the negative array index of C for the

sake
 of backward compatibility goes against the design philosophy for D...

 For pointers, negative indices actually make sense. If you allow indexing

of
 raw pointers (which I think is a good idea) then prohibiting negative
 indices would be strange. For arrays, negative indices are, of cause,
 caught by the range checking mechanism.

 I think I should have read the language spec more before posting, as I was
assuming from the following that -ve indices were valid for arrays:

  "The reason it's a very bad idea is that array subscripting
  in C and C++ and D can be done with signed integers because it is legal
_and  meaningful_ to pass a -ve subscript to mean prior to the given base
(pointer  and/or array)."

Of course, this isn't quite the case with arrays as runtime bounds checking
won't allow this, although whether switching off this mechanism via a
compiler switch would circumvent this I'm not sure.

I can see why the introduction of -ve indices to have a different behaviour
would impose (slight) overhead on the runtime, and while this overhead must
be already present with bounds checking, at least the latter is optional and
may be compiled out. With -ve indexing implying a different semantic the
checking would always have to remain regardless, unless it was only allowed
for (compile time detectable) literals, which is bad as it wouldn't be
orthogonal.

 Raw pointers, of course, are error prone. Anyway it's the philosophy of D

to
 give the developer all the tools to shoot himself in the foot, but make it
 clear what the dangerous tools are, and encourage him to avoid these tools
 completely.

Yup, I completely agree. If the programmer still wants the raw power of
pointers then let them have it. Btw, I wasn't suggesting that -ve indexing
for such pointers should be prohibited - that would just be wacky.

 Introducing a special operator ($) to denote the length strikes me as
 ungainly, making the code more perl-like, but perhaps that's just my
 dislike of none C symbols.

 That's just personal taste. $ has no meaning in D so far, and it is a

plain
 ASCII character. Why not put it to use?

Agreed, just my preference.  I think what I don't like about it is that it's
an arbitrary symbol denoting some magic value. To the uninitiated it looks
odd. Ok, it'd wouldn't take long to get used to but still, it seems a step
in the direcion perl has taken in using such arbitrary symbols, and look how
unreadable that is. Probably a very minor point though.

 B.t.w: in the suggested meaning, $ would not be a normal operator at all,
 but something special that does not exist in D so far: a "zero-ary
 operator" or however you want to call it.

 Has anybody given any thought to an [optional] stride value:

 int[] x = a[1..10 : 2];    // Gets every other element of the array

 See my multidimension array proposal at

http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 it contains strided slices and much more.

Wow, there's a lot to chew over in that doc!  I've only had chance to skim
it so far but it looks like a lot of good thought's gone into it. I really
like the notation of the indices being within the same set of brackets
a[m,n] for rectangular arrays as this suggests a tighter coupling of the
array elements than the a[n][m] notation for dynamic arrays; both notations
are appropriate to reflect the underlying nature of the data types. I look
forward to seeing your next draft.

Cheers,
Harvey.

May 11 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Harvey Stroud wrote:

 ----- Original Message -----
 From: "Norbert Nemec" <Norbert.Nemec gmx.de>
 See my multidimension array proposal at

http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html

 it contains strided slices and much more.

 
 Wow, there's a lot to chew over in that doc!  I've only had chance to skim
 it so far but it looks like a lot of good thought's gone into it. I really
 like the notation of the indices being within the same set of brackets
 a[m,n] for rectangular arrays as this suggests a tighter coupling of the
 array elements than the a[n][m] notation for dynamic arrays; both
 notations are appropriate to reflect the underlying nature of the data
 types. I look forward to seeing your next draft.

Thanks. The basic idea still is rather simple, but explaining it in detail
really took more effort than I myself would have expected.

I would suggest to wait for the next version of the proposal before reading
it in detail. I have a number of changes to make already, and running it
through a spellchecker might also improve readability...

May 11 2004

D Programming

C/C++ Programming

Other

digitalmars.D - Negative array indices?