D - Thought on Array Type Syntax

Russ Lewis (25/25) Jun 30 2003 What if we declare array variables and types with the indices on the

Mark Evans (16/16) Jun 30 2003 The question is really, Why use square brackets for so many purposes? A...

Sean L. Palmer (20/36) Jun 30 2003 You may be on the right track, but overloading ( ) more to avoid overloa...

Mark Evans (38/41) Jul 01 2003 It matters little except to avoid []. Maybe (), <>, {}, (()), or XMLism...

Fabian Giesen (15/30) Jul 01 2003 I don't really think so. VLAs are still arrays, and so are strings (who

Sean L. Palmer (10/35) Jun 30 2003 This is a terse version of Pascal type specifier syntax, which read left...
Fabian Giesen (4/11) Jul 01 2003 I'd rather simply not use the "int *var3,var4" notation in D code anymor...

Russ Lewis (4/18) Jul 01 2003 I heard you, but then you've violated one of the fundamental assumptions...

Fabian Giesen (9/13) Jul 01 2003 It is only being used for delimiting tokens. D *always* groups * with

Russ Lewis (4/21) Jul 01 2003 Oh, so you're just talking about a coding convention? Yeah, I totally

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

What if we declare array variables and types with the indices on the 
left?  Then we can still "build up" from basic types but still have 
in-order array indicies:
	[type_foo][]int var;
which would be an associative array (indexed by 'type_foo') of dynamic 
arrays of int.  var would be accessed in the same index-order as the 
declaration:
	var[foo_index][int_index] = <val>;

Interestingly, this also makes it easy and nonambiguous when you are 
mixing array and pointer modifiers:
	[type_foo]*[]*int var2;
would be an associative array of pointers to dynamic arrays to pointers 
to int.

I know that this looks horribly backwards to all of us who grew up on C. 
  But the reverse-index problem is also a horrible backwards, and it is 
NOT immediately obvious to new programmers (or even to old programmers, 
like me).  Maybe htis ugliness is better than the current?

AN INTERESTING SIDE-EFFECT is that now it is (more) obvious from the 
syntax that all variables in a multiple-declaration are the same type. 
In the current syntax,
	int *var3,var4;
declares two pointers, but looks (at first glance) like it's one pointer 
and one int.  But this syntax: is more obvious (at least to me):
	*int var3,var4;

Thoughts, anyone?

Jun 30 2003

Mark Evans <Mark_member pathlink.com> writes:

The question is really, Why use square brackets for so many purposes?  Any
syntax element with multiple meanings will be painful.  In C and D alike, square
brackets both declare and index arrays.  The confusion grows in D with the
proliferation of arrays:  VLA's, associative array declarations, strings, and
slicing.  D is overloading the [] syntax beyond reasonable limits.

One can hack through the undergrowth in a simple way.  Separate type
declarations from indexing.  Reserve [] for indexing and slicing only.  There
are other ways to declare types.

A phrase like 'array(int,N,fixed)' could declare an N-dimensional array of int,
'array(double,N,variable)' a VLA of doubles with initial size N.  A mixed case
might read 'array(array(int,N,fixed),M,variable)'.  If you don't like these
notions, invent your own.  There are no limits except to keep [] out of the type
declarations.

If D is finally going to break with C syntax (hooray), then go all the way and
do it right.  I'm not holding my breath, but that would be my input.

Mark

Jun 30 2003

"Sean L. Palmer" <palmer.sean verizon.net> writes:

You may be on the right track, but overloading ( ) more to avoid overloading
[ ] doesn't make much sense.

We need more brackets.

I think Unicode may have a few more brackets that may be useful.   ;)

Some mileage may be gained by using double square brackets for declarations,
thusly:

[[4]][[]]int wierdarray;

Since it's normally invalid syntax to have nested square brackets, this
should be unambiguous.

Sean

"Mark Evans" <Mark_member pathlink.com> wrote in message
news:bdq1d0$rhl$1 digitaldaemon.com...
 The question is really, Why use square brackets for so many purposes?  Any
 syntax element with multiple meanings will be painful.  In C and D alike,

square
 brackets both declare and index arrays.  The confusion grows in D with the
 proliferation of arrays:  VLA's, associative array declarations, strings,

and
 slicing.  D is overloading the [] syntax beyond reasonable limits.

 One can hack through the undergrowth in a simple way.  Separate type
 declarations from indexing.  Reserve [] for indexing and slicing only.

There
 are other ways to declare types.

 A phrase like 'array(int,N,fixed)' could declare an N-dimensional array of

int,
 'array(double,N,variable)' a VLA of doubles with initial size N.  A mixed

case
 might read 'array(array(int,N,fixed),M,variable)'.  If you don't like

these
 notions, invent your own.  There are no limits except to keep [] out of

the type
 declarations.

 If D is finally going to break with C syntax (hooray), then go all the way

and
 do it right.  I'm not holding my breath, but that would be my input.

 Mark

Jun 30 2003

Mark Evans <Mark_member pathlink.com> writes:

You may be on the right track, but overloading ( ) more to avoid overloading
[ ] doesn't make much sense.
We need more brackets.

It matters little except to avoid [].  Maybe (), <>, {}, (()), or XMLisms could
work.  The main idea is to use a self-closing, nesting syntax instead of C's
flat syntax.

A pseudo-functional form stands to reason because the type declaration is a kind
of compile-time function.  Parameters go in, a type comes out.  One could argue
that <> makes more sense from a C++ familiarity and semantics standpoint.  I
would not quibble over such details.

Here is my quibble.  Instead of making the input parameters clear, C and D use
cryptic, subtle clues:  [] vs. [N], embedding the symbol inside its own type
signature, and using implicit rules of precedence and associativity.  Explicit
parameters make more sense.

Writing very involved C and C++ type definitions teaches one to composit
typedefs with each other, avoiding C syntax completely at almost every step.
This procedure is tantamount to shutting the language down and suggests that
something is very wrong with it.

http://compilers.iecc.com/comparch/article/03-06-010

"This is very true. When computer languages skirt the edge of
ambiguity, people often write things they think are correct, but which
are actually logical errors. For example, most people assume
left-associative exponentiation, but right-associative exponentiation
is also a valid interpretation of the mathematics and concepts
involved.

So if your language has an exponentiation operator, you have to make
an explicit decision and specify it: is exponentiation
left-associative, right-associative, or do you require parens or the
equivalent to make it explicit? And after getting bitten a few times
anyway, which inevitably happens, most programmers learn to use
parentheses defensively, to prevent exactly that kind of ambiguity,
even when the language has a rule for resolving it. That is, even in a
language that has a rule for resolving a semantic ambiguity, people
have to think about it and defend against misinterpretation - as much
by themselves as by the language system.

I've been bitten this way by C's address-of and dereferencing
operators not associating the way I expect them to and requiring
parentheses to disambiguate, many times. And now I just use parens as
part of those operators because I don't want to sweat out some obscure
bug caused by me taking one view of how something would be parsed and
the compiler taking another (my LISP background shows here, I guess)."

Jul 01 2003

"Fabian Giesen" <rygNO SPAMgmx.net> writes:

 The question is really, Why use square brackets for so many purposes?
 Any syntax element with multiple meanings will be painful.  In C and
 D alike, square brackets both declare and index arrays.  The
 confusion grows in D with the proliferation of arrays:  VLA's,
 associative array declarations, strings, and slicing.  D is
 overloading the [] syntax beyond reasonable limits.

I don't really think so. VLAs are still arrays, and so are strings (who
inherit the C notion of being an "array of characters", even though the
way D specifies arrays makes it *much* safer than the C variant). That
associative arrays behave like their non-associative counterparts is,
given the name, pretty obvious to me.

Remains the issue of slicing - but when you go the other way round and
view array indexing as a special case of slicing (which it is), we're at
exactly 2 uses: array declaration and array slicing. The whole point of
C-style declarations being that a declaration looks just like the actual
use (cf. pointers), I don't see much of an issue with that.

 One can hack through the undergrowth in a simple way.  Separate type
 declarations from indexing.  Reserve [] for indexing and slicing
 only.  There are other ways to declare types.

 A phrase like 'array(int,N,fixed)' could declare an N-dimensional
 array of int, 'array(double,N,variable)' a VLA of doubles with
 initial size N.  A mixed case might read
 'array(array(int,N,fixed),M,variable)'.  If you don't like these
 notions, invent your own.  There are no limits except to keep [] out
 of the type declarations.

I really don't see much point in this - as said, the idea about C
declaration syntax *was* to look and behave like actual code, and
the awful syntax of function pointers aside I think this both makes
sense and is intuitive. What is your problem with that notion, exactly?

-fg

Jul 01 2003

"Sean L. Palmer" <palmer.sean verizon.net> writes:

This is a terse version of Pascal type specifier syntax, which read left to
right.  array[0..3] of ^ foo, or something of that nature... my Pascal days
are rapidly becoming a faded memory.

It is also almost exactly the method I chose for my old scripting language,
now known as Scrap.  ;)

Very simple to parse.  Very easy to remember, or to read.  I recommend it
thoroughly.

Sean

"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:bdpt1j$m5q$1 digitaldaemon.com...
 What if we declare array variables and types with the indices on the
 left?  Then we can still "build up" from basic types but still have
 in-order array indicies:
 [type_foo][]int var;
 which would be an associative array (indexed by 'type_foo') of dynamic
 arrays of int.  var would be accessed in the same index-order as the
 declaration:
 var[foo_index][int_index] = <val>;

 Interestingly, this also makes it easy and nonambiguous when you are
 mixing array and pointer modifiers:
 [type_foo]*[]*int var2;
 would be an associative array of pointers to dynamic arrays to pointers
 to int.

 I know that this looks horribly backwards to all of us who grew up on C.
   But the reverse-index problem is also a horrible backwards, and it is
 NOT immediately obvious to new programmers (or even to old programmers,
 like me).  Maybe htis ugliness is better than the current?

 AN INTERESTING SIDE-EFFECT is that now it is (more) obvious from the
 syntax that all variables in a multiple-declaration are the same type.
 In the current syntax,
 int *var3,var4;
 declares two pointers, but looks (at first glance) like it's one pointer
 and one int.  But this syntax: is more obvious (at least to me):
 *int var3,var4;

 Thoughts, anyone?

Jun 30 2003

"Fabian Giesen" <rygNO SPAMgmx.net> writes:

 AN INTERESTING SIDE-EFFECT is that now it is (more) obvious from the
 syntax that all variables in a multiple-declaration are the same type.
 In the current syntax,
 int *var3,var4;
 declares two pointers, but looks (at first glance) like it's one
 pointer and one int.  But this syntax: is more obvious (at least to
 me): *int var3,var4;

I'd rather simply not use the "int *var3,var4" notation in D code anymore,
but the perfectly legal and far more descriptive variant "int* var3,var4"
instead :)

-fg

Jul 01 2003

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

Fabian Giesen wrote:
AN INTERESTING SIDE-EFFECT is that now it is (more) obvious from the
syntax that all variables in a multiple-declaration are the same type.
In the current syntax,
int *var3,var4;
declares two pointers, but looks (at first glance) like it's one
pointer and one int.  But this syntax: is more obvious (at least to
me): *int var3,var4;

 
 
 I'd rather simply not use the "int *var3,var4" notation in D code anymore,
 but the perfectly legal and far more descriptive variant "int* var3,var4"
 instead :)
 
 -fg

I heard you, but then you've violated one of the fundamental assumptions 
of the C family: that whitespace is only used for delimiting tokens, not 
for syntax. :(

Jul 01 2003

"Fabian Giesen" <rygNO SPAMgmx.net> writes:

 I heard you, but then you've violated one of the fundamental
 assumptions
 of the C family: that whitespace is only used for delimiting tokens,
 not for syntax. :(

It is only being used for delimiting tokens. D *always* groups * with
the type, regardless of whitespace (C/C++ always group with the variable,
again regardless of whitespace).

For grouping with variables, the descriptive (and intuitive) way to write it
is int *x,y; For grouping with types, int* x,y; makes far more sense. Both
variants are absolutely equal in C/C++/D as far as parsing is concerned and I
do not propose to change that - however, as said, it's just more natural to
write "int* x,y" the way D parses declarations.

-fg

Jul 01 2003

Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:

Fabian Giesen wrote:
I heard you, but then you've violated one of the fundamental
assumptions
of the C family: that whitespace is only used for delimiting tokens,
not for syntax. :(

 
 
 It is only being used for delimiting tokens. D *always* groups * with
 the type, regardless of whitespace (C/C++ always group with the variable,
 again regardless of whitespace).
 
 For grouping with variables, the descriptive (and intuitive) way to write it
 is int *x,y; For grouping with types, int* x,y; makes far more sense. Both
 variants are absolutely equal in C/C++/D as far as parsing is concerned and I
 do not propose to change that - however, as said, it's just more natural to
 write "int* x,y" the way D parses declarations.
 
 -fg

Oh, so you're just talking about a coding convention?  Yeah, I totally 
agree, and I already do that. :)

Russ

Jul 01 2003

D Programming

C/C++ Programming

Other

D - Thought on Array Type Syntax