digitalmars.D - complement to $

Steven Schveighoffer (27/27) May 14 2010 Currently, D supports the special symbol $ to mean the end of a

strtr (2/29) May 14 2010 I like it, if only for its happy aesthetic properties.
Simen kjaeraas (6/9) May 14 2010 I like ^ for this usage, due (as you said) to its symmetry with

Robert Jacques (4/13) May 14 2010 ? is the ternary operator

Simen kjaeraas (5/22) May 14 2010 Yeah, I found that the mesh wouldn't work. What should be no

Michel Fortin (10/13) May 14 2010 coll[µ..$];

Steven Schveighoffer (11/18) May 14 2010 Not exactly, µ would have to be a global with the same type/meaning
Simen kjaeraas (6/14) May 14 2010 =AC. :-)

Walter Bright (3/5) May 14 2010 I'd just go with accepting the literal 0. Let's see how far that goes fi...

Steven Schveighoffer (17/23) May 14 2010 Do you have specific objections, or does it just look horrendous to you ...

Walter Bright (7/34) May 14 2010 The problem is D already has a lot of syntax. More syntax just makes the...

KennyTM~ (2/37) May 14 2010 Why a map type (sorted associative array)'s key must start at zero?

Walter Bright (2/3) May 14 2010 You can special case the [0..$], or simply use [] to represent the entir...

Simen kjaeraas (4/8) May 15 2010 Of course, but assume you want the first 15 elements, what do you do?

Walter Bright (3/13) May 15 2010 For a map, does the first 15 elements even make any sense? There is no o...

Simen kjaeraas (6/18) May 15 2010 std::map is ordered. Other data structures might make more sense.

Walter Bright (2/22) May 15 2010 If it's ordered, then why doesn't [0..15] make sense to get the first 15...

KennyTM~ (9/32) May 15 2010 auto a = new OrderedDict!(int, string);

Walter Bright (2/37) May 15 2010 Good question.
bearophile (13/29) May 15 2010 D slicing syntax and indexing isn't able to represent what you can in Py...

Walter Bright (3/20) May 15 2010 Sure you can:

bearophile (12/14) May 15 2010 Probably my post was nearly useless, because it doesn't help much the de...
Simen kjaeraas (23/42) May 15 2010 Ah, but if you then change the length of a, last_index is no longer

Kagamin (2/13) May 17 2010 half of a slice: a[None+len(a) : None]

bearophile (4/5) May 17 2010 Python is strictly typed, so you can't sum None with an int.

Kagamin (2/4) May 17 2010 I meant, Python is not a silver bullet and its abstraction of first and ...

bearophile (17/18) May 15 2010 This is an interesting topic in general (the following are general notes...
Steven Schveighoffer (35/65) May 17 2010 In a lot of cases, this is somewhat true. On the other hand though,

Nick Sabalausky (6/33) May 14 2010 I think 0 makes perfect sense for any ordered container, or, really,

div0 (18/23) May 14 2010 -----BEGIN PGP SIGNED MESSAGE-----

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (9/11) May 14 2010 Speaking of regex, [^ sequence starts a set of excluded characters. :)

bearophile (9/10) May 14 2010 The $ is not elegant, but it's a good solution to a design problem, how ...

Nick Sabalausky (22/33) May 14 2010 Once upon a time, there was a book called "Writing Solid Code". It seeme...

Mike Parker (3/22) May 15 2010 It's available on safari, for anyone who has a subscription.
bearophile (20/28) May 15 2010 "Writing Solid Code" is a book about programming, but its examples are i...

Nick Sabalausky (35/78) May 15 2010 Yea. The book is heavily C, of course, because C was heavily used at the...

Steven Schveighoffer (10/20) May 14 2010 Well, for true contiguous ranges such as arrays, you need to have ways o...
Pelle (2/8) May 15 2010 coll[uniform(0,$)]; // <-- awesome.

Don (6/44) May 15 2010 If we were to have something like this (and I'm quite unconvinced that

eles (5/5) May 16 2010 I am not sure that is necessary to have a symbol for begining ($
Steven Schveighoffer (58/93) May 17 2010 slicing implies order, that is for sure. But mapping to natural numbers...

Kagamin (2/13) May 17 2010 If your collection supports the array interface, you can provide it expl...
Ellery Newcomer (4/15) May 17 2010 Does your collections library allow for code like

Steven Schveighoffer (14/35) May 17 2010 No. begin and end return cursors, which are essentially non-movable

Ellery Newcomer (3/14) May 17 2010 emphasis on the semantics (slice starting at second element), not the

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

Currently, D supports the special symbol $ to mean the end of a  
container/range.

However, there is no analogous symbol to mean "beginning of a  
container/range".  For arrays, there is none necessary, 0 is always the  
first element.  But not all containers are arrays.

I'm running into a dilemma for dcollections, I have found a way to make  
all containers support fast slicing (basically by imposing some  
limitations), and I would like to support *both* beginning and end symbols.

Currently, you can slice something in dcollections via:

coll[coll.begin..coll.end];

I could replace that end with $, but what can I replace coll.begin with?   
0 doesn't make sense for things like linked lists, maps, sets, basically  
anything that's not an array.

One thing that's nice about opDollar is I can make it return coll.end, so  
I control the type.  With 0, I have no choice, I must take a uint, which  
means I have to check to make sure it's always zero, and throw an  
exception otherwise.

Would it make sense to have an equivalent symbol for the beginning of a  
container/range?

In regex, ^ matches beginning of the line, $ matches end of the line --  
would there be any parsing ambiguity there?  I know ^ is a binary op, and  
$ means nothing anywhere else, so the two are not exactly equivalent.  I'm  
not very experienced on parsing ambiguities, but things like ~ can be  
unambiguous as binary and unary ops, so maybe it is possible.

So how does this look:  coll[^..$];

Thoughts? other ideas?

-Steve

May 14 2010

strtr <strtr spam.com> writes:

== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Currently, D supports the special symbol $ to mean the end of a
 container/range.
 However, there is no analogous symbol to mean "beginning of a
 container/range".  For arrays, there is none necessary, 0 is always the
 first element.  But not all containers are arrays.
 I'm running into a dilemma for dcollections, I have found a way to make
 all containers support fast slicing (basically by imposing some
 limitations), and I would like to support *both* beginning and end symbols.
 Currently, you can slice something in dcollections via:
 coll[coll.begin..coll.end];
 I could replace that end with $, but what can I replace coll.begin with?
 0 doesn't make sense for things like linked lists, maps, sets, basically
 anything that's not an array.
 One thing that's nice about opDollar is I can make it return coll.end, so
 I control the type.  With 0, I have no choice, I must take a uint, which
 means I have to check to make sure it's always zero, and throw an
 exception otherwise.
 Would it make sense to have an equivalent symbol for the beginning of a
 container/range?
 In regex, ^ matches beginning of the line, $ matches end of the line --
 would there be any parsing ambiguity there?  I know ^ is a binary op, and
 $ means nothing anywhere else, so the two are not exactly equivalent.  I'm
 not very experienced on parsing ambiguities, but things like ~ can be
 unambiguous as binary and unary ops, so maybe it is possible.
 So how does this look:  coll[^..$];
 Thoughts? other ideas?
 -Steve

I like it, if only for its happy aesthetic properties.

May 14 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

  Steven Schveighoffer <schveiguy yahoo.com> wrote:

 So how does this look:  coll[^..$];

 Thoughts? other ideas?

 -Steve

I like ^ for this usage, due (as you said) to its symmetry with


There should be no ambiguities with the caret as far as I can see.

-- 
Simen

May 14 2010

"Robert Jacques" <sandford jhu.edu> writes:

On Fri, 14 May 2010 09:32:31 -0400, Simen kjaeraas  
<simen.kjaras gmail.com> wrote:

   Steven Schveighoffer <schveiguy yahoo.com> wrote:

 So how does this look:  coll[^..$];

 Thoughts? other ideas?

 -Steve

 I like ^ for this usage, due (as you said) to its symmetry with


 There should be no ambiguities with the caret as far as I can see.

? is the ternary operator

May 14 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Robert Jacques <sandford jhu.edu> wrote:

 On Fri, 14 May 2010 09:32:31 -0400, Simen kjaeraas  
 <simen.kjaras gmail.com> wrote:

   Steven Schveighoffer <schveiguy yahoo.com> wrote:

 So how does this look:  coll[^..$];

 Thoughts? other ideas?

 -Steve

 I like ^ for this usage, due (as you said) to its symmetry with


 There should be no ambiguities with the caret as far as I can see.

 ? is the ternary operator


Yeah, I found that the mesh wouldn't work. What should be no
problem, though.

-- 
Simen

May 14 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-14 09:20:10 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 So how does this look:  coll[^..$];
 
 Thoughts? other ideas?

	coll[µ..$];

The funny thing is that you can probably make it work today if you want 
since 'µ' is a valid identifier. Unfortunately you can't use €. :-)

Other characters you could use: ø, Ø, ß...

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 14 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 14 May 2010 10:25:23 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2010-05-14 09:20:10 -0400, "Steven Schveighoffer"  
 <schveiguy yahoo.com> said:

 So how does this look:  coll[^..$];
  Thoughts? other ideas?

 	coll[µ..$];

 The funny thing is that you can probably make it work today if you want  
 since 'µ' is a valid identifier. Unfortunately you can't use €. :-)

Not exactly, µ would have to be a global with the same type/meaning  
everywhere.  I want to control the type per container, so the compiler  
would still have to treat it special, or I would have to use  
coll[coll.µ..$].

If I didn't want to control the type, I could of course use 0 to that same  
effect.

Besides, I can't type that character or any of those others (had to  
copy-paste), so I don't see it being a viable alternative :)

-Steve

May 14 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Michel Fortin <michel.fortin michelf.com> wrote:

 On 2010-05-14 09:20:10 -0400, "Steven Schveighoffer"  =

 <schveiguy yahoo.com> said:

 So how does this look:  coll[^..$];
  Thoughts? other ideas?

 	coll[=C2=B5..$];

 The funny thing is that you can probably make it work today if you wan=

t  =

 since '=C2=B5' is a valid identifier. Unfortunately you can't use =E2=82=

=AC. :-)
 Other characters you could use: =C3=B8, =C3=98, =C3=9F...

Sadly, =C2=A2 is not on the list.

-- =

Simen

May 14 2010

Walter Bright <newshound1 digitalmars.com> writes:

Steven Schveighoffer wrote:
 So how does this look:  coll[^..$];

nooooooooo <g>

 Thoughts? other ideas?

I'd just go with accepting the literal 0. Let's see how far that goes first.

May 14 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 14 May 2010 13:33:57 -0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 So how does this look:  coll[^..$];

 nooooooooo <g>

Do you have specific objections, or does it just look horrendous to you  
:)  Would another symbol be acceptable?

 Thoughts? other ideas?

 I'd just go with accepting the literal 0. Let's see how far that goes  
 first.

I thought of a counter case:

auto tm = new TreeMap!(int, uint);
tm[-1] = 5;
tm[1] = 6;

What does tm[0..$] mean?  What about tm[0]?  If it is analogous to  
"beginning of collection" then it doesn't make any sense for a container  
with a key of numeric type.

Actually any map type where the indexes don't *always* start at zero are a  
problem.

I can make 0 work for LinkList and ArrayList, but not any of the others.   
Even with TreeSet, I allow using element values as slice arguments.

I guess I should have pointed this out in my first post... sorry.

-Steve

May 14 2010

Walter Bright <newshound1 digitalmars.com> writes:

Steven Schveighoffer wrote:
 On Fri, 14 May 2010 13:33:57 -0400, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 
 Steven Schveighoffer wrote:
 So how does this look:  coll[^..$];

 nooooooooo <g>

 
 Do you have specific objections, or does it just look horrendous to you 
 :)  Would another symbol be acceptable?


The problem is D already has a lot of syntax. More syntax just makes the 
language more burdensome after a certain point, even if in isolation it's a
good 
idea.

One particular problem regex has is that few can remember its syntax unless
they 
use it every day.


 Thoughts? other ideas?

 I'd just go with accepting the literal 0. Let's see how far that goes 
 first.

 
 I thought of a counter case:
 
 auto tm = new TreeMap!(int, uint);
 tm[-1] = 5;
 tm[1] = 6;
 
 What does tm[0..$] mean?  What about tm[0]?  If it is analogous to 
 "beginning of collection" then it doesn't make any sense for a container 
 with a key of numeric type.
 
 Actually any map type where the indexes don't *always* start at zero are 
 a problem.

I'd question the design of a map type that has the start at something other
than 0.

May 14 2010

KennyTM~ <kennytm gmail.com> writes:

On May 15, 10 13:07, Walter Bright wrote:
 Steven Schveighoffer wrote:
 On Fri, 14 May 2010 13:33:57 -0400, Walter Bright
 <newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 So how does this look: coll[^..$];

 nooooooooo <g>

 Do you have specific objections, or does it just look horrendous to
 you :) Would another symbol be acceptable?


 The problem is D already has a lot of syntax. More syntax just makes the
 language more burdensome after a certain point, even if in isolation
 it's a good idea.

 One particular problem regex has is that few can remember its syntax
 unless they use it every day.


 Thoughts? other ideas?

 I'd just go with accepting the literal 0. Let's see how far that goes
 first.

 I thought of a counter case:

 auto tm = new TreeMap!(int, uint);
 tm[-1] = 5;
 tm[1] = 6;

 What does tm[0..$] mean? What about tm[0]? If it is analogous to
 "beginning of collection" then it doesn't make any sense for a
 container with a key of numeric type.

 Actually any map type where the indexes don't *always* start at zero
 are a problem.

 I'd question the design of a map type that has the start at something
 other than 0.

Why a map type (sorted associative array)'s key must start at zero?

May 14 2010

Walter Bright <newshound1 digitalmars.com> writes:

KennyTM~ wrote:
 Why a map type (sorted associative array)'s key must start at zero?

You can special case the [0..$], or simply use [] to represent the entire range.

May 14 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Walter Bright <newshound1 digitalmars.com> wrote:

 KennyTM~ wrote:
 Why a map type (sorted associative array)'s key must start at zero?

 You can special case the [0..$], or simply use [] to represent the  
 entire range.

Of course, but assume you want the first 15 elements, what do you do?

-- 
Simen

May 15 2010

Walter Bright <newshound1 digitalmars.com> writes:

Simen kjaeraas wrote:
 Walter Bright <newshound1 digitalmars.com> wrote:
 
 KennyTM~ wrote:
 Why a map type (sorted associative array)'s key must start at zero?

 You can special case the [0..$], or simply use [] to represent the 
 entire range.

 
 Of course, but assume you want the first 15 elements, what do you do?
 

For a map, does the first 15 elements even make any sense? There is no order in 
a map.

May 15 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Walter Bright <newshound1 digitalmars.com> wrote:

 Simen kjaeraas wrote:
 Walter Bright <newshound1 digitalmars.com> wrote:

 KennyTM~ wrote:
 Why a map type (sorted associative array)'s key must start at zero?

 You can special case the [0..$], or simply use [] to represent the  
 entire range.

  Of course, but assume you want the first 15 elements, what do you do?

 For a map, does the first 15 elements even make any sense? There is no  
 order in a map.

std::map is ordered. Other data structures might make more sense.

A weird example would be a trie - slice all from the start
to ['f','o','o'], for instance.


-- 
Simen

May 15 2010

Walter Bright <newshound1 digitalmars.com> writes:

Simen kjaeraas wrote:
 Walter Bright <newshound1 digitalmars.com> wrote:
 
 Simen kjaeraas wrote:
 Walter Bright <newshound1 digitalmars.com> wrote:

 KennyTM~ wrote:
 Why a map type (sorted associative array)'s key must start at zero?

 You can special case the [0..$], or simply use [] to represent the 
 entire range.

  Of course, but assume you want the first 15 elements, what do you do?

 For a map, does the first 15 elements even make any sense? There is no 
 order in a map.

 
 std::map is ordered. Other data structures might make more sense.
 
 A weird example would be a trie - slice all from the start
 to ['f','o','o'], for instance.


If it's ordered, then why doesn't [0..15] make sense to get the first 15
elements?

May 15 2010

KennyTM~ <kennytm gmail.com> writes:

On May 16, 10 02:01, Walter Bright wrote:
 Simen kjaeraas wrote:
 Walter Bright <newshound1 digitalmars.com> wrote:

 Simen kjaeraas wrote:
 Walter Bright <newshound1 digitalmars.com> wrote:

 KennyTM~ wrote:
 Why a map type (sorted associative array)'s key must start at zero?

 You can special case the [0..$], or simply use [] to represent the
 entire range.

 Of course, but assume you want the first 15 elements, what do you do?

 For a map, does the first 15 elements even make any sense? There is
 no order in a map.

 std::map is ordered. Other data structures might make more sense.

 A weird example would be a trie - slice all from the start
 to ['f','o','o'], for instance.


 If it's ordered, then why doesn't [0..15] make sense to get the first 15
 elements?

auto a = new OrderedDict!(int, string);
a[-3] = "negative three";
a[-1] = "negative one";
a[0] = "zero";
a[3] = "three";
a[4] = "four";
assert(a[0] == "zero");
return a[0..4]; // which slice should it return?

May 15 2010

Walter Bright <newshound1 digitalmars.com> writes:

KennyTM~ wrote:
 On May 16, 10 02:01, Walter Bright wrote:
 Simen kjaeraas wrote:
 Walter Bright <newshound1 digitalmars.com> wrote:

 Simen kjaeraas wrote:
 Walter Bright <newshound1 digitalmars.com> wrote:

 KennyTM~ wrote:
 Why a map type (sorted associative array)'s key must start at zero?

 You can special case the [0..$], or simply use [] to represent the
 entire range.

 Of course, but assume you want the first 15 elements, what do you do?

 For a map, does the first 15 elements even make any sense? There is
 no order in a map.

 std::map is ordered. Other data structures might make more sense.

 A weird example would be a trie - slice all from the start
 to ['f','o','o'], for instance.


 If it's ordered, then why doesn't [0..15] make sense to get the first 15
 elements?

 
 auto a = new OrderedDict!(int, string);
 a[-3] = "negative three";
 a[-1] = "negative one";
 a[0] = "zero";
 a[3] = "three";
 a[4] = "four";
 assert(a[0] == "zero");
 return a[0..4]; // which slice should it return?

Good question.

May 15 2010

bearophile <bearophileHUGS lycos.com> writes:

KennyTM~:
 auto a = new OrderedDict!(int, string);
 a[-3] = "negative three";
 a[-1] = "negative one";
 a[0] = "zero";
 a[3] = "three";
 a[4] = "four";
 assert(a[0] == "zero");
 return a[0..4]; // which slice should it return?

D slicing syntax and indexing isn't able to represent what you can in Python,
where you can store the last index in a variable:

last_index = -1
a = ['a', 'b', 'c', 'd']
assert a[last_index] == 'd'

In D you represent the last index as $-1, but you can't store that in a
variable.
If you introduce a symbol like ^ to represent the start, you can't store it in
a variable.

Another example are range bounds, you can omit them, or exactly the same, they
can be None:

a = ['a', 'b', 'c', 'd']
 assert a[None : 2] == ['a', 'b']
 assert a[ : 2] == ['a', 'b']
 assert a[0 : 2] == ['a', 'b']
 idx = None
 assert a[idx : 2] == ['a', 'b']
 assert a[2 : None] == ['c', 'd']
 assert a[2 : ] == ['c', 'd']
 assert a[2 : len(a)] == ['c', 'd']



You can store a None in a Python variable, so you can use it to represent the
empty start or end of a slice. 
But currently D indexes are a size_t, so they can't represent a null.

Bye,
bearophile

May 15 2010

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 KennyTM~:
 auto a = new OrderedDict!(int, string);
 a[-3] = "negative three";
 a[-1] = "negative one";
 a[0] = "zero";
 a[3] = "three";
 a[4] = "four";
 assert(a[0] == "zero");
 return a[0..4]; // which slice should it return?

 
 D slicing syntax and indexing isn't able to represent what you can in Python,
where you can store the last index in a variable:
 
 last_index = -1
 a = ['a', 'b', 'c', 'd']
 assert a[last_index] == 'd'
 
 In D you represent the last index as $-1, but you can't store that in a
variable.

Sure you can:

     last_index = a.length - 1;

May 15 2010

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

 Sure you can:
      last_index = a.length - 1;

Probably my post was nearly useless, because it doesn't help much the
development of D, so you can ignore most of it.
The only part of it that can be meaningful is that it seems Python designers
have though that having a way to specify the _generic_ idea of start of a slice
can be useful (with a syntax like a[:end] or a[None:end]) so in theory an
equivalent syntax (like a[^..end]) can be added to D, but I have no idea if
this is so commonly useful in D programs. If you notice in this thread I have
not said that I like or dislike the 'complement to $' feature.

Regarding your specific answer here, storing a.length-1 in the index is not
able to represent the idea of "last item". Another example to show you better
what I meant:

last_index = -1
a = ['a', 'b', 'c']
b = [1.5, 2.5, 3.5, 4.5]
assert a[last_index] == 'c'
assert b[last_index] == '4.5'

I am not asking for this in D, I don't think there are simple ways to add this,
and I don't need this often in Python too. I am just saying that the semantics
of negative indexes in Python is a superset of the D one.

Bye,
bearophile

May 15 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

On Sat, 15 May 2010 23:46:12 +0200, Walter Bright  
<newshound1 digitalmars.com> wrote:

 bearophile wrote:
 KennyTM~:
 auto a = new OrderedDict!(int, string);
 a[-3] = "negative three";
 a[-1] = "negative one";
 a[0] = "zero";
 a[3] = "three";
 a[4] = "four";
 assert(a[0] == "zero");
 return a[0..4]; // which slice should it return?

  D slicing syntax and indexing isn't able to represent what you can in  
 Python, where you can store the last index in a variable:
  last_index = -1
 a = ['a', 'b', 'c', 'd']
 assert a[last_index] == 'd'
  In D you represent the last index as $-1, but you can't store that in  
 a variable.

 Sure you can:

      last_index = a.length - 1;

Ah, but if you then change the length of a, last_index is no longer
correct. Now, if we had a special index type...

enum slice_base {
   START,
   END
}

struct index {
   ptrdiff_t pos;
   slice_base base;

   // Operator overloads here, returns typeof( this ) if +/-
   // integral, ptrdiff_t if subtracted from
   // typeof( this ).
}

immutable $ = index( 0, slice_base.END );
immutable ^ = index( 0, slice_base.START );

auto last_index = $ - 1;
auto third_index = ^ + 2;

These would then stay valid no  matter what you
did to the container.

-- 
Simen

May 15 2010

Kagamin <spam here.lot> writes:

bearophile Wrote:

 Another example are range bounds, you can omit them, or exactly the same, they
can be None:
 
 a = ['a', 'b', 'c', 'd']
 assert a[None : 2] == ['a', 'b']
 assert a[ : 2] == ['a', 'b']
 assert a[0 : 2] == ['a', 'b']
 idx = None
 assert a[idx : 2] == ['a', 'b']
 assert a[2 : None] == ['c', 'd']
 assert a[2 : ] == ['c', 'd']
 assert a[2 : len(a)] == ['c', 'd']




half of a slice: a[None+len(a) : None]

May 17 2010

bearophile <bearophileHUGS lycos.com> writes:

Kagamin:
 half of a slice: a[None+len(a) : None]

Python is strictly typed, so you can't sum None with an int.

Bye,
bearophile

May 17 2010

Kagamin <spam here.lot> writes:

bearophile Wrote:

 Kagamin:
 half of a slice: a[None+len(a) : None]


I meant, Python is not a silver bullet and its abstraction of first and last
indexes leaks (you like it, don't you?)

May 17 2010

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

The problem is D already has a lot of syntax. More syntax just makes the
language more burdensome after a certain point, even if in isolation it's a
good idea.<

This is an interesting topic in general (the following are general notes, not
specific of the complement to $). I agree that what can be good in isolation
can be less good for the whole syntax ecology of a language.

Some times adding a syntax reduces the language complexity for the programmer.
When you have added struct constructors to D2 you have removed one special
case, making classes and structs more uniform, removing one thing the
programmer/manual has to remember.

Some other times a syntax can mean an about net zero complexity added, because
it increases some complexity but reduces some other complexity in the language.
For example named function arguments add a little complexity to the language,
but they are very easy to learn and if used wisely they can make the code (at
the calling point of functions) more readable, and reduce mistakes caused by
wrong arguments or wrong argument order. So I see them as good.

There are languages like the Scheme family that have very little amount of

time Lisp when was common, but today in practice quite syntax-rich languages
seem to have "won". D too is syntax-rich.

So many things show me that programmers are able to use/learn good amounts of
syntax (especially when they already know a language with a similar syntax), so
syntax is not an evil thing. Yet, today C++ usage is decreasing, maybe even
quickly. I think the cause is that not all syntax is created equal. 

This is the C++ syntax for an abstract function:
virtual int foo() = 0;
The same in D:
abstract int foo();

Both have a syntax to represent this, but the D syntax is better, because it's
specific, it's not used for other purposes, and because it uses a readable word
to denote it, instead of of arbitrary generic symbols.

So I think just saying "lot of syntax" as you have done is not meaningful
enough. In my opinion programmers are able to learn and use lot of syntax
(maybe even more than the amount of syntax currently present in D) if such
syntax is:
- Readable. So for example it uses a English word (example: abstract), or it's
common in normal matematics, or it's another example of the same syntax already
present in another part of the language (example: tuple slicing syntax is the
same as array slicing syntax). This makes it easy to remember, easy/quick to
read, and unambiguous. If a syntax is a bit wordy this is often not a problem
in a modern language used with a modern monitor (so I don't care if 'abstract'
is several chars long and a symbol like ^? can be used for the same purpose
saving few chars. It's not a true gain for the programmer).
- Specific. Using the same syntax for several different or subtly different
purposes in different parts of the language is bad. A specific syntax is
something that can't be confused with anything else in the language, it's used
for just one purpose and does it well.
- Safe. There is a natural enough way to use it, and the compiler catches all
improper usages of it. There are no subtly wrong usages that do something very
bad in the program. (This is why I have asked for the compiler to enforce the
presence in the code of only the correct symbol strings in the new D2 operator
overloading regime. It's too much easy to write something wrong, in my opinion).

Bye,
bearophile

May 15 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 15 May 2010 01:07:50 -0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 On Fri, 14 May 2010 13:33:57 -0400, Walter Bright  
 <newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 So how does this look:  coll[^..$];

 nooooooooo <g>

  Do you have specific objections, or does it just look horrendous to  
 you :)  Would another symbol be acceptable?


 The problem is D already has a lot of syntax. More syntax just makes the  
 language more burdensome after a certain point, even if in isolation  
 it's a good idea.

In a lot of cases, this is somewhat true.  On the other hand though,  
shortcut syntaxes like this are not as bad.  What I mean by shortcut is  
that 1) its a shortcut for an existing syntax (e.g. $ is short for  
coll.length), and 2) it doesn't affect or improves readability.

A good example of shortcut syntax is the recent inout changes.  At first,  
the objection was "we already have too mcuh const", but when you look at  
the result, it is *less* const because you don't have to worry about the  
three cases, only one.

The burden for such shortcuts is usually on readers of such code, not  
writers.  But a small lesson from the docs is all that is needed.  Any new  
developer will already be looking up $ when they encounter it, if you put  
^ right there with it, it's not so bad.  Once you understand the meanings,  
it reads just as smoothly  (and I'd say even smoother) as the alternative  
syntax.

I'll also say that I'm not in love with ^, it's just a suggestion.  I'd  
not be upset if something else were to be used.  But 0 cannot be it.

 One particular problem regex has is that few can remember its syntax  
 unless they use it every day.

I don't use it every day, in fact, I almost always have to look up syntax  
if I want to get fancy.

But I always remember several things:

1. [^abc] means none of these
2. . means any character
3. * means 0 or more of the previous characters and + means 1 or more of  
the previous characters
4. ^ and $ mean beginning and end of line.  I usually have to look up  
which one means which :)

point 4 may suggest a special error message if someone does coll[^-1] or  
coll[$..^]

 Thoughts? other ideas?

 I'd just go with accepting the literal 0. Let's see how far that goes  
 first.

  I thought of a counter case:
  auto tm = new TreeMap!(int, uint);
 tm[-1] = 5;
 tm[1] = 6;
  What does tm[0..$] mean?  What about tm[0]?  If it is analogous to  
 "beginning of collection" then it doesn't make any sense for a  
 container with a key of numeric type.
  Actually any map type where the indexes don't *always* start at zero  
 are a problem.

 I'd question the design of a map type that has the start at something  
 other than 0.

Then I guess you question the AA design?  Or STL's std::map?  Or Java's  
TreeMap and HashMap?  Or dcollections' map types?

I don't think you meant this.  The whole *point* of a map is to have  
arbitrary indexes, requiring them to start at 0 would defeat the whole  
purpose.

-Steve

May 17 2010

"Nick Sabalausky" <a a.a> writes:

"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.vco5zwhreav7ka localhost.localdomain...
 Currently, D supports the special symbol $ to mean the end of a 
 container/range.

 However, there is no analogous symbol to mean "beginning of a 
 container/range".  For arrays, there is none necessary, 0 is always the 
 first element.  But not all containers are arrays.

 I'm running into a dilemma for dcollections, I have found a way to make 
 all containers support fast slicing (basically by imposing some 
 limitations), and I would like to support *both* beginning and end 
 symbols.

 Currently, you can slice something in dcollections via:

 coll[coll.begin..coll.end];

 I could replace that end with $, but what can I replace coll.begin with? 
 0 doesn't make sense for things like linked lists, maps, sets, basically 
 anything that's not an array.

 One thing that's nice about opDollar is I can make it return coll.end, so 
 I control the type.  With 0, I have no choice, I must take a uint, which 
 means I have to check to make sure it's always zero, and throw an 
 exception otherwise.

 Would it make sense to have an equivalent symbol for the beginning of a 
 container/range?

 In regex, ^ matches beginning of the line, $ matches end of the line --  
 would there be any parsing ambiguity there?  I know ^ is a binary op, and 
 $ means nothing anywhere else, so the two are not exactly equivalent.  I'm 
 not very experienced on parsing ambiguities, but things like ~ can be 
 unambiguous as binary and unary ops, so maybe it is possible.

 So how does this look:  coll[^..$];

 Thoughts? other ideas?

I think 0 makes perfect sense for any ordered container, or, really, 
anything for which $ makes sense (plus some things for which $ doesn't make 
sense, like an right-infinite range). However, the rest of your argument 
convinced me.

May 14 2010

div0 <div0 users.sourceforge.net> writes:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nick Sabalausky wrote:
 
 I think 0 makes perfect sense for any ordered container, or, really, 
 anything for which $ makes sense (plus some things for which $ doesn't make 
 sense, like an right-infinite range). However, the rest of your argument 
 convinced me.

What if you order by > instead of <, then 0 should be the end surely.

Also what if you have an array that you can access with a negative
index? Then 0 is at some indeterminate point in the range.

Ok it's pretty bloody rare to have arrays like that, but I've done it a
couple of times before now.

- --
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFL7b0DT9LetA9XoXwRApO6AKCWIR0Yl0D1+qDTgnP619nlbbcgaQCeK18a
59Nl+g78KmlHX4C3RubCv88=
=0aI1
-----END PGP SIGNATURE-----

May 14 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Steven Schveighoffer wrote:

 In regex, ^ matches beginning of the line, $ matches end of the line

So far so good... :)

 So how does this look:  coll[^..$];

Speaking of regex, [^ sequence starts a set of excluded characters. :)

$ has always bugged me anyway, so how about no character at all:

coll[..n];  // beginning to n
coll[n..];  // n to end
coll[..];   // all of it

I like it! :)

Ali

May 14 2010

bearophile <bearophileHUGS lycos.com> writes:

Ali Çehreli:
 $ has always bugged me anyway, so how about no character at all:

The $ is not elegant, but it's a good solution to a design problem, how to
represent items from the bottom of the array. In Python you write:
a[-5]
In D you write:
a[$-5]

This small one-char difference has an important effect for a system language:
the presence of $ allows you to avoid a conditional each time you want to
access an array item :-)
So I think it as one of the smartest details of D design :-)

Bye,
bearophile

May 14 2010

"Nick Sabalausky" <a a.a> writes:

"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:hskd8g$2bcd$1 digitalmars.com...
 Ali Çehreli:
 $ has always bugged me anyway, so how about no character at all:

 The $ is not elegant, but it's a good solution to a design problem, how to 
 represent items from the bottom of the array. In Python you write:
 a[-5]
 In D you write:
 a[$-5]

 This small one-char difference has an important effect for a system 
 language: the presence of $ allows you to avoid a conditional each time 
 you want to access an array item :-)
 So I think it as one of the smartest details of D design :-)

Once upon a time, there was a book called "Writing Solid Code". It seemed 
that anyone who was an established, respectable programmer swore by it and 
proclaimed it should be required reading by all programmers. These days, I 
sometimes feel like I'm the only one who's ever heard of it (let alone read 
it).

So much of the book has made such an impact on me as a programmer, that from 
the very first time I ever heard of a language (probably Python) using 
"someArray[-5]" to denote an index from the end, I swear, the very first 
thought that popped into my head was "Candy-Machine Interface". I instantly 
disliked it, and still consider it a misguided design.

For anyone who doesn't see the the problem with Python's negative indicies 
(or anyone who wants to delve into one of the forerunners to great books 
like "Code Craft" or "The Pragmatic Programmer"), I *highly* recommend 
tracking down a copy of "Writing Solid Code" and reading "The One-Function 
Memory Manager" and "Wishy-Washy Inputs", both in the "Candy-Machine 
Interfaces" chapter.

(Although, the book did have such an impact on the programming world at the 
time, that many of the cautions in it sound like no-brainers today, like not 
using return values to indicate error codes. But even for those, it goes 
into much more detail on the "why" than what you usually hear.)

May 14 2010

Mike Parker <aldacron gmail.com> writes:

Nick Sabalausky wrote:
 
 Once upon a time, there was a book called "Writing Solid Code". It seemed 
 that anyone who was an established, respectable programmer swore by it and 
 proclaimed it should be required reading by all programmers. These days, I 
 sometimes feel like I'm the only one who's ever heard of it (let alone read 
 it).
 
 So much of the book has made such an impact on me as a programmer, that from 
 the very first time I ever heard of a language (probably Python) using 
 "someArray[-5]" to denote an index from the end, I swear, the very first 
 thought that popped into my head was "Candy-Machine Interface". I instantly 
 disliked it, and still consider it a misguided design.
 
 For anyone who doesn't see the the problem with Python's negative indicies 
 (or anyone who wants to delve into one of the forerunners to great books 
 like "Code Craft" or "The Pragmatic Programmer"), I *highly* recommend 
 tracking down a copy of "Writing Solid Code" and reading "The One-Function 
 Memory Manager" and "Wishy-Washy Inputs", both in the "Candy-Machine 
 Interfaces" chapter.

It's available on safari, for anyone who has a subscription.

http://safari.oreilly.com/

May 15 2010

bearophile <bearophileHUGS lycos.com> writes:

Nick Sabalausky:

Once upon a time, there was a book called "Writing Solid Code".<

I have not read this book. Thank you for the suggestion, I will read it. There
are tons of books about programming and computer science, but the good books
are very uncommon, I can probably write a list of less than a dozen titles. So
given few years it's easy to read them all.


It seemed that anyone who was an established, respectable programmer swore by
it and proclaimed it should be required reading by all programmers.<

"Writing Solid Code" is a book about programming, but its examples are in C and
it's focused on C programming. Today some people write code all day and they
don't even know how to write ten lines of C code. Time goes on, and what once
was regarded as indispensable, today is less important (in my university the C
language is taught only at about the third year, often in just one course and
the teacher is not good at all, the code written on the blackboard can be
sometimes used by me as examples of how not write C code). This happens in all
fields of human knowledge. In practice given enough time, almost no books will
be indispensable. Books that are interesting for centuries are uncommon, and
they are usually about the human nature (like novels) that changes very little
as years pass :-)


and reading "The One-Function Memory Manager"<

C99 has changed things a bit:

In both C89 and C99, realloc with length 0 is a special case. The C89 standard
explicitly states that the pointer given is freed, and that the return is
either a null pointer or a pointer to the newly allocated space. The C99
standard says that realloc deallocates its pointer argument (regardless of the
size value) and allocates a new one of the specified size.<

I agree that C realloc is a function that tries to do too many things. C libs
are not perfect.


"Wishy-Washy Inputs",<


D3), this can reduce the bug count because you can state what an argument is at
the calling point too.

The code of CopySubStr is bad:
- Abbreviated names for functions (and often variables too) are bad.
- Unless very useful, it's better to avoid pointers and to use normal array
syntax [].
- There is no need to put () around the return value in C.


that many of the cautions in it sound like no-brainers today, like not using
return values to indicate error codes.<

Generally in Python when some function argument is not acceptable, an exception
is raised. Exceptions are used in D for similar purposes. But in D you also
have contracts that I am starting to use more and more.


So much of the book has made such an impact on me as a programmer, that from
the very first time I ever heard of a language (probably Python) using
"someArray[-5]" to denote an index from the end, I swear, the very first
thought that popped into my head was "Candy-Machine Interface". I instantly
disliked it, and still consider it a misguided design.<

A negative value to index items from the end of an array is a bad idea in C
(and D), it slows down the code and it's unsafe.

But you must understand that what's unsafe in C is not equally unsafe in
Python, and the other way around too is true. A well enough designed computer
language is not a collection of features, it's a coherent thing. So its
features are adapted to each other. So even if a[5] in Python looks the same
syntax as a[5] in C, in practice they are very different things. Python arrays
are not pointers, and out-of-bound exceptions are always present. And often in
Python you don't actually use a[i], you use something like:

for x in a:
  do_something(x)

As you see there are no indexes visible here (as in D foreach).

What I am trying to say you is that while I agree that negative indexes can be
tricky and they are probably too much unsafe in C programs (given how all other
things work in C programs), they are not too much unsafe in Python programs
(given how all other things work in Python programs). In Python you have to be
a little careful when you use them, but they usually don't cause disasters in
my code.


But even for those, it goes into much more detail on the "why" than what you
usually hear.)<

It looks like a good book.

Bye,
bearophile

May 15 2010

"Nick Sabalausky" <a a.a> writes:

"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:hsm7l9$27nt$1 digitalmars.com...
 Nick Sabalausky:

It seemed that anyone who was an established, respectable programmer swore 
by it and proclaimed it should be required reading by all programmers.<

 "Writing Solid Code" is a book about programming, but its examples are in 
 C and it's focused on C programming. Today some people write code all day 
 and they don't even know how to write ten lines of C code. Time goes on, 
 and what once was regarded as indispensable, today is less important (in 
 my university the C language is taught only at about the third year, often 
 in just one course and the teacher is not good at all, the code written on 
 the blackboard can be sometimes used by me as examples of how not write C 
 code). This happens in all fields of human knowledge. In practice given 
 enough time, almost no books will be indispensable. Books that are 
 interesting for centuries are uncommon, and they are usually about the 
 human nature (like novels) that changes very little as years pass :-)

Yea. The book is heavily C, of course, because C was heavily used at the 
time. But, I think another reason for all the focus on C is that the typical 
(at least at the time) C-style and the standard C lib are filled with great 
examples of "what not to do". ;)

"Wishy-Washy Inputs",<


 in D3), this can reduce the bug count because you can state what an 
 argument is at the calling point too.

Yea, non-named-arguments-only has been feeling more and more antiquated to 
me lately.

 The code of CopySubStr is bad:
 - Abbreviated names for functions (and often variables too) are bad.

There are two major schools of thought on that. On one side are those who 
say full names are more clear and less prone to misinterpretation. The other 
side feels that using a few obvious and consistent abbreviations makes code 
much easier to read at a glance and doesn't cause misinterpretation unless 
misused. Personally, I lean more towards the latter group. (Some people also 
say abbreviations are bad because the number of bytes saved is insignificant 
on moden hardware. But I find that to be a bit of a strawman since everybody 
on *both* sides agrees with that and the people who still use abbreviations 
generally don't do so for that particular reason anymore.)

 - Unless very useful, it's better to avoid pointers and to use normal 
 array syntax [].

Heh, yea. Well, that's old-school C for you ;)

that many of the cautions in it sound like no-brainers today, like not 
using return values to indicate error codes.<

 Generally in Python when some function argument is not acceptable, an 
 exception is raised. Exceptions are used in D for similar purposes. But in 
 D you also have contracts that I am starting to use more and more.

Yea, since the book was written, exceptions have pretty much become the de 
facto standard way of handling errors. There are times when exceptions 
aren't used, or can't be used, but those cases are rare (dare I say, they're 
"exceptions"? ;) ), and the most compelling arguments against exceptions are 
only applicable to languages that don't have a "finally" clause.

So much of the book has made such an impact on me as a programmer, that 
from the very first time I ever heard of a language (probably Python) 
using "someArray[-5]" to denote an index from the end, I swear, the very 
first thought that popped into my head was "Candy-Machine Interface". I 
instantly disliked it, and still consider it a misguided design.<

 A negative value to index items from the end of an array is a bad idea in 
 C (and D), it slows down the code and it's unsafe.

 But you must understand that what's unsafe in C is not equally unsafe in 
 Python, and the other way around too is true. A well enough designed 
 computer language is not a collection of features, it's a coherent thing.

 What I am trying to say you is that while I agree that negative indexes 
 can be tricky and they are probably too much unsafe in C programs (given 
 how all other things work in C programs), they are not too much unsafe in 
 Python programs (given how all other things work in Python programs). In 
 Python you have to be a little careful when you use them, but they usually 
 don't cause disasters in my code.

Python certainly makes the consequences of getting the index wrong less 
severe than in C, and less likely. But it still stikes me as a bit of a 
"dual-purpose" input, and therefore potentally error-prone.

For instance, suppose it's your intent to get the fifth element before the 
one that matches "target" (and you already have the index of "target"):

leeloo = collection[targetIndex-5]

Then, suppose your collection, unexpectedly, has "target" in the third 
position (either because of a bug elsewhere, or because you just forgot to 
take into account the possibility that "target" might be one of the first 
five). With bounds-checking that ensures no negatives, you find out 
instantly. With Python-style, you're happily givin the second-to-last 
element and a silent bug.

May 15 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 14 May 2010 16:42:28 -0400, Ali Çehreli <acehreli yahoo.com> wrote:

 Steven Schveighoffer wrote:

  > In regex, ^ matches beginning of the line, $ matches end of the line

 So far so good... :)

  > So how does this look:  coll[^..$];

 Speaking of regex, [^ sequence starts a set of excluded characters. :)

Yeah, that is a good counter-argument :)

 $ has always bugged me anyway, so how about no character at all:

 coll[..n];  // beginning to n
 coll[n..];  // n to end
 coll[..];   // all of it

 I like it! :)

Well, for true contiguous ranges such as arrays, you need to have ways of  
adding or subtracting values.  For example:

a[0..$-1];

How does that look with your version?

a[0..-1];

not good.  I think we need something to denote "end" and I would also like  
something to denote "beginning", and I think that can't be empty space.

-Steve

May 14 2010

Pelle <pelle.mansson gmail.com> writes:

On 05/14/2010 10:42 PM, Ali Çehreli wrote:
 $ has always bugged me anyway, so how about no character at all:

 coll[..n]; // beginning to n
 coll[n..]; // n to end
 coll[..]; // all of it

 I like it! :)

 Ali

coll[uniform(0,$)]; // <-- awesome.

May 15 2010

Don <nospam nospam.com> writes:

Steven Schveighoffer wrote:
 Currently, D supports the special symbol $ to mean the end of a 
 container/range.
 
 However, there is no analogous symbol to mean "beginning of a 
 container/range".  For arrays, there is none necessary, 0 is always the 
 first element.  But not all containers are arrays.
 
 I'm running into a dilemma for dcollections, I have found a way to make 
 all containers support fast slicing (basically by imposing some 
 limitations), and I would like to support *both* beginning and end symbols.
 
 Currently, you can slice something in dcollections via:
 
 coll[coll.begin..coll.end];
 
 I could replace that end with $, but what can I replace coll.begin 
 with?  0 doesn't make sense for things like linked lists, maps, sets, 
 basically anything that's not an array.
 
 One thing that's nice about opDollar is I can make it return coll.end, 
 so I control the type.  With 0, I have no choice, I must take a uint, 
 which means I have to check to make sure it's always zero, and throw an 
 exception otherwise.
 
 Would it make sense to have an equivalent symbol for the beginning of a 
 container/range?
 
 In regex, ^ matches beginning of the line, $ matches end of the line -- 
 would there be any parsing ambiguity there?  I know ^ is a binary op, 
 and $ means nothing anywhere else, so the two are not exactly 
 equivalent.  I'm not very experienced on parsing ambiguities, but things 
 like ~ can be unambiguous as binary and unary ops, so maybe it is possible.
 
 So how does this look:  coll[^..$];
 
 Thoughts? other ideas?
 
 -Steve

If we were to have something like this (and I'm quite unconvinced that 
it is desirable), I'd suggest something beginning with $, eg $begin.
But, it seems to me that the slicing syntax assumes that the slicing 
index can be mapped to the natural numbers. I think in cases where 
that's not true, slicing syntax just shouldn't be used.

May 15 2010

eles <eles eles.com> writes:

I am not sure that is necessary to have a symbol for begining ($
represents the length, not the end, right?).

Anyway, instead of $begin and $end, I would rather have: $$ and $ (or
vice-versa).

Thoughts?

May 16 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 16 May 2010 02:24:55 -0400, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 Currently, D supports the special symbol $ to mean the end of a  
 container/range.
  However, there is no analogous symbol to mean "beginning of a  
 container/range".  For arrays, there is none necessary, 0 is always the  
 first element.  But not all containers are arrays.
  I'm running into a dilemma for dcollections, I have found a way to  
 make all containers support fast slicing (basically by imposing some  
 limitations), and I would like to support *both* beginning and end  
 symbols.
  Currently, you can slice something in dcollections via:
  coll[coll.begin..coll.end];
  I could replace that end with $, but what can I replace coll.begin  
 with?  0 doesn't make sense for things like linked lists, maps, sets,  
 basically anything that's not an array.
  One thing that's nice about opDollar is I can make it return coll.end,  
 so I control the type.  With 0, I have no choice, I must take a uint,  
 which means I have to check to make sure it's always zero, and throw an  
 exception otherwise.
  Would it make sense to have an equivalent symbol for the beginning of  
 a container/range?
  In regex, ^ matches beginning of the line, $ matches end of the line  
 -- would there be any parsing ambiguity there?  I know ^ is a binary  
 op, and $ means nothing anywhere else, so the two are not exactly  
 equivalent.  I'm not very experienced on parsing ambiguities, but  
 things like ~ can be unambiguous as binary and unary ops, so maybe it  
 is possible.
  So how does this look:  coll[^..$];
  Thoughts? other ideas?
  -Steve

 If we were to have something like this (and I'm quite unconvinced that  
 it is desirable), I'd suggest something beginning with $, eg $begin.

This would be better than nothing.

 But, it seems to me that the slicing syntax assumes that the slicing  
 index can be mapped to the natural numbers. I think in cases where  
 that's not true, slicing syntax just shouldn't be used.

slicing implies order, that is for sure.  But mapping to natural numbers  
may be too strict.  I look at slicing in a different way.  Hopefully you  
can follow my train of thought.

dcollections, as a D2 lib, should support ranges, I think that makes the  
most sense.  All containers in dcollections are classes, so they can't  
also be ranges (my belief is that a reference-type based range is too  
awkward to be useful).  The basic operation to get a range from a  
container is to get all the elements as a range (a struct with the range  
interface).

So what if I want a subrange?  Well, I can pick off the ends of the range  
until I get the right elements as the end points.  But if it's possible,  
why not allow slicing as a better means of doing this?  However, slicing  
should be a fast operation.  Slicing quickly isn't always feasible, for  
example, LinkList must walk through the list until you find the right  
element, so that's an O(n) operation.  So my thought was to allow slicing,  
but with the index being a cursor (i.e. pointer) to the elements you want  
to be the end points.

Well, if we are to follow array convention, and want to try not to enforce  
memory safety, we should verify those end points make sense, we don't want  
to return an invalid slice.  In some cases, verifying the end points are  
in the correct order is slow, O(n) again.  But, you always have reasonably  
quick access to the first and last elements of a container, and you *know*  
their order relative to any other element in the container.

So in dcollections, I support slicing on all collections based on two  
cursors, and in all collections, if you make the first cursor the  
beginning cursor, or the second cursor the end cursor, it will work.  In  
some cases, I support slicing on arbitrary cursors, where I can quickly  
determine validity of the cursors.  The only two cases which allow this  
are the ArrayList, which is array based, and the Tree classes (TreeMap,  
TreeSet, TreeMultiset), where determining validity is at most a O(lgN)  
operation.

Essentially, I see slicing as a way to create a subrange of a container,  
where the order of the two end points can be quickly verified.

auto dict = new TreeMap!(string, string); // TreeMap is sorted

...

auto firstHalf = dict["A".."M"];

(You say that slicing using anything besides natural numbers shouldn't be  
used.  You don't see any value in the above?)

But "A" may not be the first element, there could be strings that are less  
than it (for example, strings that start with _), such is the way with  
arbitrary maps.  So a better way to get the first half may be:

auto firstHalf = dict[dict.begin.."M"];

What does the second half look like?

auto secondHalf = dict["M"..dict.end];

Well, if we are to follow array convention, the second half can be  
shortcutted like this:

auto secondHalf = dict["M"..$];

Which looks and reads rather nicely.  But there is no equivalent "begin"  
shortcut because $ was invented for arrays, which always have a way to  
access the first element -- 0.  Arbitrary maps have no such index.  So  
although it's not necessary, a shortcut for begin would also be nice.

Anyways, that's what led me to propose we have some kind of short cut.  If  
nothing else, at least I hope you now see where I'm coming from, and  
hopefully you can see that slicing is useful in cases other than natural  
number indexes.

-Steve

May 17 2010

Kagamin <spam here.lot> writes:

KennyTM~ Wrote:

 If it's ordered, then why doesn't [0..15] make sense to get the first 15
 elements?

 
 auto a = new OrderedDict!(int, string);
 a[-3] = "negative three";
 a[-1] = "negative one";
 a[0] = "zero";
 a[3] = "three";
 a[4] = "four";
 assert(a[0] == "zero");
 return a[0..4]; // which slice should it return?

If your collection supports the array interface, you can provide it explicitly,
when zero means the start.

May 17 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

On 05/14/2010 08:20 AM, Steven Schveighoffer wrote:
 Currently, D supports the special symbol $ to mean the end of a
 container/range.

 However, there is no analogous symbol to mean "beginning of a
 container/range". For arrays, there is none necessary, 0 is always the
 first element. But not all containers are arrays.

 I'm running into a dilemma for dcollections, I have found a way to make
 all containers support fast slicing (basically by imposing some
 limitations), and I would like to support *both* beginning and end symbols.

 Currently, you can slice something in dcollections via:

 coll[coll.begin..coll.end];


 -Steve

Does your collections library allow for code like

coll[coll.begin + 1 .. coll.end]

?

May 17 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 17 May 2010 11:02:05 -0400, Ellery Newcomer  
<ellery-newcomer utulsa.edu> wrote:

 On 05/14/2010 08:20 AM, Steven Schveighoffer wrote:
 Currently, D supports the special symbol $ to mean the end of a
 container/range.

 However, there is no analogous symbol to mean "beginning of a
 container/range". For arrays, there is none necessary, 0 is always the
 first element. But not all containers are arrays.

 I'm running into a dilemma for dcollections, I have found a way to make
 all containers support fast slicing (basically by imposing some
 limitations), and I would like to support *both* beginning and end  
 symbols.

 Currently, you can slice something in dcollections via:

 coll[coll.begin..coll.end];


 -Steve

 Does your collections library allow for code like

 coll[coll.begin + 1 .. coll.end]

No.  begin and end return cursors, which are essentially non-movable  
pointers.

The only collection where adding an integer to a cursor would be feasible  
is ArrayList, which does support slicing via indexes (and indexes can be  
added/subtracted as needed).

Note, I'm not trying to make slicing dcollections as comprehensive as  
slicing with arrays, I'm just looking to avoid the verbosity of re-stating  
the container's symbol when specifying the beginning or end.  In all  
dcollections containers, one can always slice where one of the endpoints  
is begin or end.

I probably should at least add opDollar to ArrayList...

-Steve

May 17 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

On 05/17/2010 10:15 AM, Steven Schveighoffer wrote:
 On Mon, 17 May 2010 11:02:05 -0400, Ellery Newcomer
 <ellery-newcomer utulsa.edu> wrote:
 Does your collections library allow for code like

 coll[coll.begin + 1 .. coll.end]

 No. begin and end return cursors, which are essentially non-movable
 pointers.

 The only collection where adding an integer to a cursor would be
 feasible is ArrayList, which does support slicing via indexes (and
 indexes can be added/subtracted as needed).

emphasis on the semantics (slice starting at second element), not the 
arithmetic, sorry.

May 17 2010

D Programming

C/C++ Programming

Other

digitalmars.D - complement to $