www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Revamping associative arrays

reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

I like how you leave no stone unturned. Regardless of what the "final" quality
of the D2 language will be, I will always think you have done your best to
improve it. I will say so to the people I know too.

I have already discussed about D AAs in the past, but I guess it was not the
right moment to do.

Do we want to keep them in the language? It's easy to do an:
import std.collections: HashMap;
And a flexible language is supposed to allow for a handy syntax even for
user-defined collections.
But the built-ins have a nicer syntax and a little different purpose. So I
think having the built-in syntax is good.

That syntax may map to built-in logic, or to library code. Both options are
acceptable, but I think having built-in logic is a little better.
But I need templated HashMaps and TreeMaps too in a std.collections.

Using templates leads to faster code, so the hash in collections can be
templated. But templates slow down compilation and inflate binary size. Built-i
AAs have a handy syntax, so they get used more often, even where you put only 6
items into them. So I'd like built-in AAs to be optimized for a very small
number of items too. There are several ways to do this. Built-in AAs must be
the most general and very flexible, easy to use, and not bug-prone. I don't
need built-in AAs to be the faster ones, or the ones the use the less memory.
This is an optimization, so for such more demanding needs I can use the HashMap
from the collections module.

So a good thing of built-in AAs is that they don't inflate code a lot, and the
compilation is fast, so they can be used everywhere in the program performance
in both speed and memory isn't critical (and this happens).

I like how the current AA design never degenerate to an O(n^2) behaviour
(Python dicts are much faster than the current D AAs but Python dicts in corner
cases go quadratic). But this quality has the disadvantage of requiring a opCmp
too to items that I want to put in an AA (because final chaining is resolved
with an ordered tree), while in Python and other languages I need just an opEq
and an opHash.

Regarding the iteration, I'd like the design used in Python3, where keys and
values (and items, that are pairs, if you want) return a small iterable object
that's a view to the dict. I may also want methods that return a true array of
values/items/keys as now, because once in a while you need this too (Python3
doesn't have this, and you need to do list(d.keys())).
The keys() returns an light immutable set object, this is very good and very
handy. Doing set operations is a godsend in certain kinds of code.

To delete an key-value another possible syntax is:
delete d[key]
But d.delete(key) too is acceptable.

LDC compiler is able to optimize away the double lookups if they are done
nearby, so it's even possible for the "in" to return a boolean. (But I like to
have both !in and !is too).

I don't use .rehash often, and it takes some time to run. So I don't know how
much useful it is.

I need a .clear method too. And maybe a .copy too. AAs *must* support opEqual
too, as in LDC, because I have to know when a function returns the correct AA
inside unittests.
(Python dicts even support opCmp, but this is a little tricky, and less useful,
so this can be omitted).
This features don't require 100 lines of code, but they make AAs quite more
useful and handy.

A method like Python .get(key[, default]) is quite useful, it returns default
if the key is missing. In D probably the default can't be optional.

fromkeys, update (and maybe pop and popitem too) methods can be useful, see
Python docs: 
http://docs.python.org/library/stdtypes.html#dict
.update can also be written as ~, but there it works only with another AA, and
it's not commutative.

I'd like to have a literal for an empty AA, I use AA!(K,V) in my dlibs.

An old idea: If the value of an AA is of type void, the AA may not allocate
memory for them, so the AA becomes a set (and it needs few basic set
operations, with operator overload).

I may like an AA to be false when empty.

I'd like to have a standard way to, given a pointer to a AA value, return its
key. I have a function that does this in my dlibs (module "extra") but it's not
officially, so it will break when the AA implementation will change. This is
useful if I want to implement a richer data structure on top of built-in AAs,
for example an "ordered AA" that when iterated on, yields items in the same
order as the insertion order, so values are kept linked in a double linked
list, to do this and keep the data structure flexible I need to know a key
given a pointer to a value.

I may like a "freeze" method that turns an AA into an immutable one. Ruby has
this.

A .reserve method may be useful to speed up AA creation when you know you have
to add tons of pairs.

I'd really like the "default" iteration on an AA to yield its keys, instead of
values as currently done. Because if I have a key I can find its value, while
the opposite is not possible, so having keys is much more useful. This is true
in Python too. In my dlibs all iterables and functions behave like this. The
current D design is just a design mistake ad Walter was wrong on this.

Reducing the number of small memory allocations done by the AAs currently used
may be positive. Making them a little more precise for the GC will be very good
as D matures.

The management of AAs at compile-time can be better. Having literals that work
at compile time, or compile-time functions used to create runtime AAs is
positive. It can even be possible to use perfect hashes for immutable AAs built
at compile time.

Bye,
bearophile
Oct 17 2009
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 I'd really like the "default" iteration on an AA to yield its keys,
 instead of values as currently done. Because if I have a key I can
 find its value, while the opposite is not possible, so having keys is
 much more useful. This is true in Python too. In my dlibs all
 iterables and functions behave like this. The current D design is
 just a design mistake ad Walter was wrong on this.

You can iterate over both keys and values with: foreach (key, value; aa)
Oct 18 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:

 You can iterate over both keys and values with:
      foreach (key, value; aa)

I know this, of course, but you are missing the point by a mile. An explicit foreach is not the only way you may want to iterate an AA on. If I have a higher order function like map, I must be able to use it on any kind of iterable. Like an array, associative array, a set, a list, and so on. I must be able to change the data structure I give to the map, if the need arise while I change the code, and it has to keep working. If the mapping function takes a single argument the map has to choose to what iterate, among keys or values (or even both, as pairs). In such case iterating on keys is more useful. That's how all my dlibs higher order functions work when you give them an AA, they iterate on AA keys, array items, set items, list items, etc. If you think still I am not right, you may ask for a poll here, to see how many like the default AA iteration to be on keys or values. Bye, bearophile
Oct 18 2009
prev sibling next sibling parent reply Piotrek <starpit tlen.pl> writes:
bearophile Wrote:

 I'd really like the "default" iteration on an AA to yield its keys, instead of
values as currently done. Because if I have a key I can find its value, while
the opposite is not possible, so having keys is much more useful. This is true
in Python too. In my dlibs all iterables and functions behave like this. The
current D design is just a design mistake ad Walter was wrong on this.
 

No! No! No! Maybe you are wrong. Or it's a metter of taste. I remember that I spent many hours on finding bug in python's script written by me in my job. The reason was the python's behaviour decribed by you. Then I couldn't understand why the hell iterating on collection returns a key in the first place. It's so not intuitive. Your explanation is not even close in convincing me. If I wanted keys I would write: foreach (key, value; set) or (key; set.keys) Now I know why I don't like Python and I hope I will never have to need it again. For scriptng (but not at work since I don't do scripting enymore) I prefer D (rdmd). bearophile, I like your great commitment in D development but I don't like pushing D toward pythonish world. (Of course some ideas from Python project could be succesfully used in D) Cheers Piotrek
Oct 18 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Piotrek:

No! No! No! Maybe you are wrong.<

Life is complex so I am usually wrong, because it's hard to consider all sides of a thing, but sometimes other people are even more wrong :-)
I remember that I spent many hours on finding bug in python's script written by
me in my job. The reason was the python's behaviour decribed by you.<

I have have followed many Python programmers for a lot of time and I think such problem of yours is not common. But I'll keep an eye open for possible other people with your problem.
Then I couldn't understand why the hell iterating on collection returns a key
in the first place. It's so not intuitive.<

What's intuitive on iterating on values? Well, I think Walter agrees with you, I remember his explanation (iterating on a normal array doesn't yield its indexes), but beside what's intuitive you have also to keep in mind what's handy, and iterating on keys is more useful. Beside the things I have said, if you think of associative arrays as sets where there is a value associated to each set item (and this is how Python dicts are implemented and how the light iterable objects they give you when you ask for keys in Python3), when you iterate on the set you get the set items, so if you extend the set you keep iterating on the set items...
 Your explanation is not even close in convincing me. If I wanted keys I would
write:
  foreach (key, value; set) or (key; set.keys)

In D set.keys needs a good amount of memory and time, it's useful only in special situations, for example when you are sure your AA is small. Andrei will probably push to add/replace something to find keys and a lazy way.
Now I know why I don't like Python and I hope I will never have to need it
again. For scriptng (but not at work since I don't do scripting enymore) I
prefer D (rdmd).<

Refusing forever to use a popular (and usually quite appreciated) language just because you don't like a single small feature is stupid. I don't want to force you to like Python, but the choice of a language must be based on a bit more global evaluation of it. Every language under the sun has plenty warts. C++ has enough warts (far larger than the one you have listed) that you can write a big book on them, but people keep using it still. "Not doing scripting any more" too is probably a not smart thing to say, because in most programming jobs I've seen, there are small files to munge, commands to automate, things to show or plot, and so on, etc. A scripting language (even just shell scripting) is designed for such things.
bearophile, I like your great commitment in D development but I don't like
pushing D toward pythonish world. (Of course some ideas from Python project
could be succesfully used in D)<

Thank you :-) I've known several languages (I think this is normal in this newsgroup), and I try to suggest what my experience shows me :-) I'm often wrong anyway. Bye, bearophile
Oct 18 2009
next sibling parent Sergey Gromov <snake.scaly gmail.com> writes:
Sun, 18 Oct 2009 06:18:34 -0400, bearophile wrote:

 Then I couldn't understand why the hell iterating on collection
 returns a key in the first place. It's so not intuitive.<

What's intuitive on iterating on values? Well, I think Walter agrees with you, I remember his explanation (iterating on a normal array doesn't yield its indexes), but beside what's intuitive you have also to keep in mind what's handy, and iterating on keys is more useful.

It's easy to see what's intuitive if you consider what a collection contains. To me, it contains *values*, always. These values may be indexed: by an arbitrary key (AA), by an integral index (array), or not at all (single-linked list). But the index is not the point, it's only a way to access values. And when I iterate over a collection, I definitely wan to iterate over the values it contains, regardless of an indexing scheme this particular collection uses.
Oct 18 2009
prev sibling parent Piotrek <starpit tlen.pl> writes:
bearophile pisze:
 Piotrek:
 
 No! No! No! Maybe you are wrong.<

Life is complex so I am usually wrong, because it's hard to consider all sides of a thing, but sometimes other people are even more wrong :-)

I didn't mean to be offensive. All people (including me) have tendency to claim that they point is the only right.
 In D set.keys needs a good amount of memory and time, it's useful only in
special situations, for example when you are sure your AA is small. Andrei will
probably push to add/replace something to find keys and a lazy way.
 

Yes that could be good. Any views, ranges are option of course. However I was referring to semantics not mechanics.
 Refusing forever to use a popular (and usually quite appreciated) language
just because you don't like a single small feature is stupid. I don't want to
force you to like Python, but the choice of a language must be based on a bit
more global evaluation of it. Every language under the sun has plenty warts.
C++ has enough warts (far larger than the one you have listed) that you can
write a big book on them, but people keep using it still.

The more time I live the more stupid things I find in my behaviour. But to be honest I have longer list of reasons why I don't like Python - dynamic typing - whitespace indentation - performance - not friendly for C-syntax-boys like me - lack of many features that are in D and some more I find D to be sufficient alternative for all scripting language (good std library is a cure). I said it before but I can say it again. Using many languages when one should do is waste for my time and efficiency ;) C++ (or C in embedded systems)is used because of extremely big momentum created for years expressed in invested money in learning, software development, etc. But hopefully everything could be stopped. It's a matter of time and force.
 "Not doing scripting any more" too is probably a not smart thing to say,
because in most programming jobs I've seen, there are small files to munge,
commands to automate, things to show or plot, and so on, etc. A scripting
language (even just shell scripting) is designed for such things.

It's not my choice and I'm glad here. My duties don't include scripting any more. And to me, scripting languages are some kind of joke (having D around). Despite it could seem (at first look) that I'm against you I'm not (except that area connected with those creatures). I really like the benchmarking done by you and I really appreciate how much time (much much more than me of course) you gave this community. First I must pay same debts then maybe I will start the project I'm thinking of for long time. Cheers Piotrek
Oct 18 2009
prev sibling next sibling parent reply grauzone <none example.net> writes:
Piotrek wrote:
 bearophile Wrote:
 
 I'd really like the "default" iteration on an AA to yield its keys, instead of
values as currently done. Because if I have a key I can find its value, while
the opposite is not possible, so having keys is much more useful. This is true
in Python too. In my dlibs all iterables and functions behave like this. The
current D design is just a design mistake ad Walter was wrong on this.

No! No! No! Maybe you are wrong. Or it's a metter of taste. I remember that I spent many hours on finding bug in python's script written by me in my job. The reason was the python's behaviour decribed by you. Then I couldn't understand why the hell iterating on collection returns a key in the first place. It's so not intuitive. Your explanation is not even close in convincing me. If I wanted keys I would write: foreach (key, value; set) or (key; set.keys)

In a perfect world, iterating over an AA would yield a tuple (key, value). You could iterate over either the keys or values by iterating over a "view" on the key or value list. I'm surprised Python doesn't do that. (I'm expecting that Andrei will replace the .key and .value properties by "lazy" ranges, that don't allocate memory; so that aspect will be alright. But too bad the gods don't see the need for better tuple support.)
Oct 18 2009
parent bearophile <bearophileHUGS lycos.com> writes:
grauzone:

 In a perfect world, iterating over an AA would yield a tuple (key, 
 value). You could iterate over either the keys or values by iterating 
 over a "view" on the key or value list. I'm surprised Python doesn't do 
 that.

I don't know why Python has originally chosen to iterate on just keys, maybe it's a performance optimization (iterating on pairs is slower, because a single reference needs no allocation, while a tuple of 2 may need it, in practice CPython uses the same memory for the tuple, avoiding allocating it at every loop cycle), or maybe for one of the reasons I've explained. I don't know if D can make looping on key-value of built-in AAs as fast as iterating on just keys or just values. Iterating on AA keys or values is a very common operation that must be fast.
 (I'm expecting that Andrei will replace the .key and .value properties 
 by "lazy" ranges, that don't allocate memory; so that aspect will be 
 alright.

At least, the range of the keys supports a O(1) opIn_r, I hope :-) Bye, bearophile
Oct 18 2009
prev sibling parent Christopher Wright <dhasenan gmail.com> writes:
Piotrek wrote:
 bearophile Wrote:
 
 I'd really like the "default" iteration on an AA to yield its keys, instead of
values as currently done. Because if I have a key I can find its value, while
the opposite is not possible, so having keys is much more useful. This is true
in Python too. In my dlibs all iterables and functions behave like this. The
current D design is just a design mistake ad Walter was wrong on this.

No! No! No! Maybe you are wrong. Or it's a metter of taste. I remember that I spent many hours on finding bug in python's script written by me in my job. The reason was the python's behaviour decribed by you. Then I couldn't understand why the hell iterating on collection returns a key in the first place. It's so not intuitive. Your explanation is not even close in convincing me. If I wanted keys I would write: foreach (key, value; set) or (key; set.keys)

Why not mandate using both keys and values? That should eliminate ambiguity. Essentially, an associative array would be a Tuple!(Tkey, Tvalue)[] with some extra accessors.
Oct 18 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
 
 I like how you leave no stone unturned. Regardless of what the
 "final" quality of the D2 language will be, I will always think you
 have done your best to improve it. I will say so to the people I know
 too.

Well it should be said Walter is doing his best too, I'm just bitching more :o).
 I have already discussed about D AAs in the past, but I guess it was
 not the right moment to do.

My thoughts are aligned to yours. I'm not sure whether much or any of this is attainable within D2. Andrei
Oct 18 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

I'm not sure whether much or any of this is attainable within D2.<

The things I've written about aren't all equally important and they don't require the same amount of work to be implemented. I am sorry for writing that gazpacho. There are two things that I think are more important that can implemented now: the opEquals among AAs, and the default iteration with foreach on the keys. The other things can wait. The opEquals among AAs requires probably less than 20 lines of code. Two persons (plus Walter, of course) have said they don't like to iterate on the keys first. The other people have kept muzzle shut so I can't tell yet. Bye, bearophile
Oct 18 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
 
 I'm not sure whether much or any of this is attainable within D2.<

The things I've written about aren't all equally important and they don't require the same amount of work to be implemented. I am sorry for writing that gazpacho. There are two things that I think are more important that can implemented now: the opEquals among AAs, and the default iteration with foreach on the keys. The other things can wait. The opEquals among AAs requires probably less than 20 lines of code.

You may want to submit those to bugzilla so they don't get forgotten.
 Two persons (plus Walter, of course) have said they don't like to
 iterate on the keys first. The other people have kept muzzle shut so
 I can't tell yet.

Well clearly sometimes both are needed. I think it would be limiting to e.g. only offer iteration on keys, to then do one extra lookup to fetch the value. Andrei
Oct 18 2009
prev sibling parent reply Piotrek <starpit tlen.pl> writes:
Bill Baxter pisze:
 I think the default should be to iterate over whatever 'in' looks at.
 

I was almost convinced, because that rule has a sense. But treating normal arrays and associative array has more sense to me. fun (SomeObject object) { foreach (element;object.arr1){ //normal, but how do I know at first look //just do something with element } foreach (element;object.arr2){ // assoc, but how do I know at first look //just do something with element hopefully not index } Cheers Piotrek
Oct 18 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:
It shouldn't be harder to port D1 code to D2 than C code!<

In the world for every line of D1 code you may want to port to D2, there are probably 100 or 1000+ lines of C code that you may want to port to D2, so the situation is not the same. Keeping compatibility with C is far more important than keeping compatibility with D1. Bye, bearophile
Oct 18 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Denis Koroskin:

Why would you want to port C code to D, if you can easily interface with it?<

First of all you have to consider programmer experience, they know C, so keeping the language backwards compatible with C helps them avoid bugs and learn D faster. If you ignore what programmers know, and you assume an easy interface between a new language and C, then you don't need to design a language like C++/D, that keeps lot of little compatibility with C, you are more free, and you can avoid warts like the design of C switch(). I have ported some thousands of lines of C code to D. Surely there are situations where keeping a large amount of C code is the best thing do to, and saves you lot of time and work. But other times you may want to port the C code. Some of the reasons you may have to port C code to D: - because I like the look of D code more than D code; - because the original C code may be so old and ugly that keeping it in my project hurts my aesthetic sense; - gives me more safety than certain C code; - allows me to use a GC that may be safer than the original manual memory management; - because I may use my dlibs and shorten the original C code. Less code is usually a good thing; - because I will probably need to change and improve the code and I prefer to do it on D code that's nicer and allows me to program in a faster way; - because I use bud to compile small D projects and in them adding a C dependency requires more time than translating 20 lines of C code and adding it into an already existing D module; - because I am creating some pure D library, to keep things simpler and tidy. - I'd even like to see the official D2 compiler to be written in D1/D2 (and a little of assembly). Bye, bearophile
Oct 18 2009
prev sibling parent reply Piotrek <starpit tlen.pl> writes:
Bill Baxter pisze:
 On Sun, Oct 18, 2009 at 1:12 PM, Piotrek <starpit tlen.pl> wrote:
 Bill Baxter pisze:
 I think the default should be to iterate over whatever 'in' looks at.

arrays and associative array has more sense to me. fun (SomeObject object) { foreach (element;object.arr1){ //normal, but how do I know at first look //just do something with element } foreach (element;object.arr2){ // assoc, but how do I know at first look //just do something with element hopefully not index }

That sounds like an argument that there should be no default, because either way it's not clear whether you're iterating over keys or values.

Really?! That wasn't my intention :) In both cases I wish it were values ;)
 Just get rid of the the one-argument foreach over AAs altogether and 

 explicit about it.

I wouldn't do so. Would anybody do an error by thinking that foreach (elem,table) should iterate over keys? Maybe I'm not thinking correctly but for me an assoc array is just an array with additional key (index) features thanks to which I save space and/or have more indexing method than only integers. e.g. Normal array No. Item 0 George 1 Fred 2 Dany 3 Lil Index/key is infered from position (offset) Now Assoc array: No. Item 10 Lindsey 21 Romeo 1001 C-Jay Or No. Item first Europe second South America third Australia Or Names occurrence frequency: No. Item Andy 21 John 23 Kate 12 And the only difference is the need for using a hash function for value lookup (calculate position) which should not bother a user when he doesn't care. Then when you ask somebody to iterate over the tables, what he will do almost for certain? If it would be me, you know... values all the time. Even for last example most important values are those numbers (despite in this case they're meaningless without keys). Cheers Piotrek
Oct 18 2009
next sibling parent reply Piotrek <starpit tlen.pl> writes:
Bill Baxter pisze:
 Just get rid of the the one-argument foreach over AAs altogether and force
 the user to be
 explicit about it.

(elem,table) should iterate over keys?

Bearophile. And anyone coming from python, at the least. And anyone who agrees with the logic of connecting 'in' with what gets iterated.

And only python that I am aware of. Java,C#,PHP (which hold most of all programmers) are defaulting to values unless explicitly stated.
 Maybe I'm not thinking correctly but for me an assoc array is just an array
 with additional key (index) features thanks to which I save space and/or
 have more indexing method than only integers.

It can also be thought of as a set with some ancillary data associated with each element. In that case the keys are the set elements, and the values are just some extra stuff hanging off the elements. --bb

Sorry in advance, I couldn't resist. From Wikipedia: "From the perspective of a computer programmer, an associative array can be viewed as a generalization of an array. While a regular array maps an index to an arbitrary data type such as integers, other primitive types, or even objects, an associative array's keys can be arbitrarily typed. The values of an associative array do not need to be the same type, although this is dependent on the programming language" So it almost the same what I have said. I hadn't seen wiki entry before nor didn't changed the article's test ;) So now you can see that for most people (except Python guys and maybe a some more) it should behave like normal array. It's just intuitive. What you are talking about it's a side effect of AAs (or rather derivative feature) and then we use the keys property or the key,value pair. Cheers Piotrek
Oct 19 2009
parent reply KennyTM~ <kennytm gmail.com> writes:
On Oct 20, 09 03:40, Piotrek wrote:
 Bill Baxter pisze:
 Just get rid of the the one-argument foreach over AAs altogether and
 force
 the user to be
 explicit about it.

(elem,table) should iterate over keys?

Bearophile. And anyone coming from python, at the least. And anyone who agrees with the logic of connecting 'in' with what gets iterated.

And only python that I am aware of. Java,C#,PHP (which hold most of all programmers) are defaulting to values unless explicitly stated.

and Javascript. and Objective-C.
 Maybe I'm not thinking correctly but for me an assoc array is just an
 array
 with additional key (index) features thanks to which I save space and/or
 have more indexing method than only integers.

It can also be thought of as a set with some ancillary data associated with each element. In that case the keys are the set elements, and the values are just some extra stuff hanging off the elements. --bb

Sorry in advance, I couldn't resist. From Wikipedia: "From the perspective of a computer programmer, an associative array can be viewed as a generalization of an array. While a regular array maps an index to an arbitrary data type such as integers, other primitive types, or even objects, an associative array's keys can be arbitrarily typed. The values of an associative array do not need to be the same type, although this is dependent on the programming language" So it almost the same what I have said. I hadn't seen wiki entry before nor didn't changed the article's test ;) So now you can see that for most people (except Python guys and maybe a some more) it should behave like normal array. It's just intuitive. What you are talking about it's a side effect of AAs (or rather derivative feature) and then we use the keys property or the key,value pair. Cheers Piotrek

Oct 19 2009
parent Piotrek <starpit tlen.pl> writes:
Bill Baxter pisze:
 
 So C# avoids the ambiguity.
 
 Unless there's some other AA type in C#.
 
 --bb

True. Java does it too. My fault. Cheers Piotrek
Oct 21 2009
prev sibling parent reply =?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:
Piotrek wrote:
 Bill Baxter pisze:
 On Sun, Oct 18, 2009 at 1:12 PM, Piotrek <starpit tlen.pl> wrote:
 Bill Baxter pisze:
 I think the default should be to iterate over whatever 'in' looks at.

normal arrays and associative array has more sense to me. fun (SomeObject object) { foreach (element;object.arr1){ //normal, but how do I know at first look //just do something with element } foreach (element;object.arr2){ // assoc, but how do I know at first look //just do something with element hopefully not index }

That sounds like an argument that there should be no default, because either way it's not clear whether you're iterating over keys or values.

Really?! That wasn't my intention :) In both cases I wish it were values ;) > Just get rid of the the one-argument foreach over AAs altogether and force the user to be > explicit about it. I wouldn't do so. Would anybody do an error by thinking that foreach (elem,table) should iterate over keys? Maybe I'm not thinking correctly but for me an assoc array is just an array with additional key (index) features thanks to which I save space and/or have more indexing method than only integers. e.g. Normal array No. Item 0 George 1 Fred 2 Dany 3 Lil Index/key is infered from position (offset) Now Assoc array: No. Item 10 Lindsey 21 Romeo 1001 C-Jay Or No. Item first Europe second South America third Australia Or Names occurrence frequency: No. Item Andy 21 John 23 Kate 12 And the only difference is the need for using a hash function for value lookup (calculate position) which should not bother a user when he doesn't care. Then when you ask somebody to iterate over the tables, what he will do almost for certain? If it would be me, you know... values all the time. Even for last example most important values are those numbers (despite in this case they're meaningless without keys). Cheers Piotrek

Put it this way: Is there any time you are interested in the values without the keys? Is there any time you are interested in the keys without the values? If you're not interested in the keys, the real question would be why you are using an associative array instead of just an array. I can think of at least one example of when you want key iteration, which would be when using a bool[T] as a set.
Oct 21 2009
parent reply Piotrek <starpit tlen.pl> writes:
Pelle Månsson pisze:
 Piotrek wrote:
 Bill Baxter pisze:
 On Sun, Oct 18, 2009 at 1:12 PM, Piotrek <starpit tlen.pl> wrote:
 Bill Baxter pisze:
 I think the default should be to iterate over whatever 'in' looks at.

normal arrays and associative array has more sense to me. fun (SomeObject object) { foreach (element;object.arr1){ //normal, but how do I know at first look //just do something with element } foreach (element;object.arr2){ // assoc, but how do I know at first look //just do something with element hopefully not index }

That sounds like an argument that there should be no default, because either way it's not clear whether you're iterating over keys or values.

Really?! That wasn't my intention :) In both cases I wish it were values ;) > Just get rid of the the one-argument foreach over AAs altogether and force the user to be > explicit about it. I wouldn't do so. Would anybody do an error by thinking that foreach (elem,table) should iterate over keys? Maybe I'm not thinking correctly but for me an assoc array is just an array with additional key (index) features thanks to which I save space and/or have more indexing method than only integers. e.g. Normal array No. Item 0 George 1 Fred 2 Dany 3 Lil Index/key is infered from position (offset) Now Assoc array: No. Item 10 Lindsey 21 Romeo 1001 C-Jay Or No. Item first Europe second South America third Australia Or Names occurrence frequency: No. Item Andy 21 John 23 Kate 12 And the only difference is the need for using a hash function for value lookup (calculate position) which should not bother a user when he doesn't care. Then when you ask somebody to iterate over the tables, what he will do almost for certain? If it would be me, you know... values all the time. Even for last example most important values are those numbers (despite in this case they're meaningless without keys). Cheers Piotrek

Put it this way: Is there any time you are interested in the values without the keys?

Yes!
  Is there any time you are interested in the keys without the values?
 

Yes!
 If you're not interested in the keys, the real question would be why you 
 are using an associative array instead of just an array.
 

The answer is simple. I can reuse AA in many different functions. Sometimes I need keys other time values and even... both :) That isn't the issue. The problem was about what should return short version of foreach over AA.
 I can think of at least one example of when you want key iteration, 
 which would be when using a bool[T] as a set.

See above. Cheers Piotrek
Oct 21 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Piotrek:

 The problem was about what should return short version of 
 foreach over AA.

Relax. Despite both you and Walter are wrong, things will not change, and D will probably keep the wrong design you like :-) Bye, bearophile
Oct 21 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Sun, Oct 18, 2009 at 10:56 AM, bearophile <bearophileHUGS lycos.com> wrote:

 The opEquals among AAs requires probably less than 20 lines of code.
 Two persons (plus Walter, of course) have said they don't like to iterate on
the keys first. The other people have kept muzzle shut so I can't tell yet.

I typed a long post that weighed lots of pros and cons of different options, but then I hit upon a simple rule that I think makes a lot of sense: I think the default should be to iterate over whatever 'in' looks at. And conversely, I think 'in' should compare against whatever default iteration iterates over. Proposed: arrays -- default iteration over values, "x in A" answers if x is one of the values. assoc arrays -- default iteration over keys, "x in AA" answers if x is one of the keys sets -- iteration over keys (or call 'em values, could be either), "x in S" answers if x is one of them I think this is what Python uses, actually. But I just think it makes a lot of sense to say that 'x in Y' should be some kind of shorthand for foreach(thing; Y) { if (x == thing) { return something useful } } I guess that's even clearer in Python where you iterate by writing "for thing in Y:" This looks to me like a general rule that trumps a rule which merely stems from the happenstantial syntactic similarity between arrays and associative arrays. --bb
Oct 18 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Sun, Oct 18, 2009 at 1:12 PM, Piotrek <starpit tlen.pl> wrote:
 Bill Baxter pisze:
 I think the default should be to iterate over whatever 'in' looks at.

I was almost convinced, because that rule has a sense. But treating normal arrays and associative array has more sense to me. fun (SomeObject object) { foreach (element;object.arr1){ //normal, but how do I know at first look //just do something with element } foreach (element;object.arr2){ // assoc, but how do I know at first look //just do something with element hopefully not index }

That sounds like an argument that there should be no default, because either way it's not clear whether you're iterating over keys or values. That's reasonable too, I think. Just get rid of the the one-argument foreach over AAs altogether and force the user to be explicit about it. Probably much less error-prone than quietly changing the D1 default, for sure. :-) As much as people go on about making it easy to port C code, really ya gots to think about all the D1 code too. It shouldn't be harder to port D1 code to D2 than C code! So for a new language I would go for what I said before. But for D, I think the better move is to get rid of the one-arg foreach and require .keys / .values explicitly. (And make that efficient, of course). --bb
Oct 18 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Mon, 19 Oct 2009 01:37:36 +0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Bill Baxter:
 It shouldn't be harder to port D1 code to D2 than C code!<

In the world for every line of D1 code you may want to port to D2, there are probably 100 or 1000+ lines of C code that you may want to port to D2, so the situation is not the same. Keeping compatibility with C is far more important than keeping compatibility with D1. Bye, bearophile

Why would you want to port C code to D, if you can easily interface with it?
Oct 18 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Sun, Oct 18, 2009 at 3:28 PM, Piotrek <starpit tlen.pl> wrote:
 Bill Baxter pisze:
 On Sun, Oct 18, 2009 at 1:12 PM, Piotrek <starpit tlen.pl> wrote:
 Bill Baxter pisze:
 I think the default should be to iterate over whatever 'in' looks at.

normal arrays and associative array has more sense to me. fun (SomeObject object) { foreach (element;object.arr1){ //normal, but how do I know at first look //just do something with element } foreach (element;object.arr2){ // assoc, but how do I know at first look //just do something with element hopefully not index }

That sounds like an argument that there should be no default, because either way it's not clear whether you're iterating over keys or values.

Really?! That wasn't my intention :) In both cases I wish it were values ;)

Got that, but we got two explanations for the "most logical behavior". I like my explanation, you like yours. Given that in a sample size of like 4 here we can't agree, I don't see much hope of there being overwhelming agreement on the right behavior in the population at large. Seems like there's interest in making it harder to make mistakes with D (see the T[new] discussion), and there's a genuine ambiguity here so the user should be made to specify what they want. Otherwise it's easy to make mistakes. I know I've gotten wrong before. To the point where I pretty much always just use the foreack(k,v; AA) form now just to be sure. Even if I don't need the values. Or the keys or whichever it is ;-)
 Just get rid of the the one-argument foreach over AAs altogether and force
 the user to be
 explicit about it.

I wouldn't do so. Would anybody do an error by thinking that foreach (elem,table) should iterate over keys?

Bearophile. And anyone coming from python, at the least. And anyone who agrees with the logic of connecting 'in' with what gets iterated.
 Maybe I'm not thinking correctly but for me an assoc array is just an array
 with additional key (index) features thanks to which I save space and/or
 have more indexing method than only integers.

It can also be thought of as a set with some ancillary data associated with each element. In that case the keys are the set elements, and the values are just some extra stuff hanging off the elements. --bb
Oct 18 2009
prev sibling next sibling parent language_fan <foo bar.com.invalid> writes:
Sun, 18 Oct 2009 18:01:51 -0400, bearophile thusly wrote:

 Denis Koroskin:
 
Why would you want to port C code to D, if you can easily interface with
it?<

First of all you have to consider programmer experience, they know C, so keeping the language backwards compatible with C helps them avoid bugs and learn D faster.

Is this a joke? Being backwards compatible with C often means that the compiler does not test for as many bugs on compile time. You probably have some kind of vague idea of the improvements over traditional C. Are you talking about competent programmers here? One part of mastering the skill programming means that you can easily switch languages if need be. You are a sucky novice if the only languages you know are C & C++.
Oct 19 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Oct 19, 2009 at 1:58 PM, KennyTM~ <kennytm gmail.com> wrote:
 On Oct 20, 09 03:40, Piotrek wrote:
 Bill Baxter pisze:
 Just get rid of the the one-argument foreach over AAs altogether and
 force
 the user to be
 explicit about it.

I wouldn't do so. Would anybody do an error by thinking that foreach (elem,table) should iterate over keys?

Bearophile. And anyone coming from python, at the least. And anyone who agrees with the logic of connecting 'in' with what gets iterated.

And only python that I am aware of. Java,C#,PHP (which hold most of all programmers) are defaulting to values unless explicitly stated.

and Javascript. and Objective-C.

I took a look at C# yesterday, and I don't think it is true there. As far as I can tell IDictionary is the closest thing to AA in C#, and for it default iteration yields Key,Value pairs called DictionaryEntry. MSDN Says: "Since each element of the IDictionary object is a key/value pair, the element type is not the type of the key or the type of the value. Instead, the element type is DictionaryEntry." So C# avoids the ambiguity. Unless there's some other AA type in C#. --bb
Oct 19 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Bill Baxter, el 19 de octubre a las 14:18 me escribiste:
 On Mon, Oct 19, 2009 at 1:58 PM, KennyTM~ <kennytm gmail.com> wrote:
 On Oct 20, 09 03:40, Piotrek wrote:
 Bill Baxter pisze:
 Just get rid of the the one-argument foreach over AAs altogether and
 force
 the user to be
 explicit about it.

I wouldn't do so. Would anybody do an error by thinking that foreach (elem,table) should iterate over keys?

Bearophile. And anyone coming from python, at the least. And anyone who agrees with the logic of connecting 'in' with what gets iterated.

And only python that I am aware of. Java,C#,PHP (which hold most of all programmers) are defaulting to values unless explicitly stated.

and Javascript. and Objective-C.

I took a look at C# yesterday, and I don't think it is true there. As far as I can tell IDictionary is the closest thing to AA in C#, and for it default iteration yields Key,Value pairs called DictionaryEntry. MSDN Says: "Since each element of the IDictionary object is a key/value pair, the element type is not the type of the key or the type of the value. Instead, the element type is DictionaryEntry."

I guess everybody knows, but this is what C++ does =P -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Did you know the originally a Danish guy invented the burglar-alarm unfortunately, it got stolen
Oct 19 2009