digitalmars.D - Empty array literals

bearophile (44/44) Mar 06 2010 First part:

Walter Bright (2/5) Mar 07 2010 Actually, that's how D2 associative arrays are done.
Don (4/14) Mar 07 2010 That IS what Walter thinks, actually. It's being moved to libraries

Michael Rynn (29/32) Mar 07 2010 Sounds good. So the source is in druntime?

Lutger (3/48) Mar 07 2010 Perhaps this dsource project will interest you:

grauzone (6/53) Mar 08 2010 I suppose the compiler automatically maps AAs to this struct:

bearophile (5/11) Mar 07 2010 But the point of this thread was not this one :-(

Nick Sabalausky (29/40) Mar 09 2010 I had to think about that for awhile. Frankly, I'm not sure how I feel a...

bearophile <bearophileHUGS lycos.com> writes:

First part:

I prefer a language that uses literals that are as much as possible clear,
readable and unambiguous. In D "null" can represent empty pointers, empty class
references, and empty dynamic arrays:


int[] foo(int[] a) {
    return null;
}
int[] foo2(int[] a) {
    return [];
}
void main() {
    foo(cast(int[])null);
    foo(null);
    foo([]);
    foo2(cast(int[])null);
    foo2(null);
    foo2([]);
}


But "null" is not a clear way to represent an empty array, that is a struct of
two items (this reminds me of the untidy usage of 0 to represent a null pointer
in C). So I propose to remove/deprecate the usage of "null" to represent an
empty dynamic array (the Delight D-derived language already allows only [] for
such purpose).


Note: [] is not fully unambiguous, it doesn't specify a type. Specifying a type
can be useful, because you can use it to give as argument to a template
function an array of the correct type:


import std.stdio: writeln;
void foo(T)(T[] a) {
    writeln(typeid(T));
}
void main() {
    // OK, int[]
    foo(cast(int[])null);

    // OK, int[]
    foo(cast(int[])[]);

    // Partially OK, inside foo 'a' is of type void[]
    foo([]);

    // Bad, Error: template test.foo(T) does not match any function template
declaration
    foo(null);
}


In such situations cast(int[])[] is enough. I think this idiom isn't common in
D code.

------------------------------------------

Second part (this is less nice than the first part):

In D null can also be used to represent an empty associative array. If you
think of AAs as instances of a class to be used by reference, then using a null
to represent an empty AA is meaningful (the main difference is that you don't
need "new" to allocate it at the beginning).

But I think currently associative arrays are not instances of a normal class,
so this is a fourth different overload of the meaning of "null". To remove some
semantic overload, the empty untyped associative array can be represented with
a different literal, for example like:

[:]

Where needed, for templates, you can use that literal with types too, as in the
dynamic array case:

cast(int[float])[:]

If in future associative arrays will become true instances of a templated
class, then it's better to restore the usage of 'null'.


If you think there is something good in what I have written, then later I am
willing to create a bug report from the first part (or both parts) of this post.

Bye,
bearophile

Mar 06 2010

Walter Bright <newshound1 digitalmars.com> writes:

retard wrote:
 The associative array is a sad type in D. It has nasty semantics. I'd 
 like to see it as a real library defined type with just syntactic sugar 
 for constructing literals. Too bad that's not what Walter thinks.

Actually, that's how D2 associative arrays are done.

Mar 07 2010

Don <nospam nospam.com> writes:

retard wrote:
 Sat, 06 Mar 2010 21:30:35 -0500, bearophile wrote:
 
 In D null can also be used to represent an empty associative array. If
 you think of AAs as instances of a class to be used by reference, then
 using a null to represent an empty AA is meaningful (the main difference
 is that you don't need "new" to allocate it at the beginning).

 
 The associative array is a sad type in D. It has nasty semantics. I'd 
 like to see it as a real library defined type with just syntactic sugar 
 for constructing literals. Too bad that's not what Walter thinks.

That IS what Walter thinks, actually. It's being moved to libraries 
(which is why there have been some nasty AA regressions in the past few 
compiler releases :-( ).

Mar 07 2010

Michael Rynn <michaelrynn optusnet.com.au> writes:

 That IS what Walter thinks, actually. It's being moved to libraries
 (which is why there have been some nasty AA regressions in the past few
 compiler releases :-( ).

Sounds good. So the  source is in druntime?
Does that mean I can find out how to implement opIn ?.

I found the file druntime/src/rt/aaA.d 

The code looks nice, and its got a function _aaIn, which "Determine if 
key is in aa".

I want to implement opIn for my own class, but I have not found an 
example.

This week I have made a RadixTree implementation, for immutable array 
keys based on char or ubyte only. Values no such restriction.

I measured its lookups can be ~20x faster than AA equivalent.
Insertions somewhat slower than AA, even with a node preallocation heap 
for speedup.

I am sure I read about a radix tree in the past, but have not seen an 
implementation in current libraries.  It would be nice to have one, if 
only to stop me or others wasting time doing another. If its simple 
enough for me to make, there should be one there already.

I took a pretty recursive Java implementation (because I was supposed to 
be studying java) and reworked it.

So although AA are "in the library", there is not yet an obvious facility 
to completely replace the AA implementation with another equivalent (or 
even same) using non- druntime code and the facilities of struct, 
operator overrides etc.  

Whats the magic hook that will connect D source AA type operator "in"  to 
a class or struct function of a Tree, AA or trie?  Is there any great 
advantage to the AA being struct rather than class based?

I might like to upload the radix tree implementation, to see how it can 
be improved.

Thanks,
Michael R.

Mar 07 2010

Lutger <lutger.blijdestijn gmail.com> writes:

Michael Rynn wrote:

 
 That IS what Walter thinks, actually. It's being moved to libraries
 (which is why there have been some nasty AA regressions in the past few
 compiler releases :-( ).

 
 Sounds good. So the  source is in druntime?
 Does that mean I can find out how to implement opIn ?.
 
 I found the file druntime/src/rt/aaA.d
 
 The code looks nice, and its got a function _aaIn, which "Determine if
 key is in aa".
 
 I want to implement opIn for my own class, but I have not found an
 example.
 
 This week I have made a RadixTree implementation, for immutable array
 keys based on char or ubyte only. Values no such restriction.
 
 I measured its lookups can be ~20x faster than AA equivalent.
 Insertions somewhat slower than AA, even with a node preallocation heap
 for speedup.
 
 I am sure I read about a radix tree in the past, but have not seen an
 implementation in current libraries.  It would be nice to have one, if
 only to stop me or others wasting time doing another. If its simple
 enough for me to make, there should be one there already.
 
 I took a pretty recursive Java implementation (because I was supposed to
 be studying java) and reworked it.
 
 So although AA are "in the library", there is not yet an obvious facility
 to completely replace the AA implementation with another equivalent (or
 even same) using non- druntime code and the facilities of struct,
 operator overrides etc.
 
 Whats the magic hook that will connect D source AA type operator "in"  to
 a class or struct function of a Tree, AA or trie?  Is there any great
 advantage to the AA being struct rather than class based?
 
 I might like to upload the radix tree implementation, to see how it can
 be improved.
 
 Thanks,
 Michael R.

Perhaps this dsource project will interest you: 
http://www.dsource.org/projects/aa

Mar 07 2010

grauzone <none example.net> writes:

Lutger wrote:
 Michael Rynn wrote:
 
 That IS what Walter thinks, actually. It's being moved to libraries
 (which is why there have been some nasty AA regressions in the past few
 compiler releases :-( ).

 Sounds good. So the  source is in druntime?
 Does that mean I can find out how to implement opIn ?.

 I found the file druntime/src/rt/aaA.d

 The code looks nice, and its got a function _aaIn, which "Determine if
 key is in aa".

 I want to implement opIn for my own class, but I have not found an
 example.

 This week I have made a RadixTree implementation, for immutable array
 keys based on char or ubyte only. Values no such restriction.

 I measured its lookups can be ~20x faster than AA equivalent.
 Insertions somewhat slower than AA, even with a node preallocation heap
 for speedup.

 I am sure I read about a radix tree in the past, but have not seen an
 implementation in current libraries.  It would be nice to have one, if
 only to stop me or others wasting time doing another. If its simple
 enough for me to make, there should be one there already.

 I took a pretty recursive Java implementation (because I was supposed to
 be studying java) and reworked it.

 So although AA are "in the library", there is not yet an obvious facility
 to completely replace the AA implementation with another equivalent (or
 even same) using non- druntime code and the facilities of struct,
 operator overrides etc.

 Whats the magic hook that will connect D source AA type operator "in"  to
 a class or struct function of a Tree, AA or trie?  Is there any great
 advantage to the AA being struct rather than class based?


I suppose the compiler automatically maps AAs to this struct: 
http://dsource.org/projects/druntime/browser/trunk/import/object.di#L304

The struct simply calls the "old" runtime AA methods like in D1.

Maybe one can simply edit the object.di to plug in his own implementation.

 I might like to upload the radix tree implementation, to see how it can
 be improved.

 Thanks,
 Michael R.

 
 Perhaps this dsource project will interest you: 
 http://www.dsource.org/projects/aa

Shouldn't this measure memory usage as well?

Mar 08 2010

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright Wrote:

 retard wrote:
 The associative array is a sad type in D. It has nasty semantics. I'd 
 like to see it as a real library defined type with just syntactic sugar 
 for constructing literals. Too bad that's not what Walter thinks.

 
 Actually, that's how D2 associative arrays are done.

But the point of this thread was not this one :-(
I have added a little enhancement request...
http://d.puremagic.com/issues/show_bug.cgi?id=3889

bearophile

Mar 07 2010

"Nick Sabalausky" <a a.a> writes:

"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:hn0ure$1q0s$1 digitalmars.com...
 Walter Bright Wrote:

 retard wrote:
 The associative array is a sad type in D. It has nasty semantics. I'd
 like to see it as a real library defined type with just syntactic sugar
 for constructing literals. Too bad that's not what Walter thinks.

 Actually, that's how D2 associative arrays are done.

 But the point of this thread was not this one :-(
 I have added a little enhancement request...
 http://d.puremagic.com/issues/show_bug.cgi?id=3889

 bearophile

I had to think about that for awhile. Frankly, I'm not sure how I feel about 
it. My thoughts:

- Having zero elements and being null are two conceptually different things. 
Null should not be thought of as meaning "empty", it's more like 
"undefined/unknown/not-applicable". Like in a well-designed database schema: 
Empty string and null string are *deliberately* two different things. It's 
good to be able to distinguish between the two, and we should be able to do 
so.

- A D array is a struct. D structs are value types. Value types in D aren't 
nullable. So it seems weird that this value type can be null.

- The makeup of a D struct is a length and a pointer. Pointers can obviously 
be null. In fact, a D array can be unallocated, in which case the pointer 
part is null. So allowing null for arrays can be considered sugar for 
setting the pointer field to null or comparing the pointer field to null (I 
assume that's how it actually works currently).

- It can be useful in the general case for a value type to, optionally, be 
nullable (Isn't there a Nullable!() wrapper in Phobos for that?) In fact, 
it's useful to have "nullable vs non-nullable" be completely orthogonal to 
"value-type vs reference type". So maybe it would be better to just simply 
have a generalized system of "if you want variable X to be nullable, do this 
/ if you want variable X to be non-nullable, do that". Then, if you have 
reason for your array to be nullable, then you use the standard "nullable 
type" approach that's already applicable to any type, and if you have reason 
to keep it non-nullable, you use the stanard "non-nullable" approach that's 
already applicable to any type.

So, my conclusion on the proposal in the OP:
...Beats me.

Mar 09 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Empty array literals