digitalmars.D - Re: range and algorithm-related stuff

bearophile <bearophileHUGS lycos.com> Jan 26 2009

bearophile <bearophileHUGS lycos.com> Jan 26 2009

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

Probably array and assign will make the cut as well.<


assign() of my libs is an ugly hack, I'd like such feature to be built-in in
the language, with the comma syntax:

a, b = foo(...)
I use such syntax all the time in Python and it's quite handy.
Something like my Record/record can be quite useful as built-in with a
destructuring syntax too.


Other are already present in similar forms and sometimes with different names
(but I think that e.g. "chain" is better than my "span").<


As you may have seen xchain allows to chain two or more lazy or eager
iterables. If you nest xchains, they look inside what you give them, and
simplify, so if you write:
xchain(a, xchain(b, c))
In my libs that's exactly as writing:
xchain(a, b, c)
That improves performance.

In my libs there is also the Chainable!() mixing, that you can put into your
iterable classes (and all lazy generators contain it already), so you can
already to:

auto d = xrange(5) ~ xfilter([1,0,3]) ~ [3, 5];

In D2 I think such chaining syntax can be built-in. There is already the
relative class method to overload.


The problem with xslice is that it has a O(n) setup time on a input or forward
range, and that that cost is somewhat hidden in a non-decomposable manner.
People who need something like slice can call drop to advance the range in
documented linear time, and then use take.<


In the end what I was looking for is this syntax:

auto e = xfilter([1,0,3,5,7]);
// now  array(e) == [1, 3, 5, 7]
auto e2 = e[1..3];
// now  array(e2) == [3, 5]

That is, to allow the concat and slicing syntax among lazy (and eager)
iterables too.
Such stuff is present in Python and Haskell (but in Python the syntax is a bit
worse, you need xchain and islice functions to do that).

Also, take a look at my boolean() function, it's quite handy.

------------------

Oh, it's the name that threw me off. I automatically assumed that "frequency"
is normalized counts.<


You are right. I have an idea regarding how to fix that: I already have a
count() function that given an iterable (lazy or not) and an item, counts how
many of such items are present, and returns their count.

So I can remove the frequency() function and make count more flexible: when you
give one item (beside the iterable), it gives its count, when you give just the
iterable it returns the associative arrays of items:count.

An optimization: currently D associative arrays are very slow when they are
small (you can scan an array of length about 85+ items in the same time you can
access an associative array of 85 pairs), so at the beginning, when the number
of item-count pairs is low it uses internally just one array (or an
ArrayBuilder). And switches to using an associative array when the number of
keys is more than about 50. At the end such improved count()returns an an
associative array when you give it only an iterable (so creates the associative
array if not already present).

--------------

Regarding such topic, there are other things that will be very handy (I have
already explained such things in the past, but if you want more info ask):
- Improve a lot the == among arrays and among associative arrays and among
structs. This is a large hole in the language.
- foreach scan on associative arrays keys first, and values then. This will
simplify *LOT* of code.
- A very clean syntax to define lazy/eager arrays/iterables (like in Python) as
single expressions.
- A clean syntax to define lazy functions, better than the current opApply.
- More intelligent printing function.

At the moment D2 needs to become a little more handy regarding those things.
The most common idioms must be so simple that you don't need to look for them
into the manual each time. A purpose of a good language is to "force" you to
look in the manual only when you are doing something uncommon (for example for
performance).

Bye,
bearophile

Jan 26 2009

bearophile <bearophileHUGS lycos.com> writes:

I was forgetting, Clojure (the nice lisp variant that works on the JavaVM) also
has pmap(), a parallel map() function, that executes the functions in parallel,
on different CPU cores. Such stuff can become quite useful in D2 too:

http://clojure.org/api#toc571

Bye,
bearophile

Jan 26 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Re: range and algorithm-related stuff