www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Huffman coding comparison

reply bearophile <bearophileHUGS lycos.com> writes:
Now and then it's very useful to compare how to write in D2 a small interesting
program written in another language, the comparison can help find problems,
possible improvements, performance problems, and so on.

Time ago I have tried to convert to D2 a small Python program that finds the
Huffman encoding of a given string, but Phobos2 was not mature enough, and I
didn't produce a good enough D2 program. I have tried again and this time
things are good enough.

The original Python code comes from the rosettacode site (that contains
comparisons of many different programs implemented in many different
languages), the point of the Python version is not performance, but to show
elegant & compact code (there are ways to write this code shorter in Python but
this code must has to be readable, it's not a code golf site).

This the Python 2.x version:


from heapq import heappush, heappop, heapify
from collections import defaultdict

def encode(symb2freq):
    """Huffman encode the given dict mapping symbols to weights"""
    heap = [[float(wt), [sym, ""]] for sym, wt in symb2freq.iteritems()]
    heapify(heap)
    while len(heap) > 1:
        lo = heappop(heap)
        hi = heappop(heap)
        for pair in lo[1:]:
            pair[1] = '0' + pair[1]
        for pair in hi[1:]:
            pair[1] = '1' + pair[1]
        heappush(heap, [lo[0] + hi[0]] + lo[1:] + hi[1:])
    return sorted(heappop(heap)[1:], key=lambda p: (len(p[1]), p))

txt = "this is an example for huffman encoding"
symb2freq = defaultdict(int)
for ch in txt:
    symb2freq[ch] += 1
huff = encode(symb2freq)
print "Symbol\tWeight\tHuffman Code"
for p in huff:
    print "%s\t%s\t%s" % (p[0], symb2freq[p[0]], p[1])


It contains a demo string, and it prints (tabs set to 4 spaces):

Symbol  Weight  Huffman Code
    6   101
n   4   010
a   3   1001
e   3   1100
f   3   1101
h   2   0001
i   3   1110
m   2   0010
o   2   0011
s   2   0111
g   1   00000
l   1   00001
p   1   01100
r   1   01101
t   1   10000
u   1   10001
x   1   11110
c   1   111110
d   1   111111


After few experiments I have produced this working D2 version that uses just
Phobos2:


import std.stdio, std.algorithm, std.typecons;

/// Huffman encode the given dict mapping symbols to weights
auto encode(TFreq, TSymb)(TFreq[TSymb] symb2freq) {
    alias Tuple!(TSymb, "symb", string, "code") Pair;
    alias Tuple!(TFreq, "freq", Pair[], "pairs") Block;
    Block[] blocks;
    foreach (symb, freq; symb2freq)
        blocks ~= Block(freq, [Pair(symb, "")]);
    auto heap = BinaryHeap!(Block[], "b < a")(blocks);

    while (heap.length > 1) {
        auto lo = heap.pop(1)[0];
        auto hi = heap.pop(1)[0];
        foreach (ref pair; lo.pairs)
            pair.code = '0' ~ pair.code;
        foreach (ref pair; hi.pairs)
            pair.code = '1' ~ pair.code;
        heap.push(Block(lo.freq + hi.freq, lo.pairs ~ hi.pairs));
    }
    auto r = heap.pop(1)[0].pairs;
    schwartzSort!((x){ return x.code.length; })(r);
    return r;
}

void main() {
    auto txt = "this is an example for huffman encoding";
    int[char] symb2freq;
    foreach (c; txt)
        symb2freq[c]++;
    auto huff = encode(symb2freq);
    writeln("Symbol\tWeight\tHuffman Code");
    foreach (p; huff)
        writefln("%s\t%s\t%s", p.symb, symb2freq[p.symb], p.code);
}


That prints:

Symbol  Weight  Huffman Code
n   4   010
    6   101
a   3   1001
e   3   1100
o   2   0011
i   3   1110
h   2   0001
f   3   1101
s   2   0111
m   2   0010
t   1   10000
g   1   00000
r   1   01101
p   1   01100
l   1   00001
u   1   10001
x   1   11110
d   1   111111
c   1   111110


Some notes (later I can send them to Andrei):

The D2 version is longer, but this size difference can be acceptable. The D2
version looks good enough (this is just a small test program. In a real D2
program I write code in a less compressed way, I add comments, preconditions,
unittests, and when it's necessary I try to write more efficient code).


It is not fully equivalent to the Python version, to print the same thing the
Schwartz sorting has to use  (x){return tuple(x.code.length, x);}.


For me for one thing the D program is better than the Python version: the
Python std lib doesn't define a mutable named tuple like that one in D2 (there
is only one immutable and dynamically typed one), this forces the Python code
to use all those [0] and [1:]. Anonymous structures sometimes make Python code
shorter, but they can be less explicit too. Of course it's easy to define in
Python a function like Tuple of Phobos2 that accepts optional field names too.


The "std.typecons" module, that contains Tuple and tuple can be named something
simpler to remember as "std.tuples".


I will keep missing array comprehensions in D. In the meantime other languages
have got some forms of them (but Python ones use the best syntax still).


In translating the program I have not found significant problems. The only bug
I have found was caused by the different ordering of the Python/Phobos2 heaps,
so I have had to use "b<a" in D2. This is not a bug, but I have to keep it in
mind in future usages of heaps across the two languages. I don't know why the
two languages have chosen different orderings here, if someone of you knows I'd
like to know it.


Another small problem I have found was caused by BinaryHeap.pop(). I'd like it
to pop and return a single item, instead of an array of given length of items,
because this is done quite more often than popping many items. If the need to
pop many items is important, then a different method can be defined, as npop().


Maybe BinaryHeap can be moved to the collections module.


array(map(...)) is so common that an amap(...) can be considered.


A helper function like this one can be useful:
auto binaryHeap(alias less="a < b", Range)(Range items) {
    return BinaryHeap!(Range, less)(items);
}
Note that the 'less' is before the Range, so in that program you can write:
auto heap = binaryHeap!("b < a")(s2w);
Instead of:
auto heap = BinaryHeap!(Block[], "b < a")(s2w);
And in many cases you just write:
auto heap = binaryHeap(items);


I have seen that this doesn't work, but something like this can be useful (but
I am not sure, I don't love strings used this way):
schwartzSort!("a.code.length")(r);


In Python there is both list.sort() and sorted(), the second creates a new
list. In my dlibs1 I have similar functions. They allow you to write:
return schwartzSorted!((x){ return x.code.length; })(...items);
Instead of:
auto r = ...items;
schwartzSort!((x){ return x.code.length; })(items);
return items;
So I'd like sorted and schwartzSorted in Phobos2.


While debugging this program I have found that array(iota(0, 10)) is an uint[]
instead of being an int[]. In most cases I prefer that syntax to produce an
array of true ints. In the uncommon situations when I want an array of uints I
can ask it with a syntax like array(iota(0U, 10U)). If this specification is
not possible then I prefer iota to always yield signed values, because they are
much *safer* in D.


I'd like iota(100) to be the same as iota(0, 100), as in the Python range().


To improve a *lot* my debugging in code like this I'd like this line of code:
  writeln(tuple(['x': 1.0, 'y': 2.0], "hello", [[1, 2], [3, 4]]));
To print (better):
  tuple(['x': 1.0, 'y': 2.0], "hello", [[1, 2], [3, 4]])
Or even:
  Tuple!(double[char], string, int[][])(['x': 1.0, 'y': 2.0], "hello", [[1, 2],
[3, 4]])
instead of:
  Tuple!(double[char],string,int[][])([x:1, y:2], hello, 1 2 3 4)

Note that the double 1.0 is printed on default with a leading zero to denote
it's a floating point, the string if inside a collection is printed with "" on
its sides, the char inside '' (just like the way you have to write them in the
D code), the arrays with [] and commas, and associative arrays have a bit more
spaces to help readability.

From various D2 programs I have seen that I usually don't need the full type of
the tuple in the printout, they clutter the printing too much. For example if
you try to print the 'heap' variable inside the encode() function just before
the return you get the following text, this is too much noisy and unreadable
(notice a funny detail, strings are correctly printed inside "" if they are
tuple field names!):

BinaryHeap!(Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")[],"b
< a")(1, Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(39,
Tuple!(char,"c",string,"code")(g, 00000) Tuple!(char,"c",string,"code")(l,
00001) Tuple!(char,"c",string,"code")(h, 0001)
Tuple!(char,"c",string,"code")(m, 0010) Tuple!(char,"c",string,"code")(o, 0011)
Tuple!(char,"c",string,"code")(n, 010) Tuple!(char,"c",string,"code")(p, 01100)
Tuple!(char,"c",string,"code")(r, 01101) Tuple!(char,"c",string,"code")(s,
0111) Tuple!(char,"c",string,"code")(t, 10000)
Tuple!(char,"c",string,"code")(u, 10001) Tuple!(char,"c",string,"code")(a,
1001) Tuple!(char,"c",string,"code")( , 101) Tuple!(char,"c",string,"code")(e,
1100) Tuple!(char,"c",string,"code")(f, 1101) Tuple!(char,"c",string,"code")(i,
1110) Tuple!(char,"c",string,"code")(x, 11110)
Tuple!(char,"c",string,"code")(c, 111110) Tuple!(char,"c",string,"code")(d,
111111)) Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(16,
Tuple!(char,"c",string,"code")(g, 00000) Tuple!(char,"c",string,"code")(l,
00001) Tuple!(char,"c",string,"code")(h, 0001)
Tuple!(char,"c",string,"code")(m, 0010) Tuple!(char,"c",string,"code")(o, 0011)
Tuple!(char,"c",string,"code")(n, 010) Tuple!(char,"c",string,"code")(p, 01100)
Tuple!(char,"c",string,"code")(r, 01101) Tuple!(char,"c",string,"code")(s,
0111)) Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(11,
Tuple!(char,"c",string,"code")(t, 0000) Tuple!(char,"c",string,"code")(u, 0001)
Tuple!(char,"c",string,"code")(a, 001) Tuple!(char,"c",string,"code")( , 01))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(8,
Tuple!(char,"c",string,"code")(g, 0000) Tuple!(char,"c",string,"code")(l, 0001)
Tuple!(char,"c",string,"code")(h, 001) Tuple!(char,"c",string,"code")(m, 010)
Tuple!(char,"c",string,"code")(o, 011))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(6,
Tuple!(char,"c",string,"code")(e, 00) Tuple!(char,"c",string,"code")(f, 01))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(5,
Tuple!(char,"c",string,"code")(t, 000) Tuple!(char,"c",string,"code")(u, 001)
Tuple!(char,"c",string,"code")(a, 01))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(4,
Tuple!(char,"c",string,"code")(n, 0))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(4,
Tuple!(char,"c",string,"code")(g, 000) Tuple!(char,"c",string,"code")(l, 001)
Tuple!(char,"c",string,"code")(h, 01))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(3,
Tuple!(char,"c",string,"code")(i, 0))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(3,
Tuple!(char,"c",string,"code")(e, 0))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(2,
Tuple!(char,"c",string,"code")(t, 00) Tuple!(char,"c",string,"code")(u, 01))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(2,
Tuple!(char,"c",string,"code")(p, 00) Tuple!(char,"c",string,"code")(r, 01))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(2,
Tuple!(char,"c",string,"code")(m, 0))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(2,
Tuple!(char,"c",string,"code")(g, 00) Tuple!(char,"c",string,"code")(l, 01))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(1,
Tuple!(char,"c",string,"code")(x, 0))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(1,
Tuple!(char,"c",string,"code")(t, 0))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(1,
Tuple!(char,"c",string,"code")(p, 0))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(1,
Tuple!(char,"c",string,"code")(g, 0))
Tuple!(double,"freq",Tuple!(char,"c",string,"code")[],"pairs")(1,
Tuple!(char,"c",string,"code")(c, 0)))


This is what Python prints in the same situation, it's more readable for my
eyes (but this is not a fair comparison, in D the structures are Tuples with
named fields, and so on):

[[39.0, ['g', '00000'], ['l', '00001'], ['h', '0001'], ['m', '0010'], ['o',
'0011'], ['n', '010'], ['p', '01100'], ['r', '01101'], ['s', '0111'], ['t',
'10000'], ['u', '10001'], ['a', '1001'], [' ', '101'], ['e', '1100'], ['f',
'1101'], ['i', '1110'], ['x', '11110'], ['c', '111110'], ['d', '111111']]]



The purpose of this program was to be short and "elegant". To test the
performance I have replaced in both programs the input text with something that
contains one hundred thousand different symbols.

Python:
txt = range(10000) + range(10000) + range(100000)

D2:
auto txt = array(iota(0, 10000)) ~ array(iota(0, 10000)) ~ array(iota(0,
100000));

The timings, seconds:
  Python:       2.98
  Python+Psyco: 2.84
  D2:           1.97

In this benchmark most of the running time is spent inside the encode(), the
creation of symb2freq takes 0.05-0.16 seconds.

Bye,
bearophile
May 28 2010
next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
bearophile, el 28 de mayo a las 18:01 me escribiste:
 For me for one thing the D program is better than the Python version:
 the Python std lib doesn't define a mutable named tuple like that one in
 D2 (there is only one immutable and dynamically typed one), this forces
 the Python code to use all those [0] and [1:]. Anonymous structures
 sometimes make Python code shorter, but they can be less explicit too.
 Of course it's easy to define in Python a function like Tuple of Phobos2
 that accepts optional field names too.
Just for fun :) def NamedTuple(*fields): class T: def __init__(self, *args): for i in range(len(args)): setattr(self, fields[i], args[i]) return T Pair = NamedTuple("symb", "code") Block = NamedTuple("freq", "pair") p = Pair(1, 2) b = Block(3, p) print p.symb, p.code, b.freq, b.pair.symb, b.pair.code
 I will keep missing array comprehensions in D. In the meantime other
 languages have got some forms of them (but Python ones use the best
 syntax still).
They are wonderful, and it's *very* hard to get used to live in a language that doesn't support them once you start using it often. PS: Sorry about the Python-love mail... =P -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- "Lidiar" no es lo mismo que "holguear"; ya que "lidiar" es relativo a "lidia" y "holguear" es relativo a "olga". -- Ricardo Vaporeso
May 28 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Leandro Lucarella:

 def NamedTuple(*fields):
 	class T:
 		def __init__(self, *args):
 			for i in range(len(args)):
 				setattr(self, fields[i], args[i])
 	return T
That's not enough, those tuples must support the cmp for the heap. This barely works, but it's not good code yet: def NamedTuple(*fields): class Foo: def __init__(self, *args): for field, arg in zip(fields, args): setattr(self, field, arg) def __cmp__(self, other): return cmp([getattr(self, f) for f in fields], [getattr(other, f) for f in fields]) return Foo Bye, bearophile
May 28 2010
prev sibling next sibling parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
bearophile <bearophileHUGS lycos.com> wrote:

 I will keep missing array comprehensions in D. In the meantime other  
 languages have got some forms of them (but Python ones use the best  
 syntax still).
I'm now so tired of hearing this, I started writing something that does it: auto r = list!"2 * a" | [1,2,3,4,5] & where!"a & 1" ); Takes any kind of range, and list and where are basically map and filter in disguise. Only thing I did was add operator overloading. Now, granted, I have little to no experience with list comprehension, and as such this might not be what is needed. Still, it was fun to play around with, and I kinda like what it turned out as. import std.algorithm; import std.range; import std.array; struct listImpl( alias pred ) { auto opBinary( string op : "|", Range )( Range other ) { return map!pred(other); } } struct whereImpl( alias pred ) { auto opBinaryRight( string op : "&", R )( R range ) { return filter!pred(range); } } property whereImpl!pred where( alias pred )( ) { whereImpl!pred result; return result; } property listImpl!pred list( alias pred )( ) { listImpl!pred result; return result; } unittest { assert( equal( list!"2 * a" | [1,2,3,4,5] & where!"a & 1", [2,6,10] ) ); assert( equal( list!"2 * a" | [1,2,3], [2,4,6] ) ); } -- Simen
May 28 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Simen kjaeraas:

 I'm now so tired of hearing this, I started writing something that
 does it:
Array comprehensions are syntax sugar. From what I have seen there is a very narrow range of usable syntaxes for them. Outside that range, they get much less useful or useless. Even Haskell has got them sub-quality. At first sight they seem just able to allow you write one line of code instead of three or four, so no big deal. In practice they allow the programmer to turn those three lines of code into a single chunk. So the programmer can grasp and think about more code at the same time. This improves programming. The word 'chunk' comes from mind sciences, it's the thing referred to in the famous "The Magical Number Seven, Plus or Minus Two", those are numbers of chunks. http://en.wikipedia.org/wiki/Chunking_%28psychology%29 Eventually Walter will understand that. Walter is sometimes slow in improving his understanding of things, but he never stops moving forward: I think his meta-learning capabilities are not high, but they are present and they are higher than most of other people you find around (that often have near zero meta-learning capabilities) :-) Bye, bearophile
May 28 2010
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sat, May 29, 2010 at 03:27, bearophile <bearophileHUGS lycos.com> wrote:

Now and then it's very useful to compare how to write in D2 a small
interesting program
written in another language, the comparison can help find problems,
possible improvements,
performance problems, and so on.

Time ago I have tried to convert to D2 a small Python program that finds
the Huffman encoding
 of a given string, but Phobos2 was not mature enough, and I didn't produce
a good enough D2 program.
I have tried again and this time things are good enough.
(snip) Hey, cool. And you made it generic to boot. I'd vote that to be put in std.algorithm. Simen kjaeraas:
 I'm now so tired of hearing this, I started writing something that
 does it:
Now, granted, I have little to no experience with list comprehension, and as such this might not be what is needed. Still, it was fun to play around with, and I kinda like what it turned out as.
That's fun to see, the syntax is cool. I never thought of using operator overloading for this. I did a range comprehension also, a few months ago. It's called comp in http://svn.dsource.org/projects/dranges/trunk/dranges/docs/algorithm2.html Like yours, it takes any range and is lazy, the main difference it that it's multirange: you can give it any number of range as input, it will create the combinations of elements (aka product of ranges), filter it with the predicate and map the generative function on it. Syntax is comp!(mappingFunction, predicate)(ranges...) The classical example for Haskell list comprehension is generating the pythagorean triplets: pyth n = [ ( a, b, c ) | a <- [1..n], -- Pythagorean Triples b <- [1..n], c <- [1..n], a + b + c <= n, a^2 + b^2 == c^2 ] In my case, it would be: auto pyth(int n) { return comp ! ("tuple(a,b,c)", "a*a+b*b==c*c && a<b") (iota(1,n),iota(1,n), iota(1,n)); } pyth(11) -> (3,4,5), (6,8,10) Philippe
May 29 2010
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Philippe Sigaud <philippe.sigaud gmail.com> wrote:

 That's fun to see, the syntax is cool. I never thought of using operat=
or
 overloading for this.
I just had a look at Wikipedia's article on it, found that the syntax wa= s somewhat approachable in D's operator overloading, and thought 'hey, that's neat'. Just wish I could use curly brackets around it, but parentheses work: { 2=C2=B7x | x =E2=88=88 N, x=C2=B2 > 3 } =3D> ( list!"2*a" | iota(1,int.max) * where!"x^^2 > 3" )
 I did a range comprehension also, a few months ago. It's called comp i=
n
 http://svn.dsource.org/projects/dranges/trunk/dranges/docs/algorithm2.=
html
 Like yours, it takes any range and is lazy, the main difference it tha=
t =
 it's
 multirange: you can give it any number of range as input, it will crea=
te =
 the
 combinations of elements (aka product of ranges), filter it with the
 predicate and map the generative function on it.
[snip]
 auto pyth(int n)
 {
     return comp ! ("tuple(a,b,c)", "a*a+b*b=3D=3Dc*c && a<b")
 (iota(1,n),iota(1,n), iota(1,n));
 }
This reminded me that Phobos lacks a combinatorial range, taking two or more (forward) ranges and giving all combinations thereof: combine([0,1],[2,3]) =3D> (0,2), (0,3), (1,2), (1,3) Work has commenced on implementing just that. -- = Simen
May 30 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/30/2010 10:27 AM, Simen kjaeraas wrote:
 Philippe Sigaud <philippe.sigaud gmail.com> wrote:

 That's fun to see, the syntax is cool. I never thought of using operator
 overloading for this.
I just had a look at Wikipedia's article on it, found that the syntax was somewhat approachable in D's operator overloading, and thought 'hey, that's neat'. Just wish I could use curly brackets around it, but parentheses work: { 2·x | x ∈ N, x² > 3 } => ( list!"2*a" | iota(1,int.max) * where!"x^^2 > 3" )
 I did a range comprehension also, a few months ago. It's called comp in
 http://svn.dsource.org/projects/dranges/trunk/dranges/docs/algorithm2.html


 Like yours, it takes any range and is lazy, the main difference it
 that it's
 multirange: you can give it any number of range as input, it will
 create the
 combinations of elements (aka product of ranges), filter it with the
 predicate and map the generative function on it.
[snip]
 auto pyth(int n)
 {
 return comp ! ("tuple(a,b,c)", "a*a+b*b==c*c && a<b")
 (iota(1,n),iota(1,n), iota(1,n));
 }
This reminded me that Phobos lacks a combinatorial range, taking two or more (forward) ranges and giving all combinations thereof: combine([0,1],[2,3]) => (0,2), (0,3), (1,2), (1,3) Work has commenced on implementing just that.
Yah, that would be useful. If Philippe agrees to adapt his work, maybe that would be the fastest route. And don't forget - the gates of Phobos are open. Andrei
May 30 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
And don't forget - the gates of Phobos are open.<
My original post contains numerous notes and annotations, I was about to send it to you through email. You can think about reading it and thinking abut the things I have written regarding Phobos2. Bye, bearophile
May 30 2010
prev sibling next sibling parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 This reminded me that Phobos lacks a combinatorial range, taking two or
 more (forward) ranges and giving all combinations thereof:

 combine([0,1],[2,3])
 =>
 (0,2), (0,3), (1,2), (1,3)

 Work has commenced on implementing just that.
Yah, that would be useful. If Philippe agrees to adapt his work, maybe that would be the fastest route. And don't forget - the gates of Phobos are open.
Too late for that, as I've already written this. :p Current problems: back and popBack not implemented. I'm not sure they even should be. Doing so would be a tremendous pain the region below the spine. May very well be there are other problems, I don't know. If anyone finds any, please let me know. -- Simen
May 30 2010
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, May 30, 2010 at 19:01, Simen kjaeraas <simen.kjaras gmail.com>wrote:

 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 This reminded me that Phobos lacks a combinatorial range, taking two or
 more (forward) ranges and giving all combinations thereof:

 combine([0,1],[2,3])
 =>
 (0,2), (0,3), (1,2), (1,3)

 Work has commenced on implementing just that.
Yah, that would be useful. If Philippe agrees to adapt his work, maybe that would be the fastest route. And don't forget - the gates of Phobos are open.
Too late for that, as I've already written this. :p
Hey, cool, lots of static foreach. Man, it's like I'm reading my own code. I'm happy to see convergent evolution: this must be the D way to do this.
 Current problems: back and popBack not implemented. I'm not sure they
 even should be. Doing so would be a tremendous pain the region below the
 spine.
Ow. I *guess* it's doable, but I for sure don't want to do it.
 May very well be there are other problems, I don't know. If anyone finds
 any, please let me know.
Well, I like the way you dealt with popFront. You length is more costly than mine, but I cheated: I count the numbers of popFronts and substract them from the original length. Your empty use .length in the foreach loop. You should use .empty instead, I think. And with an infinite range in the mix, the resulting product is infinte also, so you can define your combine to be infinite. Then I can give you something to munch over :-) When one range is infinite, the resulting combination is infinite. What's sad is that you'll never 'traverse' the infinite range: auto c = combine([0,1,2], cycle([2,3,4])); -> (0,2),(0,3),(0,4),(0,2),(0,3), ... You never reach the (1,x) (2,x) parts, as it should be seeing how we both defined combinations: iterating on the ranges as if they were digits in a number. But there is another way to iterate: diagonally, by cutting successive diagonal slices: c is: (0,2),(0,3),(0,4),(0,2),... (1,2),(1,3),(1,4),(1,2),... (2,2),(2,3),(2,4),(2,2),... -> (0,2),(0,3)(1,2),(0,4),(1,3),(2,2) ... It's even better if all ranges are infinite. I never coded this, but it seems doable for two ranges. Lot thougher for any number of ranges... Pretty obscure, maybe? btw, I think we should divert this discussion in another thread if you wish to continue: this is bearophile's Huffman thread, after all. Philippe
May 30 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/30/2010 04:43 PM, Philippe Sigaud wrote:
 On Sun, May 30, 2010 at 19:01, Simen kjaeraas <simen.kjaras gmail.com
 <mailto:simen.kjaras gmail.com>> wrote:

     Andrei Alexandrescu <SeeWebsiteForEmail erdani.org
     <mailto:SeeWebsiteForEmail erdani.org>> wrote:

             This reminded me that Phobos lacks a combinatorial range,
             taking two or
             more (forward) ranges and giving all combinations thereof:

             combine([0,1],[2,3])
             =>
             (0,2), (0,3), (1,2), (1,3)

             Work has commenced on implementing just that.


         Yah, that would be useful. If Philippe agrees to adapt his work,
         maybe that would be the fastest route. And don't forget - the
         gates of Phobos are open.


     Too late for that, as I've already written this. :p


 Hey, cool, lots of static foreach. Man, it's like I'm reading my own
 code. I'm happy to see convergent evolution: this must be the D way to
 do this.

     Current problems: back and popBack not implemented. I'm not sure they
     even should be. Doing so would be a tremendous pain the region below the
     spine.


 Ow.
 I *guess* it's doable, but I for sure don't want to do it.

     May very well be there are other problems, I don't know. If anyone finds
     any, please let me know.


 Well, I like the way you dealt with popFront.  You length is more costly
 than mine, but I cheated: I count the numbers of popFronts and substract
 them from the original length.

 Your empty use .length in the foreach loop. You should use .empty
 instead, I think. And with an infinite range in the mix, the resulting
 product is infinte also, so you can define your combine to be infinite.

 Then I can give you something to munch over :-)

 When one range is infinite, the resulting combination is infinite.
 What's sad is that you'll never 'traverse' the infinite range:
 auto c = combine([0,1,2], cycle([2,3,4]));
 ->
 (0,2),(0,3),(0,4),(0,2),(0,3), ...

 You never reach the (1,x) (2,x) parts, as it should be seeing how we
 both defined combinations: iterating on the ranges as if they were
 digits in a number.

 But there is another way to iterate: diagonally, by cutting successive
 diagonal slices:

 c is:
 (0,2),(0,3),(0,4),(0,2),...
 (1,2),(1,3),(1,4),(1,2),...
 (2,2),(2,3),(2,4),(2,2),...
 ->

 (0,2),(0,3)(1,2),(0,4),(1,3),(2,2) ...

 It's even better if all ranges are infinite.
 I never coded this, but it seems doable for two ranges. Lot thougher for
 any number of ranges... Pretty obscure, maybe?

 btw, I think we should divert this discussion in another thread if you
 wish to continue: this is bearophile's Huffman thread, after all.
Yah, there's no argument that infinite ranges must be allowed by a n-way cross-product. It reminds one of Cantor's diagonalization, just in several dimensions. Shouldn't be very difficult, but it only works if all ranges except one are forward ranges (one can be an input range). Andrei
May 30 2010
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Yah, there's no argument that infinite ranges must be allowed by a n-way  
 cross-product. It reminds one of Cantor's diagonalization, just in  
 several dimensions. Shouldn't be very difficult, but it only works if  
 all ranges except one are forward ranges (one can be an input range).
Might I coerce you into indulging some more detail on this idea? I'm afraid my knowledge of the diagonal method is sadly lacking, and some reading on the subject did not give me satisfactory understanding of its application in the discussed problem. Way I thought of doing it is save the highest position this far of each range, then in popFront see if we're past it. If we are, reset this range, and pop from the next range up, recursively. -- Simen
May 31 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/31/2010 08:07 PM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Yah, there's no argument that infinite ranges must be allowed by a
 n-way cross-product. It reminds one of Cantor's diagonalization, just
 in several dimensions. Shouldn't be very difficult, but it only works
 if all ranges except one are forward ranges (one can be an input range).
Might I coerce you into indulging some more detail on this idea? I'm afraid my knowledge of the diagonal method is sadly lacking, and some reading on the subject did not give me satisfactory understanding of its application in the discussed problem. Way I thought of doing it is save the highest position this far of each range, then in popFront see if we're past it. If we are, reset this range, and pop from the next range up, recursively.
I thought about this some more, and it's more difficult and more interesting than I thought. Cantor enumerated rational numbers the following way: first come all fractions that have numerator + denominator = 1. That's only one rational number, 1/1. Then come all fractions that have num + denom = 2. That gives 1/2 and 2/1. Then come all fractions that have num + denom = 3, and so on. Using this enumeration method he proved that rational numbers are countable so in some way they are not more "numerous" than natural numbers, which was an important result. With ranges, the trick should be similar: although individual ranges may be infinite, they are composed in a simple, countable manner. Generalizing Cantor's enumeration technique to n ranges, note that the enumeration goes through elements such that the sum of their offsets from the beginning of the ranges is constant. So for two ranges, we first select pairs that have their offsets sum to 0. That is (0, 0). Then we select pairs of offsets summing to 1: (0, 1) and (1, 0). Then same for 2: (0, 2), (1, 1), (2, 0). And so on. The problem is that in order to make sure offsets are constant, forward ranges are not enough. If all ranges involved had random access, the problem would be trivially solvable. The trick is pushing that envelope; for example, it's possible to make things work with one forward range and one random-access range, and so on. This is bound to be an interesting problem. Please discuss any potential solutions here. Andrei
May 31 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/31/2010 08:54 PM, Andrei Alexandrescu wrote:
 On 05/31/2010 08:07 PM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Yah, there's no argument that infinite ranges must be allowed by a
 n-way cross-product. It reminds one of Cantor's diagonalization, just
 in several dimensions. Shouldn't be very difficult, but it only works
 if all ranges except one are forward ranges (one can be an input range).
Might I coerce you into indulging some more detail on this idea? I'm afraid my knowledge of the diagonal method is sadly lacking, and some reading on the subject did not give me satisfactory understanding of its application in the discussed problem. Way I thought of doing it is save the highest position this far of each range, then in popFront see if we're past it. If we are, reset this range, and pop from the next range up, recursively.
I thought about this some more, and it's more difficult and more interesting than I thought. Cantor enumerated rational numbers the following way: first come all fractions that have numerator + denominator = 1. That's only one rational number, 1/1. Then come all fractions that have num + denom = 2. That gives 1/2 and 2/1. Then come all fractions that have num + denom = 3, and so on.
I'm off by one there, but you got the idea :o). Andrei
May 31 2010
prev sibling parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 I thought about this some more, and it's more difficult and more  
 interesting than I thought.

 Cantor enumerated rational numbers the following way: first come all  
 fractions that have numerator + denominator = 1. That's only one  
 rational number, 1/1. Then come all fractions that have num + denom = 2.  
 That gives 1/2 and 2/1. Then come all fractions that have num + denom =  
 3, and so on.

 Using this enumeration method he proved that rational numbers are  
 countable so in some way they are not more "numerous" than natural  
 numbers, which was an important result.

 With ranges, the trick should be similar: although individual ranges may  
 be infinite, they are composed in a simple, countable manner.

 Generalizing Cantor's enumeration technique to n ranges, note that the  
 enumeration goes through elements such that the sum of their offsets  
 from the beginning of the ranges is constant.

 So for two ranges, we first select pairs that have their offsets sum to  
 0. That is (0, 0). Then we select pairs of offsets summing to 1: (0, 1)  
 and (1, 0). Then same for 2: (0, 2), (1, 1), (2, 0). And so on.

 The problem is that in order to make sure offsets are constant, forward  
 ranges are not enough. If all ranges involved had random access, the  
 problem would be trivially solvable. The trick is pushing that envelope;  
 for example, it's possible to make things work with one forward range  
 and one random-access range, and so on.

 This is bound to be an interesting problem. Please discuss any potential  
 solutions here.
All forward ranges should be doable, too. One input range also, as long as no other range is infinite. In the case of all forward ranges, elements of each should be visited in order, though. I believe this to be possible, but not at all easy. I had to abandon my original idea, as it proved fundamentally flawed. I believe that for the case of 0 or 1 infinite ranges, the approach used in the already supplied version is good enough. It also works if one range is an (optionally infinite) input range. Well, except that it does not without some rewrites... For more than one infinite range, Cantor's diagonalization approach should work, but only if all ranges or all except one are random-access. -- Simen
May 31 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/31/2010 10:51 PM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 I thought about this some more, and it's more difficult and more
 interesting than I thought.

 Cantor enumerated rational numbers the following way: first come all
 fractions that have numerator + denominator = 1. That's only one
 rational number, 1/1. Then come all fractions that have num + denom =
 2. That gives 1/2 and 2/1. Then come all fractions that have num +
 denom = 3, and so on.

 Using this enumeration method he proved that rational numbers are
 countable so in some way they are not more "numerous" than natural
 numbers, which was an important result.

 With ranges, the trick should be similar: although individual ranges
 may be infinite, they are composed in a simple, countable manner.

 Generalizing Cantor's enumeration technique to n ranges, note that the
 enumeration goes through elements such that the sum of their offsets
 from the beginning of the ranges is constant.

 So for two ranges, we first select pairs that have their offsets sum
 to 0. That is (0, 0). Then we select pairs of offsets summing to 1:
 (0, 1) and (1, 0). Then same for 2: (0, 2), (1, 1), (2, 0). And so on.

 The problem is that in order to make sure offsets are constant,
 forward ranges are not enough. If all ranges involved had random
 access, the problem would be trivially solvable. The trick is pushing
 that envelope; for example, it's possible to make things work with one
 forward range and one random-access range, and so on.

 This is bound to be an interesting problem. Please discuss any
 potential solutions here.
All forward ranges should be doable, too.
How would the spanning strategy work for two forward infinite ranges? Andrei
Jun 01 2010
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 All forward ranges should be doable, too.
How would the spanning strategy work for two forward infinite ranges?
In a complex, horrible way? 0. Save initial states (position = 0) 1. Pop first until past the position of the other 2. Restore other 3. Pop other until past the position of the first 4. Restore first 6. Goto 1. Screw that, as you cannot save positions. Well, I guess a long should be enough for most, but it might not be. As long as there is no way to compare positions, 'tis unpossible. If we accept saving the position (the number of pops), this approach should be applicable to any number of ranges. -- Simen
Jun 01 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/01/2010 09:23 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 All forward ranges should be doable, too.
How would the spanning strategy work for two forward infinite ranges?
In a complex, horrible way? 0. Save initial states (position = 0) 1. Pop first until past the position of the other 2. Restore other 3. Pop other until past the position of the first 4. Restore first 6. Goto 1. Screw that, as you cannot save positions. Well, I guess a long should be enough for most, but it might not be. As long as there is no way to compare positions, 'tis unpossible. If we accept saving the position (the number of pops), this approach should be applicable to any number of ranges.
It's ok to store positions in ulong objects. Most uses of infinite range combinations use only the first few elements; the computation will anyway take a very, very long time when any of the ranges overflows a ulong. That being said, I didn't understand the algorithm. What is its complexity? From what I understand there's a lot of looping going on. We need something that makes progress in O(1). Andrei
Jun 01 2010
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 On 06/01/2010 09:23 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 All forward ranges should be doable, too.
How would the spanning strategy work for two forward infinite ranges?
In a complex, horrible way? 0. Save initial states (position = 0) 1. Pop first until past the position of the other 2. Restore other 3. Pop other until past the position of the first 4. Restore first 6. Goto 1. Screw that, as you cannot save positions. Well, I guess a long should be enough for most, but it might not be. As long as there is no way to compare positions, 'tis unpossible. If we accept saving the position (the number of pops), this approach should be applicable to any number of ranges.
It's ok to store positions in ulong objects. Most uses of infinite range combinations use only the first few elements; the computation will anyway take a very, very long time when any of the ranges overflows a ulong.
With the algorithm (attempted) explained above, it's actually 2^128, which is acceptably high, I think.
 That being said, I didn't understand the algorithm. What is its  
 complexity? From what I understand there's a lot of looping going on. We  
 need something that makes progress in O(1).
Should be O(1). Each pop in the described algorithm is a call to popFront: void popFront( ) { ranges[active].popFront; position[active]++; if (position[active] > position[inactive]) { active = !active; position[active] = 0; ranges[active] = savedRange[active]; } } -- Simen
Jun 01 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/01/2010 11:34 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 On 06/01/2010 09:23 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 All forward ranges should be doable, too.
How would the spanning strategy work for two forward infinite ranges?
In a complex, horrible way? 0. Save initial states (position = 0) 1. Pop first until past the position of the other 2. Restore other 3. Pop other until past the position of the first 4. Restore first 6. Goto 1. Screw that, as you cannot save positions. Well, I guess a long should be enough for most, but it might not be. As long as there is no way to compare positions, 'tis unpossible. If we accept saving the position (the number of pops), this approach should be applicable to any number of ranges.
It's ok to store positions in ulong objects. Most uses of infinite range combinations use only the first few elements; the computation will anyway take a very, very long time when any of the ranges overflows a ulong.
With the algorithm (attempted) explained above, it's actually 2^128, which is acceptably high, I think.
 That being said, I didn't understand the algorithm. What is its
 complexity? From what I understand there's a lot of looping going on.
 We need something that makes progress in O(1).
Should be O(1). Each pop in the described algorithm is a call to popFront: void popFront( ) { ranges[active].popFront; position[active]++; if (position[active] > position[inactive]) { active = !active; position[active] = 0; ranges[active] = savedRange[active]; } }
I still haven't understood your explanation, but I think I figured a different way to reach the same solution. The idea is to crawl the space by adding layers on top of a hypercube of increasing side, starting from the origin corner. This is an absolutely awesome problem. I attach an implementation for two ranges. It's quite a bit more messy than I'd hope, so generalization to n ranges should be preceded by cleaning up the abstractions used. I think your solution already has said cleanups! The output of the program is: 0 Tuple!(uint,uint)(0, 0) 1 Tuple!(uint,uint)(0, 1) 2 Tuple!(uint,uint)(1, 1) 3 Tuple!(uint,uint)(1, 0) 4 Tuple!(uint,uint)(0, 2) 5 Tuple!(uint,uint)(1, 2) 6 Tuple!(uint,uint)(2, 2) 7 Tuple!(uint,uint)(2, 0) 8 Tuple!(uint,uint)(2, 1) 9 Tuple!(uint,uint)(0, 3) 10 Tuple!(uint,uint)(1, 3) 11 Tuple!(uint,uint)(2, 3) 12 Tuple!(uint,uint)(0, 4) 13 Tuple!(uint,uint)(1, 4) 14 Tuple!(uint,uint)(2, 4) Andrei
Jun 01 2010
next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Jun 1, 2010 at 19:54, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 I thought about this some more, and it's more difficult and more
interesting than I thought. Yeah, that's why I proposed it as a challenge. You guys sure seem to like this :) What's interesting, is that it may even be useful :) It's ok to store positions in ulong objects. Most uses of infinite
 range combinations use only the first few elements; the computation
 will anyway take a very, very long time when any of the ranges
 overflows a ulong.
Quite...
 I still haven't understood your explanation, but I think I figured a
 different way to reach the same solution. The idea is to crawl the space by
 adding layers on top of a hypercube of increasing side, starting from the
 origin corner.
Like this? 01234 11234 22234 33334 44444 In fact, any method to iterate on a finite but growing space without going onto the already produced values is OK. You could iterate on an hypercube, then duplicate it to make it n times bigger and recurse: 01 22 3333 11 22 3333 22 22 3333 22 22 3333 3333 3333 ... 3333 3333 I think your solution is viable, because you always iterate on 'square' shapes, that's a good idea. Its only drawback is that the elements are produced in a less natural order, but really, who cares: you just want to take a finite part of an infinite range without closing any door while doing so. Going full diagonal is more difficult: you must find a way to peel of the slices. I encountered this while reading some on Haskell and enumerating infinite ranges, but the only implentation I know of for diagonals is here: http://hackage.haskell.org/packages/archive/level-monad/0.3.1/doc/html/Control-Monad-Levels.html Related to this, I've something that takes 'hyperslices' from ranges of ranges of ... (of whatever rank), but only 'vertical' or 'horizontal' (perpendicular to the normal basis of ranges of ranges...) I also have a n-dim generalization of take: take(rangeOfRanges, 2,3,4, ...) and for drop and slice. I think it's doable to preduce something akin to your layers than way, or maybe my suggestion on blocks. A combination of takes and drops should produce finite cubic shapes that can then be iterated.
 This is an absolutely awesome problem.
Man, don't you have some boring stuff to do instead? I also found the (simple?) problem to enumerate a range of ranges to be more interesting than I thought. Given a range of ranges ... of ranges, of whatever rank, recursively enumerate it: [[a,b,c],[d,e,f],[g,h,i]] produce: (a,0,0),(b,0,1), (c,0,2),(d,1,0), ... the coordinates of each element, if you will. I have a working version now, but it sure taught me that my vision of recursion was flawed. I attach an implementation for two ranges. It's quite a bit more messy than
 I'd hope, so generalization to n ranges should be preceded by cleaning up
 the abstractions used. I think your solution already has said cleanups!
It's good if you can make it to more than two ranges, but I think 2 is already quite useful.
 The output of the program is:

 0       Tuple!(uint,uint)(0, 0)
 1       Tuple!(uint,uint)(0, 1)
 2       Tuple!(uint,uint)(1, 1)
 3       Tuple!(uint,uint)(1, 0)
 4       Tuple!(uint,uint)(0, 2)
 5       Tuple!(uint,uint)(1, 2)
 6       Tuple!(uint,uint)(2, 2)
 7       Tuple!(uint,uint)(2, 0)
 8       Tuple!(uint,uint)(2, 1)
 9       Tuple!(uint,uint)(0, 3)
 10      Tuple!(uint,uint)(1, 3)
 11      Tuple!(uint,uint)(2, 3)
 12      Tuple!(uint,uint)(0, 4)
 13      Tuple!(uint,uint)(1, 4)
 14      Tuple!(uint,uint)(2, 4)


 Seems good to me. I see the layers, like in the Matrix.
If you ever put that in Phobos, make it a secondary function compared to a standard combination, or use a policy to let the user dicide to iterate along layers. Most of the time, you just want the standard 'digit-like' way to iterate on n ranges. I'll have a look at your code. Philippe
Jun 01 2010
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 01 Jun 2010 13:54:01 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 The output of the program is:

 0	Tuple!(uint,uint)(0, 0)
 1	Tuple!(uint,uint)(0, 1)
 2	Tuple!(uint,uint)(1, 1)
 3	Tuple!(uint,uint)(1, 0)
 4	Tuple!(uint,uint)(0, 2)
 5	Tuple!(uint,uint)(1, 2)
 6	Tuple!(uint,uint)(2, 2)
 7	Tuple!(uint,uint)(2, 0)
 8	Tuple!(uint,uint)(2, 1)
 9	Tuple!(uint,uint)(0, 3)
 10	Tuple!(uint,uint)(1, 3)
 11	Tuple!(uint,uint)(2, 3)
 12	Tuple!(uint,uint)(0, 4)
 13	Tuple!(uint,uint)(1, 4)
 14	Tuple!(uint,uint)(2, 4)
It looks like you're missing some iterations there. -Steve
Jun 01 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/01/2010 04:06 PM, Steven Schveighoffer wrote:
 On Tue, 01 Jun 2010 13:54:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 The output of the program is:

 0 Tuple!(uint,uint)(0, 0)
 1 Tuple!(uint,uint)(0, 1)
 2 Tuple!(uint,uint)(1, 1)
 3 Tuple!(uint,uint)(1, 0)
 4 Tuple!(uint,uint)(0, 2)
 5 Tuple!(uint,uint)(1, 2)
 6 Tuple!(uint,uint)(2, 2)
 7 Tuple!(uint,uint)(2, 0)
 8 Tuple!(uint,uint)(2, 1)
 9 Tuple!(uint,uint)(0, 3)
 10 Tuple!(uint,uint)(1, 3)
 11 Tuple!(uint,uint)(2, 3)
 12 Tuple!(uint,uint)(0, 4)
 13 Tuple!(uint,uint)(1, 4)
 14 Tuple!(uint,uint)(2, 4)
It looks like you're missing some iterations there. -Steve
There should be 5 * 3 = 15 iterations. Andrei
Jun 01 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 01 Jun 2010 17:12:29 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 06/01/2010 04:06 PM, Steven Schveighoffer wrote:
 On Tue, 01 Jun 2010 13:54:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 The output of the program is:

 0 Tuple!(uint,uint)(0, 0)
 1 Tuple!(uint,uint)(0, 1)
 2 Tuple!(uint,uint)(1, 1)
 3 Tuple!(uint,uint)(1, 0)
 4 Tuple!(uint,uint)(0, 2)
 5 Tuple!(uint,uint)(1, 2)
 6 Tuple!(uint,uint)(2, 2)
 7 Tuple!(uint,uint)(2, 0)
 8 Tuple!(uint,uint)(2, 1)
 9 Tuple!(uint,uint)(0, 3)
 10 Tuple!(uint,uint)(1, 3)
 11 Tuple!(uint,uint)(2, 3)
 12 Tuple!(uint,uint)(0, 4)
 13 Tuple!(uint,uint)(1, 4)
 14 Tuple!(uint,uint)(2, 4)
It looks like you're missing some iterations there. -Steve
There should be 5 * 3 = 15 iterations.
Oh, I didn't realize the input was not two infinite ranges. I was looking at this as the first 15 lines from the output of 2 infinite ranges. I expected to see (3, 0), (3, 1), (3, 2), and (3, 3). My bad, I guess I should have read the code. -Steve
Jun 01 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/01/2010 04:20 PM, Steven Schveighoffer wrote:
 On Tue, 01 Jun 2010 17:12:29 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 06/01/2010 04:06 PM, Steven Schveighoffer wrote:
 On Tue, 01 Jun 2010 13:54:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 The output of the program is:

 0 Tuple!(uint,uint)(0, 0)
 1 Tuple!(uint,uint)(0, 1)
 2 Tuple!(uint,uint)(1, 1)
 3 Tuple!(uint,uint)(1, 0)
 4 Tuple!(uint,uint)(0, 2)
 5 Tuple!(uint,uint)(1, 2)
 6 Tuple!(uint,uint)(2, 2)
 7 Tuple!(uint,uint)(2, 0)
 8 Tuple!(uint,uint)(2, 1)
 9 Tuple!(uint,uint)(0, 3)
 10 Tuple!(uint,uint)(1, 3)
 11 Tuple!(uint,uint)(2, 3)
 12 Tuple!(uint,uint)(0, 4)
 13 Tuple!(uint,uint)(1, 4)
 14 Tuple!(uint,uint)(2, 4)
It looks like you're missing some iterations there. -Steve
There should be 5 * 3 = 15 iterations.
Oh, I didn't realize the input was not two infinite ranges. I was looking at this as the first 15 lines from the output of 2 infinite ranges. I expected to see (3, 0), (3, 1), (3, 2), and (3, 3). My bad, I guess I should have read the code. -Steve
Funniest thing is the infinite ranges code was _much_ easier to get right. I got it rolling in a few minutes. It was the limit conditions for the finite ranges that killed me. Andrei
Jun 01 2010
prev sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Jun 1, 2010 at 23:12, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 06/01/2010 04:06 PM, Steven Schveighoffer wrote:

 It looks like you're missing some iterations there.

 -Steve
There should be 5 * 3 = 15 iterations. Andrei
The example is: auto c = xproduct(iota(3), iota(5)); so the first index only goes from 0 to 2. Philippe
Jun 01 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/01/2010 04:20 PM, Philippe Sigaud wrote:
 On Tue, Jun 1, 2010 at 23:12, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

     On 06/01/2010 04:06 PM, Steven Schveighoffer wrote:

         It looks like you're missing some iterations there.

         -Steve


     There should be 5 * 3 = 15 iterations.

     Andrei


 The example is:

   auto c = xproduct(iota(3), iota(5));

 so the first index only goes from 0 to 2.


 Philippe
And by the way, I added the feature that will ease bearophile's coding by one order of magnitude: iota with one argument. Andrei
Jun 01 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 And by the way, I added the feature that will ease bearophile's coding 
 by one order of magnitude: iota with one argument.
Thank you. There's a long way to go still :o) Bye, bearophile
Jun 01 2010
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Jun 1, 2010 at 23:45, bearophile <bearophileHUGS lycos.com> wrote:

 Andrei Alexandrescu:
 And by the way, I added the feature that will ease bearophile's coding
 by one order of magnitude: iota with one argument.
Thank you. There's a long way to go still :o) Bye, bearophile
What, do you also need the no-arg version of iota? :-p
Jun 02 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Philippe Sigaud:
 What, do you also need the no-arg version of iota?
 
 :-p
I'd like a generator (range) similar to the Python itertools.count, that yields numbers starting from the given number (defaulting to zero) and just goes on and on. You can use it in many situations: http://docs.python.org/library/itertools.html#itertools.count Bye, bearophile
Jun 02 2010
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Wed, Jun 2, 2010 at 19:57, bearophile <bearophileHUGS lycos.com> wrote:

 Philippe Sigaud:
 What, do you also need the no-arg version of iota?

 :-p
I'd like a generator (range) similar to the Python itertools.count, that yields numbers starting from the given number (defaulting to zero) and just goes on and on. You can use it in many situations: http://docs.python.org/library/itertools.html#itertools.count
Yes, it's handy. It's one of the first range I made, a year ago.
Jun 02 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/02/2010 02:29 PM, Philippe Sigaud wrote:
 On Wed, Jun 2, 2010 at 19:57, bearophile <bearophileHUGS lycos.com
 <mailto:bearophileHUGS lycos.com>> wrote:

     Philippe Sigaud:
      > What, do you also need the no-arg version of iota?
      >
      > :-p

     I'd like a generator (range) similar to the Python itertools.count,
     that yields numbers starting from the given number (defaulting to
     zero) and just goes on and on. You can use it in many situations:
     http://docs.python.org/library/itertools.html#itertools.count


 Yes, it's handy. It's one of the first range I made, a year ago.
iota(n, n.max) is close. Well, it's not infinite, but cycle(iota(n, n.max)) is. Probably a version using BigInt would be most sensible. Andrei
Jun 02 2010
next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Wed, Jun 2, 2010 at 21:41, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 06/02/2010 02:29 PM, Philippe Sigaud wrote:

 On Wed, Jun 2, 2010 at 19:57, bearophile <bearophileHUGS lycos.com
 <mailto:bearophileHUGS lycos.com>> wrote:

    Philippe Sigaud:
     > What, do you also need the no-arg version of iota?
     >
     > :-p

    I'd like a generator (range) similar to the Python itertools.count,
    that yields numbers starting from the given number (defaulting to
    zero) and just goes on and on. You can use it in many situations:
    http://docs.python.org/library/itertools.html#itertools.count


 Yes, it's handy. It's one of the first range I made, a year ago.
iota(n, n.max) is close. Well, it's not infinite, but cycle(iota(n, n.max)) is. Probably a version using BigInt would be most sensible.
As it's mostly used to enumerate/index ranges or generate some simple numbers, iota(n,n.max) is largely good enough, except, as you said it's not infinite. I never used iota much, it regularly crashed on me in the beginning. I can't remember why. I don't know if your suggestion of BigInt is a joke or not, but the truth is, I have a version using any numerical type, I called it numberz. numbers is the standard version, numberz!T the templated. It uses T for indexing also (it's a random-access range), which allowed me to do auto r = numberz(BigInt("1000000000"), BigInt("10000000000")); auto n = r[BigInt("3000000000")]; Hmm, this doesn't work: auto i = iota(BigInt("0"),BigInt("10"), BigInt("1")); But honestly, I just used it to learn ranges and D, and never had a real use for it. It's just the kind of things that's useful to try, to break out of some preconceptions (using a size_t to index..., 1u as a step). As you said, it already takes loooots of iterations to exhaust a long. Philippe
Jun 02 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 iota(n, n.max) is close. Well, it's not infinite, but cycle(iota(n, 
 n.max)) is. Probably a version using BigInt would be most sensible.
An enumerate() too can be useful (not much tested yet): import std.range: isForwardRange, hasLength, isBidirectionalRange, ElementType; import std.typecons: Tuple; import std.array: front, back, empty, popFront, popBack; /// ... struct Enumerate(R) if (isForwardRange!R) { alias Tuple!(size_t, "index", ElementType!R, "item") TPair; R _range; size_t _index; static if (isBidirectionalRange!R && hasLength!R) immutable size_t _originalEnd; this(R input) { _range = input; static if (isBidirectionalRange!R && hasLength!R) _originalEnd = _range.length - 1; } /// Range primitive implementations. ref TPair front() { return TPair(_index, _range.front()); } /// Ditto bool empty() { return _range.empty(); } /// Ditto void popFront() { _range.popFront(); _index++; } static if (isBidirectionalRange!R && hasLength!R) { /// Ditto ref TPair back() { return TPair(_originalEnd + _index, _range.back()); } } static if (isBidirectionalRange!R && hasLength!R) { /// Ditto void popBack() { _range.popBack(); _index--; } } static if (hasLength!R) { /// Ditto property auto length() { return _range.length; } } } /// Ditto Enumerate!R enumerate(R)(R input) if (isForwardRange!R) { return Enumerate!R(input); } import std.stdio: write, writeln; void main() { string[] arr = ["y", "n", "y", "n", "y"]; foreach (el; enumerate(arr)) write(el, " "); writeln("\n"); foreach_reverse (el; enumerate(arr)) write(el, " "); writeln("\n"); } I don't know if a reverse iteration on an enumerate can be useful. ----------------------- I have used that to try to implement in D2 this Python code:
 arr = ["y", "n", "y", "n", "y"]
 [i for i,el in enumerate(arr) if el == "y"]
[0, 2, 4] This is a basic D version, Appender not used: import std.stdio: writeln; void main() { // input data string[] arr = ["y", "n", "y", "n", "y"]; // int[] indexes; foreach (i, item; arr) if (item == "y") indexes ~= i; writeln(indexes); writeln(); } ----------------------- This is a more functional quite bad-looking D2 version, that doesn't work (see http://d.puremagic.com/issues/show_bug.cgi?id=4264 ): import std.stdio: writeln; import std.algorithm: filter, array, map; void main() { // input data string[] arr = ["y", "n", "y", "n", "y"]; auto r1 = filter!((p){ return p.item == "y"; })(enumerate(arr)); auto r2 = map!((p){ return p.index; })(r1); writeln(array(r2)); } Bye, bearophile
Jun 02 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/02/2010 07:24 PM, bearophile wrote:
 Andrei Alexandrescu:
 iota(n, n.max) is close. Well, it's not infinite, but cycle(iota(n,
 n.max)) is. Probably a version using BigInt would be most sensible.
An enumerate() too can be useful (not much tested yet):
No need to write enumerate() - it's zip(iota(0, size_t.max), r). Andrei
Jun 02 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 No need to write enumerate() - it's zip(iota(0, size_t.max), r).
What does it happen if someone gives you the sequence produced by: zip(r, iota(0, size_t.max)) ? You have can require an adaptor because it's not the standard convention. While enumerate() always gives you indexes at the first place. And the index/item fields have a standard name that you remember in your code. Finding the minimal number of pieces to do something is useful, but sometimes if an operation is common enough it can be useful to give it a new name. It's called chunking. Inside an expression that can contain other stuff do you prefer to use: zip(iota(0, size_t.max), items) or enumerate(items) ? The second makes the expression shorter and more readable in its purpose. Bye, bearophile
Jun 02 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/02/2010 08:04 PM, bearophile wrote:
 Andrei Alexandrescu:
 No need to write enumerate() - it's zip(iota(0, size_t.max), r).
What does it happen if someone gives you the sequence produced by: zip(r, iota(0, size_t.max)) ? You have can require an adaptor because it's not the standard convention. While enumerate() always gives you indexes at the first place. And the index/item fields have a standard name that you remember in your code. Finding the minimal number of pieces to do something is useful, but sometimes if an operation is common enough it can be useful to give it a new name. It's called chunking. Inside an expression that can contain other stuff do you prefer to use: zip(iota(0, size_t.max), items) or enumerate(items) ? The second makes the expression shorter and more readable in its purpose. Bye, bearophile
My point was that there's no need to sit down and implement functionality. Just write auto enumerate(R)(R r) if (isInputRange!R) { return zip(iota(0, size_t.max), r); } Andrei
Jun 02 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 My point was that there's no need to sit down and implement 
 functionality. Just write
 
 auto enumerate(R)(R r) if (isInputRange!R)
 {
      return zip(iota(0, size_t.max), r);
 }
Right, thank you. A std lib can contain even small functions/HOFs (that are the result of combination of few other things), if they are very commonly useful. Bye, bearophile
Jun 03 2010
prev sibling parent reply Graham Fawcett <fawcett uwindsor.ca> writes:
On Wed, 02 Jun 2010 20:21:49 -0500, Andrei Alexandrescu wrote:
 
 My point was that there's no need to sit down and implement
 functionality. Just write
 
 auto enumerate(R)(R r) if (isInputRange!R) {
      return zip(iota(0, size_t.max), r);
 }
Should this work with v2.043? I get an error if I try to enumerate an array, for example: /* example.d */ import std.range; auto enumerate(R)(R r) if (isInputRange!R) { return zip(iota(0, size_t.max), r); } void main() { auto e = enumerate([10,20,30]); } $ dmd example.d /usr/include/d/dmd/phobos/std/range.d(1773): Error: cannot implicitly convert expression (&this.ranges._field_field_0.front) of type uint delegate() to uint* (The error message is unfortunate: it doesn't indicate the offending expression in example.d.) Thanks, Graham
Jun 03 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/03/2010 10:01 AM, Graham Fawcett wrote:
    import std.range;

      auto enumerate(R)(R r) if (isInputRange!R) {
           return zip(iota(0, size_t.max), r);
      }

      void main() {
        auto e = enumerate([10,20,30]);
      }
I cry bug. Andrei
Jun 03 2010
parent reply Graham Fawcett <fawcett uwindsor.ca> writes:
On Thu, 03 Jun 2010 10:15:33 -0500, Andrei Alexandrescu wrote:

 On 06/03/2010 10:01 AM, Graham Fawcett wrote:
    import std.range;

      auto enumerate(R)(R r) if (isInputRange!R) {
           return zip(iota(0, size_t.max), r);
      }

      void main() {
        auto e = enumerate([10,20,30]);
      }
I cry bug.
LOL! Andrei, you are a very terse guy. :) Do you cry a bug in my example, in std.range, or D 2.043? Graham
Jun 03 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/03/2010 10:25 AM, Graham Fawcett wrote:
 On Thu, 03 Jun 2010 10:15:33 -0500, Andrei Alexandrescu wrote:

 On 06/03/2010 10:01 AM, Graham Fawcett wrote:
     import std.range;

       auto enumerate(R)(R r) if (isInputRange!R) {
            return zip(iota(0, size_t.max), r);
       }

       void main() {
         auto e = enumerate([10,20,30]);
       }
I cry bug.
LOL! Andrei, you are a very terse guy. :) Do you cry a bug in my example, in std.range, or D 2.043?
My code sucks. Andrei
Jun 03 2010
prev sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
On Thu, 03 Jun 2010 17:25:47 +0200, Graham Fawcett <fawcett uwindsor.ca>  
wrote:

 On Thu, 03 Jun 2010 10:15:33 -0500, Andrei Alexandrescu wrote:

 On 06/03/2010 10:01 AM, Graham Fawcett wrote:
    import std.range;

      auto enumerate(R)(R r) if (isInputRange!R) {
           return zip(iota(0, size_t.max), r);
      }

      void main() {
        auto e = enumerate([10,20,30]);
      }
I cry bug.
LOL! Andrei, you are a very terse guy. :) Do you cry a bug in my example, in std.range, or D 2.043?
This is Bugzilla 3123. http://d.puremagic.com/issues/show_bug.cgi?id=3123 -- Simen
Jun 03 2010
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Jun 1, 2010 at 22:54, Philippe Sigaud <philippe.sigaud gmail.com>wrote:

 I also found the (simple?) problem to enumerate a range of ranges to be
 more interesting than I thought.
 Given a range of ranges ... of ranges, of whatever rank, recursively
 enumerate it:

 [[a,b,c],[d,e,f],[g,h,i]]

 produce:

 (a,0,0),(b,0,1), (c,0,2),(d,1,0), ...

 the coordinates of each element, if you will. I have a working version now,
 but it sure taught me that my vision of recursion was flawed.
Ow, I meant, while conserving the original topology, not as a flat range: [[(a,0,0),(b,0,1),(c,0,2)], [(d,1,0), (e,1,1),(f,1,2)], [(g,2,0), (h,2,1),(i,2,2)]] Lazily, of course. I wanted the to conserve the rank because 1- that was the goal of the module: map, produce, etc range of ranges 2- it allowed me to iterate to a 'line' and then descend into this particular line. 3- the original idea was to generate an infinite n-dim grid of coordinates Philippe
Jun 01 2010
prev sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, May 30, 2010 at 18:00, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 05/30/2010 10:27 AM, Simen kjaeraas wrote:

 { 2=C2=B7x | x =E2=88=88 N, x=C2=B2 > 3 }
 =3D>
 ( list!"2*a" | iota(1,int.max) * where!"x^^2 > 3" )
 return comp ! ("tuple(a,b,c)", "a*a+b*b=3D=3Dc*c && a<b")
 This reminded me that Phobos lacks a combinatorial range, taking two or
 more (forward) ranges and giving all combinations thereof
Simen, could you please have a look at my dranges project? http://www.dsource.org/projects/dranges/ Look at the algorithm and range sections, you'll find plenty of code that try to tackle this. In particular combinations() in algorithm2, and also tensorialProduct() in rangeofranges combinations is a higher-order range that takes any number of forward range= s and output their combinations as tuples. tensorialProduct does the same, but creates 'topology' : the combination of three ranges is a range, whereas the tensorial product of three (linear) ranges is a range of ranges of ranges: tensorialProduct([0,1],[2,3]) -> [[(0,2),(0,3)], [(1,2),(1,3)]] (where (x,y) indicates a tuple) Yah, that would be useful. If Philippe agrees to adapt his work, maybe that
 would be the fastest route. And don't forget - the gates of Phobos are op=
en.
 Andrei
Concerning helping Phobos, I'd be delighted to if the Phobos team deems it possible. It's just that my job and formation are far from coding (I'm more a physicist), so I'm not sure my code would be up to it without being reviewed. I'm OK with pretty much everything concerning licensing and adaptation. I'l= l begin by putting a Boost licence in my code, when dsource is up again. If someone is interested by some of it, I'm quite ready to adapt it to conform to D/Phobos evolutions. That's what I'm doing anyway. Philippe
May 30 2010
parent reply "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Philippe Sigaud <philippe.sigaud gmail.com> wrote:
 Simen, could you please have a look at my dranges project?

 http://www.dsource.org/projects/dranges/

 Look at the algorithm and range sections, you'll find plenty of code that
 try to tackle this.
 In particular combinations() in algorithm2, and also tensorialProduct()  
 in
 rangeofranges

 combinations is a higher-order range that takes any number of forward  
 ranges
 and output their combinations as tuples.

 tensorialProduct does the same, but creates 'topology' : the combination  
 of
 three ranges is a range, whereas the tensorial product of three (linear)
 ranges is a range of ranges of ranges:

 tensorialProduct([0,1],[2,3])
 ->
 [[(0,2),(0,3)],
  [(1,2),(1,3)]]

 (where (x,y) indicates a tuple)
Indeed. I just had a quick look at the traits2 module last time you linked it. This does seem to cover all bases. I have to say I'm impressed! -- Simen
May 30 2010
parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, May 30, 2010 at 19:29, Simen kjaeraas <simen.kjaras gmail.com>wrote:

 Indeed. I just had a quick look at the traits2 module last time you
 linked it. This does seem to cover all bases. I have to say I'm impressed!
If you ever have time to look at it, I'm always eager for comments/criticisms/new ideas. Even though my todo list for this project has more than 200 entries left :-) Philippe
May 30 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/28/2010 05:01 PM, bearophile wrote:
[snip]
 Some notes (later I can send them to Andrei):
Thanks for your notes. I removed those for which I have no notable answer.
 The "std.typecons" module, that contains Tuple and tuple can be named
 something simpler to remember as "std.tuples".
Phobos' general approach to naming modules is of coarse granularity. I wanted to massage in std.typecons everything that constructs new, generally useful types.
 I will keep missing array comprehensions in D. In the meantime other
 languages have got some forms of them (but Python ones use the best
 syntax still).
Looks like we're having two proposals.
 In translating the program I have not found significant problems. The
 only bug I have found was caused by the different ordering of the
 Python/Phobos2 heaps, so I have had to use "b<a" in D2. This is not a
 bug, but I have to keep it in mind in future usages of heaps across
 the two languages. I don't know why the two languages have chosen
 different orderings here, if someone of you knows I'd like to know
 it.
a < b is sort of the default comparison operator in D, so it would have been surprising if I'd created a max heap.
 Another small problem I have found was caused by BinaryHeap.pop().
 I'd like it to pop and return a single item, instead of an array of
 given length of items, because this is done quite more often than
 popping many items. If the need to pop many items is important, then
 a different method can be defined, as npop().
There exists a pop() function that only pops one element.
 Maybe BinaryHeap can be moved to the collections module.
Agreed.
 array(map(...)) is so common that an amap(...) can be considered.
I don't know.
 A helper function like this one can be useful: auto binaryHeap(alias
 less="a<  b", Range)(Range items) { return BinaryHeap!(Range,
 less)(items); } Note that the 'less' is before the Range, so in that
 program you can write: auto heap = binaryHeap!("b<  a")(s2w); Instead
 of: auto heap = BinaryHeap!(Block[], "b<  a")(s2w); And in many cases
 you just write: auto heap = binaryHeap(items);
Sounds good.
 I have seen that this doesn't work, but something like this can be
 useful (but I am not sure, I don't love strings used this way):
 schwartzSort!("a.code.length")(r);


 In Python there is both list.sort() and sorted(), the second creates
 a new list. In my dlibs1 I have similar functions. They allow you to
 write: return schwartzSorted!((x){ return x.code.length;
 })(...items); Instead of: auto r = ...items; schwartzSort!((x){
 return x.code.length; })(items); return items; So I'd like sorted and
 schwartzSorted in Phobos2.
I'm not crazy about functions that return large arrays by value. I'd have sorted() return a range (a heap!) that lazily spans the input in sorted order.
 While debugging this program I have found that array(iota(0, 10)) is
 an uint[] instead of being an int[]. In most cases I prefer that
 syntax to produce an array of true ints. In the uncommon situations
 when I want an array of uints I can ask it with a syntax like
 array(iota(0U, 10U)). If this specification is not possible then I
 prefer iota to always yield signed values, because they are much
 *safer* in D.


 I'd like iota(100) to be the same as iota(0, 100), as in the Python
 range().
Sounds good.
 To improve a *lot* my debugging in code like this I'd like this line
 of code: writeln(tuple(['x': 1.0, 'y': 2.0], "hello", [[1, 2], [3,
 4]])); To print (better): tuple(['x': 1.0, 'y': 2.0], "hello", [[1,
 2], [3, 4]]) Or even: Tuple!(double[char], string, int[][])(['x':
 1.0, 'y': 2.0], "hello", [[1, 2], [3, 4]]) instead of:
 Tuple!(double[char],string,int[][])([x:1, y:2], hello, 1 2 3 4)
I've never had a clear view on what the target audience for writeln() is. You seem to want it to output debug strings; I'm not sure that's the best way to purpose it. Andrei
May 30 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei:

Thanks for your notes.<
My pleasure, if you like them I will try to do this thing some more times. There are probably tens or hundreds of small details that can be improved in Phobos. Some of such changes can improve the usage patterns. In past I have put some of them in Bugzilla. One good way to find such possible improvements is to use Phobos, to write small programs, and keep eyes open.
Looks like we're having two proposals.<
I am sceptic that this can be done with no compiler/language support to offer the good enough syntax sugar: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=110613 In my dlibs1 (for D1) I have implemented and then later sometimes used an expanded and improved an idea by Henning Hasemann shown in 2007, but this (you are free to use it if you want, code shown in this newsgroup page is free to use, I think): - is not efficient - you have to define the iteration variable types before this array comp - the code that uses this array comp is not so easy to read - this can't be done (in D1) for lazy comphrensions. TA[] select(TA, TI, TC)(lazy TA mapper, ref TI iter1, TC items1) { static if(is( TC == void[0] )) { return null; } else { auto aux1 = iter1; // save original iteration variable static if (HasLength!(TC)) { auto result = new TA[items1.length]; uint i; static if (IsAA!(TC)) { foreach (k, v; items1) { iter1 = k; result[i] = mapper(); i++; } } else { // Then items1 is an iterable with attribute length // (an array, xrange, etc) // don't use foreach (i,el;items1), it's less general foreach (el; items1) { iter1 = el; result[i] = mapper(); i++; } } iter1 = aux1; // restore original iteration variable return result; } else { // Then items1 is an iterable object // when TA isn't an AA, geometric grow can be used to speed up ArrayBuilder!(TA) result; foreach (el; items1) { iter1 = el; result ~= mapper(); } iter1 = aux1; // restore original iteration variable return result.toarray; } } } TA[] select(TA, TI, TC, TP)(lazy TA mapper, ref TI iter1, TC items1, lazy TP where) { ... TA[] select(TA, TI1, TC1, TI2, TC2)(lazy TA mapper, ref TI1 iter1, TC1 items1, ref TI2 iter2, lazy TC2 items2) { ... TA[] select(TA, TI1, TC1, TI2, TC2, TP)(lazy TA mapper, ref TI1 iter1, TC1 items1, ref TI2 iter2, lazy TC2 items2, lazy TP where) { ...
There exists a pop() function that only pops one element.<
This is how it is implemented: /** Pops the largest element (according to the predicate $(D less)). */ void pop() { enforce(_length); if (_length > 1) swap(_store.front, _store[_length - 1]); --_length; percolateDown(_store[0 .. _length]); } I'd like it to also return the popped item, a ElementType!Range, is this possible? Popping one item out is one of the most common operations I have to perform on an heap.
array(map(...)) is so common that an amap(...) can be considered.<<
I don't know.<
A too much long list of function (that is a too much large API) is bad, but I have found that for the most common higher order functions (map and filter, they are common because array comps aren't present) I like a short cut for the eager version, amap/afilter. But they are not essential, we can survive without them :-)
I'm not crazy about functions that return large arrays by value. I'd have
sorted() return a range (a heap!) that lazily spans the input in sorted order.<
When I need only the few items in sorted order I can use just pop(n), or many pop(). Functional languages return data copies, but they are sometimes lazy (Haskell) or thy try to avoid using arrays and use more functional-friendly data structures that reduce the need for copying lot of data. sorted() and schwartzSorted() can be handy because they can be used as expressions in a functional style. You can write: foreach (x; sorted(items)) {... Instead of: auto sortedItems = items.dup; sortedItems.sorted(); foreach (x; sortedItems) {... If items is long then sorted() is not the top efficiency in both memory and time, but in many situations you don't have many items. Most arrays are short. If you have a 5 items long array and the items are small (like numbers) then using sorted() is not so bad unless it's in the middle of a critical piece of code. And in this case using a standard sort is probably wrong anyway. So sorted/schwartzSorted are not for every situation, they are more for situations where you prefer handy and short code, and you don't need max performance. You don't have to abuse them, as most other things.
I've never had a clear view on what the target audience for writeln() is. You
seem to want it to output debug strings; I'm not sure that's the best way to
purpose it.<
Usages of the printing functions: - To debug code. For this purpose the text shown has to be human-readable, the writeln has to be as automatic as possible (to reduce time needed to add the printing statements), and the text shown has to be "nice" to show the data types but not too much noisy, otherwise the text can become useless. There are more modern ways to show data structures, even GUI-based, but having a fall-back strategy with a good writeln is good. - To show output in small script-like programs or medium command line programs. I think this is the same case as the debug code one. - To print a lot of numbers or simple symbols, for later processing with other programs. In this case printf() is better because it's faster than writeln. - To print many strings. In this case in D printf()/puts() can be suboptimal or unfit. Some very simple, very fast and not templated function similar to puts() but designed for D strings. - For (textual) serialization? In this case it's better to use functions more specialized for this purpose, and to avoid the writeln. So I don't see why it's better for this command: writeln(tuple(['x': 1.0, 'y': 2.0], "hello", [[1, 2], [3, 4]])); To print: Tuple!(double[char],string,int[][])([x:1, y:2], hello, 1 2 3 4) Instead of something more fitter for humans, that can show the things well, as: tuple(['x': 1.0, 'y': 2.0], "hello", [[1, 2], [3, 4]]) Or: Tuple!(double[char], string, int[][])(['x': 1.0, 'y': 2.0], "hello", [[1, 2], [3, 4]]) Bye, bearophile
May 31 2010
parent bearophile <bearophileHUGS lycos.com> writes:
 I'd like it to also return the popped item, a ElementType!Range, is this
possible?
 Popping one item out is one of the most common operations I have to perform on
an heap.
I have read some pages, trying to understand why your pop doesn't return the item. In a page I have read:
Pop returns void for the sake of speed: SGI FAQ, and to prevent exceptions that
may be thrown by the objects copy constructor.<
A quotation from this page: http://www.sgi.com/tech/stl/FAQ.html
Why do the pop member functions return void? All of the STL's pop member
functions (pop_back in vector, list, and deque; pop_front in list, slist, and
deque; pop in stack, queue, and priority_queue) return void, rather than
returning the element that was removed. This is for the sake of efficiency. If
the pop member functions were to return the element that was removed then they
would have to return it by value rather than by reference. (The element is
being removed, so there wouldn't be anything for a reference to point to.)
Return by value, however, would be inefficient; it would involve at least one
unnecessary copy constructor invocation. The pop member functions return
nothing because it is impossible for them to return a value in a way that is
both correct and efficient.<
Time ago I have suggested about a possible enum boolean value defined inside nonvoid functions, that is true/false if the function result is used or not: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=97424 So this value can be used with a static if to compute the result. So such functions are like templates, they get copied in two (generic function pointers have to refer to the nornal version of the function that computes and returns the result). A simpler solution is to use two different names for two functions. The most common operation is to pop an item from a heap and then to use it. So the pop() can return the popped item. And then the heap can define another method that just pops the top item and doesn't return it, it can be named for example "drop", something similar to this (I don't know if this is right): /** Pops the largest element (according to the predicate $(D less)) and returns it. */ ElementType!Range pop() { enforce(_length); auto tmp = _store.front; enforce(_length); if (_length > 1) swap(_store.front, _store[_length - 1]); --_length; percolateDown(_store[0 .. _length]); return tmp; } /** Drops the largest element (according to the predicate $(D less)). */ void drop() { enforce(_length); if (_length > 1) swap(_store.front, _store[_length - 1]); --_length; percolateDown(_store[0 .. _length]); } Bye, bearophile
May 31 2010