digitalmars.D - Re: Proposal: Hide the int in opApply from the user

Paul Findlay <r.lph50+d gmail.com> Jan 08 2008

bearophile <bearophileHUGS lycos.com> Jan 09 2008

Bill Baxter <dnewsgroup billbaxter.com> Jan 09 2008

Dan <murpsoft hotmail.com> Jan 09 2008
bearophile <bearophileHUGS lycos.com> Jan 10 2008

bearophile <bearophileHUGS lycos.com> Jan 10 2008

BC <notmi_emayl_adreznot hotmail.com.remove.not> Jan 11 2008

Paul Findlay <r.lph50+d gmail.com> writes:

Sorry, to use the web interface to the newsgroup..

This is an old (1993), but I hope related, discussion:

Iterators: Signs of Weakness in Object-Oriented Languages
http://home.pipeline.com/~hbaker1/Iterator.html

 - Paul

Jan 08 2008

bearophile <bearophileHUGS lycos.com> writes:

Paul Findlay Wrote:
 This is an old (1993), but I hope related, discussion:
 Iterators: Signs of Weakness in Object-Oriented Languages
 http://home.pipeline.com/~hbaker1/Iterator.html


Thank you for the link, it's quite interesting.
From that article:

Iterator functions for a collection can be "member functions" of the C++
collection class itself, but this policy creates a problem if one wishes to
enumerate the same collection for different purposes at the same time--e.g., if
one wishes to enumerate the Cartesian product of a collection with itself. The
problem is that the "state" of the iterator must be stored somewhere, and the
only two places are in the instance of the collection or the class of the
collection. If the state is stored in the class, then only one iterator can be
active for the entire class, while if the state is stored in the instance, then
only one iterator can be active for the collection. Neither of these policies
provides the flexibility required for our Cartesian product problem. The
solution usually proposed for C++ is to provide every collection class with a
"friend" class called its "iterator class".<


If you put a D opApply as a metod of a collection class, it allows you to
enumerate the Cartesian product of a collection with itself too.

Bye,
bearophile

Jan 09 2008

Bill Baxter <dnewsgroup billbaxter.com> writes:

bearophile wrote:
 Paul Findlay Wrote:
 This is an old (1993), but I hope related, discussion:
 Iterators: Signs of Weakness in Object-Oriented Languages
 http://home.pipeline.com/~hbaker1/Iterator.html


 Thank you for the link, it's quite interesting.
 From that article:
 
 The problem is that the "state" of the iterator must be stored somewhere, and
the only two places are in the instance of the collection or the class of the
collection. 


 If you put a D opApply as a metod of a collection class, it allows you to
enumerate the Cartesian product of a collection with itself too.


In other words, the author seems to have overlooked the other place 
where state can be stored: on the stack.

--bb

Jan 09 2008

Dan <murpsoft hotmail.com> writes:

Bill Baxter Wrote:

 bearophile wrote:
 Paul Findlay Wrote:
 This is an old (1993), but I hope related, discussion:
 Iterators: Signs of Weakness in Object-Oriented Languages
 http://home.pipeline.com/~hbaker1/Iterator.html


 Thank you for the link, it's quite interesting.
 From that article:
 
 The problem is that the "state" of the iterator must be stored somewhere, and
the only two places are in the instance of the collection or the class of the
collection. 


 If you put a D opApply as a metod of a collection class, it allows you to
enumerate the Cartesian product of a collection with itself too.


 In other words, the author seems to have overlooked the other place 
 where state can be stored: on the stack.
 
 --bb


Or the other place; in registers.

Typically when you go through a loop, you generate this (on x86):

    mov ecx, length;
top:
    bla bla bla
    dec ecx;
    jnz top;

This is considerably more efficient for storing iterator state than any other
means.

Jan 09 2008

bearophile <bearophileHUGS lycos.com> writes:

Bill Baxter>In other words, the author seems to have overlooked the other place
where state can be stored: on the stack.<

In other parts of that article the author is aware of the stack too, he talks
about the alloca() function too.
What I don't like is the formatting of those snippets of C code :-)

void maptimes(fn,limit) void (*fn)(int); int limit;
{int i; for (i=0; i<limit; i++) (*fn)(i);}

The article presents some problems of iterating on a collection, and how to
solve them in Lisp, C and C++. The main problems are:
1) Iterating on a collection
2) Given a collection (like a tree), enumerate the Cartesian product of it with
itself.
3) Given two collections, like two trees, find if they have the same leaves,
ignoring the other structure of the tree. The purpose is to do this computation
efficiently, keeping the collection data encapsulation, etc.
4) I think the article tries to talk about a problem where you need to do both
2) and 3).

Putting an opApply method into the collection class is enough to solve the 1)
and 2) problem too.

The problem 3) can be solved in many ways, the simpler way is to scan both
trees to accumulate their leaves into an array, and then to see if the two (or
more) arrays are equal. But that ways usually wastes too much memory and time
if the trees aren't small. So a lazy approach is better. In Haskell and in
recent versions of Python it's easy to perform lazy computations. (In Python
you can write a function/method generator, that yields the leaves, then you
just have to write a little function that calls the two generators in parallel.
If the two generators give equal sequences, and they stop at the same time,
then the two trees have the same fringe). The D opApply doesn't allow you to do
that, I presume (fibers/threads may help, but so far I haven't used them to
solve this problem of parallel lazy scan).

But to solve the samefringe problem in D (if you have a Tree data structure
class) you can add three methods to that class, like "next", "isfinished" and
"reset" (or something similar) that use a little part of the state of the tree
object to iterate on the nodes, where the "next" method gives the currently
seen node. Then you can write a little function that calls "next" on two
different trees, to solve the samefringe problem (so you need "next",
"isfinished", "reset" to solve parallel scan problems, and you can add
"opApply" method too, for a faster single scan, because I presume opApply is
faster than using the next/isfinished pair).

To solve the problem 4) (that's not stated in the article, I think) the
reset/next/isfinished methods don't suffice, because the object stores just one
iteration state. So you may need to give the tree class yet another method that
returns a newly created object with reset/next/isfinished methods and a new
iteration state. This is probably closer to the solution used by C++ (but it's
a friend class, not a nested class). You may want to avoid putting the
reset/next/isfinished methods in the tree class, and just use this smaller
iteration object for the problem 3) too.

I don't know if D may use some standard way (like opApply) to allow parallel
lazy scans (that is to manage the yield like Python does).

I am learning D still, so if you see something wrong, silly or non D-thonic
among the things I have written here, please tell me :-)

Bye,
bearophile

Jan 10 2008

bearophile <bearophileHUGS lycos.com> writes:

bearophile:
 I don't know if D may use some standard way (like opApply) to allow
 parallel lazy scans (that is to manage the yield like Python does).


D 2.x has full closures too, so to solve those problems it may be used one of
the full functional approaches discussed in the article too (I presume they
aren't much used yet among D programmers).

Bye,
bearophile

Jan 10 2008

BC <notmi_emayl_adreznot hotmail.com.remove.not> writes:

On Wed, 09 Jan 2008 21:15:48 -0000, bearophile <bearophileHUGS lycos.com>  
wrote:

 Paul Findlay Wrote:
 This is an old (1993), but I hope related, discussion:
 Iterators: Signs of Weakness in Object-Oriented Languages
 http://home.pipeline.com/~hbaker1/Iterator.html


 Thank you for the link, it's quite interesting.
 From that article:

 Iterator functions for a collection can be "member functions" of the  
 C++ collection class itself, but this policy creates a problem if one  
 wishes to enumerate the same collection for different purposes at the  
 same time--e.g., if one wishes to enumerate the Cartesian product of a  
 collection with itself. The problem is that the "state" of the iterator  
 must be stored somewhere, and the only two places are in the instance  
 of the collection or the class of the collection. If the state is  
 stored in the class, then only one iterator can be active for the  
 entire class, while if the state is stored in the instance, then only  
 one iterator can be active for the collection. Neither of these  
 policies provides the flexibility required for our Cartesian product  
 problem. The solution usually proposed for C++ is to provide every  
 collection class with a "friend" class called its "iterator class".<


 If you put a D opApply as a metod of a collection class, it allows you  
 to enumerate the Cartesian product of a collection with itself too.

 Bye,
 bearophile


It's difficult to iterate over two collections without buffering.
A tree iterator needs a tree with parent links on all the nodes or it
needs to hold its own stack. Unfortunately if you try to assign those
iterators around, the stack is copied by reference and your program
is broken. So you have to explicitly dup those iterators in all your
generic code.

Jan 11 2008

D Programming

C/C++ Programming

Other

digitalmars.D - Re: Proposal: Hide the int in opApply from the user