## digitalmars.D - Iterating over multiple collections in parallel

• Koroskin Denis (25/25) Jul 03 2008 If found myself solving the same problem again and again:
• Jarrett Billingsley (12/37) Jul 03 2008 It's something Walter mentioned before, but without any kind of public
• BCS (5/35) Jul 03 2008 Can a singe thread have more than one stack? If someone can figure out h...
• Jarrett Billingsley (5/9) Jul 03 2008 Of course, they're called stackthreads or Fibers. Downs has implemented...
• downs (23/36) Jul 04 2008 They're pretty stable now, but they depend on a Phobos patch to run acce...
• bearophile (4/11) Jul 04 2008 Nice :-)
• Fawzi Mohamed (43/43) Jul 05 2008 Iterating over many collection in parallel is very nice to have, what
• bearophile (7/17) Jul 05 2008 I may have lost you a bit here, anyway in real Python that code is:
• Steven Schveighoffer (47/72) Jul 03 2008 The issue here is how to formulate each iteration. foreach works becaus...
• Walter Bright (8/11) Jul 05 2008 It's a well-known problem, and not so easy to solve in an intuitive,
• Manfred_Nowak (16/20) Jul 05 2008 What's wrong with
• Yigal Chripun (7/39) Jul 05 2008 what about:
• Manfred_Nowak (8/9) Jul 05 2008 - Maybe some knowledge of some types of disagreeing and their relation
• Yigal Chripun (14/27) Jul 06 2008 What's so strong about the word "should" ?!
• Koroskin Denis (27/47) Jul 05 2008 On Sun, 06 Jul 2008 01:45:08 +0400, Manfred_Nowak ...
"Koroskin Denis" <2korden gmail.com> writes:
```If found myself solving the same problem again and again:
how to implement simultaneous iteration over two (or more collections)?

suppose, we have the arrays:
int[] a = [1, 2, 3];
int[] b = [1, 4, 9];

and would like to multiply them per-component, like this:

int[] c = new int[a.length];
for (int i = 0; i < a.length; ++j) {
c[i] = a[i] * b[i];
}

Since foreach is the prefered (and sometimes the only) way to iterate over
collection, there should be a way to iterate over multiple collections
simultaneously, like (just an example) this:

int[] c = new int[a.length];
foreach (i : a; j : b; ref k : c) {
k = a + b;
}

But this isn't supported (yet?)
More complicated example would be an iterating over two (or more)
collection with *no* random access iterators available, opApply only.

I spend a lot of time on this and have found no solution how to do it with
existing D feature set, but this is surely doable with a few
inter-function gotos and an exception if collections sizes don't match.
It's just a small layer written in asm, nothing more.

How about an enhancement proposal? :)
```
Jul 03 2008
"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
```"Koroskin Denis" <2korden gmail.com> wrote in message
news:op.udp4pcjxenyajd proton.creatstudio.intranet...
If found myself solving the same problem again and again:
how to implement simultaneous iteration over two (or more collections)?

suppose, we have the arrays:
int[] a = [1, 2, 3];
int[] b = [1, 4, 9];

and would like to multiply them per-component, like this:

int[] c = new int[a.length];
for (int i = 0; i < a.length; ++j) {
c[i] = a[i] * b[i];
}

Since foreach is the prefered (and sometimes the only) way to iterate over
collection, there should be a way to iterate over multiple collections
simultaneously, like (just an example) this:

int[] c = new int[a.length];
foreach (i : a; j : b; ref k : c) {
k = a + b;
}

But this isn't supported (yet?)
More complicated example would be an iterating over two (or more)
collection with *no* random access iterators available, opApply only.

I spend a lot of time on this and have found no solution how to do it with
existing D feature set, but this is surely doable with a few
inter-function gotos and an exception if collections sizes don't match.
It's just a small layer written in asm, nothing more.

How about an enhancement proposal? :)

It's something Walter mentioned before, but without any kind of public
"plan" of these ideas he has, there's no way to know if it's even on his
plate.

I think the syntax he used was something like this:

foreach(a, b; c)(d, e; f) { blah }

In any case, it's difficult in the general case to do parallel iteration
with D's inside-out iterators.  To do it generally, you'd have to use
coroutines.  But you can currently write a templated zip to iterate over
arrays simultaneously, only because they're so easy to iterate over.  I'm
sure downs will post something horrid-looking that you can look over.
```
Jul 03 2008
```Reply to Koroskin,

If found myself solving the same problem again and again: how to
implement simultaneous iteration over two (or more collections)?

suppose, we have the arrays:
int[] a = [1, 2, 3];
int[] b = [1, 4, 9];
and would like to multiply them per-component, like this:

int[] c = new int[a.length];
for (int i = 0; i < a.length; ++j) {
c[i] = a[i] * b[i];
}
Since foreach is the prefered (and sometimes the only) way to iterate
over  collection, there should be a way to iterate over multiple
collections  simultaneously, like (just an example) this:

int[] c = new int[a.length];
foreach (i : a; j : b; ref k : c) {
k = a + b;
}
But this isn't supported (yet?)
More complicated example would be an iterating over two (or more)
collection with *no* random access iterators available, opApply only.
I spend a lot of time on this and have found no solution how to do it
with  existing D feature set, but this is surely doable with a few
inter-function gotos and an exception if collections sizes don't
match.  It's just a small layer written in asm, nothing more.

How about an enhancement proposal? :)

Can a singe thread have more than one stack? If someone can figure out how
to do that, then this wouldn't be "to hard" to do in a lib.

call dg1 on first stack, body jumps to stack 2 and calls dg 2, ..., last
dg is called on the threads normal stack and calls the real body.
```
Jul 03 2008
"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
```"BCS" <ao pathlink.com> wrote in message
news:55391cb32ecb08caab0811331728 news.digitalmars.com...
Can a singe thread have more than one stack? If someone can figure out how
to do that, then this wouldn't be "to hard" to do in a lib.

call dg1 on first stack, body jumps to stack 2 and calls dg 2, ..., last
dg is called on the threads normal stack and calls the real body.

Of course, they're called stackthreads or Fibers.  Downs has implemented
them in his tools package (in an unstable state) for Phobos, and they've
been in Tango for more than two years.
```
Jul 03 2008
downs <default_357-line yahoo.de> writes:
```Jarrett Billingsley wrote:
"BCS" <ao pathlink.com> wrote in message
news:55391cb32ecb08caab0811331728 news.digitalmars.com...
Can a singe thread have more than one stack? If someone can figure out how
to do that, then this wouldn't be "to hard" to do in a lib.

call dg1 on first stack, body jumps to stack 2 and calls dg 2, ..., last
dg is called on the threads normal stack and calls the real body.

Of course, they're called stackthreads or Fibers.  Downs has implemented
them in his tools package (in an unstable state) for Phobos, and they've
been in Tango for more than two years.

They're pretty stable now, but they depend on a Phobos patch to run acceptably
fast (500 cycles per context switch on my Pentium M).

And I don't know if they run on DMD; there was some problem with the GDC
assembler statements. If somebody finds a problem with it, tell me in a reply
and I'll try to fix it.

Anyway, here's parallel iteration over two foreachables:

gentoo-pc ~/d \$ cat test33.d && echo "----" && rebuild test33.d && ./test33
module test33;

void main() {
auto it1 = [1, 2, 3];
auto it2 = [1, 4, 9];
auto iter2 = generator((void delegate(int) yield) {
foreach (entry; it2) yield(entry);
});

foreach (entry1; it1) {
auto entry2 = iter2();
writefln(entry1, " - ", entry2);
}
}
----
1 - 1
2 - 4
3 - 9
gentoo-pc ~/d \$
```
Jul 04 2008
bearophile <bearophileHUGS lycos.com> writes:
```downs:
auto iter2 = generator((void delegate(int) yield) {
foreach (entry; it2) yield(entry);
});

foreach (entry1; it1) {
auto entry2 = iter2();
writefln(entry1, " - ", entry2);

Nice :-)

Bye,
bearophile
```
Jul 04 2008
Fawzi Mohamed <fmohamed mac.com> writes:
```Iterating over many collection in parallel is very nice to have, what
would be nice to have with a cleaner syntax for it.

Aldor had it, and since then I always missed it.
It worked like this:
Iterator/loops can be declared like this:

for a in [1,2,3]
for b in [1,0,7,3]
for i in 1..
while (b<7)
repeat {
...
}

which mean follow *all* iterators in parallel, and execute ..., when
any iterator stop, stop everything

very clear, nice and flexible.

Other syntaxes look like a kludge with respect to it.

Python for example has generators/iterators, and builds derived
iterators one on the top of the other

for a in [1,2,3]
for i in 0..
repeat { ... }

becomes
for i,a in enumerate([1,2,3].iterator()):
...

C++ iterators can be used in parallel, because all the difficulty of
keeping the current position is moved to the iterator writer, but the
syntax is just plain ugly and writing efficient iterators for complex
structures can be painful.

Anyway leaving away that syntax, in D without changes one can already
have a reasonable syntax (even if not as nice).
Using fibers, as in the nice example of downs it should also be
possible to convert automatically an opApply in a generator, so
technically it should be feasible, even if 500 cycles per conetext
switch, might be too much for some applications (tight loops).
So in D one could have a something workable
foreach(a,b,i;collectLoop([1,2,3],[2,3,7],Range(1,infinity))){
...
}

where collectLoop is a variadic template function that converts opApply
to a generator and returns an object having a opApply that loops in
parallel.
Implementation of it is left as exercice to the reader ;-)

Fawzi
```
Jul 05 2008
bearophile <bearophileHUGS lycos.com> writes:
```Fawzi Mohamed:

for i,a in enumerate([1,2,3].iterator()):

I may have lost you a bit here, anyway in real Python that code is:

for idx, el in enumerate([1, 2, 3]):
print idx, el

Using fibers, as in the nice example of downs it should also be
possible to convert automatically an opApply in a generator, so
technically it should be feasible, even if 500 cycles per conetext
switch, might be too much for some applications (tight loops).
So in D one could have a something workable
foreach(a,b,i;collectLoop([1,2,3],[2,3,7],Range(1,infinity))){
...
}
Implementation of it is left as exercice to the reader ;-)

In my libs there are zip() and azip() for something like that, but they don't
use fibers yet... Once the fibers are added, the code can be made smart at
compile-time, so it can use the current fast code when possible, and fibers
when it must.

Bye,
bearophile
```
Jul 05 2008
"Steven Schveighoffer" <schveiguy yahoo.com> writes:
```"Koroskin Denis" wrote
If found myself solving the same problem again and again:
how to implement simultaneous iteration over two (or more collections)?

suppose, we have the arrays:
int[] a = [1, 2, 3];
int[] b = [1, 4, 9];

and would like to multiply them per-component, like this:

int[] c = new int[a.length];
for (int i = 0; i < a.length; ++j) {
c[i] = a[i] * b[i];
}

Since foreach is the prefered (and sometimes the only) way to iterate over
collection, there should be a way to iterate over multiple collections
simultaneously, like (just an example) this:

int[] c = new int[a.length];
foreach (i : a; j : b; ref k : c) {
k = a + b;
}

But this isn't supported (yet?)
More complicated example would be an iterating over two (or more)
collection with *no* random access iterators available, opApply only.

I spend a lot of time on this and have found no solution how to do it with
existing D feature set, but this is surely doable with a few
inter-function gotos and an exception if collections sizes don't match.
It's just a small layer written in asm, nothing more.

How about an enhancement proposal? :)

The issue here is how to formulate each iteration.  foreach works because it
is well defined "iterate over each value in the container".  But iterating
over multiple containers, you could want different orders of iteration,
iterating for each unique set of indexes, etc.

Or you could want what you had originally stated, in which cases the lengths
must be correct.  We already have foreach_reverse, which is like foreach,
but iterates in the opposite direction.  I don't want foreach_X for
everyones preferred iteration method of choice.

I think the best solution here is a fruct (foreach struct) in a library:

struct myfruct
{
static myfruct opCall(int[] a, int[] b, int[] c)
{
myfruct result;
result.a = a;
result.b = b;
result.c = c;
assert(a.length == b.length && a.length == c.length);
}

int[] a;
int[] b;
int[] c;
int opApply(int delegate(ref int i, ref int j, ref int k) dg)
{
int result = 0;
for(int i = 0; i < a.length; i++)
{
if((result = dg(a[i], b[i], c[i])) != 0)
break;
}
return result;
}
}

usage:

foreach(i, j, ref k; myfruct(a, b, c))
{
k = i * j;
}

Now, the issue here is, how can this be made generic?  The answer is, we
need iterators.  Otherwise you cannot iterate over multiple containers that
are not index-based (such as AA's or maps).  With iterators, that have a
well defined interface (like opApply), that allows one to increment
individual indexes, and not have to know what indexes are made of how how
they work.  They just define opInc and opStar.  Oh, and we would need
reference return types too (so opStar can return an actual reference).

-Steve
```
Jul 03 2008
Walter Bright <newshound1 digitalmars.com> writes:
```Koroskin Denis wrote:
I spend a lot of time on this and have found no solution how to do it
with existing D feature set, but this is surely doable with a few
inter-function gotos and an exception if collections sizes don't match.

It's a well-known problem, and not so easy to solve in an intuitive,
non-klunky way. Some of the issues are what do you do when the
collections are of different sizes? Error? Iterate to the minimum?
Iterate to the maximum and use default values for the shorter collections?

The best I can offer at the moment is:

foreach (i; 0 .. a.length)
a[i] = b[i] + c[i];
```
Jul 05 2008
"Manfred_Nowak" <svv1999 hotmail.com> writes:
```Koroskin Denis wrote:

int[] c = new int[a.length];
foreach (i : a; j : b; ref k : c) {
k = a + b;
}

What's wrong with

auto c= a .* b;

except, that it is neither suggested nor supported?

Or

auto c= a (*) b;

which was suggested some long time ago, but has not got through?

Is it horrible then to define:

void opDotMul(T)(out T c, T x, T y){
assert( x.length ==  y.length);
c.length= x.length;
foreach( i,v; x){
c[i]= x[i] * y[i];
}
}

-manfred
```
Jul 05 2008
Yigal Chripun <yigal100 gmail.com> writes:
```Manfred_Nowak wrote:
Koroskin Denis wrote:

int[] c = new int[a.length];
foreach (i : a; j : b; ref k : c) {
k = a + b;
}

What's wrong with

auto c= a .* b;

except, that it is neither suggested nor supported?

Or

auto c= a (*) b;

which was suggested some long time ago, but has not got through?

Is it horrible then to define:

void opDotMul(T)(out T c, T x, T y){
assert( x.length ==  y.length);
c.length= x.length;
foreach( i,v; x){
c[i]= x[i] * y[i];
}
}

-manfred

int[100] a, b, c;
for (int i = 0; i < 100; ++i) {
a[i] = myCrazyFunkyFunction(b[i], c[i]);
}

D does not and should not have an opMyCrazyFunkyFunction.
```
Jul 05 2008
"Manfred_Nowak" <svv1999 hotmail.com> writes:
```Yigal Chripun wrote:

D does not and should not have an opMyCrazyFunkyFunction.

- Maybe some knowledge of some types of disagreeing and their relation
can turn out to be useful:
http://blog.createdebate.com/2008/04/07/writing-strong-arguments/

- Maybe a DigitalMars_funky_extension.opMyCrazyFunkyFunction is allowed
to exist? At least it's in the language specification:
http://www.digitalmars.com/d/1.0/pragma.html

-manfred
```
Jul 05 2008
Yigal Chripun <yigal100 gmail.com> writes:
```Manfred_Nowak wrote:
Yigal Chripun wrote:

D does not and should not have an opMyCrazyFunkyFunction.

- Maybe some knowledge of some types of disagreeing and their relation
can turn out to be useful:
http://blog.createdebate.com/2008/04/07/writing-strong-arguments/

- Maybe a DigitalMars_funky_extension.opMyCrazyFunkyFunction is allowed
to exist? At least it's in the language specification:
http://www.digitalmars.com/d/1.0/pragma.html

-manfred

What's so strong about the word "should" ?!
If I wanted to make a "strong argument" as you say, I would've used the
word "must".

Also, I didn't talk about pragmas and compiler specific extensions but
rather meant the core language.

maybe you want to suggest a way to define infix functions?

Last thing: I find it ridiculous you try to teach me the proper way to
post because you didn't like my phrasing while other people use insults
and other bad language on the NG and you're fine with that. That's just
smells of hypocrisy. If you really want to educate someone here, start
with those that curse, swear and troll.

disappointed,
Yigal
```
Jul 06 2008
"Koroskin Denis" <2korden gmail.com> writes:
```On Sun, 06 Jul 2008 01:45:08 +0400, Manfred_Nowak <svv1999 hotmail.com> =
=

wrote:

Koroskin Denis wrote:

int[] c =3D new int[a.length];
foreach (i : a; j : b; ref k : c) {
k =3D a + b;
}

What's wrong with

auto c=3D a .* b;

except, that it is neither suggested nor supported?

Or

auto c=3D a (*) b;

which was suggested some long time ago, but has not got through?

Is it horrible then to define:

void opDotMul(T)(out T c, T x, T y){
assert( x.length =3D=3D  y.length);
c.length=3D x.length;
foreach( i,v; x){
c[i]=3D x[i] * y[i];	=

}
}

-manfred

Nothing wrong. It was just brought as an example.
Replace int[] with a List!(int) that exposes opApply() but not iterators=
=

and try again :)

The only way (apart from using stackthreads) you can do it now is  =

something like this:

List!(int) a, b, c;

int[] acopy, bcopy;
foreach (i; a) {
acopy ~=3D i;
}

foreach (j; b) {
bcopy ~=3D j;
}

int i =3D 0;
foreach (ref k; c) {
k =3D acopy[i] + bcopy[i];
}

compare with the following:

List!(int) a, b, c;
foreach (i; a) (j; b)(ref k; c) {
c =3D a + b;
}

which one would you prefer?
```
Jul 05 2008
"Manfred_Nowak" <svv1999 hotmail.com> writes:
```Koroskin Denis wrote:

List!(int) a, b, c;

[...]
which one would you prefer?

Still

c= a .+ b;

To do componentwise operations imposes some requirements for the
involved types. These are at least:
- an order on the elements, and
- definition(s) for the operation(s) on the elements

If an opApply cannot be interpreted as representing an ordering, than