www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - foreach syntax, std.mixin

reply dsimcha <dsimcha yahoo.com> writes:
What are the chances that D gets auto tuple unpacking for foreach loops before
D2 goes gold?  In other words, it would be nice to write:

uint[] foo = [1,2,3,4,5];
uint[] bar = [6,7,8,9,10];

foreach(a, b; zip(foo, bar)) {
    // Does what you think it does.
}

Also, how about foreach over ranges with an index variable?  For example:

foreach(index, elem; chain(foo, bar)) {
   // Does what you think it does.
}

If these aren't going to happen, would it be worth having in Phobos a
std.mixin module (I've already considered such a thing a few times here) that
would allow someone writing a range to add goodies like these to structs and
classes with a single line of code?  For example:

struct MyRange {
    Tuple!(float, float) _front;
    bool _empty;

    typeof(_front) front() {
        return _front;
    }

    void popFront() {
        // do stuff.
    }

    bool empty() {
        return _empty;
    }

    mixin(UnpackTupleForeach);

    mixin(IndexForeach);
}

The mixin(UnpackTupleForeach) would make this work:
foreach(float float1, float float2; MyRange.init) {}

The mixin(IndexForeach) would make this work:
foreach(size_t index, float floatTuple; MyRange.init) {}

This would work via opApply, which some don't like for efficiency reasons.
It's been shown a while back that, while opApply does have some overhead, it's
pretty small and LDC actually optimizes it out.
Nov 08 2009
next sibling parent reply Bill Baxter <wbaxter gmail.com> writes:
On Sun, Nov 8, 2009 at 9:10 AM, dsimcha <dsimcha yahoo.com> wrote:
 What are the chances that D gets auto tuple unpacking for foreach loops b=
efore
 D2 goes gold? =A0In other words, it would be nice to write:

 uint[] foo =3D [1,2,3,4,5];
 uint[] bar =3D [6,7,8,9,10];

 foreach(a, b; zip(foo, bar)) {
 =A0 =A0// Does what you think it does.
 }
Probably obvious, but if added, it should probably work by means of .tupleof on the iterated type, rather than by any particular knowledge of types used in std.range. And I guess just public members would be included. Main down side is that it could introduce ambiguities with explicitly specified opApply overloads. If you have an opApply that takes uint,uint already, which gets used? Or is it an error? I not sure what would be least likely to cause trouble. One the one hand the class author should be able to override the behavior, but on the other hand class users will assume that tuple unpacking works and may be surprised if it has been overriden to do something different. So either rule -- specifc opApply wins, or it's an error -- seems to have an argument for it.
 Also, how about foreach over ranges with an index variable? =A0For exampl=
e:
 foreach(index, elem; chain(foo, bar)) {
 =A0 // Does what you think it does.
 }
This will create too many ambiguities if the tuple unpacking is also implemeted, so I'm agaist it: foreach(index,elem; zip(arrayOfInts, bar)) { // I don't know what this does } I think the index variant would better be done as an enumerate() function like in python. Which does something like zip(iota(1,bar.length),bar).
 If these aren't going to happen, would it be worth having in Phobos a
 std.mixin module (I've already considered such a thing a few times here) =
that
 would allow someone writing a range to add goodies like these to structs =
and
 classes with a single line of code? =A0For example:
These look less controversial, but I would like having foreach be able to tease apart structs/tuples. A further issue, and this may nuke the idea till we have tuple literals, is what to do about tuples of tuples. In python you can do for i,(f,b) in enumerate(zip(foo,bar)): I don't think we want to introduce the bad idea of automatic flattening yet again just to get this kind of thing to work: foreach(i,f,b; enumerate(zip(foo,bar))) It should really be something like: foreach(i,(f,b); enumerate(zip(foo,bar))) But that requires D get some tuple sytax. ... Or I suppose it could be a special syntax in "foreach" for now. Another approach to disambiguating tuple iteration vs opApply iteration could be to require tuple parens around the arguments: foreach((f,b); zip(foo,bar)) {} // tuple iteration foreach(f,b; zip(foo, bar)) {} // only works if zip returns something with the right kind of opApply --bb
Nov 08 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Sun, Nov 8, 2009 at 9:10 AM, dsimcha <dsimcha yahoo.com> wrote:
 What are the chances that D gets auto tuple unpacking for foreach loops b
efore
 D2 goes gold?  In other words, it would be nice to write:

 uint[] foo = [1,2,3,4,5];
 uint[] bar = [6,7,8,9,10];

 foreach(a, b; zip(foo, bar)) {
    // Does what you think it does.
 }
Probably obvious, but if added, it should probably work by means of .tupleof on the iterated type, rather than by any particular knowledge of types used in std.range. And I guess just public members would be included. Main down side is that it could introduce ambiguities with explicitly specified opApply overloads. If you have an opApply that takes uint,uint already, which gets used? Or is it an error? I not sure what would be least likely to cause trouble. One the one hand the class author should be able to override the behavior, but on the other hand class users will assume that tuple unpacking works and may be surprised if it has been overriden to do something different. So either rule -- specifc opApply wins, or it's an error -- seems to have an argument for it.
 Also, how about foreach over ranges with an index variable?  For exampl
e:
 foreach(index, elem; chain(foo, bar)) {
   // Does what you think it does.
 }
This will create too many ambiguities if the tuple unpacking is also implemeted, so I'm agaist it: foreach(index,elem; zip(arrayOfInts, bar)) { // I don't know what this does } I think the index variant would better be done as an enumerate() function like in python. Which does something like zip(iota(1,bar.length),bar).
Enumerate is a great idea. It's probably much better than requiring every range struct to mix something in to enable this behavior. Thanks. I guess the same thing could be applied to unpack(). Instead of making the range implementer mixin something to enable this (he/she will probably forget and it could lead to ambiguities), do it on the caller end. Makes me wonder why noone thought of this until now, or maybe someone did and I forgot. How's: foreach(fooElem, barElem; unpack(zip(foo, bar))) {}, or: foreach(i, elem; enumerate(chain(foo, bar))) {} ?
Nov 08 2009
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, Nov 8, 2009 at 21:14, dsimcha wrote:

 function like in python. Which does something like
 zip(iota(1,bar.length),bar).
 I think the index variant would better be done as an enumerate()
Enumerate is a great idea. It's probably much better than requiring every range struct to mix something in to enable this behavior. Thanks. FWIW, I coded enumerate and such for D2, and will post them on dsource as
soon as I can get my hands on svn. In a few days at most, I hope. Personally, I then use a 'tuplify' template which takes a standard function and transforms it into a tuple-accepting one: so if I have: int foo(int a, double b, string c) {...} tuplify!foo is : int tfoo(Tuple!(int, double, string) t) { /* extracts a,b & c from t, and returns foo(a,b,c) */ } so, as I can't unpack the tuple: foreach(a,b,c ; zip(range1, range2, range3) { /* do something with foo on (a,b,c) */ } I tuplify foo and do: foreach (tup; zip(range1, range2, range3) { /* do something with tuplify!foo(tup) */ } There is no real limit on the number of ranges that can be acted upon in parallel that way, though I admit the syntax is a bit cumbersome. I also use unzip!index( someZip) to get back the original range from inside zip.
 I guess the same thing could be applied to unpack().  Instead of making the
 range
 implementer mixin something to enable this (he/she will probably forget and
 it
 could lead to ambiguities), do it on the caller end.

 Makes me wonder why noone thought of this until now, or maybe someone did
 and I
 forgot.  How's:

 foreach(fooElem, barElem; unpack(zip(foo, bar))) {}, or:

 foreach(i, elem; enumerate(chain(foo, bar))) {} ?
Can that be done for more than two ranges? The syntax is so nice, I'd be deligthed to have that. Philippe
Nov 08 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Philippe Sigaud (philippe.sigaud gmail.com)'s article
 dsimcha wrote:
 Makes me wonder why noone thought of this until now, or maybe someone did
 and I
 forgot.  How's:

 foreach(fooElem, barElem; unpack(zip(foo, bar))) {}, or:

 foreach(i, elem; enumerate(chain(foo, bar))) {} ?
Can that be done for more than two ranges? The syntax is so nice, I'd be deligthed to have that.
Hot off the press and VERY prototype-ish: Code: http://pastebin.com/m2087e524 Docs: http://cis.jhu.edu/~dsimcha/unpackEnumerate.html Does this look like a good addition to std.range? The elegance of it is it solves the problem of providing syntactic sugar to ranges w/ zero ripple effects either in the compiler or in the rest of Phobos. I'll file it somewhere more official after people review it a little and refine the idea, but I definitely think something similar to this has a legit place in std.range. If you're wondering how unpack works and don't want to grovel through all the code, it's tons of string mixin magic. That's about the only way I was able to make it work.
Nov 08 2009
next sibling parent reply Bill Baxter <wbaxter gmail.com> writes:
On Sun, Nov 8, 2009 at 1:43 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Philippe Sigaud (philippe.sigaud gmail.com)'s article
 dsimcha wrote:
 Makes me wonder why noone thought of this until now, or maybe someone =
did
 and I
 forgot. =A0How's:

 foreach(fooElem, barElem; unpack(zip(foo, bar))) {}, or:

 foreach(i, elem; enumerate(chain(foo, bar))) {} ?
Can that be done for more than two ranges? The syntax is so nice, I'd be deligthed to have that.
Hot off the press and VERY prototype-ish: Code: http://pastebin.com/m2087e524 Docs: http://cis.jhu.edu/~dsimcha/unpackEnumerate.html Does this look like a good addition to std.range? =A0The elegance of it i=
s it solves
 the problem of providing syntactic sugar to ranges w/ zero ripple effects=
either
 in the compiler or in the rest of Phobos. =A0I'll file it somewhere more =
official
 after people review it a little and refine the idea, but I definitely thi=
nk
 something similar to this has a legit place in std.range.

 If you're wondering how unpack works and don't want to grovel through all=
the
 code, it's tons of string mixin magic. =A0That's about the only way I was=
able to
 make it work.
What's the overhead like? That would be the thing that would keep me from using unpack or enumerate. As Andrei is fond of saying "expensive abstractions are a dime a dozen". If it's not too bad then this sounds like a decent solution to me. --bb
Nov 08 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Sun, Nov 8, 2009 at 1:43 PM, dsimcha <dsimcha yahoo.com> wrote:
 == Quote from Philippe Sigaud (philippe.sigaud gmail.com)'s article
 dsimcha wrote:
 Makes me wonder why noone thought of this until now, or maybe someone
did
 and I
 forgot.  How's:

 foreach(fooElem, barElem; unpack(zip(foo, bar))) {}, or:

 foreach(i, elem; enumerate(chain(foo, bar))) {} ?
Can that be done for more than two ranges? The syntax is so nice, I'd be deligthed to have that.
Hot off the press and VERY prototype-ish: Code: http://pastebin.com/m2087e524 Docs: http://cis.jhu.edu/~dsimcha/unpackEnumerate.html Does this look like a good addition to std.range?  The elegance of it i
s it solves
 the problem of providing syntactic sugar to ranges w/ zero ripple effects
either
 in the compiler or in the rest of Phobos.  I'll file it somewhere more
official
 after people review it a little and refine the idea, but I definitely thi
nk
 something similar to this has a legit place in std.range.

 If you're wondering how unpack works and don't want to grovel through all
the
 code, it's tons of string mixin magic.  That's about the only way I was
able to
 make it work.
What's the overhead like? That would be the thing that would keep me from using unpack or enumerate. As Andrei is fond of saying "expensive abstractions are a dime a dozen". If it's not too bad then this sounds like a decent solution to me. --bb
import std.stdio, std.perf; void main() { auto myRange = replicate(100_000_000, 0); scope pc = new PerformanceCounter; pc.start; foreach(i, num; enumerate(myRange)) {} pc.stop; writeln("Enumerate: ", pc.milliseconds); pc.start; foreach(num; myRange) {} pc.stop; writeln("Raw Range: ", pc.milliseconds); pc.start; foreach(i; 0..100_000_000) { int num = 0; } pc.stop; writeln("Plain old for loop: ", pc.milliseconds); } Enumerate: 1207 Raw Range: 940 Plain old for loop: 112 In other words, it's not a zero-cost abstraction, but it's not what you'd call expensive either, especially since in real-world code you'd actually have a loop body. Also, apparently LDC inlines opApply, proving that a sufficiently smart but realistically implementable compiler can make this a zero-cost abstraction. (Someone who uses LDC please confirm this.) To put this in perspective, ranges are not a free abstraction on DMD either, probably because DMD's inliner isn't sufficiently aggressive. Really, the raw range case shouldn't take nearly as long as it does either, as the plain old for loop test proves. IMHO the details of how DMD's optimizer currently works should not dictate the design of the standard library unless either performance is absurdly bad or we have good reason to believe that common implementations will never be any better. LDC proves that inlining opApply can be done. If you have something that absolutely must be as fast as possible now, you may not want to use this (or ranges either), but in the bigger picture I think it's efficient enough to have a legitimate place in Phobos.
Nov 08 2009
parent reply Bill Baxter <wbaxter gmail.com> writes:
On Sun, Nov 8, 2009 at 5:13 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Sun, Nov 8, 2009 at 1:43 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Philippe Sigaud (philippe.sigaud gmail.com)'s articl=
e
 dsimcha wrote:
 Makes me wonder why noone thought of this until now, or maybe someo=
ne
 did
 and I
 forgot. =A0How's:

 foreach(fooElem, barElem; unpack(zip(foo, bar))) {}, or:

 foreach(i, elem; enumerate(chain(foo, bar))) {} ?
Can that be done for more than two ranges? The syntax is so nice, I'd=
be
 deligthed to have that.
Hot off the press and VERY prototype-ish: Code: http://pastebin.com/m2087e524 Docs: http://cis.jhu.edu/~dsimcha/unpackEnumerate.html Does this look like a good addition to std.range? =A0The elegance of i=
t i
 s it solves
 the problem of providing syntactic sugar to ranges w/ zero ripple effe=
cts
 =A0either
 in the compiler or in the rest of Phobos. =A0I'll file it somewhere mo=
re
 official
 after people review it a little and refine the idea, but I definitely =
thi
 nk
 something similar to this has a legit place in std.range.

 If you're wondering how unpack works and don't want to grovel through =
all
 =A0the
 code, it's tons of string mixin magic. =A0That's about the only way I =
was
 =A0able to
 make it work.
What's the overhead like? =A0That would be the thing that would keep me from using unpack or enumerate. =A0As Andrei is fond of saying "expensive abstractions are a dime a dozen". =A0If it's not too bad then this sounds like a decent solution to me. --bb
import std.stdio, std.perf; void main() { =A0 =A0auto myRange =3D replicate(100_000_000, 0); =A0 =A0scope pc =3D new PerformanceCounter; =A0 =A0pc.start; =A0 =A0foreach(i, num; enumerate(myRange)) {} =A0 =A0pc.stop; =A0 =A0writeln("Enumerate: =A0", pc.milliseconds); =A0 =A0pc.start; =A0 =A0foreach(num; myRange) {} =A0 =A0pc.stop; =A0 =A0writeln("Raw Range: =A0", pc.milliseconds); =A0 =A0pc.start; =A0 =A0foreach(i; 0..100_000_000) { =A0 =A0 =A0 =A0int num =3D 0; =A0 =A0} =A0 =A0pc.stop; =A0 =A0writeln("Plain old for loop: =A0", pc.milliseconds); } Enumerate: =A01207 Raw Range: =A0940 Plain old for loop: =A0112 In other words, it's not a zero-cost abstraction, but it's not what you'd=
call
 expensive either, especially since in real-world code you'd actually have=
a loop
 body. =A0Also, apparently LDC inlines opApply, proving that a sufficientl=
y smart but
 realistically implementable compiler can make this a zero-cost abstractio=
n.
 (Someone who uses LDC please confirm this.)

 To put this in perspective, ranges are not a free abstraction on DMD eith=
er,
 probably because DMD's inliner isn't sufficiently aggressive. =A0Really, =
the raw
 range case shouldn't take nearly as long as it does either, as the plain =
old for
 loop test proves.

 IMHO the details of how DMD's optimizer currently works should not dictat=
e the
 design of the standard library unless either performance is absurdly bad =
or we
 have good reason to believe that common implementations will never be any=
better.
 =A0LDC proves that inlining opApply can be done. =A0If you have something=
that
 absolutely must be as fast as possible now, you may not want to use this =
(or
 ranges either), but in the bigger picture I think it's efficient enough t=
o have a
 legitimate place in Phobos.
I agree. Those numbers don't seem so bad, particularly if inlining is possible in the future. But there's still the issue of how to get both an enumeration and an unpacked tuple together. --bb
Nov 09 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 I agree.  Those numbers don't seem so bad, particularly if inlining is
 possible in the future.
 But there's still the issue of how to get both an enumeration and an
 unpacked tuple together.
 --bb
Yeah, I thought about this, and I didn't have any good answer other than make an absurdly baroque enumerateUnpack function. I'd be open to suggestions here.
Nov 09 2009
parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Nov 9, 2009 at 6:46 AM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Bill Baxter (wbaxter gmail.com)'s article
 I agree. =A0Those numbers don't seem so bad, particularly if inlining is
 possible in the future.
 But there's still the issue of how to get both an enumeration and an
 unpacked tuple together.
 --bb
Yeah, I thought about this, and I didn't have any good answer other than =
make an
 absurdly baroque enumerateUnpack function. =A0I'd be open to suggestions =
here. unpack2nd(enumerate(zip(foo,bar))) ? :-P or more seriously, you could have a flatten() type of HOF, which flattens any tuples at the top level. So (1, Tuple(a,Tuple(b,c))) becomes (1,a,Tuple(b,c)). --bb
Nov 09 2009
prev sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Sun, Nov 8, 2009 at 22:43, dsimcha <dsimcha yahoo.com> wrote:

 Hot off the press and VERY prototype-ish:

 Code:
 http://pastebin.com/m2087e524

 Docs:
 http://cis.jhu.edu/~dsimcha/unpackEnumerate.html<http://cis.jhu.edu/%7Edsimcha/unpackEnumerate.html>

 Cool! Thanks a lot.
I looked at opApply for D1 maybe two years ago, but never used it myself. I'll go and read this part of the docs. *test it* Hmm, I can get enumerate to work, but the Unpack part doesn't compile. It complains elem.at!(0) is not an lvalue. You know, the Proxy part of std.range.zip is really annoying. I'd prefer zip to return std.typecons.Tuples, even if that means stopping at the shortest range. That's what other languages do and it seems enough for most uses. Does this look like a good addition to std.range? The elegance of it is it
 solves
 the problem of providing syntactic sugar to ranges w/ zero ripple effects
 either
 in the compiler or in the rest of Phobos.  I'll file it somewhere more
 official
 after people review it a little and refine the idea, but I definitely think
 something similar to this has a legit place in std.range.

 If you're wondering how unpack works and don't want to grovel through all
 the
 code, it's tons of string mixin magic.  That's about the only way I was
 able to
 make it work.
I'll read it with pleasure(!). I'm there to learn anyway.Heck, I can't read my own string mixins a week after writing them, so it'll be a good exercise. As to making it an addition to std.range, I'm all for it. If people don't want to use it because they are leery of using opApply, too bad for them. Your small benchmark was interesting: overhead exists, but it's not awful. Philippe
Nov 11 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Philippe Sigaud (philippe.sigaud gmail.com)'s article
 --0016e6d99ba70494130478196284
 Content-Type: text/plain; charset=ISO-8859-1
 On Sun, Nov 8, 2009 at 22:43, dsimcha <dsimcha yahoo.com> wrote:
 Hot off the press and VERY prototype-ish:

 Code:
 http://pastebin.com/m2087e524

 Docs:
http://cis.jhu.edu/~dsimcha/unpackEnumerate.html<http://cis.jhu.edu/%7Edsimcha/unpackEnumerate.html>
 Cool! Thanks a lot.
I looked at opApply for D1 maybe two years ago, but never used it myself. I'll go and read this part of the docs. *test it* Hmm, I can get enumerate to work, but the Unpack part doesn't compile. It complains elem.at!(0) is not an lvalue.
Argh. That's because I was hacking around with Zip in my copy of Phobos right before I wrote this lib and forgot to change some stuff back to stock when testing. If you uncomment a /*ref*/ in there somewhere, which was leftover from a compiler bug a long time ago, it seems to work. The real problem is that that bit of cruft hasn't been removed from Phobos yet.
Nov 11 2009
next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Wed, Nov 11, 2009 at 16:48, dsimcha <dsimcha yahoo.com> wrote:


 *test it*
 Hmm, I can get enumerate to work, but the Unpack part doesn't compile. It
 complains elem.at!(0) is not an lvalue.
Argh. That's because I was hacking around with Zip in my copy of Phobos right before I wrote this lib and forgot to change some stuff back to stock when testing.
How do you deal with successive version of DMD? Do you have parallel installations? I know I hacked around Zip, Chain and such a few times, forgot about it, only to have it crushed by my next download :-(
  If you uncomment a /*ref*/ in there somewhere, which was leftover from a
 compiler bug a long time ago, it seems to work.  The real problem is that
 that bit
 of cruft hasn't been removed from Phobos yet.
OK, thanks. I get the impression that ref is viral... Either you have it everywhere or it'll block some compositions (chain(map() ...) . I'm wrong? Thanks again for your code, I'll test it and tell you how it went. Philippe
Nov 11 2009
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Wed, Nov 11, 2009 at 17:44, Philippe Sigaud <philippe.sigaud gmail.com>wrote:

 On Wed, Nov 11, 2009 at 16:48, dsimcha <dsimcha yahoo.com> wrote:

  If you uncomment a /*ref*/ in there somewhere, which was leftover from a
 compiler bug a long time ago, it seems to work.  The real problem is that
 that bit
 of cruft hasn't been removed from Phobos yet.
So, it works. Thanks! Strangely, on my computer it's frequently faster to use your enumerate than a raw range (re: your speed tests). Is there something I don't get? I thought (ie: read here) that opApply was slower than other means of iteration? My own enumerate, which produces a tuple(uint, T) as a lazy range is thrice as slow :( And, I just discovered that I can simply unpack a tuple with .field or .expand. auto t = tuple('a',1, 2.0); int foo(char a, int b, double c) { return to!int(a) * b * to!int(c);} foo(t.expand) ; // works. Is that common knowledge? It'd be nice addition to the std.typecons docs. Gosh, when I think of the time I spent to find a way to extract this information with template recursion, to map a n-ary function on tuple-producing ranges. And all this time, I could have done tuple-mapping as a simple map on "foo(a.expand)". Time for some heavy refactoring... Philippe
Nov 11 2009
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:

 Or I suppose it could
 be a special syntax in "foreach" for now.
D has an already very large amount of "for now" inside. Let's start designing things correctly & tidy from the start instead, for a change (and let's generalize some of the already present things). Bye, bearophile
Nov 08 2009