www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - range and algorithm-related stuff

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I'm working on the new range stuff and the range-based algorithm. In all 
likelihood, you all might be pleased with the results.

I wanted to gauge opinions on a couple of issues. One is, should the 
empty() member function for ranges be const? On the face of it it 
should, but I don't want that to be a hindrance. I presume non-const 
empty might be necessary sometimes, e.g. figuring out if a stream is 
empty effectively means fetching an element off it.

Second, there are arguably some range-related constructs that do not 
really qualify as "algorithms" (some of these are inspired from 
Haskell's standard library):

1. repeat(x) => returns an infinite range consisting of the element x 
repeated.

http://www.zvon.org/other/haskell/Outputprelude/repeat_f.html

2. take(n, range) => takes at most n elements out of a range (very 
useful with infinite ranges!)

http://www.zvon.org/other/haskell/Outputprelude/take_f.html

3. cycle(range)

http://www.zvon.org/other/haskell/Outputprelude/cycle_f.html

and a few others. I defined a new module called std.range that contains 
range fundamentals. Should I put these functions in there, or in 
std.algorithm? Or should I just merge them both to avoid confusion? If 
not, where to I draw the line between "it's an algorithm" and "it's a 
range utility"?


Thanks,

Andrei
Jan 24 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 I'm working on the new range stuff and the range-based algorithm. In all
 likelihood, you all might be pleased with the results.
 I wanted to gauge opinions on a couple of issues. One is, should the
 empty() member function for ranges be const? On the face of it it
 should, but I don't want that to be a hindrance. I presume non-const
 empty might be necessary sometimes, e.g. figuring out if a stream is
 empty effectively means fetching an element off it.
Play devil's advocate with me. Given that making empty() non-const would make the internal implementation of things more flexible, and that next() has to be non-const, so the range interface as a whole is non-const, what is the practical advantage to making empty() const?
 Second, there are arguably some range-related constructs that do not
 really qualify as "algorithms" (some of these are inspired from
 Haskell's standard library):
 1. repeat(x) => returns an infinite range consisting of the element x
 repeated.
 http://www.zvon.org/other/haskell/Outputprelude/repeat_f.html
 2. take(n, range) => takes at most n elements out of a range (very
 useful with infinite ranges!)
 http://www.zvon.org/other/haskell/Outputprelude/take_f.html
 3. cycle(range)
 http://www.zvon.org/other/haskell/Outputprelude/cycle_f.html
 and a few others. I defined a new module called std.range that contains
 range fundamentals. Should I put these functions in there, or in
 std.algorithm? Or should I just merge them both to avoid confusion? If
 not, where to I draw the line between "it's an algorithm" and "it's a
 range utility"?
I think these Haskell inspired functions belong in std.range. To me an algorithm is something that manipulates the contents of a range, i.e. the actual values contained in the range. A range utility is something that doesn't care what's in the range, and just affects the way the range is iterated over, etc. A good test for this would be to assume that the contents of the range are null references. A range utility would still work because it doesn't care what the actual contents are. An algorithm (at least one that does anything useful other than initialize the references) could not work in this case.
Jan 24 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 I'm working on the new range stuff and the range-based algorithm. In all
 likelihood, you all might be pleased with the results.
 I wanted to gauge opinions on a couple of issues. One is, should the
 empty() member function for ranges be const? On the face of it it
 should, but I don't want that to be a hindrance. I presume non-const
 empty might be necessary sometimes, e.g. figuring out if a stream is
 empty effectively means fetching an element off it.
Play devil's advocate with me. Given that making empty() non-const would make the internal implementation of things more flexible, and that next() has to be non-const, so the range interface as a whole is non-const, what is the practical advantage to making empty() const?
If empty, head, toe, opIndex, and length are const, there's plenty you can do with the const range. The problem is that there are higher-order range that need to make assumptions about the underlying range. Consider a simple range Retro that iterates another range in reverse order: struct Retro(R) { private R _original; ... bool empty() { return _original.empty; } } Should Retro make empty const or non-const? If it does, then ranges that make const non-empty won't work with Retro. If it doesn't, then users can't use empty for a const Retro.
 Second, there are arguably some range-related constructs that do not
 really qualify as "algorithms" (some of these are inspired from
 Haskell's standard library):
 1. repeat(x) => returns an infinite range consisting of the element x
 repeated.
 http://www.zvon.org/other/haskell/Outputprelude/repeat_f.html
 2. take(n, range) => takes at most n elements out of a range (very
 useful with infinite ranges!)
 http://www.zvon.org/other/haskell/Outputprelude/take_f.html
 3. cycle(range)
 http://www.zvon.org/other/haskell/Outputprelude/cycle_f.html
 and a few others. I defined a new module called std.range that contains
 range fundamentals. Should I put these functions in there, or in
 std.algorithm? Or should I just merge them both to avoid confusion? If
 not, where to I draw the line between "it's an algorithm" and "it's a
 range utility"?
I think these Haskell inspired functions belong in std.range. To me an algorithm is something that manipulates the contents of a range, i.e. the actual values contained in the range. A range utility is something that doesn't care what's in the range, and just affects the way the range is iterated over, etc. A good test for this would be to assume that the contents of the range are null references. A range utility would still work because it doesn't care what the actual contents are. An algorithm (at least one that does anything useful other than initialize the references) could not work in this case.
That's an excellent principle! Range functions deal with the range's topology, algorithms deal with the range's content. Thanks a lot. Andrei
Jan 24 2009
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Andrei Alexandrescu wrote:
 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 article
 I'm working on the new range stuff and the range-based algorithm. In all
 likelihood, you all might be pleased with the results.
 I wanted to gauge opinions on a couple of issues. One is, should the
 empty() member function for ranges be const? On the face of it it
 should, but I don't want that to be a hindrance. I presume non-const
 empty might be necessary sometimes, e.g. figuring out if a stream is
 empty effectively means fetching an element off it.
Play devil's advocate with me. Given that making empty() non-const would make the internal implementation of things more flexible, and that next() has to be non-const, so the range interface as a whole is non-const, what is the practical advantage to making empty() const?
If empty, head, toe, opIndex, and length are const, there's plenty you can do with the const range. The problem is that there are higher-order range that need to make assumptions about the underlying range. Consider a simple range Retro that iterates another range in reverse order: struct Retro(R) { private R _original; ... bool empty() { return _original.empty; } } Should Retro make empty const or non-const? If it does, then ranges that make const non-empty won't work with Retro. If it doesn't, then users can't use empty for a const Retro.
Couldn't you do something like this? struct Retro(R) { private R _original; ... // Don't know how to write the isMemberConst test... static if( isMemberConst!( R, "empty" ) ) { const bool empty() { return _original.empty; } } else { bool empty() { return _original.empty; } } } Propogate const-ness where possible, but don't prevent it from functioning otherwise. Incidentally, I'd have proposed something like: protectionOf!(_original.empty) bool empty() { return _original.empty; } But I doubt that's possible with templates :P
 [snip]
I'm with dsimcha on the distinction between std.algorithm and std.range. -- Daniel
Jan 24 2009
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-01-24 20:09:07 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 I'm working on the new range stuff and the range-based algorithm. In 
 all likelihood, you all might be pleased with the results.
 
 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
Another case where you want to propagate the constness depending on the arguments... do we need something akin equivariant templates, with constness propagation?
 Second, there are arguably some range-related constructs that do not 
 really qualify as "algorithms" (some of these are inspired from 
 Haskell's standard library):
 
 1. repeat(x) => returns an infinite range consisting of the element x repeated.
 
 http://www.zvon.org/other/haskell/Outputprelude/repeat_f.html
 
 2. take(n, range) => takes at most n elements out of a range (very 
 useful with infinite ranges!)
 
 http://www.zvon.org/other/haskell/Outputprelude/take_f.html
 
 3. cycle(range)
 
 http://www.zvon.org/other/haskell/Outputprelude/cycle_f.html
 
 and a few others.
I'd add a second optional argument to repeat and cycle where you can specify how many times you want to loop. When argument is omitted, it's infinite.
 I defined a new module called std.range that contains range 
 fundamentals. Should I put these functions in there, or in 
 std.algorithm? Or should I just merge them both to avoid confusion? If 
 not, where to I draw the line between "it's an algorithm" and "it's a 
 range utility"?
I'd go with std.range. In fact, I'd remove std.algorithm and put everything into std.range. After all, all those algorithms apply to ranges. After all, if we are going to have algorithms for other thing than ranges -- like tuples -- then they should be in their own module -- like std.tuple -- no? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jan 24 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michel Fortin wrote:
 On 2009-01-24 20:09:07 -0500, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:
 
 I'm working on the new range stuff and the range-based algorithm. In 
 all likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
Another case where you want to propagate the constness depending on the arguments... do we need something akin equivariant templates, with constness propagation?
 Second, there are arguably some range-related constructs that do not 
 really qualify as "algorithms" (some of these are inspired from 
 Haskell's standard library):

 1. repeat(x) => returns an infinite range consisting of the element x 
 repeated.

 http://www.zvon.org/other/haskell/Outputprelude/repeat_f.html

 2. take(n, range) => takes at most n elements out of a range (very 
 useful with infinite ranges!)

 http://www.zvon.org/other/haskell/Outputprelude/take_f.html

 3. cycle(range)

 http://www.zvon.org/other/haskell/Outputprelude/cycle_f.html

 and a few others.
I'd add a second optional argument to repeat and cycle where you can specify how many times you want to loop. When argument is omitted, it's infinite.
Repeat a finite number of times is called replicate in Haskell, and the D implementation is eerily close to Haskell's: auto replicate(T)(size_t n, T value) { return take(n, repeat(value)); } It even compiles and runs. Just you can't document it because ddoc doesn't work with "auto" functions :o).
 I defined a new module called std.range that contains range 
 fundamentals. Should I put these functions in there, or in 
 std.algorithm? Or should I just merge them both to avoid confusion? If 
 not, where to I draw the line between "it's an algorithm" and "it's a 
 range utility"?
I'd go with std.range. In fact, I'd remove std.algorithm and put everything into std.range. After all, all those algorithms apply to ranges. After all, if we are going to have algorithms for other thing than ranges -- like tuples -- then they should be in their own module -- like std.tuple -- no?
I guess, but today there are algorithms that don't operate on ranges inside std.algorithm, such as move and swap. Andrei
Jan 24 2009
prev sibling next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 24 Jan 2009 17:09:07 -0800, Andrei Alexandrescu wrote:

 I'm working on the new range stuff and the range-based algorithm. In all 
 likelihood, you all might be pleased with the results.
 
 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
I have a hard time imagining a use for a const range. They're supposed to be structs, right? Const value argument is not a very useful idiom. Also you cannot do much with a const range. OTOH you never know what downs will invent next time. Supporting const transparency as much as possible seems to be the right solution.
Jan 25 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 Sat, 24 Jan 2009 17:09:07 -0800, Andrei Alexandrescu wrote:
 
 I'm working on the new range stuff and the range-based algorithm. In all 
 likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
I have a hard time imagining a use for a const range.
Read-only arrays for example.
 They're supposed
 to be structs, right?
Yah.
 Const value argument is not a very useful idiom.
Not in C++ you mean! In D the ballgame is rather different.
 Also you cannot do much with a const range.
If it has random access, it's effectively a read-only array.
 OTOH you never know what downs will invent next time.  Supporting const
 transparency as much as possible seems to be the right solution.
Yah. And (as I also discovered yesterday without joy) ref transparency as well... Andrei
Jan 25 2009
parent reply Christopher Wright <dhasenan gmail.com> writes:
Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Sat, 24 Jan 2009 17:09:07 -0800, Andrei Alexandrescu wrote:

 I'm working on the new range stuff and the range-based algorithm. In 
 all likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
I have a hard time imagining a use for a const range.
Read-only arrays for example.
A range is essentially an iterator. It has to change its internal state to move to the next element. So a const range will not allow you to iterate over the members of a const array. It will allow you to iterate over a single element, either once or an infinite number of times. You could change ranges to have "Range next()" rather than "void next()", and that would allow you to use a const Range for reasons other than checking whether it's empty. Iterating over a range would then look like: for (auto range = getRange(); !range.empty; range = range.next) {} I don't see much purpose to this. If you want polymorphic ranges, you're going to use a class for ranges. This will incur a lot of allocations, which will be dog slow. The current design would only allocate once if your range is a class.
Jan 25 2009
next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sun, 25 Jan 2009 23:32:13 +0300, Christopher Wright <dhasenan gmail.com>
wrote:

 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Sat, 24 Jan 2009 17:09:07 -0800, Andrei Alexandrescu wrote:

 I'm working on the new range stuff and the range-based algorithm. In  
 all likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the  
 empty() member function for ranges be const? On the face of it it  
 should, but I don't want that to be a hindrance. I presume non-const  
 empty might be necessary sometimes, e.g. figuring out if a stream is  
 empty effectively means fetching an element off it.
I have a hard time imagining a use for a const range.
Read-only arrays for example.
A range is essentially an iterator. It has to change its internal state to move to the next element. So a const range will not allow you to iterate over the members of a const array. It will allow you to iterate over a single element, either once or an infinite number of times. You could change ranges to have "Range next()" rather than "void next()", and that would allow you to use a const Range for reasons other than checking whether it's empty. Iterating over a range would then look like: for (auto range = getRange(); !range.empty; range = range.next) {} I don't see much purpose to this. If you want polymorphic ranges, you're going to use a class for ranges. This will incur a lot of allocations, which will be dog slow. The current design would only allocate once if your range is a class.
I belive he was talking about random-access ranges. A range.subrange() could also be const, though.
Jan 25 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Sat, 24 Jan 2009 17:09:07 -0800, Andrei Alexandrescu wrote:

 I'm working on the new range stuff and the range-based algorithm. In 
 all likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
I have a hard time imagining a use for a const range.
Read-only arrays for example.
A range is essentially an iterator. It has to change its internal state to move to the next element. So a const range will not allow you to iterate over the members of a const array. It will allow you to iterate over a single element, either once or an infinite number of times.
A range offering random access can give me any element without the range actually changing.
 You could change ranges to have "Range next()" rather than "void 
 next()", and that would allow you to use a const Range for reasons other 
 than checking whether it's empty. Iterating over a range would then look 
 like:
 for (auto range = getRange(); !range.empty; range = range.next) {}
 
 I don't see much purpose to this. If you want polymorphic ranges, you're 
 going to use a class for ranges. This will incur a lot of allocations, 
 which will be dog slow. The current design would only allocate once if 
 your range is a class.
I agree. Range.next currently mutates the range in-place. Andrei
Jan 25 2009
parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sun, 25 Jan 2009 13:53:43 -0800, Andrei Alexandrescu wrote:

 Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Sat, 24 Jan 2009 17:09:07 -0800, Andrei Alexandrescu wrote:

 I'm working on the new range stuff and the range-based algorithm. In 
 all likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
I have a hard time imagining a use for a const range.
Read-only arrays for example.
A range is essentially an iterator. It has to change its internal state to move to the next element. So a const range will not allow you to iterate over the members of a const array. It will allow you to iterate over a single element, either once or an infinite number of times.
A range offering random access can give me any element without the range actually changing.
This restricts you to algorithms working exclusively with random-access ranges. Any more generic algos won't work with const ranges. How many of those do you have?
Jan 26 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 Sun, 25 Jan 2009 13:53:43 -0800, Andrei Alexandrescu wrote:
 
 Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Sat, 24 Jan 2009 17:09:07 -0800, Andrei Alexandrescu wrote:

 I'm working on the new range stuff and the range-based algorithm. In 
 all likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
I have a hard time imagining a use for a const range.
Read-only arrays for example.
A range is essentially an iterator. It has to change its internal state to move to the next element. So a const range will not allow you to iterate over the members of a const array. It will allow you to iterate over a single element, either once or an infinite number of times.
A range offering random access can give me any element without the range actually changing.
This restricts you to algorithms working exclusively with random-access ranges. Any more generic algos won't work with const ranges. How many of those do you have?
Stable partition, everything related to binary search, heap functions, and radial iteration come to mind. Anyhow, for now let's go with requiring ref returns and we'll see how that fares. I'm almost done with std.algorithm. (I am cheating because Walter emailed me a fixed dmd that allows ref returns from template funs.) Very exciting. I will post the documentation as soon as I'm done. Andrei
Jan 26 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
You can take a look at my dlibs, they may give you ideas, because contain all
such lazy functions and some more.

I presume you have chosen to not tell apart the lazy functions from the eager
ones.


 2. take(n, range) => takes at most n elements out of a range (very 
 useful with infinite ranges!)
I think that an xslice is better than an xtake (it's essentially a lazy version of a more powerful version of the slicing that in D you can do on arrays). Bye, bearophile
Jan 25 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu: You can take a look at my dlibs, they may give
 you ideas, because contain all such lazy functions and some more.
You mean the libs in my signature? Sure. :o) In http://www.fantascienza.net/leonardo/so/dlibs/func.html, I like functions such as any and all; they aren't readily implementable by combining other primitives. Predicates will be passed as aliases though. Probably array and assign will make the cut as well. I like frequency a lot, would be very useful for my NLP code (although probably I'd replace it with counts and let the user normalize). Other are already present in similar forms and sometimes with different names (but I think that e.g. "chain" is better than my "span").
 I presume you have chosen to not tell apart the lazy functions from
 the eager ones.
Well, so far it's been pretty clear what should be lazy and what shouldn't, e.g. map is lazy and reduce is eager :o). Also, signature can also distinguish rather easily between lazy and eager. For example, you have lazyAnd and lazyOr, but there's no necessity for eagerAnd and eagerOr; those would be rather silly. You do have all and any, which is eager (in a sense), but also has a signature that makes it not compete with lazyAnd and lazyOr.
 2. take(n, range) => takes at most n elements out of a range (very
  useful with infinite ranges!)
I think that an xslice is better than an xtake (it's essentially a lazy version of a more powerful version of the slicing that in D you can do on arrays).
The problem with xslice is that it has a O(n) setup time on a input or forward range, and that that cost is somewhat hidden in a non-decomposable manner. People who need something like slice can call drop to advance the range in documented linear time, and then use take. Andrei -- http://www.fantascienza.net/leonardo/so/dlibs/all.html
Jan 26 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 You mean the libs in my signature? Sure. :o)
:-]
 I like frequency a 
 lot, would be very useful for my NLP code (although probably I'd replace 
 it with counts and let the user normalize).
It doesn't perform a normalization (I can add to it few more functionalities can be added, but not this one), I'll improve its documentation.
 Also, signature can also distinguish rather easily between lazy and eager.
Well, if you are using an IDE that's easy. Otherwise you may need your memory or a manual. Several of my lazy functions are inspired by: http://docs.python.org/library/itertools.html I don't like to show people programs that require my large dlibs to run, so I hope 80-90+% of my dlibs will be obsolete in D2. D2 supports immutable data structures, so the D Std lib may gain some immutable data structures, like finger trees, etc. Multi-threading, pure functions, immutable data structures, and lazy computations are forms of computation quite different from the usual ones done in C. I am sure the current D2 Std lib is using only a small part of the potential usages of such programming styles :-) I think some of such usages have yet to be invented (when C++ developers have added templates to C++ surely they didn't know all the weird and creative usages of them used for example in Blitz++ or Boost). Seeing the work of Walter, your work here, and the community in this newsgroup, and on the other hand seeing how a significant percentage of computer science is done in academy, and what it produces, I'd say that there's more innovation and creativity here :-) Lot of CS is out of the world, not doing useful things, and sometimes not even much intelligent. Yet, I know many smart folks working in CS in academy, and I think the design of D2 may be improved by some "hard" thinking done by that researchers. It's a pity there are so little communication between the two groups. Bye, bearophile
Jan 26 2009
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
bearophile:
 and sometimes not even much intelligent.
I meant: "sometimes not even much interesting."
Jan 26 2009
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
 I like frequency a lot, would be very useful for my NLP code
 (although probably I'd replace it with counts and let the user
 normalize).
It doesn't perform a normalization (I can add to it few more functionalities can be added, but not this one), I'll improve its documentation.
Oh, it's the name that threw me off. I automatically assumed that "frequency" is normalized counts.
 Several of my lazy functions are inspired by: 
 http://docs.python.org/library/itertools.html
 
 I don't like to show people programs that require my large dlibs to
 run, so I hope 80-90+% of my dlibs will be obsolete in D2.
 
 D2 supports immutable data structures, so the D Std lib may gain some
 immutable data structures, like finger trees, etc.
 
 Multi-threading, pure functions, immutable data structures, and lazy
 computations are forms of computation quite different from the usual
 ones done in C. I am sure the current D2 Std lib is using only a
 small part of the potential usages of such programming styles :-) I
 think some of such usages have yet to be invented (when C++
 developers have added templates to C++ surely they didn't know all
 the weird and creative usages of them used for example in Blitz++ or
 Boost).
 
 Seeing the work of Walter, your work here, and the community in this
 newsgroup, and on the other hand seeing how a significant percentage
 of computer science is done in academy, and what it produces, I'd say
 that there's more innovation and creativity here :-) Lot of CS is out
 of the world, not doing useful things, and sometimes not even much
 intelligent. Yet, I know many smart folks working in CS in academy,
 and I think the design of D2 may be improved by some "hard" thinking
 done by that researchers. It's a pity there are so little
 communication between the two groups.
 
 Bye, bearophile
That's good to read, and a definite departure from your discouragement of a few weeks ago. I agree, there are vast areas as of yet unexplored in D2, and compile-time introspection will open up even many more. (By the way, I figured how to do introspection, I need to discuss details with Walter and Sean after which it'll be "a simple matter of implementation".) I agree that run-of-the-mill academic work may not be that good, but then there's a lot of run-of-the-mill practitioner work. I'd say our work on D2 as of now is not quite up to par with top academic work. The communication is mostly due to us doing little to publicize D2. But that will improve starting soon. Andrei
Jan 26 2009
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 I'm working on the new range stuff and the range-based algorithm. In all 
 likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it should, 
 but I don't want that to be a hindrance. I presume non-const empty might 
 be necessary sometimes, e.g. figuring out if a stream is empty effectively 
 means fetching an element off it.
Ranges are structs. It should not matter if you want to make some const and some non-const. Basically, it depends on the range implementation. If you can make it const, make it const, if not, don't make it const. It shouldn't break any APIs. For example, an array range might have empty be const, but a stream range might not. What matters is what functions you can use those ranges in, but those are generally templated functions, so the compiler will tell you whether it can be used or not when it tries to compile it. Personally, I see no benefit to having empty() be const. What benefits do you gain by specifically making empty const and the other functions not const? Presumably, the underlying container must be not const in order for head, next, etc. to work properly, so there is no requirement there. -Steve
Jan 26 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 I'm working on the new range stuff and the range-based algorithm. In all 
 likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it should, 
 but I don't want that to be a hindrance. I presume non-const empty might 
 be necessary sometimes, e.g. figuring out if a stream is empty effectively 
 means fetching an element off it.
Ranges are structs. It should not matter if you want to make some const and some non-const. Basically, it depends on the range implementation. If you can make it const, make it const, if not, don't make it const. It shouldn't break any APIs.
The problem is "higher-order" ranges - ranges that take other ranges as argument. For example, consider Retro, a range that iterates another range backwards. struct Retro(Range) { Range _input; ... bool empty() { return _input.empty; } } If Retro.empty is const and Range.empty is not, that won't compile. If Retro.empty is non-const and Range.empty is const, it will compile, but passing a const Retro won't work as well as passing a const Range.
 For example, an array range might have empty be const, but a stream range 
 might not.  What matters is what functions you can use those ranges in, but 
 those are generally templated functions, so the compiler will tell you 
 whether it can be used or not when it tries to compile it.
 
 Personally, I see no benefit to having empty() be const.  What benefits do 
 you gain by specifically making empty const and the other functions not 
 const?  Presumably, the underlying container must be not const in order for 
 head, next, etc. to work properly, so there is no requirement there.
If you have a constant range with random access, empty, length, and opIndex should be enough for you to look at anything you want without altering the range itself. Andrei
Jan 26 2009
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 I'm working on the new range stuff and the range-based algorithm. In all 
 likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it 
 should, but I don't want that to be a hindrance. I presume non-const 
 empty might be necessary sometimes, e.g. figuring out if a stream is 
 empty effectively means fetching an element off it.
Ranges are structs. It should not matter if you want to make some const and some non-const. Basically, it depends on the range implementation. If you can make it const, make it const, if not, don't make it const. It shouldn't break any APIs.
The problem is "higher-order" ranges - ranges that take other ranges as argument. For example, consider Retro, a range that iterates another range backwards. struct Retro(Range) { Range _input; ... bool empty() { return _input.empty; } } If Retro.empty is const and Range.empty is not, that won't compile. If Retro.empty is non-const and Range.empty is const, it will compile, but passing a const Retro won't work as well as passing a const Range.
Hm... you need the template code to take the place of the designer who looks at code and decides that something should or should not be const. For the case of Retro, you'd need some sort of const detection on Range's empty function. But in general, when is it necessary for Range.empty to be const (and therefore Retro.empty to be const)? If Range is working with a const container, wouldn't you expect that Range itself would not be const, just the reference to the container is const? A range that doesn't mutate doesn't seem like a very useful idiom.
 For example, an array range might have empty be const, but a stream range 
 might not.  What matters is what functions you can use those ranges in, 
 but those are generally templated functions, so the compiler will tell 
 you whether it can be used or not when it tries to compile it.

 Personally, I see no benefit to having empty() be const.  What benefits 
 do you gain by specifically making empty const and the other functions 
 not const?  Presumably, the underlying container must be not const in 
 order for head, next, etc. to work properly, so there is no requirement 
 there.
If you have a constant range with random access, empty, length, and opIndex should be enough for you to look at anything you want without altering the range itself.
Why not just pass the container itself if the range isn't going to mutate? The container probably already has opIndex, and length (empty AFAIK plays no role here). If it doesn't it seems like a bad design to me. If you have a good example where this is useful, I'd like to hear it. -Steve
Jan 26 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 If Retro.empty is const and Range.empty is not, that won't compile. If 
 Retro.empty is non-const and Range.empty is const, it will compile, but 
 passing a const Retro won't work as well as passing a const Range.
Hm... you need the template code to take the place of the designer who looks at code and decides that something should or should not be const. For the case of Retro, you'd need some sort of const detection on Range's empty function.
Exactly. A sort of "auto-qualify". But I guess at that point some people will throw their hands in the air, and those who'd thrown their hands already will now throw their feet too. :o)
 But in general, when is it necessary for Range.empty to be const (and 
 therefore Retro.empty to be const)?  If Range is working with a const 
 container, wouldn't you expect that Range itself would not be const, just 
 the reference to the container is const?  A range that doesn't mutate 
 doesn't seem like a very useful idiom.
I agree, but then I don't want to make many assumptions about future designs. In general I'm not sure we can always assume the type of the range and the environment make it easy to obtain mutable ranges for immutable elements. Probably many people would expect to just write this: void foo(in SomeRange range) { ... } and have foo look at the range and its elements no problem, without actually iterating through the range.
 For example, an array range might have empty be const, but a stream range 
 might not.  What matters is what functions you can use those ranges in, 
 but those are generally templated functions, so the compiler will tell 
 you whether it can be used or not when it tries to compile it.

 Personally, I see no benefit to having empty() be const.  What benefits 
 do you gain by specifically making empty const and the other functions 
 not const?  Presumably, the underlying container must be not const in 
 order for head, next, etc. to work properly, so there is no requirement 
 there.
If you have a constant range with random access, empty, length, and opIndex should be enough for you to look at anything you want without altering the range itself.
Why not just pass the container itself if the range isn't going to mutate? The container probably already has opIndex, and length (empty AFAIK plays no role here). If it doesn't it seems like a bad design to me. If you have a good example where this is useful, I'd like to hear it.
I need to think that through. Andrei
Jan 26 2009
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Probably many people would expect to just write this:

 void foo(in SomeRange range) { ... }

 and have foo look at the range and its elements no problem, without 
 actually iterating through the range.
Consider that some type const Range!(T), where T is a reference that contains the data to be iterated can be copied without casting to a Range!(const T) I think you even filed an enhancement on this, if I'm not mistaken. So a void foo(in SomeRange range) can be rewritten as void foo(ConstSomeRange), and you should technically be able to pass a const(SomeRange) into it. This is because a range is a struct, and aside from the reference to the data, everything else is stack local. Assuming ConstSomeRange is the same as SomeRange but with the data reference being const. -Steve
Jan 26 2009
prev sibling parent reply Denis Koroskin <2korden gmail.com> writes:
Steven Schveighoffer Wrote:

 "Andrei Alexandrescu" wrote
 I'm working on the new range stuff and the range-based algorithm. In all 
 likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the 
 empty() member function for ranges be const? On the face of it it should, 
 but I don't want that to be a hindrance. I presume non-const empty might 
 be necessary sometimes, e.g. figuring out if a stream is empty effectively 
 means fetching an element off it.
Ranges are structs. It should not matter if you want to make some const and some non-const. Basically, it depends on the range implementation. If you can make it const, make it const, if not, don't make it const. It shouldn't break any APIs. For example, an array range might have empty be const, but a stream range might not. What matters is what functions you can use those ranges in, but those are generally templated functions, so the compiler will tell you whether it can be used or not when it tries to compile it. Personally, I see no benefit to having empty() be const. What benefits do you gain by specifically making empty const and the other functions not const? Presumably, the underlying container must be not const in order for head, next, etc. to work properly, so there is no requirement there. -Steve
Checking if a range is empty() prior to accessing its head is useful. If empty() is const, you can't do that.
Jan 26 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Mon, 26 Jan 2009 23:37:25 +0300, Denis Koroskin <2korden gmail.com> wrote:

 Steven Schveighoffer Wrote:

 "Andrei Alexandrescu" wrote
 I'm working on the new range stuff and the range-based algorithm. In  
all
 likelihood, you all might be pleased with the results.

 I wanted to gauge opinions on a couple of issues. One is, should the
 empty() member function for ranges be const? On the face of it it  
should,
 but I don't want that to be a hindrance. I presume non-const empty  
might
 be necessary sometimes, e.g. figuring out if a stream is empty  
effectively
 means fetching an element off it.
Ranges are structs. It should not matter if you want to make some const and some non-const. Basically, it depends on the range implementation. If you can make it const, make it const, if not, don't make it const. It shouldn't break any APIs. For example, an array range might have empty be const, but a stream range might not. What matters is what functions you can use those ranges in, but those are generally templated functions, so the compiler will tell you whether it can be used or not when it tries to compile it. Personally, I see no benefit to having empty() be const. What benefits do you gain by specifically making empty const and the other functions not const? Presumably, the underlying container must be not const in order for head, next, etc. to work properly, so there is no requirement there. -Steve
Checking if a range is empty() prior to accessing its head is useful. If empty() is const, you can't do that.
Err.. if empty() is not const and you have a const range reference.
Jan 26 2009
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Denis Koroskin" wrote
 On Mon, 26 Jan 2009 23:37:25 +0300, Denis Koroskin <2korden gmail.com> 
 wrote:
 Checking if a range is empty() prior to accessing its head is useful. If 
 empty() is const, you can't do that.
Err.. if empty() is not const and you have a const range reference.
empty not being const does not imply that you can't access a const member. Empty not being const allows all access. The question is whether a wrapping range (or any range for that matter) should implement empty as const, as an empty that IS const cannot call a non-const function of a member. -Steve
Jan 26 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 27 Jan 2009 05:13:31 +0300, Steven Schveighoffer <schveiguy yahoo.com>
wrote:

 "Denis Koroskin" wrote
 On Mon, 26 Jan 2009 23:37:25 +0300, Denis Koroskin <2korden gmail.com>
 wrote:
 Checking if a range is empty() prior to accessing its head is useful.  
 If
 empty() is const, you can't do that.
Err.. if empty() is not const and you have a const range reference.
empty not being const does not imply that you can't access a const member. Empty not being const allows all access. The question is whether a wrapping range (or any range for that matter) should implement empty as const, as an empty that IS const cannot call a non-const function of a member. -Steve
I didn't say non-const empty restricts any member access, did I? My point is, if empty() returns false (and you can't know it because empty() is not const) you may get either an invalid result, access violation or an exception on member access: void foo(Range)(const Range r) { // is r empty? can I /safely/ access its head? // let's cross our fingers and try it // hoping that it is either not empty or throws an exception auto head = r.head; ... }
Jan 27 2009
next sibling parent Christopher Wright <dhasenan gmail.com> writes:
Denis Koroskin wrote:
 My point is, if empty() returns false (and you can't know it because 
 empty() is not const) you may get either an invalid result, access 
 violation or an exception on member access:
 
 void foo(Range)(const Range r)
 {
    // is r empty? can I /safely/ access its head?
    // let's cross our fingers and try it
    // hoping that it is either not empty or throws an exception
    auto head = r.head;
    ...
 }
 
Then it's good design to implement empty() as const. In some cases, you might not be able to (I believe someone mentioned IO streams as such a case). There, empty() might also throw.
Jan 27 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Denis Koroskin" wrote
 On Tue, 27 Jan 2009 05:13:31 +0300, Steven Schveighoffer 
 <schveiguy yahoo.com> wrote:

 "Denis Koroskin" wrote
 On Mon, 26 Jan 2009 23:37:25 +0300, Denis Koroskin <2korden gmail.com>
 wrote:
 Checking if a range is empty() prior to accessing its head is useful. 
 If
 empty() is const, you can't do that.
Err.. if empty() is not const and you have a const range reference.
empty not being const does not imply that you can't access a const member. Empty not being const allows all access. The question is whether a wrapping range (or any range for that matter) should implement empty as const, as an empty that IS const cannot call a non-const function of a member. -Steve
I didn't say non-const empty restricts any member access, did I?
Sorry, I sort of misinterpreted what you said. You said, "If empty() is const you can't do that." and then sent out a followup email saying "Err.. if empty() is not const and you have a const range reference." I interpreted that as you replacing the statement "If empty() is const" with "if empty() is not const and you have a const range reference," making the complete statement "if empty() is not const and you have a const range reference, you can't do that." Which means I thought you meant, if you have an empty that is not const and a const range reference (which I interpreted as the reference to the data inside the range), then empty cannot access the const member. I see now what you were saying. However, I don't think this is a common situation. A range is a struct so it can be a value type. Having a const reference to a range means you can easily copy everything out of the const range to a non-const range with a const reference to the data. Note that if I parameterize my range based on the data type, const Range!(T) should be implicitly convertable to Range!(const T), where the range itself is made mutable, but the data reference is const. So there is little point to making a range itself const. That being said, I'll also point out that creating a const range reference to access a single element from a range is a very questionable practice, the design might be better served to just access that element directly from the data container itself. Similarly with random access, any container that provides a random access range to its data will most likely also provide an opIndex to get the data directly. Ranges being structs, they are only really useful in functions that already know the data type, or template functions which are told the data type. I don't see any reason why you can't substitute the data structure itself that supplies a length and opIndex member instead of the range. -Steve
Jan 27 2009
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

Probably array and assign will make the cut as well.<
assign() of my libs is an ugly hack, I'd like such feature to be built-in in the language, with the comma syntax: a, b = foo(...) I use such syntax all the time in Python and it's quite handy. Something like my Record/record can be quite useful as built-in with a destructuring syntax too.
Other are already present in similar forms and sometimes with different names
(but I think that e.g. "chain" is better than my "span").<
As you may have seen xchain allows to chain two or more lazy or eager iterables. If you nest xchains, they look inside what you give them, and simplify, so if you write: xchain(a, xchain(b, c)) In my libs that's exactly as writing: xchain(a, b, c) That improves performance. In my libs there is also the Chainable!() mixing, that you can put into your iterable classes (and all lazy generators contain it already), so you can already to: auto d = xrange(5) ~ xfilter([1,0,3]) ~ [3, 5]; In D2 I think such chaining syntax can be built-in. There is already the relative class method to overload.
The problem with xslice is that it has a O(n) setup time on a input or forward
range, and that that cost is somewhat hidden in a non-decomposable manner.
People who need something like slice can call drop to advance the range in
documented linear time, and then use take.<
In the end what I was looking for is this syntax: auto e = xfilter([1,0,3,5,7]); // now array(e) == [1, 3, 5, 7] auto e2 = e[1..3]; // now array(e2) == [3, 5] That is, to allow the concat and slicing syntax among lazy (and eager) iterables too. Such stuff is present in Python and Haskell (but in Python the syntax is a bit worse, you need xchain and islice functions to do that). Also, take a look at my boolean() function, it's quite handy. ------------------
Oh, it's the name that threw me off. I automatically assumed that "frequency"
is normalized counts.<
You are right. I have an idea regarding how to fix that: I already have a count() function that given an iterable (lazy or not) and an item, counts how many of such items are present, and returns their count. So I can remove the frequency() function and make count more flexible: when you give one item (beside the iterable), it gives its count, when you give just the iterable it returns the associative arrays of items:count. An optimization: currently D associative arrays are very slow when they are small (you can scan an array of length about 85+ items in the same time you can access an associative array of 85 pairs), so at the beginning, when the number of item-count pairs is low it uses internally just one array (or an ArrayBuilder). And switches to using an associative array when the number of keys is more than about 50. At the end such improved count()returns an an associative array when you give it only an iterable (so creates the associative array if not already present). -------------- Regarding such topic, there are other things that will be very handy (I have already explained such things in the past, but if you want more info ask): - Improve a lot the == among arrays and among associative arrays and among structs. This is a large hole in the language. - foreach scan on associative arrays keys first, and values then. This will simplify *LOT* of code. - A very clean syntax to define lazy/eager arrays/iterables (like in Python) as single expressions. - A clean syntax to define lazy functions, better than the current opApply. - More intelligent printing function. At the moment D2 needs to become a little more handy regarding those things. The most common idioms must be so simple that you don't need to look for them into the manual each time. A purpose of a good language is to "force" you to look in the manual only when you are doing something uncommon (for example for performance). Bye, bearophile
Jan 26 2009
parent bearophile <bearophileHUGS lycos.com> writes:
I was forgetting, Clojure (the nice lisp variant that works on the JavaVM) also
has pmap(), a parallel map() function, that executes the functions in parallel,
on different CPU cores. Such stuff can become quite useful in D2 too:

http://clojure.org/api#toc571

Bye,
bearophile
Jan 26 2009