digitalmars.D - [RFC] Add an operator for ranges to D. Pros and cons?

Dejan Lekic (25/25) Nov 07 2012 Dear D community, I do not know about You, but I certainly do not

Dejan Lekic (4/30) Nov 07 2012 EDIT:
Peter Alexander (4/15) Nov 07 2012 I'm confused. So does this new operator just do the same thing as

Dejan Lekic (11/31) Nov 07 2012 UFCS is what makes that code-mess I started with.

Tavi Cacina (11/13) Nov 07 2012 yeah, the range chaining is quite cool. I am just starting with
bearophile (19/32) Nov 07 2012 I suggest to format it this way, it's more readable:

Dejan Lekic (18/66) Nov 07 2012 I already did try using the tilda operator for a while, then I realised ...

bearophile (7/12) Nov 07 2012 "~" is used for all arrays (while array.Appender used put for

Dejan Lekic (4/19) Nov 07 2012 ~ with arrays has different semantics I fear.

Jonathan M Davis (24/28) Nov 07 2012 As far as I can tell, it adds zero functionality. It's purely a matter o...

Dejan Lekic (8/11) Nov 07 2012 It looks readable in this case, but to have it clean like that your para...

"Dejan Lekic" <dejan.lekic gmail.com> writes:

Dear D community, I do not know about You, but I certainly do not 
like writing code like:

inRange.fooRange(param).barRange.
   .bazRange(param1, param2).outRange;

I also tried to use operators ">>" and "~" but these make it 
confusing and hard to understand what the statement actually does.

Therefore I would like to know what do you think about the idea 
of having additional operator exclusively made for ranges? This 
operator would make it obvious that data are "streamed" (lack of 
better term) among ranges.

The first name I could come up with was "opArrow" but "opData" 
could also be okay, and operator would be either "~>" or "->".

This would give us an obvious, unambiguous statement:

Console.in ~> filter1(param) ~> fooRange ~> Console.out;
// Console is an imaginary class/struct

Or:
arr ~> odd ~> random ~> randomOdd;

I humbly believe that ranges are one of the most important 
concepts in D and that, plus the readability increase are two 
valid reasons for having this new operator.

I am also asking this because my point of view is strictly 
pragmatic - there may be technical reasons why we should not have 
this, or why we should have it done some other way, so please 
share your opinion.

Kind regards

Nov 07 2012

"Dejan Lekic" <dejan.lekic gmail.com> writes:

On Wednesday, 7 November 2012 at 13:07:13 UTC, Dejan Lekic wrote:
 Dear D community, I do not know about You, but I certainly do 
 not like writing code like:

 inRange.fooRange(param).barRange.
   .bazRange(param1, param2).outRange;

 I also tried to use operators ">>" and "~" but these make it 
 confusing and hard to understand what the statement actually 
 does.

 Therefore I would like to know what do you think about the idea 
 of having additional operator exclusively made for ranges? This 
 operator would make it obvious that data are "streamed" (lack 
 of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData" 
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

 Or:
 arr ~> odd ~> random ~> randomOdd;

 I humbly believe that ranges are one of the most important 
 concepts in D and that, plus the readability increase are two 
 valid reasons for having this new operator.

 I am also asking this because my point of view is strictly 
 pragmatic - there may be technical reasons why we should not 
 have this, or why we should have it done some other way, so 
 please share your opinion.

 Kind regards

EDIT:

I really dislike that word "streamed" that I used. "Chained" 
would perhaps be a better one. :)

Nov 07 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Wednesday, 7 November 2012 at 13:07:13 UTC, Dejan Lekic wrote:
 Therefore I would like to know what do you think about the idea 
 of having additional operator exclusively made for ranges? This 
 operator would make it obvious that data are "streamed" (lack 
 of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData" 
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

 Or:
 arr ~> odd ~> random ~> randomOdd;

I'm confused. So does this new operator just do the same thing as 
dot, but only work with ranges? Or does it have additional useful 
semantics?

Nov 07 2012

"Dejan Lekic" <dejan.lekic gmail.com> writes:

On Wednesday, 7 November 2012 at 13:21:36 UTC, Peter Alexander 
wrote:
 On Wednesday, 7 November 2012 at 13:07:13 UTC, Dejan Lekic 
 wrote:
 Therefore I would like to know what do you think about the 
 idea of having additional operator exclusively made for 
 ranges? This operator would make it obvious that data are 
 "streamed" (lack of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData" 
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

 Or:
 arr ~> odd ~> random ~> randomOdd;

 I'm confused. So does this new operator just do the same thing 
 as dot, but only work with ranges? Or does it have additional 
 useful semantics?

UFCS is what makes that code-mess I started with.
Imagine having ranges be part of some objects. I already gave an 
example of Console.in and Console.out. But say they are even 
deeper, so you have to refer to them using obj1.member.range 
notation, and now imagine using dot operator in some complex 
operation on ranges where you chain 5 or more ranges... All those 
dots and parenthesis can make head boil (at least it does make my 
head boil, not to mention that my colleague can's easily 
understand that statement at all written using UFCS).

Nov 07 2012

"Tavi Cacina" <octavian.cacina outlook.com> writes:

On Wednesday, 7 November 2012 at 13:07:13 UTC, Dejan Lekic wrote:
 I humbly believe that ranges are one of the most important 
 concepts in D

yeah, the range chaining is quite cool. I am just starting with 
D, and I find this formatting most appealing (took from a comment 
of an article about components).

   Console.in         // get some input
   .filter1(param)    // filter it based on X
   .fooRange          // tweak it some more
   .Console.out;      // beam it up, scotty

It allows to describe the chaining inplace. You have though a 
point, if you are combining the ranges in a long 'sausage', a 
dedicated operator may increase the readability.

Nov 07 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Dejan Lekic:

 Dear D community, I do not know about You, but I certainly do 
 not like writing code like:

 inRange.fooRange(param).barRange.
   .bazRange(param1, param2).outRange;

I suggest to format it this way, it's more readable:

auto something = inRange
                  .fooRange(param)
                  .barRange()
                  .bazRange(param1, param2)
                  .outRange();


 Therefore I would like to know what do you think about the idea 
 of having additional operator exclusively made for ranges? This 
 operator would make it obvious that data are "streamed" (lack 
 of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData" 
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

I think it doesn't give a significant improvement. But maybe
there are more interesting use cases.

I'd like D ranges to support the "~" (using a template mixin to
give them such operator), that acts like chain. So instead of
writing:

range1.chain(range2)

You write:

range1 ~ range2

It's also nice to have lazy lists, maybe based on fibers, with
few operators to concat them, etc.

Bye,
bearophile

Nov 07 2012

Dejan Lekic <dejan.lekic gmail.com> writes:

bearophile wrote:

 Dejan Lekic:
 
 Dear D community, I do not know about You, but I certainly do
 not like writing code like:

 inRange.fooRange(param).barRange.
   .bazRange(param1, param2).outRange;

 
 I suggest to format it this way, it's more readable:
 
 auto something = inRange
                   .fooRange(param)
                   .barRange()
                   .bazRange(param1, param2)
                   .outRange();
 
 
 Therefore I would like to know what do you think about the idea
 of having additional operator exclusively made for ranges? This
 operator would make it obvious that data are "streamed" (lack
 of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData"
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

 
 I think it doesn't give a significant improvement. But maybe
 there are more interesting use cases.
 
 I'd like D ranges to support the "~" (using a template mixin to
 give them such operator), that acts like chain. So instead of
 writing:
 
 range1.chain(range2)
 
 You write:
 
 range1 ~ range2
 
 It's also nice to have lazy lists, maybe based on fibers, with
 few operators to concat them, etc.
 
 Bye,
 bearophile

I already did try using the tilda operator for a while, then I realised that 
people are getting confused thinking the line is concatinating strings, then 
then realise those are ranges... That is exactly the reason why I asked the D 
community what they think about having a new operator only for ranges...

I also do what you suggest quite a lot. In fact I almost write it the same way 
you do in your example. But think about potential scenario when you give 
parameters as members of some structure:

auto something = inRange
                   .fooRange(someObject.someMember.membersMember)
                   .barRange(SomeClass.staticMember)
                   .bazRange(/* etc */)
                   .jarRange(param1, param2)
                   .outRange(/* etc */);

Moreover, what if developer does not add "Range" to the name (typical case)? 
Imagine confusion with such UFCS methods and properties... 

-- 
Dejan Lekic - http://dejan.lekic.org

Nov 07 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Dejan Lekic:

 I already did try using the tilda operator for a while, then I 
 realised that
 people are getting confused thinking the line is concatinating 
 strings, then
 then realise those are ranges...

"~" is used for all arrays (while array.Appender used put for 
mysterious reasons). If more and more D code starts using "~" to 
concatenate ranges or arrays, I think D programmers will get used 
to this more general meaning.

Bye,
bearophile

Nov 07 2012

Dejan Lekic <dejan.lekic gmail.com> writes:

bearophile wrote:

 Dejan Lekic:
 
 I already did try using the tilda operator for a while, then I
 realised that
 people are getting confused thinking the line is concatinating
 strings, then
 then realise those are ranges...

 
 "~" is used for all arrays (while array.Appender used put for
 mysterious reasons). If more and more D code starts using "~" to
 concatenate ranges or arrays, I think D programmers will get used
 to this more general meaning.
 
 Bye,
 bearophile

~ with arrays has different semantics I fear. 

-- 
Dejan Lekic - http://dejan.lekic.org

Nov 07 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, November 07, 2012 14:07:12 Dejan Lekic wrote:
 Therefore I would like to know what do you think about the idea
 of having additional operator exclusively made for ranges? This
 operator would make it obvious that data are "streamed" (lack of
 better term) among ranges.

As far as I can tell, it adds zero functionality. It's purely a matter of 
trying to create cleaner looking code. That being the case, I would think that 
suggestions like

auto something = inRange
 .fooRange(param)
 .barRange()
 .bazRange(param1, param2)
 .outRange();

solve the problem quite nicely, though I honestly, I have no problem with 
simply doing

auto something = outRange(bazRange(barRange(fooRange(inRange, param)), param1, 
param2));

though with that many chained items and several of them taking multiple 
parameters, something like

auto tempSomething = barRange(fooRange(inRange, param));
auto something = outRange(bazRange(tempSomething, param1, param2));

would probably be better. The first approach using UFCS seems rather popular 
though, and it's _very_ clean. My main gripe with it is that the flow is 
backwards, but I seem to be in the minority in thinking that.

Regardless, there are ways to format code so that it's quite clean without 
making any language changes. I don't see how adding an operator would really 
help. It just complicates the language further.

- Jonathan M Davis

Nov 07 2012

Dejan Lekic <dejan.lekic gmail.com> writes:

 
 auto something = outRange(bazRange(barRange(fooRange(inRange, param)), param1,
 param2));

It looks readable in this case, but to have it clean like that your parameters 
should be variables, otherwise imagine what would be if in all that you have 
calls to some functions to obtain argument for some of those ranges: 
...fooRange(someObject.getInRange(), foo!(bla)(param1...)),param1...

If I was the author, all would be fine, but if I give that code to someone, 
he/she will need time to understand what is actually happening...

-- 
Dejan Lekic - http://dejan.lekic.org

Nov 07 2012

D Programming

C/C++ Programming

Other

digitalmars.D - [RFC] Add an operator for ranges to D. Pros and cons?