digitalmars.D.bugs - [Issue 8672] New: %% operator

d-bugmail puremagic.com (28/28) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672

d-bugmail puremagic.com (22/22) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (6/6) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (69/83) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (15/15) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672

ixid (4/4) Sep 16 2012 Is there any use for the way C-style modulus interacts with

bearophile (3/7) Sep 16 2012 This is not a group for discussions. If you want to comment you

d-bugmail puremagic.com (25/31) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (17/17) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (54/64) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (13/14) Sep 16 2012 dead languages stop changing. Please keep this in mind.
d-bugmail puremagic.com (10/16) Sep 16 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (14/20) Sep 17 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (6/6) Sep 17 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8672
d-bugmail puremagic.com (9/11) May 27 2013 http://d.puremagic.com/issues/show_bug.cgi?id=8672

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672

           Summary: %% operator
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



This proposal is an alternative to the Issue 7728

I suggest to add the operator %% that performs a modulus operation closer to
the mathematics definition, as defined in Python.

%% acts as % when both its arguments are unsigned or when they are both
positive signed numbers.

I think here a built-in operator is better than a Phobos function (as discussed
in Issue 7728): the built-in "%" operator is bug-prone, if you use it with
negative numbers. In theory both a Phobos function and a built-in operator are
usable to avoid such bugs. But I think having a built-in operator allows
programmers to better remember the problem with the standard % derived from C
and more often avoid its usage when numbers are negative, compared to a library
function.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |bugzilla digitalmars.com
         Resolution|                            |WONTFIX



14:45:34 PDT ---
Adding a new operator for an incredibly rare operation is not justified. A
library function should be used for this, if it matters at all, and I am not
convinced it does.

As Don pointed out to me, there is no "mathematical" definiton of modulus. And
as http://en.wikipedia.org/wiki/Modulo_operation makes clear, there is no
consistent definition of it in programming languages, with four different
definitions of it in use, not including "implementation defined" ones.

To say one version of modulus is "bug prone" and the other is not, is itself
erroneous.

There is simply no getting around the fact that the programmer needs to be
aware of what result he is trying to achieve with negative numbers.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672




14:46:20 PDT ---
*** Issue 7728 has been marked as a duplicate of this issue. ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672





 Adding a new operator for an incredibly rare operation is not justified.

It's not incredibly rare, I use it all the time in Python (it's the default
one!) and I used to use it in Delphi often, and I use a function to do it in D,
so in practice I use it much more often than the C-style one. It's rare in C/D
programs just because C did a wrong design decision, and this is the built-in
:-) Programmers usually stick with what's offered to them by the language,
often even it's not good. This is also why programmers sometimes switch to
newer/better languages: to have a whole new set of _better_ defaults/features.


 A library function should be used for this, if it matters at all, and I am not
 convinced it does.
 
 As Don pointed out to me, there is no "mathematical" definiton of modulus.

This enhancement request refers to the Python one...


 And
 as http://en.wikipedia.org/wiki/Modulo_operation makes clear, there is no
 consistent definition of it in programming languages, with four different
 definitions of it in use, not including "implementation defined" ones.

Making it "implementation defined" makes it nearly useless, killing code
portability. Even C99 designers have understood this.

Most languages use one of two versions. D uses one of those two, and Python the
other two. Ada has both (and I think Ada doesn't have any other ones).


 To say one version of modulus is "bug prone" and the other is not, is itself
 erroneous.
 
 There is simply no getting around the fact that the programmer needs to be
 aware of what result he is trying to achieve with negative numbers.

While this is generally right, in my opinion in this case this is not true.

Despite Python not being C, it doesn't break C99 "compatibily" randomly: Python
designers have put the currently used modulus preferring it over the C99
version. And I believe they have done the right choice.


The topic was discussed a bit in this thread:
http://forum.dlang.org/thread/k05nqv$2s3v$1 digitalmars.com

There I have shown an example, reduced from a larger program. This little
function has to look for the next true value in a circular array 'data', before
or after a given starting index, according to a int 'movement' variable that is
allowed to be only +1 or -1:


This is Python code, it correctly performs the wrap-around in all cases:


def find_next(data, index, movement):
     assert movement == 1 or movement == -1
     while True:
         index = (index + movement) % len(data)
         if data[index]:
             break
     return index


data = [False, False, False, False, True, False]




The D code doesn't work, because of the C %:


size_t findNext(in bool[] data, size_t index, in int movement)
pure nothrow in {
     assert(movement == 1 || movement == -1);
} body {
     do {
         index = (index + movement) % data.length;
     } while (!data[index]);
     return index;
}

void main() {
     import std.stdio;
     //              0      1      2      3      4     5
     const data = [false, false, false, false, true, false];

     writeln(findNext(data, 5, -1)); // OK, prints 4
     writeln(findNext(data, 0, -1)); // not OK
}


To make the D code work, you first of all have to be aware of the C99 semantics
of moduls, and then you have to introduce a workaround, with longer code. In
Python the code works naturally even when numbers are negative.

Using the D built-in % operator is a landmine, it's safe for unsigned integers.
For signed integers that can be negative you have to take extra care. So you
are of course true "the programmer needs to be aware", but the same phrase is
true for everything. Being not bug-prone means the programmer needs to be less
aware, needs to keep in mind a smaller number of things to produce correct
code.

In the end I understand that adding a new operator is asking a lot, so a
library function is enough. But I don't agree that the built-in modulus
operator is "good enough".

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672




16:35:40 PDT ---
The point is, the programmer has to take "extra care" regardless.

All definitions of modulus are a "land mine", as they are all arbitrary. There
is no such thing as a mathematically correct one. Nor is there any such thing
as a "natural" result of modulus. It may be natural for your particular code
example, and for your experience, but that doesn't mean it is generally true
for uses of modulus.

The diversity of definitions for it strongly suggests that there is no
"natural" defintion.

D improves on the C/C++ situation by making it defined behavior, rather than
undefined.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

"ixid" <nuaccount gmail.com> writes:

Is there any use for the way C-style modulus interacts with 
negative numbers? It seems little more than broken on the basis 
of making positive number modulus operations efficient back when 
C was created.

Sep 16 2012

"bearophile" <bearophileHUGS lycos.com> writes:

On Monday, 17 September 2012 at 01:15:56 UTC, ixid wrote:
 Is there any use for the way C-style modulus interacts with 
 negative numbers? It seems little more than broken on the basis 
 of making positive number modulus operations efficient back 
 when C was created.

This is not a group for discussions. If you want to comment you 
have to add it to the bug report.

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672





 The point is, the programmer has to take "extra care" regardless.

I am not the only one making mistakes with the C-style modulus, I know of other
people that have had similar bugs. And I know teachers of C/Java languages that
have to take care of explaining the not intuitive nature of the C modulus.

But as usual I don't have real statistics that show that the Python-style
modulus leads to less bugs.


 It may be natural for your particular code example,

What I have shown is not an arbitrary and very specific example, it's an
example of a common usage. 


 The diversity of definitions for it strongly suggests that there is no
 "natural" defintion.

That's the not-adaptationist explanation
(http://en.wikipedia.org/wiki/Adaptationism ). An alternative explanation is
that they are just copying the semantics from precedent (C, Algol?) designs.
In the case of D the explanation is that it has copied C99, for practical
backward compatibility purposes. It's a scan of a cyclic structure (a cirular
array), where sometimes you go forward and sometimes you go backward, with
negative steps. While the presence of an array is specific, the need to walk
circular sequences is a common enough use case for the modulus among negative
numbers.


 D improves on the C/C++ situation by making it defined behavior, rather than
 undefined.

C makes its behavour defined since C99 :-)

I will keep using my function in D, because it has avoided me some bugs. Thank
you for your answers.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmx.com



PDT ---
The whole point is that there's nothing wrong with how % works, since there is
no standard definition for it, and D implements it in one of the ways that's
very common, even if it's not what you might what. And it's not worth adding
another operator for another variation of how modulus could work. A function
does the job just fine. It's not something that merits yet another operator.

At this point, unless there's a huge advantage in putting something in the
language, we're just not going to do it. Some features may merit that, but
something that a simple function can easily do doesn't.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672

since there is no standard definition for it,

Most computer languages have one of two definitions. C99 and D use one, I am
talking about the other one. So you can say there are two "standards". Many of

designed like this just because they are copying C, not because the C % is the
better one.

The whole point is that there's nothing wrong with how % works,

The definition of % in C is not the most intuitive, it's not the one you
usually want when you code using negative numbers, and I think it's more
bug-prone because you sometimes forget all this.

Python and Haskell designers think that it's bad, they have not chosen randomly
one of the two designs throwing a coin. Python designers have broken the C
"compatibility" on purpose on this. It's documented.

Despite some flaws, Python and Haskell are two of the best designed-out
languages around, when they do something the same way, it's probably not a
random occurrence :-)

If you perform a Google search using some right words you find many results
that show/discuss the problem, it's not a problem created by my mind:

http://stackoverflow.com/questions/1082917/mod-of-negative-number-is-melting-my-brain

http://stackoverflow.com/questions/9050209/weird-results-of-modulo-arithmetic-with-negative-numbers-and-unsigned-denominato

and D implements it in one of the ways that's
very common, even if it's not what you might what.

This is right, it's very common. But you can see I have never asked to change
the % operator of D. I am aware of the importance of keeping it compatible with
C.

And it's not worth adding
another operator for another variation of how modulus could work.

I can agree to this.

A function does the job just fine. It's not something that merits yet another
operator.

If you take a look at my issue 7728, Walter has closed it too. issue 7728 asks
for a function that is intrinsic. I think this is better than a normal function
because D programmers will be more willing to use the alternative modulus if
they know it's implemented by a very small number of _inlined_ asm instructions
(or even one, on the CPUs that support this operation natively). This is a
*basic* operation, it can't afford being slow. See here:
http://code.google.com/p/go/issues/detail?id=448

There are two ways to do it: doing an extra (expensive) division

((m % n) + n) % n;

or doing an extra (expensive) conditional

temp = m % n
if temp < 0: temp += n

Hopefully the compiler will see that it can do a conditional move in the second
case,
but it's not obvious.
<<

A function-looking compiler intrinsic avoids this. Once such intrinsic is
present in Phobos I will stop remove the usage of % in my D code :-)

At this point, unless there's a huge advantage in putting
something in the language, we're just not going to do it.

Even Fortran is now and then adding important features today. Generally only
dead languages stop changing. Please keep this in mind.

Saying that "the programmer needs to be aware of what result he is trying to
achieve with negative numbers" misses the fact that the C % is usually not
useful when you are working with negative numbers.

This thread shows a lost opportunity for D. I am not going to agree with Walter
or you on this. Python and Haskell are better on this, their % operator is more
useful and safer.

--
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672




PDT ---
 Even Fortran is now and then adding important features today. Generally only

dead languages stop changing. Please keep this in mind.

If something is truly worth adding to the language, then we'll add it, but
there has to be a solid reason for it. And given how we're trying to stabilize
things, we're less likely to add features just because they'd be nice. Once
everything's more stable, we'll probbaly be more willing to add
backwards-compatible features which add real value. But I wouldn't expect %% to
be added regardless, because it doesn't add enough value over simply adding a
function which does the same.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672




20:53:43 PDT ---


 
 since there is no standard definition for it,

 
 C99 and D use one,

C99 leaves it as "implementation defined". D defines it in the specification.

Those are fundamentally different.

 So you can say there are two "standards".

As I cited, there are FOUR different standards on this, not including the C99
"implementation defined" one.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 16 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672






 C99 leaves it as "implementation defined". D defines it in the specification.
 
 Those are fundamentally different.

In the table of the Wikipedia page you have linked there is written:

C (ISO 1990)     %     Implementation defined

C (ISO 1999)     %     Dividend[1]

Where the reference [1] is:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf, section 6.5.5

6.5.5.5 paragraph says:

The result of the / operator is the quotient from the division of the first
operand by the second; the result of the % operator is the remainder.In both
operations, if the value of the second operand is zero, the behavior is
undefined.<


 As I cited, there are FOUR different standards on this, not including the C99
 "implementation defined" one.

But by far the most common ones are two of them (counting languages: 45+
Dividend, 31+ Divisor, and 8 something different).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 17 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672




12:02:34 PDT ---
I didn't realize that C99 did specify it. Thanks for the correction.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Sep 17 2012

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8672






 To say one version of modulus is "bug prone" and the other is not, is itself
 erroneous.

I have just found another bug in my code caused by it, and it's a bug that goes
away replacing it with the Python modulus operation. The built-in % operator
_is_ bug prone.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

May 27 2013

D Programming

C/C++ Programming

Other

digitalmars.D.bugs - [Issue 8672] New: %% operator