digitalmars.D.bugs - [Issue 360] New: Compile-time floating-point calculations are sometimes inconsistent

d-bugmail puremagic.com (28/28) Sep 21 2006 http://d.puremagic.com/issues/show_bug.cgi?id=360

d-bugmail puremagic.com (16/16) Sep 21 2006 http://d.puremagic.com/issues/show_bug.cgi?id=360
d-bugmail puremagic.com (4/4) Sep 21 2006 http://d.puremagic.com/issues/show_bug.cgi?id=360
d-bugmail puremagic.com (6/6) Sep 21 2006 http://d.puremagic.com/issues/show_bug.cgi?id=360
d-bugmail puremagic.com (5/5) Sep 21 2006 http://d.puremagic.com/issues/show_bug.cgi?id=360
d-bugmail puremagic.com (19/21) Sep 22 2006 http://d.puremagic.com/issues/show_bug.cgi?id=360

Walter Bright (21/35) Sep 22 2006 The compiler is allowed to evaluate intermediate results at a greater

Don Clugston (37/86) Sep 22 2006 And therefore, it only matters in implicit type deduction, and in

Dave (2/103) Sep 22 2006 Great point.
Walter Bright (18/69) Sep 22 2006 It may come about as a result of source code generation, though, so I'd

Don Clugston (25/103) Sep 23 2006 That's very important. Still, those languages don't have implicit type

Walter Bright (32/75) Sep 23 2006 You can always use hex float constants. I know they're not pretty, but

Don Clugston (22/105) Sep 24 2006 Me, too. In fact I've seen a lot of code where ignorant programmers were...

Walter Bright (8/33) Sep 24 2006 Yes.

Don Clugston (10/50) Sep 24 2006 One consequence of that would be in the name mangling for floating point...

Walter Bright (3/7) Sep 25 2006 I'm reluctant to do that because there are already problems with the

xs0 (7/15) Sep 25 2006 What if you used characters other than A-F to compress the zeros?

Walter Bright (2/18) Sep 25 2006 Compression is one solution.

Sean Kelly (11/23) Sep 25 2006 I think AMD simply set its sights on the game industry as the

Bradley Smith (30/30) Sep 22 2006 To summarize: ---

d-bugmail puremagic.com (13/20) Sep 22 2006 http://d.puremagic.com/issues/show_bug.cgi?id=360

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=360

           Summary: Compile-time floating-point calculations are sometimes
                    inconsistent
           Product: D
           Version: 0.167
          Platform: PC
        OS/Version: Windows
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: digitalmars-com baysmith.com


The following code should print false before it exits.

import std.stdio;

void main() {
        const float STEP_SIZE = 0.2f;


        float j = 0.0f;
        while (j <= ( 1.0f / STEP_SIZE)) {
                j += 1.0f;
                writefln(j <= ( 1.0f / STEP_SIZE));
        }

}

This problem does not occur when:
1. the code is optimized
2. STEP_SIZE is not a const
3. STEP_SIZE is a real


--

Sep 21 2006

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=360


bugzilla digitalmars.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID





The example is mixing up 3 different precisions - 32, 64, and 80 bit. Each
involves different rounding of unrepresentable numbers like 0.2. In this case,
the 1.0f/STEP_SIZE is calculated at different precisions based on how things
are compiled. Constant folding, for example, is done at compile time and done
at max precision even if the variables involved are floats.

The D language allows this, the guiding principle is that algorithms should be
designed to not fail if precision is increased.

Not a bug.


--

Sep 21 2006

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=360






*** Bug 361 has been marked as a duplicate of this bug. ***


--

Sep 21 2006

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=360






Why are the expressions in the while and writefln statements calculated at
different precisions?

Wouldn't the constant folding be done the same for both?


--

Sep 21 2006

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=360






while (j <= (1.0f/STEP_SIZE)) is at double precision,
writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.


--

Sep 21 2006

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=360







 while (j <= (1.0f/STEP_SIZE)) is at double precision,
 writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

I don't understand where the double precision comes from. Since all the values
are floats, the only precisions that make sense are float and reals.

Really, 0.2f should not be the same number as 0.2. When you put the 'f' suffix
on, surely you're asking the compiler to truncate the precision. It can be
expanded to real precision later without problems. Currently, there's no way to
get a low-precision constant at compile time.

(In fact, you should be able to write real a = 0.2 - 0.2f; to get the
truncation error).

Here's how I think it should work:

const float A = 0.2;  // infinitely accurate 0.2, but type inference on A
should return a float.

const float B = 0.2f; // a 32-bit approximation to 0.2
const real C = 0.2; // infinitely accurate 0.2
const real D = 0.2f; // a 32-bit approximation to 0.2, but type inference will
give an 80-bit quantity.


--

Sep 22 2006

Walter Bright <newshound digitalmars.com> writes:

d-bugmail puremagic.com wrote:


 while (j <= (1.0f/STEP_SIZE)) is at double precision,
 writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

 I don't understand where the double precision comes from. Since all the values
 are floats, the only precisions that make sense are float and reals.

The compiler is allowed to evaluate intermediate results at a greater 
precision than that of the operands.

 Really, 0.2f should not be the same number as 0.2.

0.2 is not representable exactly, the only question is how much 
precision is there in the representation.

 When you put the 'f' suffix
 on, surely you're asking the compiler to truncate the precision.

Not in D. The 'f' suffix only indicates the type. The compiler may 
maintain internally as much precision as possible, for purposes of 
constant folding. Committing the actual precision of the result is done 
as late as possible.

 It can be
 expanded to real precision later without problems. Currently, there's no way to
 get a low-precision constant at compile time.

You can by putting the constant into a static, non-const variable. Then 
it cannot be constant folded.

 (In fact, you should be able to write real a = 0.2 - 0.2f; to get the
 truncation error).

Not in D, where the compiler is allowed to evaluate using as much 
precision as possible for purposes of constant folding. The vast 
majority of calculations benefit from delaying rounding as long as 
possible, hence D's bias towards using as much precision as possible.

The way to write robust floating point calculations in D is to ensure 
that increasing the precision of the calculations will not break the result.

Early versions of Java insisted that rounding to precision of floating 
point intermediate results always happened. While this ensured 
consistency of results, it mostly resulted in consistently getting 
inferior and wrong answers.

Sep 22 2006

Don Clugston <dac nospam.com.au> writes:

Walter Bright wrote:
 d-bugmail puremagic.com wrote:


 while (j <= (1.0f/STEP_SIZE)) is at double precision,
 writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

 I don't understand where the double precision comes from. Since all 
 the values
 are floats, the only precisions that make sense are float and reals.

 
 The compiler is allowed to evaluate intermediate results at a greater 
 precision than that of the operands.
 
 Really, 0.2f should not be the same number as 0.2.

 
 0.2 is not representable exactly, the only question is how much 
 precision is there in the representation.
 
 When you put the 'f' suffix
 on, surely you're asking the compiler to truncate the precision.

 
 Not in D. The 'f' suffix only indicates the type.

And therefore, it only matters in implicit type deduction, and in 
function overloading. As I discuss below, I'm not sure that it's 
necessary even there.
In many cases, it's clearly a programmer error. For example in
real BAD = 0.2f;
where the f has absolutely no effect.

The compiler may
 maintain internally as much precision as possible, for purposes of 
 constant folding. Committing the actual precision of the result is done 
 as late as possible.
 
 It can be
 expanded to real precision later without problems. Currently, there's 
 no way to
 get a low-precision constant at compile time.

 
 You can by putting the constant into a static, non-const variable. Then 
 it cannot be constant folded.

Actually, in this case you still want it to be constant folded.
 
 (In fact, you should be able to write real a = 0.2 - 0.2f; to get the
 truncation error).

 
 Not in D, where the compiler is allowed to evaluate using as much 
 precision as possible for purposes of constant folding. The vast 
 majority of calculations benefit from delaying rounding as long as 
 possible, hence D's bias towards using as much precision as possible.
 
 The way to write robust floating point calculations in D is to ensure 
 that increasing the precision of the calculations will not break the 
 result.
 
 Early versions of Java insisted that rounding to precision of floating 
 point intermediate results always happened. While this ensured 
 consistency of results, it mostly resulted in consistently getting 
 inferior and wrong answers.

I agree. But it seems that D is currently in a halfway house on this 
issue. Somehow, 'double' is privileged, and don't think it's got any 
right to be.

     const XXX = 0.123456789123456789123456789f;
     const YYY = 1 * XXX;
     const ZZZ = 1.0 * XXX;

    auto xxx = XXX;
    auto yyy = YYY;
    auto zzz = ZZZ;

// now xxx and yyy are floats, but zzz is a double.
Multiplying by '1.0' causes a float constant to be promoted to double.

    real a = xxx;
    real b = zzz;
    real c = XXX;

Now a, b, and c all have different values.

Whereas the same operation at runtime causes it to be promoted to real.

Is there any reason why implicit type deduction on a floating point 
constant doesn't always default to real? After all, you're saying "I 
don't particularly care what type this is" -- why not default to maximum 
accuracy?

Concrete example:

real a = sqrt(1.1);

This only gives a double precision result. You have to write
real a = sqrt(1.1L);
instead.
It's easier to do the wrong thing, than the right thing.

IMHO, unless you specifically take other steps, implicit type deduction 
should always default to the maximum accuracy the machine could do.

Sep 22 2006

Dave <Dave_member pathlink.com> writes:

Don Clugston wrote:
 Walter Bright wrote:
 d-bugmail puremagic.com wrote:


 while (j <= (1.0f/STEP_SIZE)) is at double precision,
 writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

 I don't understand where the double precision comes from. Since all 
 the values
 are floats, the only precisions that make sense are float and reals.

 The compiler is allowed to evaluate intermediate results at a greater 
 precision than that of the operands.

 Really, 0.2f should not be the same number as 0.2.

 0.2 is not representable exactly, the only question is how much 
 precision is there in the representation.

 When you put the 'f' suffix
 on, surely you're asking the compiler to truncate the precision.

 Not in D. The 'f' suffix only indicates the type.

 
 And therefore, it only matters in implicit type deduction, and in 
 function overloading. As I discuss below, I'm not sure that it's 
 necessary even there.
 In many cases, it's clearly a programmer error. For example in
 real BAD = 0.2f;
 where the f has absolutely no effect.
 
 The compiler may
 maintain internally as much precision as possible, for purposes of 
 constant folding. Committing the actual precision of the result is 
 done as late as possible.

 It can be
 expanded to real precision later without problems. Currently, there's 
 no way to
 get a low-precision constant at compile time.

 You can by putting the constant into a static, non-const variable. 
 Then it cannot be constant folded.

 
 Actually, in this case you still want it to be constant folded.
 (In fact, you should be able to write real a = 0.2 - 0.2f; to get the
 truncation error).

 Not in D, where the compiler is allowed to evaluate using as much 
 precision as possible for purposes of constant folding. The vast 
 majority of calculations benefit from delaying rounding as long as 
 possible, hence D's bias towards using as much precision as possible.

 The way to write robust floating point calculations in D is to ensure 
 that increasing the precision of the calculations will not break the 
 result.

 Early versions of Java insisted that rounding to precision of floating 
 point intermediate results always happened. While this ensured 
 consistency of results, it mostly resulted in consistently getting 
 inferior and wrong answers.

 
 I agree. But it seems that D is currently in a halfway house on this 
 issue. Somehow, 'double' is privileged, and don't think it's got any 
 right to be.
 
     const XXX = 0.123456789123456789123456789f;
     const YYY = 1 * XXX;
     const ZZZ = 1.0 * XXX;
 
    auto xxx = XXX;
    auto yyy = YYY;
    auto zzz = ZZZ;
 
 // now xxx and yyy are floats, but zzz is a double.
 Multiplying by '1.0' causes a float constant to be promoted to double.
 
    real a = xxx;
    real b = zzz;
    real c = XXX;
 
 Now a, b, and c all have different values.
 
 Whereas the same operation at runtime causes it to be promoted to real.
 
 Is there any reason why implicit type deduction on a floating point 
 constant doesn't always default to real? After all, you're saying "I 
 don't particularly care what type this is" -- why not default to maximum 
 accuracy?
 
 Concrete example:
 
 real a = sqrt(1.1);
 
 This only gives a double precision result. You have to write
 real a = sqrt(1.1L);
 instead.
 It's easier to do the wrong thing, than the right thing.
 
 IMHO, unless you specifically take other steps, implicit type deduction 
 should always default to the maximum accuracy the machine could do.

Great point.

Sep 22 2006

Walter Bright <newshound digitalmars.com> writes:

Don Clugston wrote:
 Walter Bright wrote:
 Not in D. The 'f' suffix only indicates the type.

 
 And therefore, it only matters in implicit type deduction, and in 
 function overloading. As I discuss below, I'm not sure that it's 
 necessary even there.
 In many cases, it's clearly a programmer error. For example in
 real BAD = 0.2f;
 where the f has absolutely no effect.

It may come about as a result of source code generation, though, so I'd 
be reluctant to make it an error.


 You can by putting the constant into a static, non-const variable. 
 Then it cannot be constant folded.

 
 Actually, in this case you still want it to be constant folded.

A static variable's value can change, so it can't be constant folded. To 
have it participate in constant folding, it needs to be declared as const.


 I agree. But it seems that D is currently in a halfway house on this 
 issue. Somehow, 'double' is privileged, and don't think it's got any 
 right to be.
 
     const XXX = 0.123456789123456789123456789f;
     const YYY = 1 * XXX;
     const ZZZ = 1.0 * XXX;
 
    auto xxx = XXX;
    auto yyy = YYY;
    auto zzz = ZZZ;
 
 // now xxx and yyy are floats, but zzz is a double.
 Multiplying by '1.0' causes a float constant to be promoted to double.

That's because 1.0 is a double. A double*float => double.

    real a = xxx;
    real b = zzz;
    real c = XXX;
 
 Now a, b, and c all have different values.
 
 Whereas the same operation at runtime causes it to be promoted to real.
 
 Is there any reason why implicit type deduction on a floating point 
 constant doesn't always default to real? After all, you're saying "I 
 don't particularly care what type this is" -- why not default to maximum 
 accuracy?
 
 Concrete example:
 
 real a = sqrt(1.1);
 
 This only gives a double precision result. You have to write
 real a = sqrt(1.1L);
 instead.
 It's easier to do the wrong thing, than the right thing.
 
 IMHO, unless you specifically take other steps, implicit type deduction 
 should always default to the maximum accuracy the machine could do.

It is a good idea, but isn't that way for the reasons:

1) It's the way C, C++, and Fortran work. Changing the promotion rules 
would mean that, when translating solid, reliable libraries from those 
languages to D, one would have to be very, very careful.

2) Float and double are expected to be implemented in hardware. Longer 
precisions are often not available. I wanted to make it practical for a 
D implementation on those machines to provide a software long precision 
floating point type, rather than just making real==double. Such a type 
would be very slow compared with double.

3) Real, even in hardware, is significantly slower than double. Doing 
constant folding at max precision at compile time won't affect runtime 
performance, so it is 'free'.

Sep 22 2006

Don Clugston <dac nospam.com.au> writes:

Walter Bright wrote:
 Don Clugston wrote:
 Walter Bright wrote:
 Not in D. The 'f' suffix only indicates the type.

 And therefore, it only matters in implicit type deduction, and in 
 function overloading. As I discuss below, I'm not sure that it's 
 necessary even there.
 In many cases, it's clearly a programmer error. For example in
 real BAD = 0.2f;
 where the f has absolutely no effect.

 
 It may come about as a result of source code generation, though, so I'd 
 be reluctant to make it an error.
 
 
 You can by putting the constant into a static, non-const variable. 
 Then it cannot be constant folded.

 Actually, in this case you still want it to be constant folded.

 
 A static variable's value can change, so it can't be constant folded. To 
 have it participate in constant folding, it needs to be declared as const.

But if it's const, then it's not float precision! I want both!

 I agree. But it seems that D is currently in a halfway house on this 
 issue. Somehow, 'double' is privileged, and don't think it's got any 
 right to be.

     const XXX = 0.123456789123456789123456789f;
     const YYY = 1 * XXX;
     const ZZZ = 1.0 * XXX;

    auto xxx = XXX;
    auto yyy = YYY;
    auto zzz = ZZZ;

 // now xxx and yyy are floats, but zzz is a double.
 Multiplying by '1.0' causes a float constant to be promoted to double.

 
 That's because 1.0 is a double. A double*float => double.
 
    real a = xxx;
    real b = zzz;
    real c = XXX;

 Now a, b, and c all have different values.

 Whereas the same operation at runtime causes it to be promoted to real.

 Is there any reason why implicit type deduction on a floating point 
 constant doesn't always default to real? After all, you're saying "I 
 don't particularly care what type this is" -- why not default to 
 maximum accuracy?

 Concrete example:

 real a = sqrt(1.1);

 This only gives a double precision result. You have to write
 real a = sqrt(1.1L);
 instead.
 It's easier to do the wrong thing, than the right thing.

 IMHO, unless you specifically take other steps, implicit type 
 deduction should always default to the maximum accuracy the machine 
 could do.

 
 It is a good idea, but isn't that way for the reasons:
 
 1) It's the way C, C++, and Fortran work. Changing the promotion rules 
 would mean that, when translating solid, reliable libraries from those 
 languages to D, one would have to be very, very careful.

That's very important. Still, those languages don't have implicit type 
deduction. Also, none of those languages guarantee accuracy of 
decimal->binary conversions, so there's always some error in decimal 
constants. Incidentally, I recently read that GCC uses something like 
160 bits for constant folding, so it's always going to give results that 
are different to those on other compilers.

Why doesn't D behave like C with respect to 'f' suffixes?
(Ie, do the conversion, then truncate it to float precision).
Actually, I can't imagine many cases where you'd actually want a 'float' 
constant instead of a 'real' one.

 2) Float and double are expected to be implemented in hardware. Longer 
 precisions are often not available. I wanted to make it practical for a 
 D implementation on those machines to provide a software long precision 
 floating point type, rather than just making real==double. Such a type 
 would be very slow compared with double.

Interesting. I thought that 'real' was supposed to be the highest 
accuracy fast floating point type, and would therefore be either 64, 80, 
or 128 bits. So it could also be a double-double?
For me, the huge benefit of the 'real' type is that it guarantees that 
optimisation won't change the results. In C, using doubles, it's quite 
unpredictable when a temporary will be 80 bits, and when it will be 64 
bits. In D, if you stick to real, you're guaranteed that nothing weird 
will happen. I'd hate to lose that.

 3) Real, even in hardware, is significantly slower than double. Doing 
 constant folding at max precision at compile time won't affect runtime 
 performance, so it is 'free'.

In this case, the initial issue remains: in order to write code which 
maintains accuracy regardless of machine precision, it is sometimes 
necessary to specify the precision that should be used for constants.
The original code was an example where weird things happened because
that wasn't respected.

Sep 23 2006

Walter Bright <newshound digitalmars.com> writes:

Don Clugston wrote:
 Walter Bright wrote:
 A static variable's value can change, so it can't be constant folded. 
 To have it participate in constant folding, it needs to be declared as 
 const.

 But if it's const, then it's not float precision! I want both!

You can always use hex float constants. I know they're not pretty, but 
the point of them is to be able to specify exact floating point bit 
patterns. There are no rounding errors with them.

 1) It's the way C, C++, and Fortran work. Changing the promotion rules 
 would mean that, when translating solid, reliable libraries from those 
 languages to D, one would have to be very, very careful.

 
 That's very important. Still, those languages don't have implicit type 
 deduction. Also, none of those languages guarantee accuracy of 
 decimal->binary conversions, so there's always some error in decimal 
 constants. Incidentally, I recently read that GCC uses something like 
 160 bits for constant folding, so it's always going to give results that 
 are different to those on other compilers.
 
 Why doesn't D behave like C with respect to 'f' suffixes?
 (Ie, do the conversion, then truncate it to float precision).
 Actually, I can't imagine many cases where you'd actually want a 'float' 
 constant instead of a 'real' one.

A float constant would be desirable to keep the calculation all floats 
for speed reasons. I can't think of many reasons one would want reduced 
precision.

 2) Float and double are expected to be implemented in hardware. Longer 
 precisions are often not available. I wanted to make it practical for 
 a D implementation on those machines to provide a software long 
 precision floating point type, rather than just making real==double. 
 Such a type would be very slow compared with double.

 
 Interesting. I thought that 'real' was supposed to be the highest 
 accuracy fast floating point type, and would therefore be either 64, 80, 
 or 128 bits. So it could also be a double-double?
 For me, the huge benefit of the 'real' type is that it guarantees that 
 optimisation won't change the results. In C, using doubles, it's quite 
 unpredictable when a temporary will be 80 bits, and when it will be 64 
 bits. In D, if you stick to real, you're guaranteed that nothing weird 
 will happen. I'd hate to lose that.

I don't see how one would lose that if real were done in software.

 3) Real, even in hardware, is significantly slower than double. Doing 
 constant folding at max precision at compile time won't affect runtime 
 performance, so it is 'free'.

 
 In this case, the initial issue remains: in order to write code which 
 maintains accuracy regardless of machine precision, it is sometimes 
 necessary to specify the precision that should be used for constants.
 The original code was an example where weird things happened because
 that wasn't respected.

Weird things always happen with floating point. It's just a matter of 
where one chooses the seams to show (you pointed out where seams show in 
C with temporary precision). I've seen a lot of cases where people were 
surprised that 0.2f (or similar) was even rounded off, and got caught by 
the roundoff error.

I used to work in mechanical engineering where a lot of numerical 
calculations were done. Accumulating roundoff errors were a huge 
problem, and a lot (most?) engineers didn't understand it. They were 
using calculators for long chains of calculation, and rounding off after 
each step instead of carrying the full calculator precision. They were 
mystified by getting answers at the end that were way off.

It's my experience with that (and also in college where we were taught 
to never round off anything but the final answer) that led to the D 
design decision to internally carry around consts in full precision, 
regardless of type.

Deliberately reduced precision is something that only experts would 
want, and only for special cases. So it's reasonable that that would be 
harder to do (i.e. using hex float constants).

P.S. I also did some digital electronic design work long ago. The 
cardinal rule there was that since TTL devices got faster all the time, 
and old slower TTL parts became unavailable, one designed so that 
swapping in a faster chip would not cause the failure of the system. 
Hence the rule that increasing the precision of a calculation should not 
cause the program to fail <g>.

Sep 23 2006

Don Clugston <dac nospam.com.au> writes:

Walter Bright wrote:
 Don Clugston wrote:
 Walter Bright wrote:
 A static variable's value can change, so it can't be constant folded. 
 To have it participate in constant folding, it needs to be declared 
 as const.

 But if it's const, then it's not float precision! I want both!

 
 You can always use hex float constants. I know they're not pretty, but 
 the point of them is to be able to specify exact floating point bit 
 patterns. There are no rounding errors with them.


 1) It's the way C, C++, and Fortran work. Changing the promotion 
 rules would mean that, when translating solid, reliable libraries 
 from those languages to D, one would have to be very, very careful.

 That's very important. Still, those languages don't have implicit type 
 deduction. Also, none of those languages guarantee accuracy of 
 decimal->binary conversions, so there's always some error in decimal 
 constants. Incidentally, I recently read that GCC uses something like 
 160 bits for constant folding, so it's always going to give results 
 that are different to those on other compilers.

 Why doesn't D behave like C with respect to 'f' suffixes?
 (Ie, do the conversion, then truncate it to float precision).
 Actually, I can't imagine many cases where you'd actually want a 
 'float' constant instead of a 'real' one.

 
 A float constant would be desirable to keep the calculation all floats 
 for speed reasons. I can't think of many reasons one would want reduced 
 precision.

Me, too. In fact I've seen a lot of code where ignorant programmers were 
adding 'f' to end of every floating point constant. It could be that the 
number of cases where you actually care about the precision are so 
small, that hex constants are adequate.

 2) Float and double are expected to be implemented in hardware. 
 Longer precisions are often not available. I wanted to make it 
 practical for a D implementation on those machines to provide a 
 software long precision floating point type, rather than just making 
 real==double. Such a type would be very slow compared with double.

 Interesting. I thought that 'real' was supposed to be the highest 
 accuracy fast floating point type, and would therefore be either 64, 
 80, or 128 bits. So it could also be a double-double?
 For me, the huge benefit of the 'real' type is that it guarantees that 
 optimisation won't change the results. In C, using doubles, it's quite 
 unpredictable when a temporary will be 80 bits, and when it will be 64 
 bits. In D, if you stick to real, you're guaranteed that nothing weird 
 will happen. I'd hate to lose that.

 
 I don't see how one would lose that if real were done in software.
 
 3) Real, even in hardware, is significantly slower than double. Doing 
 constant folding at max precision at compile time won't affect 
 runtime performance, so it is 'free'.

 In this case, the initial issue remains: in order to write code which 
 maintains accuracy regardless of machine precision, it is sometimes 
 necessary to specify the precision that should be used for constants.
 The original code was an example where weird things happened because
 that wasn't respected.

 
 Weird things always happen with floating point. It's just a matter of 
 where one chooses the seams to show (you pointed out where seams show in 
 C with temporary precision). I've seen a lot of cases where people were 
 surprised that 0.2f (or similar) was even rounded off, and got caught by 
 the roundoff error.
 
 I used to work in mechanical engineering where a lot of numerical 
 calculations were done. Accumulating roundoff errors were a huge 
 problem, and a lot (most?) engineers didn't understand it. They were 
 using calculators for long chains of calculation, and rounding off after 
 each step instead of carrying the full calculator precision. They were 
 mystified by getting answers at the end that were way off.
 
 It's my experience with that (and also in college where we were taught 
 to never round off anything but the final answer) that led to the D 
 design decision to internally carry around consts in full precision, 
 regardless of type.
 
 Deliberately reduced precision is something that only experts would 
 want, and only for special cases. So it's reasonable that that would be 
 harder to do (i.e. using hex float constants).

OK, you've convinced me. It needs to be better documented, though.

 P.S. I also did some digital electronic design work long ago. The 
 cardinal rule there was that since TTL devices got faster all the time, 
 and old slower TTL parts became unavailable, one designed so that 
 swapping in a faster chip would not cause the failure of the system. 
 Hence the rule that increasing the precision of a calculation should not 
 cause the program to fail <g>.

I think it would be useful to specify more precisely what happens in 
constant folding. Eg, mention that all constant folding will be done in 
IEEE round-to-nearest, ties-to-even.

In the longer term, I've been wondering if the precision for real 
constants even needs to be the same as for the 'real' type. I can see 
some distinct benefits that would come if the precision of literals was 
defined to always be IEEE quadruple precision. Of course they'd always 
be rounded to 64 or 80-bit reals when the time came for them to actually 
be used.

Looking at the spec for the forthcoming IEEE 754R standard, and the 
state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add a 
quadruple precision type (they already have 16 128 bit registers, two 64 
bit mantissa units, and the quadruple exponent is the same as for x87. 
So I don't think it would require much silicon, and it would mean they 
could emulate the x87 stuff entirely on SSE). Some forward-compatibility 
things to consider in DMD 2.0; ignore for now.

Sep 24 2006

Walter Bright <newshound digitalmars.com> writes:

Don Clugston wrote:
 Walter Bright wrote:
 OK, you've convinced me. It needs to be better documented, though.

I agree with you and Bradley Smith on that.

 P.S. I also did some digital electronic design work long ago. The 
 cardinal rule there was that since TTL devices got faster all the 
 time, and old slower TTL parts became unavailable, one designed so 
 that swapping in a faster chip would not cause the failure of the 
 system. Hence the rule that increasing the precision of a calculation 
 should not cause the program to fail <g>.

 
 I think it would be useful to specify more precisely what happens in 
 constant folding. Eg, mention that all constant folding will be done in 
 IEEE round-to-nearest, ties-to-even.

Yes.

 In the longer term, I've been wondering if the precision for real 
 constants even needs to be the same as for the 'real' type. I can see 
 some distinct benefits that would come if the precision of literals was 
 defined to always be IEEE quadruple precision. Of course they'd always 
 be rounded to 64 or 80-bit reals when the time came for them to actually 
 be used.

I agree.

 Looking at the spec for the forthcoming IEEE 754R standard, and the 
 state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add a 
 quadruple precision type (they already have 16 128 bit registers, two 64 
 bit mantissa units, and the quadruple exponent is the same as for x87. 
 So I don't think it would require much silicon, and it would mean they 
 could emulate the x87 stuff entirely on SSE). Some forward-compatibility 
 things to consider in DMD 2.0; ignore for now.

I was disappointed in the AMD-64 because it didn't do 128 bit floats, in 
fact, it relegated 80 bit floats to a backwater in the instruction set. 
Few computer people seem to understand the value in high precision 
floating point.

Sep 24 2006

Don Clugston <dac nospam.com.au> writes:

Walter Bright wrote:
 Don Clugston wrote:
 Walter Bright wrote:
 OK, you've convinced me. It needs to be better documented, though.

 
 I agree with you and Bradley Smith on that.
 
 P.S. I also did some digital electronic design work long ago. The 
 cardinal rule there was that since TTL devices got faster all the 
 time, and old slower TTL parts became unavailable, one designed so 
 that swapping in a faster chip would not cause the failure of the 
 system. Hence the rule that increasing the precision of a calculation 
 should not cause the program to fail <g>.

 I think it would be useful to specify more precisely what happens in 
 constant folding. Eg, mention that all constant folding will be done 
 in IEEE round-to-nearest, ties-to-even.

 
 Yes.
 
 In the longer term, I've been wondering if the precision for real 
 constants even needs to be the same as for the 'real' type. I can see 
 some distinct benefits that would come if the precision of literals 
 was defined to always be IEEE quadruple precision. Of course they'd 
 always be rounded to 64 or 80-bit reals when the time came for them to 
 actually be used.

 
 I agree.

One consequence of that would be in the name mangling for floating point 
  constants in templates. Currently it's 20 hex characters, which only 
makes sense for a system with 80-bit reals; might be better to make it 
32 hex characters, even if the extra 12 are all '0'.

 
 Looking at the spec for the forthcoming IEEE 754R standard, and the 
 state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add 
 a quadruple precision type (they already have 16 128 bit registers, 
 two 64 bit mantissa units, and the quadruple exponent is the same as 
 for x87. So I don't think it would require much silicon, and it would 
 mean they could emulate the x87 stuff entirely on SSE). Some 
 forward-compatibility things to consider in DMD 2.0; ignore for now.

 
 I was disappointed in the AMD-64 because it didn't do 128 bit floats, in 
 fact, it relegated 80 bit floats to a backwater in the instruction set. 
 Few computer people seem to understand the value in high precision 
 floating point.

Intel seems to be better than AMD in this regard. Intel added an 82 bit 
floating point type to the Itanium so that it could do 80-bit hypot() 
without overflow (in fact, Itanium seems to have by far the best 
floating point support that I've seen); AMD's 3DNow! didn't even support 
subnormals, infinity, or NaN.

Sep 24 2006

Walter Bright <newshound digitalmars.com> writes:

Don Clugston wrote:
 One consequence of that would be in the name mangling for floating point 
  constants in templates. Currently it's 20 hex characters, which only 
 makes sense for a system with 80-bit reals; might be better to make it 
 32 hex characters, even if the extra 12 are all '0'.

I'm reluctant to do that because there are already problems with the 
mangled names getting too long.

Sep 25 2006

xs0 <xs0 xs0.com> writes:

Walter Bright wrote:
 Don Clugston wrote:
 One consequence of that would be in the name mangling for floating 
 point  constants in templates. Currently it's 20 hex characters, which 
 only makes sense for a system with 80-bit reals; might be better to 
 make it 32 hex characters, even if the extra 12 are all '0'.

 
 I'm reluctant to do that because there are already problems with the 
 mangled names getting too long.

What if you used characters other than A-F to compress the zeros?

G = 2 * '0'
H = 3 * '0'
...
Z = 21 * '0'


xs0

Sep 25 2006

Walter Bright <newshound digitalmars.com> writes:

xs0 wrote:
 Walter Bright wrote:
 Don Clugston wrote:
 One consequence of that would be in the name mangling for floating 
 point  constants in templates. Currently it's 20 hex characters, 
 which only makes sense for a system with 80-bit reals; might be 
 better to make it 32 hex characters, even if the extra 12 are all '0'.

 I'm reluctant to do that because there are already problems with the 
 mangled names getting too long.

 
 What if you used characters other than A-F to compress the zeros?
 
 G = 2 * '0'
 H = 3 * '0'
 ...
 Z = 21 * '0'

Compression is one solution.

Sep 25 2006

Sean Kelly <sean f4.ca> writes:

Don Clugston wrote:
 Walter Bright wrote:
 I was disappointed in the AMD-64 because it didn't do 128 bit floats, 
 in fact, it relegated 80 bit floats to a backwater in the instruction 
 set. Few computer people seem to understand the value in high 
 precision floating point.

 
 Intel seems to be better than AMD in this regard. Intel added an 82 bit 
 floating point type to the Itanium so that it could do 80-bit hypot() 
 without overflow (in fact, Itanium seems to have by far the best 
 floating point support that I've seen); AMD's 3DNow! didn't even support 
 subnormals, infinity, or NaN.

I think AMD simply set its sights on the game industry as the 
battleground, which seems to be supported by the presence of forums on 
LAN parties and system modding (http://forums.amd.com/).  This stands in 
contrast with the Intel, who has an entire set of forums for software 
development (http://softwareforums.intel.com/).  I decided to ask 
whether AMD has another location for software development discussion.  I 
have no idea whether science-minded software companies or developers 
communicate to AMD that they'd like improved floating-point support, but 
a bit more couldn't hurt.


Sean

Sep 25 2006

Bradley Smith <digitalmars-com baysmith.com> writes:

To summarize: ---

The compiler is allowed to evaluate intermediate results at a greater 
precision than that of the operands. The literal type suffix (like 'f') 
only indicates the type. The compiler may maintain internally as much 
precision as possible, for purposes of constant folding. Committing the 
actual precision of the result is done as late as possible.

For a low-precision constant put the value into a static, non-const 
variable. Since this is not really a constant, it cannot be constant 
folded and therefore affected by a possible compile-time increase in 
precision. However, if mixed with a higher precision at runtime, a 
increase in precision will still occur.

The way to write robust floating point calculations in D is to ensure
that increasing the precision of the calculations will not break the 
result.

--- end of summary

This is the explanation I was looking for. Although it was clear that 
during runtime, D evaluates intermediate results at high precision. The 
compile-time behavior (namely using a const) is different than the 
runtime behavior (using a static), but I don't think that is clearly 
explained in the documentation.

Would you please add this information to the D documentation? Perhaps an 
addition to the Floating Point page 
(http://www.digitalmars.com/d/float.html). Of course, if any of the 
above is incorrect, please change as necessary.

A follow-on question would be: How does one create an low-precision 
constant that is ensured to actually stay constant? A static won't do 
since a static is really non-const, and a programming error would change 
the value.


Thanks,
   Bradley

Sep 22 2006

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=360


smjg iname.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |smjg iname.com






 const float A = 0.2;  // infinitely accurate 0.2, but type inference on A
 should return a float.
 
 const float B = 0.2f; // a 32-bit approximation to 0.2
 const real C = 0.2; // infinitely accurate 0.2
 const real D = 0.2f; // a 32-bit approximation to 0.2, but type inference will
 give an 80-bit quantity.

I agree.  Only I'm not sure about A.  If you want it to be "infinitely
accurate", then why would you declare it to be a float?  It appears to me to be
a means by which a float can hold more precision than it really can.  On the
other hand, D should definitely generate a 32-bit approximation to 0.2.  By
using the 'f' suffix, this is exactly what the programmer asked for.


--

Sep 22 2006

D Programming

C/C++ Programming

Other

digitalmars.D.bugs - [Issue 360] New: Compile-time floating-point calculations are sometimes inconsistent