digitalmars.D - Casting double to ulong weirdness
- =?UTF-8?B?Ik3DoXJjaW8=?= Martins" (16/16) Aug 24 2015 I'm posting this here for visibility. This was silently
- anonymous (13/30) Aug 24 2015 1.2 is not representable exactly in binary. Try printing it with a lot o...
- Steven Schveighoffer (12/26) Aug 24 2015 Yes. This is part of the issue of floating point. 1.2 cannot be
- =?UTF-8?B?Ik3DoXJjaW8=?= Martins" (10/42) Aug 24 2015 I am familiar with floating-point representations and their
- rumbu (29/32) Aug 24 2015 Visual C++ 19.00.23026, x86, x64:
- rumbu (9/9) Aug 24 2015 BTW, 1.2 and 12.0 are directly representable as double
- Justin Whear (12/25) Aug 24 2015 No it's not, this must be some sort of constant-folding or precision
- Warwick (3/19) Aug 24 2015 Maybe the constant folding is using a different rounding mode to
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (14/23) Aug 24 2015 Output is one thing. The issue is with the representation of 1.2. You
- John Colvin (3/4) Aug 24 2015 12.0 is representable, but I'm pretty sure, if you work it out,
- Steven Schveighoffer (33/42) Aug 24 2015 I don't think they are directly representable as floating point, because...
- Steven Schveighoffer (29/71) Aug 24 2015 More data:
- bachmeier (8/13) Aug 24 2015 I don't see anything that needs to be fixed, because I don't
- Steven Schveighoffer (9/21) Aug 24 2015 What is surprising, and possibly buggy, is that none of these operations...
- H. S. Teoh via Digitalmars-d (7/22) Aug 24 2015 +1.
- bachmeier (7/23) Aug 24 2015 I would not describe to!ulong as a "work-around". You just
- Steven Schveighoffer (8/33) Aug 24 2015 real y = x * 10.0;
- bachmeier (4/12) Aug 24 2015 Yes, I was mistaken. You have to use roundTo or std.math.round.
- =?UTF-8?B?Ik3DoXJjaW8=?= Martins" (27/68) Aug 24 2015 Whatever the issue is, it is not unavoidable, because as has been
- Steven Schveighoffer (19/82) Aug 24 2015 Your other examples use doubles, not reals. It's not apples to apples.
-
Steven Schveighoffer
(11/14)
Aug 24 2015
#include
- =?UTF-8?B?Ik3DoXJjaW8=?= Martins" (18/126) Aug 24 2015 All my examples are doubles, and I have tested them all in C++ as
- H. S. Teoh via Digitalmars-d (6/8) Aug 24 2015 std.math.trunc.
- =?UTF-8?B?Ik3DoXJjaW8=?= Martins" (12/18) Aug 24 2015 import std.stdio;
- bachmeier (15/28) Aug 24 2015 There's no guarantee that it will be done consistently or
- Steven Schveighoffer (4/15) Aug 25 2015 auto result = cast(ulong)(x * 10.0 + x.epsilon);
- =?UTF-8?B?Ik3DoXJjaW8=?= Martins" (20/41) Aug 25 2015 import std.stdio;
- bachmeier (4/22) Aug 25 2015 What you are attempting to do is impossible. Is there a reason
- Steven Schveighoffer (14/41) Aug 25 2015 Sorry, I misunderstood what epsilon was (I think it's the smallest
- =?UTF-8?B?Ik3DoXJjaW8=?= Martins" (7/25) Aug 25 2015 I didn't convert to double. My computations are all in double to
- Matthias Bentrup (6/9) Aug 25 2015 The same program "fails" in gcc too, if you use x87 math. Usually
- deadalnix (4/16) Aug 25 2015 That's because of floating point exception. It is very
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (11/12) Aug 25 2015 Are you sure you follow IEEE 754 recommendations? Floating point
- Steven Schveighoffer (24/33) Aug 25 2015 I'm not an expert on floating point, but I have written code that uses
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (19/26) Aug 25 2015 I don't think C specifies how it should be done, but some
- bachmeier (10/18) Aug 25 2015 As long as it doesn't change from one release of the compiler to
- Warwick (13/17) Aug 25 2015 Probably because DMD is spewing out x87 code. The x87 FPU
- rumbu (13/19) Aug 25 2015 True word:
- Timon Gehr (2/4) Aug 25 2015 No, we don't. There are multiple platforms.
- Timon Gehr (3/7) Aug 25 2015 Oh, and multiple compilers. We don't "have" reproducibility unless it's
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (4/9) Aug 25 2015 You don't get portable results for some builtin float functions,
- bachmeier (4/14) Aug 25 2015 I haven't looked at any of this in years. It sounds like the
- bachmeier (8/29) Aug 25 2015 That will work in this case (or maybe not, as Marcio's other post
- deadalnix (2/18) Aug 24 2015 http://www.smbc-comics.com/?id=2999
- Matthias Bentrup (8/21) Aug 25 2015 Internally the first case calculates x * 10.0 in real precision
I'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12 to!ulong instead of the cast does the right thing, and is a viable work-around. Issue: https://issues.dlang.org/show_bug.cgi?id=14958)
Aug 24 2015
On Monday 24 August 2015 18:52, wrote:import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12 to!ulong instead of the cast does the right thing, and is a viable work-around. Issue: https://issues.dlang.org/show_bug.cgi?id=14958)1.2 is not representable exactly in binary. Try printing it with a lot of decimal places: writefln("%.20f", x); /* prints "1.19999999999999995559" */ Multiply that by 10: ~11.999; cast to ulong: 11. Interestingly, printing x * 10.0 that way shows exactly 12: writefln("%.20f", x * 10.0); /* 12.00000000000000000000 */ But cast one operand to real and you're back at 11.9...: writefln("%.20f", cast(real)x * 10.0); /* 11.99999999999999955591 */ So, apparently, real precision is used in your code. This is not unexpected; compilers are allowed to use higher precision than requested for floating point operations. I think people have argued against it in the past, but so far Walter has been adamant about it being the right choice.
Aug 24 2015
On 8/24/15 12:52 PM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:I'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12Yes. This is part of the issue of floating point. 1.2 cannot be represented accurately. The second case is done via real, not double, and at compile time (i.e. constant folding). There may be other reasons why this works. You are better off adding a small epsilon: writeln(cast(ulong)(x * 10.0 + 0.1));to!ulong instead of the cast does the right thing, and is a viable work-around.to!ulong likely adds the epsilon, but you'd have to look to be sure. Note, this is NOT a D problem, this is a problem with floating point. And by problem, I mean feature-that-you-should-avoid :) -Steve
Aug 24 2015
On Monday, 24 August 2015 at 17:26:12 UTC, Steven Schveighoffer wrote:On 8/24/15 12:52 PM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:I am familiar with floating-point representations and their pitfalls, and I think that is not the issue here. The issue I am trying to illustrate is the fact that the same exact operation returns different results. Both operations are x * 10.0, except one of them passes through the stack before the cast. I would expect this to be consistent, as I believe is the case in C/C++.I'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12Yes. This is part of the issue of floating point. 1.2 cannot be represented accurately. The second case is done via real, not double, and at compile time (i.e. constant folding). There may be other reasons why this works. You are better off adding a small epsilon: writeln(cast(ulong)(x * 10.0 + 0.1));to!ulong instead of the cast does the right thing, and is a viable work-around.to!ulong likely adds the epsilon, but you'd have to look to be sure. Note, this is NOT a D problem, this is a problem with floating point. And by problem, I mean feature-that-you-should-avoid :) -Steve
Aug 24 2015
On Monday, 24 August 2015 at 17:26:12 UTC, Steven Schveighoffer wrote:Note, this is NOT a D problem, this is a problem with floating ponit. And by problem, I mean feature-that-you-should-avoid :) -SteveVisual C++ 19.00.23026, x86, x64: int _tmain(int argc, _TCHAR* argv[]) { double x = 1.2; printf("%d\r\n", (unsigned long long)(x * 10.0)); double y = 1.2 * 10.0; printf("%d\r\n", ((unsigned long long)y)); return 0; } Output: 12 12 Same output in debugger for an ARM Windows App. static void Main(string[] args) { double x = 1.2; WriteLine((ulong)(x * 10.0)); double y = 1.2 * 10.0; WriteLine((ulong)y); } Output: 12 12 Same output in debugger for ARM in all flavours (Android, iOS, Windows) It seems like a D problem.
Aug 24 2015
BTW, 1.2 and 12.0 are directly representable as double In C++: printf("%.20f\r\n", 1.2); printf("%.20f\r\n", 12.0); will output: 1.20000000000000000000 12.00000000000000000000 Either upcasting to real is the wrong decision here, either the writeln string conversion is wrong.
Aug 24 2015
On Mon, 24 Aug 2015 18:06:07 +0000, rumbu wrote:BTW, 1.2 and 12.0 are directly representable as double In C++: printf("%.20f\r\n", 1.2); printf("%.20f\r\n", 12.0); will output: 1.20000000000000000000 12.00000000000000000000 Either upcasting to real is the wrong decision here, either the writeln string conversion is wrong.No it's not, this must be some sort of constant-folding or precision increase. $ cat test.c #include "stdio.h" int main(int nargs, char** args) { double x = 1.2; printf("%.20f\n", x); } $ clang test.c && ./a.out 1.19999999999999995559
Aug 24 2015
On Monday, 24 August 2015 at 18:16:44 UTC, Justin Whear wrote:On Mon, 24 Aug 2015 18:06:07 +0000, rumbu wrote:Maybe the constant folding is using a different rounding mode to the runtime?BTW, 1.2 and 12.0 are directly representable as double In C++: printf("%.20f\r\n", 1.2); printf("%.20f\r\n", 12.0); will output: 1.20000000000000000000 12.00000000000000000000 Either upcasting to real is the wrong decision here, either the writeln string conversion is wrong.No it's not, this must be some sort of constant-folding or precision increase.
Aug 24 2015
On 08/24/2015 11:06 AM, rumbu wrote:BTW, 1.2 and 12.0 are directly representable as double12 is but 1.2 is not.In C++: printf("%.20f\r\n", 1.2); printf("%.20f\r\n", 12.0); will output: 1.20000000000000000000 12.00000000000000000000 Either upcasting to real is the wrong decision here, either the writeln string conversion is wrong.Output is one thing. The issue is with the representation of 1.2. You need infinite digits. D's %a helps with visualizing it: import std.stdio; void main() { writefln("%a", 1.2); writefln("%a", 12.0); } Outputs 0x1.3333333333333p+0 0x1.8p+3 Ali
Aug 24 2015
On Monday, 24 August 2015 at 18:06:08 UTC, rumbu wrote:BTW, 1.2 and 12.0 are directly representable as double12.0 is representable, but I'm pretty sure, if you work it out, 1.2 isn't.
Aug 24 2015
On 8/24/15 2:06 PM, rumbu wrote:BTW, 1.2 and 12.0 are directly representable as double In C++: printf("%.20f\r\n", 1.2); printf("%.20f\r\n", 12.0); will output: 1.20000000000000000000 12.00000000000000000000 Either upcasting to real is the wrong decision here, either the writeln string conversion is wrong.I don't think they are directly representable as floating point, because they are have factors other than 2 in the decimal portion. From my understanding, anything that only has to do with powers of 2 are representable in floating point, just like you cannot represent 1/3 in decimal exactly. But there is definitely something weird going on with the casting. I wrote this program: testfp.d: extern(C) void foo(double x); void main() { double x = 1.2; foo(x); } testfp2.d: extern(C) void foo(double x) { import std.stdio; writeln(cast(ulong)(x * 10.0)); } testfp2.c: #include <stdio.h> void foo(double x) { printf("%lld\n", (unsigned long long)(x * 10)); } If I link testfp.d against testfp2.c, then it outputs 12. If I link against testfp2.d, it outputs 11. I have faith that printf and writeln properly output ulongs. Something different happens with the cast. There can be no constant folding operations or optimizations going on here, as this is done via separate compilation. I'll re-open the bug report. -Steve
Aug 24 2015
On 8/24/15 2:38 PM, Steven Schveighoffer wrote:On 8/24/15 2:06 PM, rumbu wrote:More data: It definitely has something to do with the representation of 1.2 * 10.0 in *real*. I changed the code so that it writes the result of the multiplication to a shared double. In this case it *works* and prints 12, just like C does. This also works: double x = 1.2; double y = x * 10.0; writeln(cast(ulong)y); // 12 However, change y to a real, and you get 11. Note that if I first convert from real to double, then convert to ulong, it works. This code: double x = 1.2; double x2 = x * 10.0; real y = x * 10.0; real y2 = x2; double y3 = y; writefln("%a, %a, %a", y, y2, cast(real)y3); outputs: 0xb.ffffffffffffep+0, 0xcp+0, 0xcp+0 So some rounding happens in the conversion from real to double, that doesn't happen in the conversion from real to ulong. All this gets down to: FP cannot accurately represent decimal. Should this be fixed? Can it be fixed? I don't know. But I would be very cautious about converting anything FP to integers without some epsilon. -SteveBTW, 1.2 and 12.0 are directly representable as double In C++: printf("%.20f\r\n", 1.2); printf("%.20f\r\n", 12.0); will output: 1.20000000000000000000 12.00000000000000000000 Either upcasting to real is the wrong decision here, either the writeln string conversion is wrong.I don't think they are directly representable as floating point, because they are have factors other than 2 in the decimal portion. From my understanding, anything that only has to do with powers of 2 are representable in floating point, just like you cannot represent 1/3 in decimal exactly. But there is definitely something weird going on with the casting. I wrote this program: testfp.d: extern(C) void foo(double x); void main() { double x = 1.2; foo(x); } testfp2.d: extern(C) void foo(double x) { import std.stdio; writeln(cast(ulong)(x * 10.0)); } testfp2.c: #include <stdio.h> void foo(double x) { printf("%lld\n", (unsigned long long)(x * 10)); } If I link testfp.d against testfp2.c, then it outputs 12. If I link against testfp2.d, it outputs 11.
Aug 24 2015
On Monday, 24 August 2015 at 18:59:58 UTC, Steven Schveighoffer wrote:All this gets down to: FP cannot accurately represent decimal. Should this be fixed? Can it be fixed? I don't know. But I would be very cautious about converting anything FP to integers without some epsilon. -SteveI don't see anything that needs to be fixed, because I don't think anything is broken - there is nothing that violates my understanding of floating point precision in D. cast is not round. What is broken is a program that attempts to convert a double to an integer type using a cast rather than the functions that were written to do it correctly.
Aug 24 2015
On 8/24/15 3:15 PM, bachmeier wrote:On Monday, 24 August 2015 at 18:59:58 UTC, Steven Schveighoffer wrote:What is surprising, and possibly buggy, is that none of these operations involve real, but the issue only happens because under the hood, real is used instead of double for the multiplication. I pretty much agree with you that the code is written incorrectly. But it is unfortunate it differs in the way it handles this from C. I think this issue has been brought up before on the newsgroup, especially where CTFE is involved. -SteveAll this gets down to: FP cannot accurately represent decimal. Should this be fixed? Can it be fixed? I don't know. But I would be very cautious about converting anything FP to integers without some epsilon. -SteveI don't see anything that needs to be fixed, because I don't think anything is broken - there is nothing that violates my understanding of floating point precision in D. cast is not round. What is broken is a program that attempts to convert a double to an integer type using a cast rather than the functions that were written to do it correctly.
Aug 24 2015
On Mon, Aug 24, 2015 at 07:15:43PM +0000, bachmeier via Digitalmars-d wrote:On Monday, 24 August 2015 at 18:59:58 UTC, Steven Schveighoffer wrote:+1. Floating-point != mathematical real numbers. Don't expect it to behave the same. T -- It's amazing how careful choice of punctuation can leave you hanging:All this gets down to: FP cannot accurately represent decimal. Should this be fixed? Can it be fixed? I don't know. But I would be very cautious about converting anything FP to integers without some epsilon. -SteveI don't see anything that needs to be fixed, because I don't think anything is broken - there is nothing that violates my understanding of floating point precision in D. cast is not round. What is broken is a program that attempts to convert a double to an integer type using a cast rather than the functions that were written to do it correctly.
Aug 24 2015
On Monday, 24 August 2015 at 16:52:54 UTC, Márcio Martins wrote:I'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12 to!ulong instead of the cast does the right thing, and is a viable work-around. Issue: https://issues.dlang.org/show_bug.cgi?id=14958)I would not describe to!ulong as a "work-around". You just discovered one of the reasons to! exists: it is the right way to do it and cast(ulong) is the wrong way. As the others have noted, floating point is tricky business, and you need to use the right tools for the job. std.math.round also works.
Aug 24 2015
On 8/24/15 1:43 PM, bachmeier wrote:On Monday, 24 August 2015 at 16:52:54 UTC, Márcio Martins wrote:real y = x * 10.0; writeln(y.to!ulong); // 11 to! does not do anything different than cast. What is happening here is the implicit cast from real to double. D treats the result of x * 10.0 as type double, but it's done at real precision. In that conversion, the error is hidden by a rounding automatically done by the processor I think. -SteveI'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12 to!ulong instead of the cast does the right thing, and is a viable work-around. Issue: https://issues.dlang.org/show_bug.cgi?id=14958)I would not describe to!ulong as a "work-around". You just discovered one of the reasons to! exists: it is the right way to do it and cast(ulong) is the wrong way. As the others have noted, floating point is tricky business, and you need to use the right tools for the job.
Aug 24 2015
On Monday, 24 August 2015 at 19:23:44 UTC, Steven Schveighoffer wrote:real y = x * 10.0; writeln(y.to!ulong); // 11 to! does not do anything different than cast. What is happening here is the implicit cast from real to double. D treats the result of x * 10.0 as type double, but it's done at real precision. In that conversion, the error is hidden by a rounding automatically done by the processor I think. -SteveYes, I was mistaken. You have to use roundTo or std.math.round. to! and cast both truncate.
Aug 24 2015
On Monday, 24 August 2015 at 19:23:44 UTC, Steven Schveighoffer wrote:On 8/24/15 1:43 PM, bachmeier wrote:Whatever the issue is, it is not unavoidable, because as has been shown, other languages do it correctly. From the data presented so far, it seems like the issue is that the mul is performed in 80-bit precision, storing it before the cast forces a truncation down to 64-bit. Similarly, passing it to a function will also truncate to 64-bit, due to ABIs. This is why to! works as expected. Please do keep in mind that the issue is not one of precision, but one of inconsistency. They are not the same thing. The result being 11 or 12 is irrelevant to this issue. It should just be the same for two instances of the same expression. In an attempt to make things more obvious, consider this example, which also illustrates why to! works, despite apparently doing nothing extra at all. double noop(double z) { return z; } void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); writeln(cast(ulong)noop(x * 10.0)); } Outputs: 11 12On Monday, 24 August 2015 at 16:52:54 UTC, Márcio Martins wrote:real y = x * 10.0; writeln(y.to!ulong); // 11 to! does not do anything different than cast. What is happening here is the implicit cast from real to double. D treats the result of x * 10.0 as type double, but it's done at real precision. In that conversion, the error is hidden by a rounding automatically done by the processor I think. -SteveI'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12 to!ulong instead of the cast does the right thing, and is a viable work-around. Issue: https://issues.dlang.org/show_bug.cgi?id=14958)I would not describe to!ulong as a "work-around". You just discovered one of the reasons to! exists: it is the right way to do it and cast(ulong) is the wrong way. As the others have noted, floating point is tricky business, and you need to use the right tools for the job.
Aug 24 2015
On 8/24/15 4:34 PM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:On Monday, 24 August 2015 at 19:23:44 UTC, Steven Schveighoffer wrote:Your other examples use doubles, not reals. It's not apples to apples.On 8/24/15 1:43 PM, bachmeier wrote:Whatever the issue is, it is not unavoidable, because as has been shown, other languages do it correctly.On Monday, 24 August 2015 at 16:52:54 UTC, Márcio Martins wrote:real y = x * 10.0; writeln(y.to!ulong); // 11 to! does not do anything different than cast. What is happening here is the implicit cast from real to double. D treats the result of x * 10.0 as type double, but it's done at real precision. In that conversion, the error is hidden by a rounding automatically done by the processor I think.I'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12 to!ulong instead of the cast does the right thing, and is a viable work-around. Issue: https://issues.dlang.org/show_bug.cgi?id=14958)I would not describe to!ulong as a "work-around". You just discovered one of the reasons to! exists: it is the right way to do it and cast(ulong) is the wrong way. As the others have noted, floating point is tricky business, and you need to use the right tools for the job.From the data presented so far, it seems like the issue is that the mul is performed in 80-bit precision, storing it before the cast forces a truncation down to 64-bit.Not just truncation, rounding too.Similarly, passing it to a function will also truncate to 64-bit, due to ABIs. This is why to! works as expected. Please do keep in mind that the issue is not one of precision, but one of inconsistency.It is an issue of precision. In order to change from real to double, some bits must be lost. Since certain numbers cannot be represented, the CPU must round or truncate.They are not the same thing. The result being 11 or 12 is irrelevant to this issue. It should just be the same for two instances of the same expression.They are not the same expression. One goes from double through multiplication to real, then back to double, then to ulong. The other skips the real to double conversion and goes directly to ulong. The real issue here is that you are not correctly converting from a floating point number to an integer.In an attempt to make things more obvious, consider this example, which also illustrates why to! works, despite apparently doing nothing extra at all. double noop(double z) { return z; } void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); writeln(cast(ulong)noop(x * 10.0)); } Outputs: 11 12I understand the inconsistency, and I agree it is an issue that should be examined. But the issue is entirely avoidable by not using incorrect methods to convert from floating point to integer after floating point operations introduce some small level of error. Perhaps there is some way to make it properly round in this case, but I guarantee it will not fix all floating point errors. -Steve
Aug 24 2015
On 8/24/15 5:03 PM, Steven Schveighoffer wrote:#include <stdio.h> int main() { long double x = 1.2; x *= 10.0; printf("%lld\n", (unsigned long long)x); } output: 11 -SteveWhatever the issue is, it is not unavoidable, because as has been shown, other languages do it correctly.Your other examples use doubles, not reals. It's not apples to apples.
Aug 24 2015
On Monday, 24 August 2015 at 21:03:50 UTC, Steven Schveighoffer wrote:On 8/24/15 4:34 PM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:All my examples are doubles, and I have tested them all in C++ as well, using doubles. It is indeed apples to apples :)On Monday, 24 August 2015 at 19:23:44 UTC, Steven Schveighoffer wrote:Your other examples use doubles, not reals. It's not apples to apples.On 8/24/15 1:43 PM, bachmeier wrote:Whatever the issue is, it is not unavoidable, because as has been shown, other languages do it correctly.On Monday, 24 August 2015 at 16:52:54 UTC, Márcio Martins wrote:real y = x * 10.0; writeln(y.to!ulong); // 11 to! does not do anything different than cast. What is happening here is the implicit cast from real to double. D treats the result of x * 10.0 as type double, but it's done at real precision. In that conversion, the error is hidden by a rounding automatically done by the processor I think.I'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12 to!ulong instead of the cast does the right thing, and is a viable work-around. Issue: https://issues.dlang.org/show_bug.cgi?id=14958)I would not describe to!ulong as a "work-around". You just discovered one of the reasons to! exists: it is the right way to do it and cast(ulong) is the wrong way. As the others have noted, floating point is tricky business, and you need to use the right tools for the job.What? If rounding was performed, then it would work as expected. i.e. both outputs would be 12.From the data presented so far, it seems like the issue is that the mul is performed in 80-bit precision, storing it before the cast forces a truncation down to 64-bit.Not just truncation, rounding too.There is no mention of real anywhere in any code. The intent is clearly stated in the code and while I accept precision and rounding errors, especially because DMD has no way to select a floating point model, that I am aware of, at least, it's very hard for me to accept the inconsistency.Similarly, passing it to a function will also truncate to 64-bit, due to ABIs. This is why to! works as expected. Please do keep in mind that the issue is not one of precision, but one of inconsistency.It is an issue of precision. In order to change from real to double, some bits must be lost. Since certain numbers cannot be represented, the CPU must round or truncate.There is only 1 floating-point operation and one cast per expression. They are effectively the same except one value is stored in a temporary before casting. The intent expressed in the code is absolutely the same. All values are the same, operation order is the same, and types are all the same.They are not the same thing. The result being 11 or 12 is irrelevant to this issue. It should just be the same for two instances of the same expression.They are not the same expression. One goes from double through multiplication to real, then back to double, then to ulong. The other skips the real to double conversion and goes directly to ulong.The real issue here is that you are not correctly converting from a floating point number to an integer.What is the correct way to truncate, not round, a floating-point value to an integer?In an attempt to make things more obvious, consider this example, which also illustrates why to! works, despite apparently doing nothing extra at all. double noop(double z) { return z; } void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); writeln(cast(ulong)noop(x * 10.0)); } Outputs: 11 12I understand the inconsistency, and I agree it is an issue that should be examined. But the issue is entirely avoidable by not using incorrect methods to convert from floating point to integer after floating point operations introduce some small level of error. Perhaps there is some way to make it properly round in this case, but I guarantee it will not fix all floating point errors. -Steve
Aug 24 2015
On Mon, Aug 24, 2015 at 09:34:22PM +0000, via Digitalmars-d wrote: [...]What is the correct way to truncate, not round, a floating-point value to an integer?std.math.trunc. T -- Having a smoking section in a restaurant is like having a peeing section in a swimming pool. -- Edward Burr
Aug 24 2015
On Monday, 24 August 2015 at 22:12:42 UTC, H. S. Teoh wrote:On Mon, Aug 24, 2015 at 09:34:22PM +0000, via Digitalmars-d wrote: [...]import std.stdio; import std.math; void main() { double x = 1.2; writeln(std.math.trunc(x * 10.0)); double y = x * 10.0; writeln(std.math.trunc(y)); } Outputs: 11 12What is the correct way to truncate, not round, a floating-point value to an integer?std.math.trunc. T
Aug 24 2015
On Monday, 24 August 2015 at 21:34:23 UTC, Márcio Martins wrote:There's no guarantee that it will be done consistently or correctly in C or C++ to my knowledge. Some compilers will do it consistently, but it's absolutely not portable.Whatever the issue is, it is not unavoidable, because as has been shown, other languages do it correctly.It's fully consistent with what DMD claims to do: http://dlang.org/portability.html While a compiler can guarantee consistency, I don't know of any way to guarantee correctness, which makes the question of consistency irrelevant. There's no way to know what will happen when you run the program.It is an issue of precision. In order to change from real to double, some bits must be lost. Since certain numbers cannot be represented, the CPU must round or truncate.There is no mention of real anywhere in any code. The intent is clearly stated in the code and while I accept precision and rounding errors, especially because DMD has no way to select a floating point model, that I am aware of, at least, it's very hard for me to accept the inconsistency.What is the correct way to truncate, not round, a floating-point value to an integer?If you can be an epsilon above or below the exact answer, there's no way to guarantee correctness unless you know you're not doing something that resembles integer operations. If the exact answer is 12.2 or 12.6, you can do it correctly. If it is 12.0 or 23.0, you can get the wrong answer.
Aug 24 2015
On 8/24/15 5:34 PM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:On Monday, 24 August 2015 at 21:03:50 UTC, Steven Schveighoffer wrote:auto result = cast(ulong)(x * 10.0 + x.epsilon); -SteveI understand the inconsistency, and I agree it is an issue that should be examined. But the issue is entirely avoidable by not using incorrect methods to convert from floating point to integer after floating point operations introduce some small level of error. Perhaps there is some way to make it properly round in this case, but I guarantee it will not fix all floating point errors.What is the correct way to truncate, not round, a floating-point value to an integer?
Aug 25 2015
On Tuesday, 25 August 2015 at 11:14:35 UTC, Steven Schveighoffer wrote:On 8/24/15 5:34 PM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0 + x.epsilon)); double y = x * 10.0; writeln(cast(ulong)(y + x.epsilon)); double z = x * 10.0 + x.epsilon; writeln(cast(ulong)(z)); } Outputs: 11 12 12 I leave it at this. It seems like this only bothers me, and I have no more time to argue. The workaround is not that bad, and at the end of the day, it is just one more thing on the list.On Monday, 24 August 2015 at 21:03:50 UTC, Steven Schveighoffer wrote:auto result = cast(ulong)(x * 10.0 + x.epsilon); -SteveI understand the inconsistency, and I agree it is an issue that should be examined. But the issue is entirely avoidable by not using incorrect methods to convert from floating point to integer after floating point operations introduce some small level of error. Perhaps there is some way to make it properly round in this case, but I guarantee it will not fix all floating point errors.What is the correct way to truncate, not round, a floating-point value to an integer?
Aug 25 2015
On Tuesday, 25 August 2015 at 13:51:18 UTC, Márcio Martins wrote:import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0 + x.epsilon)); double y = x * 10.0; writeln(cast(ulong)(y + x.epsilon)); double z = x * 10.0 + x.epsilon; writeln(cast(ulong)(z)); } Outputs: 11 12 12 I leave it at this. It seems like this only bothers me, and I have no more time to argue. The workaround is not that bad, and at the end of the day, it is just one more thing on the list.What you are attempting to do is impossible. Is there a reason you can't use std.math.round, which is the tool that was made for the task?
Aug 25 2015
On 8/25/15 9:51 AM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:On Tuesday, 25 August 2015 at 11:14:35 UTC, Steven Schveighoffer wrote:Sorry, I misunderstood what epsilon was (I think it's the smallest incremental value for a given floating point type with an exponent of 1). Because your number is further away than this value, it doesn't help. You need to add something to correct for the error that might exist. The best thing to do is to add a very small number, as that will only adjust truly close numbers. In this case, the number you could add is 0.1, since it's not going to affect anything other than a slightly-off value. It depends on where you expect the error to be. As bachmeier says, it's not something that's easy to get right.On 8/24/15 5:34 PM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0 + x.epsilon));On Monday, 24 August 2015 at 21:03:50 UTC, Steven Schveighoffer wrote:auto result = cast(ulong)(x * 10.0 + x.epsilon);I understand the inconsistency, and I agree it is an issue that should be examined. But the issue is entirely avoidable by not using incorrect methods to convert from floating point to integer after floating point operations introduce some small level of error. Perhaps there is some way to make it properly round in this case, but I guarantee it will not fix all floating point errors.What is the correct way to truncate, not round, a floating-point value to an integer?double y = x * 10.0; writeln(cast(ulong)(y + x.epsilon)); double z = x * 10.0 + x.epsilon; writeln(cast(ulong)(z));these work because you have converted to double, which appears to round up. -Steve
Aug 25 2015
On Tuesday, 25 August 2015 at 14:54:41 UTC, Steven Schveighoffer wrote:On 8/25/15 9:51 AM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:I didn't convert to double. My computations are all in double to start with, as you can see for my explicit types everywhere. If you compile it with *GDC* it works fine. If you compile a port with clang, gcc or msvc, it works right as well. I suspect it will also work fine with LDC.[...]Sorry, I misunderstood what epsilon was (I think it's the smallest incremental value for a given floating point type with an exponent of 1). Because your number is further away than this value, it doesn't help. You need to add something to correct for the error that might exist. The best thing to do is to add a very small number, as that will only adjust truly close numbers. In this case, the number you could add is 0.1, since it's not going to affect anything other than a slightly-off value. It depends on where you expect the error to be. As bachmeier says, it's not something that's easy to get right.[...]these work because you have converted to double, which appears to round up. -Steve
Aug 25 2015
On Tuesday, 25 August 2015 at 15:19:41 UTC, Márcio Martins wrote:If you compile it with *GDC* it works fine. If you compile a port with clang, gcc or msvc, it works right as well. I suspect it will also work fine with LDC.The same program "fails" in gcc too, if you use x87 math. Usually C compilers allow excess precision for intermediate results, because the extra precision seldom hurts and changing precision on x87 is very expensive (depends on the CPU, but it is more expensive than the trigonometric functions on some models).
Aug 25 2015
On Tuesday, 25 August 2015 at 21:21:59 UTC, Matthias Bentrup wrote:On Tuesday, 25 August 2015 at 15:19:41 UTC, Márcio Martins wrote:That's because of floating point exception. It is very constraining for the hardware.If you compile it with *GDC* it works fine. If you compile a port with clang, gcc or msvc, it works right as well. I suspect it will also work fine with LDC.The same program "fails" in gcc too, if you use x87 math. Usually C compilers allow excess precision for intermediate results, because the extra precision seldom hurts and changing precision on x87 is very expensive (depends on the CPU, but it is more expensive than the trigonometric functions on some models).
Aug 25 2015
On Tuesday, 25 August 2015 at 14:54:41 UTC, Steven Schveighoffer wrote:As bachmeier says, it's not something that's easy to get right.Are you sure you follow IEEE 754 recommendations? Floating point arithmetics should be reproducible according to the chosen rounding mode. https://en.wikipedia.org/wiki/IEEE_floating_point#Reproducibility «The reproducibility clause recommends that language standards should provide a means to write reproducible programs (i.e., programs that will produce the same result in all implementations of a language), and describes what needs to be done to achieve reproducible results.»
Aug 25 2015
On 8/25/15 11:56 AM, "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= <ola.fosheim.grostad+dlang gmail.com>" wrote:On Tuesday, 25 August 2015 at 14:54:41 UTC, Steven Schveighoffer wrote:I'm not an expert on floating point, but I have written code that uses it, and I have gotten it very wrong because I didn't take into account the floating point error (worst was causing an infinite loop in a corner case). I'll note that D does exactly what C does in the case where you are using 80-bit floating point numbers. There is definitely an issue with the fact that storing it as a double causes a change in the behavior, and that D doesn't treat expressions that are typed as doubles, as doubles. I see an issue with this: double x = 1.2; auto y = x * 10.0; // typed as double writefln("%s %s", cast(ulong)y, cast(ulong)(x * 10.0)); // 12 11 IMO, these two operations should be the same. If the result of an expression is detected to be double, then it should behave like one. You can't have the calculation done in 80-bit mode, and then magically throw away the rounding to get to 64-bit mode. I think Marcio has a point that this is both surprising and troublesome. But I think this is an anecdotal instance of a toy example. I'd expect real code to use adjustments when truncating to avoid the FP error (this obviously isn't his real code). -SteveAs bachmeier says, it's not something that's easy to get right.Are you sure you follow IEEE 754 recommendations? Floating point arithmetics should be reproducible according to the chosen rounding mode. https://en.wikipedia.org/wiki/IEEE_floating_point#Reproducibility «The reproducibility clause recommends that language standards should provide a means to write reproducible programs (i.e., programs that will produce the same result in all implementations of a language), and describes what needs to be done to achieve reproducible results.»
Aug 25 2015
On Tuesday, 25 August 2015 at 17:40:06 UTC, Steven Schveighoffer wrote:I'll note that D does exactly what C does in the case where you are using 80-bit floating point numbers.I don't think C specifies how it should be done, but some compilers have a "precise" compilation flag that is supposed to retain order and accurate intermediate rounding.IMO, these two operations should be the same. If the result of an expression is detected to be double, then it should behave like one. You can't have the calculation done in 80-bit mode, and then magically throw away the rounding to get to 64-bit mode.Yes, that is rather obvious. IEEE754-2008 go much further than that, though. It requires that all arithmetic have correct rounding. Yes, I am aware that the D specification allows higher precision, but it seems to me that this neither gets you predictable results or maximum performance. And what is the point of being able to set the rounding mode if you don't know the bit width used? It is a practical issue in all simulations where you want reproducible results. If D is meant for scientific computing it should support correct rounding and reproducible results. If D is meant for gaming it should provide ways of expressing minimum precision or other ways of loosening the accuracy where needed. I'm not really sure which group the current semantics appeals to. I personally either want reproducible or very fast...
Aug 25 2015
On Tuesday, 25 August 2015 at 18:15:03 UTC, Ola Fosheim Grøstad wrote:It is a practical issue in all simulations where you want reproducible results. If D is meant for scientific computing it should support correct rounding and reproducible results. If D is meant for gaming it should provide ways of expressing minimum precision or other ways of loosening the accuracy where needed. I'm not really sure which group the current semantics appeals to. I personally either want reproducible or very fast...As long as it doesn't change from one release of the compiler to the next, we have reproducibility. In many cases though, reproducibility doesn't mean exact reproducibility, at least in the old days it didn't, due to floating point issues. You generally want to allow for replication of the results using other languages, so you have to allow for some differences. I'm pretty sure Walter has stated the reason that you cannot count on exact precision, but I don't remember what it is.
Aug 25 2015
On Tuesday, 25 August 2015 at 20:00:11 UTC, bachmeier wrote:On Tuesday, 25 August 2015 at 18:15:03 UTC, Ola Fosheim Grøstad wrote: I'm pretty sure Walter has stated the reason that you cannot count on exact precision, but I don't remember what it is.Probably because DMD is spewing out x87 code. The x87 FPU converts everything to its internal working bit depth before it does the math op. You can set it to work at different bit depths but IIRC it's a fairly expensive operation to change the FPU flags. You really dont want to be doing it every time some mixes a double and a float. The compilers that dont exhibit this problem might set the x87 to work at 64 bit at startup or more likely they are using scalar SSE. You cant mix different depth operands in SSE. You cant multiply a float by double for example, you have to convert one of them so they have the same type. So in SSE the bit depth of every op is always explicit.
Aug 25 2015
On Tuesday, 25 August 2015 at 21:30:03 UTC, Warwick wrote:The compilers that dont exhibit this problem might set the x87 to work at 64 bit at startup or more likely they are using scalar SSE. You cant mix different depth operands in SSE. You cant multiply a float by double for example, you have to convert one of them so they have the same type. So in SSE the bit depth of every op is always explicit.True word: This is msvc compiler generated code (default configuration, debug): double x = 1.2; 012F174E movsd xmm0,mmword ptr ds:[12F6B30h] 012F1756 movsd mmword ptr [x],xmm0 unsigned long long u = (unsigned long long)(x * 10); 012F175B movsd xmm0,mmword ptr [x] 012F1760 mulsd xmm0,mmword ptr ds:[12F6B40h] 012F1768 call __dtoul3 (012F102Dh) 012F176D mov dword ptr [u],eax 012F1770 mov dword ptr [ebp-18h],edx
Aug 25 2015
On 08/25/2015 10:00 PM, bachmeier wrote:As long as it doesn't change from one release of the compiler to the next, we have reproducibility.No, we don't. There are multiple platforms.
Aug 25 2015
On 08/26/2015 12:46 AM, Timon Gehr wrote:On 08/25/2015 10:00 PM, bachmeier wrote:Oh, and multiple compilers. We don't "have" reproducibility unless it's in the spec, and the opposite is in the spec.As long as it doesn't change from one release of the compiler to the next, we have reproducibility.No, we don't. There are multiple platforms.
Aug 25 2015
On Tuesday, 25 August 2015 at 20:00:11 UTC, bachmeier wrote:to the next, we have reproducibility. In many cases though, reproducibility doesn't mean exact reproducibility, at least in the old days it didn't, due to floating point issues. You generally want to allow for replication of the results using other languages, so you have to allow for some differences.You don't get portable results for some builtin float functions, but otherwise I believe the 2008 edition of IEEE is exact. Latest version of ECMAScript also use the 2008 version of IEEE.
Aug 25 2015
On Tuesday, 25 August 2015 at 23:09:07 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 25 August 2015 at 20:00:11 UTC, bachmeier wrote:I haven't looked at any of this in years. It sounds like the situation is better now.to the next, we have reproducibility. In many cases though, reproducibility doesn't mean exact reproducibility, at least in the old days it didn't, due to floating point issues. You generally want to allow for replication of the results using other languages, so you have to allow for some differences.You don't get portable results for some builtin float functions, but otherwise I believe the 2008 edition of IEEE is exact. Latest version of ECMAScript also use the 2008 version of IEEE.
Aug 25 2015
On Tuesday, 25 August 2015 at 11:14:35 UTC, Steven Schveighoffer wrote:On 8/24/15 5:34 PM, "=?UTF-8?B?Ik3DoXJjaW8=?= Martins\" <marcioapm gmail.com>\"" wrote:That will work in this case (or maybe not, as Marcio's other post shows) but it's still not a general solution. You're imposing the assumption that anything sufficiently close to an integer value is that integer. Truncating a floating point number is not a well-defined exercise because you only know an interval that holds the true value.On Monday, 24 August 2015 at 21:03:50 UTC, Steven Schveighoffer wrote:auto result = cast(ulong)(x * 10.0 + x.epsilon); -SteveI understand the inconsistency, and I agree it is an issue that should be examined. But the issue is entirely avoidable by not using incorrect methods to convert from floating point to integer after floating point operations introduce some small level of error. Perhaps there is some way to make it properly round in this case, but I guarantee it will not fix all floating point errors.What is the correct way to truncate, not round, a floating-point value to an integer?
Aug 25 2015
On Monday, 24 August 2015 at 16:52:54 UTC, Márcio Martins wrote:I'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12 to!ulong instead of the cast does the right thing, and is a viable work-around. Issue: https://issues.dlang.org/show_bug.cgi?id=14958)http://www.smbc-comics.com/?id=2999
Aug 24 2015
On Monday, 24 August 2015 at 16:52:54 UTC, Márcio Martins wrote:I'm posting this here for visibility. This was silently corrupting our data, and might be doing the same for others as well. import std.stdio; void main() { double x = 1.2; writeln(cast(ulong)(x * 10.0)); double y = 1.2 * 10.0; writeln(cast(ulong)y); } Output: 11 12Internally the first case calculates x * 10.0 in real precision and casts it to ulong in truncating mode directly. As 1.2 is not representable, x is really 1.199999999999999956 and the result is trunc(11.99999999999999956) = 11. In the second case x * 10.0 is calculated in real precision, but first converted to double in round-to-nearest mode and then the result is truncated.
Aug 25 2015