digitalmars.D - Loss of precision errors in FP conversions
- bearophile (38/38) Apr 19 2011 In Bugzilla I have just added an enhancement request that asks for a lit...
- dsimcha (20/58) Apr 19 2011 Please, _NOOOOOOO!!_ The integer conversion errors are already arguably...
- bearophile (15/18) Apr 19 2011 You are right, and I am ready to close that enhancement request at once ...
- dsimcha (11/16) Apr 19 2011 Very often, actually. Basically, any time I have a lot of floating
- Andrei Alexandrescu (3/7) Apr 19 2011 Yes please. I once felt the same way, but learned better since.
- bearophile (4/5) Apr 19 2011 OK :-) But I will probably write an article about this, because I have f...
- dsimcha (2/7) Apr 19 2011 ...or write a Lint tool for D.
- Robert Jacques (14/35) Apr 19 2011 I do GP GPU work, so I use floats all the time. They're also useful for ...
- bearophile (8/15) Apr 20 2011 If you compile D1 code that doesn't contain "real" types with 32-bit LDC...
- Robert Jacques (9/13) Apr 20 2011 IIRC, the Fermi Tesla cards do doubles at about 1/2 float speed, so
- Walter Bright (12/21) Apr 19 2011 That's definitely a worry. Having a Nagging Nellie giving false alarms o...
- bearophile (7/13) Apr 19 2011 Loss of some precision bits in normal FP operations is a fact of life, b...
- Walter Bright (2/9) Apr 19 2011 Yes, I've argued strongly against warnings.
- Brad Roberts (4/18) Apr 19 2011 The stronger argument, that I agree with, is not having flag based
- Walter Bright (5/8) Apr 19 2011 True, if you have N compiler switches, you have 2^N different compilers ...
- Sean Kelly (7/13) Apr 20 2011 compilers to test! Every switch added doubles the time it takes to =
- Walter Bright (3/6) Apr 20 2011 Currently I test with all combinations of switches that affect code gen....
- Jesse Phillips (4/6) Apr 19 2011 Loosing precision on a fractional number seems like it would be a very c...
In Bugzilla I have just added an enhancement request that asks for a little change in D, I don't know if it was already discussed or if it's already present in Bugzilla: http://d.puremagic.com/issues/show_bug.cgi?id=5864 In a program like this: void main() { uint x = 10_000; ubyte b = x; } DMD 2.052 raises a compilation error like this, because the b=x assignment may lose some information, some bits of x: test.d(3): Error: cannot implicitly convert expression (x) of type uint to ubyte I think that a safe and good system language has to help avoid unwanted (implicit) loss of information during data conversions. This is a case of loss of precision where D generates no compile errors: import std.stdio; void main() { real f1 = 1.0000111222222222333; writefln("%.19f", f1); double f2 = f1; // loss of FP precision writefln("%.19f", f2); float f3 = f2; // loss of FP precision writefln("%.19f", f3); } Despite some information is lost, see the output: 1.0000111222222222332 1.0000111222222223261 1.0000110864639282226 So one possible way to face this situation is to statically disallow double=>float, real=>float, and real=>double conversions (on some computers real=>double conversions don't cause loss of information, but I suggest to ignore this, to increase code portability), and introduce compile-time errors like: test.d(5): Error: cannot implicitly convert expression (f1) of type real to double test.d(7): Error: cannot implicitly convert expression (f2) of type double to float Today float values seem less useful, because with serial CPU instructions the performance difference between operations on float and double is often not important, and often you want the precision of doubles. But modern CPUs (and current GPUs) have vector operations too. They are currently able to perform operations on 4 float values or 2 double values (or 8 float or 4 doubles) at the same time for each instruction. Such vector instructions are sometimes used directly in C-GCC code using SSE intrinsics, or come out of auto-vectorization optimization of loops done by GCC on normal serial C code. In this situation the usage of float instead of double gives almost a twofold performance increase. There are programs (like certain ray-tracing code) where the precision of a float is enough. So a compile-time error that catches currently implicit double->float conversions may help the programmer avoid unwanted usages of doubles that allow the compiler to pack 4/8 floats in a vector register during loop vectorizations. Partially related note: currently std.math doesn't seem to use the cosf, sinf C functions, but it uses sqrtf: import std.math: sqrt, sin, cos; void main() { float x = 1.0f; static assert(is(typeof( sqrt(x) ) == float)); // OK static assert(is(typeof( sin(x) ) == float)); // ERR static assert(is(typeof( cos(x) ) == float)); // ERR } Bye, bearophile
Apr 19 2011
On 4/19/2011 7:49 PM, bearophile wrote:In Bugzilla I have just added an enhancement request that asks for a little change in D, I don't know if it was already discussed or if it's already present in Bugzilla: http://d.puremagic.com/issues/show_bug.cgi?id=5864 In a program like this: void main() { uint x = 10_000; ubyte b = x; } DMD 2.052 raises a compilation error like this, because the b=x assignment may lose some information, some bits of x: test.d(3): Error: cannot implicitly convert expression (x) of type uint to ubyte I think that a safe and good system language has to help avoid unwanted (implicit) loss of information during data conversions. This is a case of loss of precision where D generates no compile errors: import std.stdio; void main() { real f1 = 1.0000111222222222333; writefln("%.19f", f1); double f2 = f1; // loss of FP precision writefln("%.19f", f2); float f3 = f2; // loss of FP precision writefln("%.19f", f3); } Despite some information is lost, see the output: 1.0000111222222222332 1.0000111222222223261 1.0000110864639282226 So one possible way to face this situation is to statically disallow double=>float, real=>float, and real=>double conversions (on some computers real=>double conversions don't cause loss of information, but I suggest to ignore this, to increase code portability), and introduce compile-time errors like: test.d(5): Error: cannot implicitly convert expression (f1) of type real to double test.d(7): Error: cannot implicitly convert expression (f2) of type double to float Today float values seem less useful, because with serial CPU instructions the performance difference between operations on float and double is often not important, and often you want the precision of doubles. But modern CPUs (and current GPUs) have vector operations too. They are currently able to perform operations on 4 float values or 2 double values (or 8 float or 4 doubles) at the same time for each instruction. Such vector instructions are sometimes used directly in C-GCC code using SSE intrinsics, or come out of auto-vectorization optimization of loops done by GCC on normal serial C code. In this situation the usage of float instead of double gives almost a twofold performance increase. There are programs (like certain ray-tracing code) where the precision of a float is enough. So a compile-time error that catches currently implicit double->float conversions may help the programmer avoid unwanted usages of doubles that allow the compiler to pack 4/8 floats in a vector register during loop vectorizations.Partially related note: currently std.math doesn't seem to use the cosf, sinf C functions, but it uses sqrtf: import std.math: sqrt, sin, cos; void main() { float x = 1.0f; static assert(is(typeof( sqrt(x) ) == float)); // OK static assert(is(typeof( sin(x) ) == float)); // ERR static assert(is(typeof( cos(x) ) == float)); // ERR } Bye, bearophilePlease, _NOOOOOOO!!_ The integer conversion errors are already arguably too pedantic, make generic code harder to write and get in the way about as often as they help. Floating point tends to degrade much more gracefully than integer. Where integer narrowing can just be silently, non-obviously and completely wrong, floating point narrowing will at least be approximately right, or become infinity and be wrong in an obvious way. I know what you suggest could prevent bugs in a lot of cases, but it also has the potential to get in the way in a lot of cases. Generally I worry about D's type system becoming like the Boy Who Cried Wolf, where it flags so many potential errors (as opposed to things that are definitely errors) that people become conditioned to just put in whatever casts they need to shut it up. I definitely fell into that when porting some 32-bit code that was sloppy with size_t vs. int to 64-bit. I knew there was no way it was going to be a problem, because there was no way any of my arrays were going to be even within a few orders of magnitude of int.max, but the compiler insisted on nagging me about it and I reflexively just put in casts everywhere. A warning _may_ be appropriate, but definitely not an error.
Apr 19 2011
dsimcha:I know what you suggest could prevent bugs in a lot of cases, but it also has the potential to get in the way in a lot of cases.You are right, and I am ready to close that enhancement request at once if the consensus is against it. double->float and real->float cases are not so common. How often do you use floats in your code? In my code it's uncommon to use floats, generally I use doubles. A problem may be in real->double conversions, because I think D feels free to use intermediate real values in some FP computations. Another possible problem: generic code like this is going to produce an error because 2.0 literal is double, so x*2.0 is a double even if x is float: T foo(T)(T x) { return x * 2.0; } But when I use floats, I have found it good a C lint that spots double->float conversions, because it has actually allowed me to speed up some code that was doing float->double->float conversions without me being aware of it.A warning _may_ be appropriate, but definitely not an error.Another option is a -warn_fp_precision_loss compiler switch, that produces warnings only when you use it. For my purposes this is is enough. Regarding the actual amount of troubles this errors messages are going to cause, I have recently shown a link that argues for quantitative analysis of language changes: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=135049 The idea here is to introduce those three FP conversions errors, compile Phobos and some other D2 code, and count how many problems it causes. Bye, bearophile
Apr 19 2011
On 4/19/2011 8:37 PM, bearophile wrote:dsimcha:Very often, actually. Basically, any time I have a lot of floating point numbers that aren't going to be extremely big or small in magnitude and I'm interested in storing them and maybe performing a few _simple_ computations with them (sorting, statistical tests, most machine learning algorithms, etc.). Good examples are gene expression levels or transformations thereof and probabilities. Single precision is plenty unless your numbers are extremely big or small, you need a ridiculous number of significant figures, or you're performing intense computations (for example matrix factorizations) where rounding error may accumulate and turn a small loss of precision into a catastrophic one.I know what you suggest could prevent bugs in a lot of cases, but it also has the potential to get in the way in a lot of cases.You are right, and I am ready to close that enhancement request at once if the consensus is against it. double->float and real->float cases are not so common. How often do you use floats in your code? In my code it's uncommon to use floats, generally I use doubles.
Apr 19 2011
On 4/19/11 7:37 PM, bearophile wrote:dsimcha:Yes please. I once felt the same way, but learned better since. AndreiI know what you suggest could prevent bugs in a lot of cases, but it also has the potential to get in the way in a lot of cases.You are right, and I am ready to close that enhancement request at once if the consensus is against it.
Apr 19 2011
Andrei:Yes please. I once felt the same way, but learned better since.OK :-) But I will probably write an article about this, because I have found a performance problem, that currently DMD doesn't help me avoid, that a C lint has avoided me. Bye, bearophile
Apr 19 2011
On 4/19/2011 10:42 PM, bearophile wrote:Andrei:...or write a Lint tool for D.Yes please. I once felt the same way, but learned better since.OK :-) But I will probably write an article about this, because I have found a performance problem, that currently DMD doesn't help me avoid, that a C lint has avoided me. Bye, bearophile
Apr 19 2011
On Tue, 19 Apr 2011 20:37:47 -0400, bearophile <bearophileHUGS lycos.com> wrote:dsimcha:I do GP GPU work, so I use floats all the time. They're also useful for data storage purposes.I know what you suggest could prevent bugs in a lot of cases, but it also has the potential to get in the way in a lot of cases.You are right, and I am ready to close that enhancement request at once if the consensus is against it. double->float and real->float cases are not so common. How often do you use floats in your code? In my code it's uncommon to use floats, generally I use doubles.A problem may be in real->double conversions, because I think D feels free to use intermediate real values in some FP computations.For your information, the x87 can only perform computations at 80-bits. So all intermediate values are performed using reals. It's just how the hardware works. Now I now some compilers (i.e. VS) allow you to set a flag, which basically causes the system to avoid intermediate values altogether or to use SIMD instructions instead in order to be properly compliant.Another possible problem: generic code like this is going to produce an error because 2.0 literal is double, so x*2.0 is a double even if x is float: T foo(T)(T x) { return x * 2.0; } But when I use floats, I have found it good a C lint that spots double->float conversions, because it has actually allowed me to speed up some code that was doing float->double->float conversions without me being aware of it.Yes, this auto-promotion of literals is very annoying, and it would be nice if constants could smartly match the expression type. By the way, C/C++ also behave this way, which has gotten me into the habit of adding f after all my floating point constants.
Apr 19 2011
Robert Jacques:I do GP GPU work, so I use floats all the time. They're also useful for data storage purposes.Today GPUs are just starting to manage doubles efficiently (Tesla?).For your information, the x87 can only perform computations at 80-bits.If you compile D1 code that doesn't contain "real" types with 32-bit LDC it uses SSE instructions on default (just 8 registers), this means most computations are done with 64 bit doubles. And in real programs, that use trigonometry, ecc, this is not the whole story.Yes, this auto-promotion of literals is very annoying, and it would be nice if constants could smartly match the expression type.Polysemus literals in general (here just floating point ones) have being discussed several times in past, but I don't know if floating point polysemus literals can be implemented well, and what consequences they will have in D code. Maybe Don is able to give a good comment on this.By the way, C/C++ also behave this way, which has gotten me into the habit of adding f after all my floating point constants.I presume if you take a good amount of care, in C (and probably in D too) you are able to avoid the performance problems I was talking about. But given a perfect programmer most warnings become useless :-) The warnings are usually meant for for programmers that do mistakes, don't know enough yet, miss things, etc. Bye, bearophile
Apr 20 2011
On Wed, 20 Apr 2011 06:23:01 -0400, bearophile <bearophileHUGS lycos.com> wrote:Robert Jacques:IIRC, the Fermi Tesla cards do doubles at about 1/2 float speed, so relatively to other GPUs, they are both the most efficient, and have the best performance. So if there was something that really needed them, I'd use them. But doubles requires more double the registers and double the memory bandwidth, both of which make dramatic performance differences (depending on code). So I'll be sticking to floats and halves for the foreseeable future.I do GP GPU work, so I use floats all the time. They're also useful for data storage purposes.Today GPUs are just starting to manage doubles efficiently (Tesla?).
Apr 20 2011
On 4/19/2011 5:02 PM, dsimcha wrote:Generally I worry about D's type system becoming like the Boy Who Cried Wolf, where it flags so many potential errors (as opposed to things that are definitely errors) that people become conditioned to just put in whatever casts they need to shut it up. I definitely fell into that when porting some 32-bit code that was sloppy with size_t vs. int to 64-bit. I knew there was no way it was going to be a problem, because there was no way any of my arrays were going to be even within a few orders of magnitude of int.max, but the compiler insisted on nagging me about it and I reflexively just put in casts everywhere. A warning _may_ be appropriate, but definitely not an error.That's definitely a worry. Having a Nagging Nellie giving false alarms on errors too often will: 1. cause programmers to hate D 2. lead to MORE bugs and harder to find ones, as Bruce Eckel pointed out, because people will put in things "just to shut up the compiler" Hence my reluctance to add in a lot of these suggestions. As to the specific about erroring on reducing precision, my talks with people who actually do write a lot of FP code for a living is NO. They don't want it. Losing precision in FP calculations is a fact of life, and FP programmers simply must understand it and deal with it. Having the compiler annoy you about it would be less than helpful.
Apr 19 2011
Walter:Hence my reluctance to add in a lot of these suggestions.In an answer I've suggested the alternative solution of a -warn_fp_precision_loss compiler switch, that produces warnings only when you use it. In theory this avoids most of the Nagging Nellie problem, because you use this switch only in special situations. But I am aware you generally don't like warnings.As to the specific about erroring on reducing precision, my talks with people who actually do write a lot of FP code for a living is NO. They don't want it. Losing precision in FP calculations is a fact of life, and FP programmers simply must understand it and deal with it. Having the compiler annoy you about it would be less than helpful.Loss of some precision bits in normal FP operations is a fact of life, but double->float conversions usually lose a much more significant amount of precision, and it's not a fact of life, it's the code that in some way asks for this irreversible conversion. A related problem your answer doesn't keep in account are unwanted float->double conversions (that get spotted by those error messages just because the code actually performs float->double->float conversions). Such unwanted conversions have caused performance loss on a 32-bit-mode CPU in some of my C code (maybe this problem is not present with CPU on 64 bit code), because the code was actually using doubles. A C lint has allowed me to spot such problems and fix them. Thank you for your answers, bye, bearophile
Apr 19 2011
On 4/19/2011 6:46 PM, bearophile wrote:Walter:Yes, I've argued strongly against warnings.Hence my reluctance to add in a lot of these suggestions.In an answer I've suggested the alternative solution of a -warn_fp_precision_loss compiler switch, that produces warnings only when you use it. In theory this avoids most of the Nagging Nellie problem, because you use this switch only in special situations. But I am aware you generally don't like warnings.
Apr 19 2011
On Tue, 19 Apr 2011, Walter Bright wrote:On 4/19/2011 6:46 PM, bearophile wrote:The stronger argument, that I agree with, is not having flag based sometimes warnings. The more flags you have, the more complex the matrix of landmines there are. I hate micro-managment, in all it's forms.Walter:Yes, I've argued strongly against warnings.Hence my reluctance to add in a lot of these suggestions.In an answer I've suggested the alternative solution of a -warn_fp_precision_loss compiler switch, that produces warnings only when you use it. In theory this avoids most of the Nagging Nellie problem, because you use this switch only in special situations. But I am aware you generally don't like warnings.
Apr 19 2011
On 4/19/2011 7:11 PM, Brad Roberts wrote:The stronger argument, that I agree with, is not having flag based sometimes warnings. The more flags you have, the more complex the matrix of landmines there are. I hate micro-managment, in all it's forms.True, if you have N compiler switches, you have 2^N different compilers to test! Every switch added doubles the time it takes to validate the compiler. If you have N warnings that can be independently toggled, you have 2^N different languages.
Apr 19 2011
On Apr 19, 2011, at 11:04 PM, Walter Bright wrote:On 4/19/2011 7:11 PM, Brad Roberts wrote:matrixThe stronger argument, that I agree with, is not having flag based sometimes warnings. The more flags you have, the more complex the =compilers to test! Every switch added doubles the time it takes to = validate the compiler. Software testing theory has suggestions for how to reduce the number of = test cases here with only a small sacrifice in general error detection. = Still, the fewer switches the better :-)=of landmines there are. I hate micro-managment, in all it's forms.=20 True, if you have N compiler switches, you have 2^N different =
Apr 20 2011
On 4/20/2011 9:28 AM, Sean Kelly wrote:Software testing theory has suggestions for how to reduce the number of test cases here with only a small sacrifice in general error detection. Still, the fewer switches the better :-)Currently I test with all combinations of switches that affect code gen. Sometimes, it will unexpectedly catch an odd interaction.
Apr 20 2011
bearophile Wrote:In Bugzilla I have just added an enhancement request that asks for a little change in D, I don't know if it was already discussed or if it's already present in Bugzilla: http://d.puremagic.com/issues/show_bug.cgi?id=5864Loosing precision on a fractional number seems like it would be a very common and desirable case. For one thing even dealing with real doesn't mean you won't have errors, you really end up in very tricky areas when using float over double would actually cause issue in the program. I'm not sure if this would be the best avenue to take in identifying possible performance issues with converting FP. And last, don't forget about significant figures. This suggests it would be better to produce an error when widening since you really don't have the precision of a double or real.
Apr 19 2011