digitalmars.D.learn - Why use float and double instead of real?

Lars T. Kyllingstad (7/7) Jun 23 2009 Is there ever any reason to use float or double in calculations? I mean,...

Witold Baryluk (15/22) Jun 23 2009 yes they are faster and are smaller, and accurate enaugh.

BCS (9/24) Jun 23 2009 IIRC on most systems real will only be slower as a result of I/O costs. ...

Witold Baryluk (8/23) Jun 23 2009 this is exactly the same think which cpu already does when dealing with

BCS (6/18) Jun 23 2009 You misread me; if you need computation to exactly match 32 or 64bit mat...

Witold Baryluk (2/15) Jun 23 2009 We both know this, so EOT. :)

Jarrett Billingsley (10/13) Jun 23 2009 As Witold mentioned, float and double are the only types SSE (and
Don (6/15) Jul 01 2009 Size. Since modern CPUs are memory-bandwidth limited, it's always going

Lars T. Kyllingstad (18/35) Jul 01 2009 The reason I'm asking is that I've templated the numerical routines I've...
BCS (5/9) Jul 01 2009 I was under the impression that the memory buss could feed the CPU at le...

Don (6/19) Jul 02 2009 Intel Core2 can only perform one load per cycle, but can do one floating...

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

Is there ever any reason to use float or double in calculations? I mean, 
when does one *not* want maximum precision? Will code using float or 
double run faster than code using real?

I understand they are needed for I/O and compatibility purposes, so I am 
by no means suggesting they be removed from the language. I am merely 
asking out of curiosity.

-Lars

Jun 23 2009

Witold Baryluk <baryluk smp.if.uj.edu.pl> writes:

Dnia 2009-06-23, wto o godzinie 14:44 +0200, Lars T. Kyllingstad pisze:
 Is there ever any reason to use float or double in calculations? I mean, 
 when does one *not* want maximum precision? Will code using float or 
 double run faster than code using real?

yes they are faster and are smaller, and accurate enaugh.

they also can be used in SSE.

reals can be very undeterministic. like

if (f(x) != f(y)) { assert(x != y, "Boom!"); } // it will explode

 I understand they are needed for I/O and compatibility purposes, so I am 
 by no means suggesting they be removed from the language. I am merely 
 asking out of curiosity.

float and double types conforms to IEEE 754 standard. real type not.
and many application (scientific computations, simultions, interval
arithmetic) absolutly needs IEEE 754 semantic (correct rounding, known
error behaviour, and so on). additionally
real have varying precission on multiple platforms, and varing size,
or are just not supported.

if you need very high precision (and still have some knowledge about
what is maximal error), you can use double-double, or quad-double
(structure of 2 or 4 doubles).

I have implemented them in D, but are quite slow.

 -Lars

Jun 23 2009

BCS <none anon.com> writes:

Hello Witold,

 Dnia 2009-06-23, wto o godzinie 14:44 +0200, Lars T. Kyllingstad
 pisze:
 
 Is there ever any reason to use float or double in calculations? I
 mean, when does one *not* want maximum precision? Will code using
 float or double run faster than code using real?
 

 yes they are faster and are smaller, and accurate enaugh.

IIRC on most systems real will only be slower as a result of I/O costs. For 
example on x86 the FPU only computes using 80-bit. 

 float and double types conforms to IEEE 754 standard. real type not.

I think you are in error here. IIRC IEEE-754 has some stuff about "extended 
precision" values that work like the normal types but with more bits. That 
is what 80 bit reals are. If you force rounding to 64-bits after each op, 
I think things will come out exactly the same as for a 64-bit FPU. 

 and many application (scientific computations, simultions, interval
 arithmetic) absolutly needs IEEE 754 semantic (correct rounding, known
 error behaviour, and so on).

 additionally
 real have varying precission on multiple platforms, and varing size,
 or are just not supported.

reals are /always/ supported if the platform supports FP, even if only with 
16-bit FP types.

Jun 23 2009

Witold Baryluk <baryluk smp.if.uj.edu.pl> writes:

Dnia 2009-06-23, wto o godzinie 16:01 +0000, BCS pisze:

 I think you are in error here. IIRC IEEE-754 has some stuff about "extended 
 precision" values that work like the normal types but with more bits. That 
 is what 80 bit reals are. If you force rounding to 64-bits after each op, 
 I think things will come out exactly the same as for a 64-bit FPU. 
 

this is exactly the same think which cpu already does when dealing with
doubles and floats. internal computations are performed in ext.
precision, and written somewhere, truncating to 64bits.

 and many application (scientific computations, simultions, interval
 arithmetic) absolutly needs IEEE 754 semantic (correct rounding, known
 error behaviour, and so on).

 
 additionally
 real have varying precission on multiple platforms, and varing size,
 or are just not supported.

 
 reals are /always/ supported if the platform supports FP, even if only with 
 16-bit FP types.

yes, you are absolutely right. i was thinking about reals which are
mapped to something bigger than double precision.

I'm using sometimes reals for intermediate values, for example
when summing large number of values. One can also use Kahan's algorithm.

Jun 23 2009

BCS <ao pathlink.com> writes:

Reply to Witold,

 Dnia 2009-06-23, wto o godzinie 16:01 +0000, BCS pisze:
 
 I think you are in error here. IIRC IEEE-754 has some stuff about
 "extended precision" values that work like the normal types but with
 more bits. That is what 80 bit reals are. If you force rounding to
 64-bits after each op, I think things will come out exactly the same
 as for a 64-bit FPU.
 

 this is exactly the same think which cpu already does when dealing
 with doubles and floats. internal computations are performed in ext.
 precision, and written somewhere, truncating to 64bits.
 

You misread me; if you need computation to exactly match 32 or 64bit math, 
you will need to round after every single operation (+, -, *, /, etc.), what 
most systems do is use full internal precision for intermediate value and 
round only when the value is stored to a variable. If you don't need
bit-for-bit 
matches, then 80-bit matches IEEE-754 semantics just with more bits of
precision.

Jun 23 2009

Witold Baryluk <baryluk smp.if.uj.edu.pl> writes:

Dnia 2009-06-23, wto o godzinie 17:14 +0000, BCS pisze:
 Reply to Witold,

 this is exactly the same think which cpu already does when dealing
 with doubles and floats. internal computations are performed in ext.
 precision, and written somewhere, truncating to 64bits.
 

 
 You misread me; if you need computation to exactly match 32 or 64bit math, 
 you will need to round after every single operation (+, -, *, /, etc.), what 
 most systems do is use full internal precision for intermediate value and 
 round only when the value is stored to a variable. If you don't need
bit-for-bit 
 matches, then 80-bit matches IEEE-754 semantics just with more bits of
precision.
 
 

We both know this, so EOT. :)

Jun 23 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Tue, Jun 23, 2009 at 8:44 AM, Lars T.
Kyllingstad<public kyllingen.nospamnet> wrote:
 Is there ever any reason to use float or double in calculations? I mean,
 when does one *not* want maximum precision? Will code using float or double
 run faster than code using real?

As Witold mentioned, float and double are the only types SSE (and
similar SIMD instruction sets on other architectures) can deal with.
Furthermore most 3D graphics hardware only uses single or even
half-precision (16-bit) floats, so it makes no sense to use 64- or
80-bit floats in those cases.

Also keep in mind that 'real' is simply defined as the largest
supported floating-point type.  On x86, that's an 80-bit real, but on
most other architectures, it's the same as double anyway.

Jun 23 2009

Don <nospam nospam.com> writes:

Lars T. Kyllingstad wrote:
 Is there ever any reason to use float or double in calculations? I mean, 
 when does one *not* want maximum precision? Will code using float or 
 double run faster than code using real?
 
 I understand they are needed for I/O and compatibility purposes, so I am 
 by no means suggesting they be removed from the language. I am merely 
 asking out of curiosity.
 
 -Lars

Size. Since modern CPUs are memory-bandwidth limited, it's always going 
to be MUCH faster to use float[] instead of real[] once the array size 
gets too big to fit in the cache. Maybe around 2000 elements or so.

Rule of thumb: use real for temporary values, use float or double for 
arrays.

Jul 01 2009

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

Don wrote:
 Lars T. Kyllingstad wrote:
 Is there ever any reason to use float or double in calculations? I 
 mean, when does one *not* want maximum precision? Will code using 
 float or double run faster than code using real?

 I understand they are needed for I/O and compatibility purposes, so I 
 am by no means suggesting they be removed from the language. I am 
 merely asking out of curiosity.

 -Lars

 
 Size. Since modern CPUs are memory-bandwidth limited, it's always going 
 to be MUCH faster to use float[] instead of real[] once the array size 
 gets too big to fit in the cache. Maybe around 2000 elements or so.
 
 Rule of thumb: use real for temporary values, use float or double for 
 arrays.


The reason I'm asking is that I've templated the numerical routines I've 
written, so that the user can choose which floating-point type to use. 
Then I started wondering whether I should in fact always use real for 
temporary values inside the routines, for precision's sake, or whether 
this would reduce performance significantly.

 From the answers I've gotten to my question (thanks everyone, BTW!), 
It's not immediately clear to me what is the best choice in general. 
(Perhaps it would be best to have two template parameters, one for 
input/output precision and one for working precision?)

Functions in std.math are defined in a lot of different ways:
   - separate overloaded functions for float, double and real
   - like the above, only float and double versions cast to real
     and call real version
   - only real version
   - templated

Is there some rationale behind these choices?

-Lars

Jul 01 2009

BCS <none anon.com> writes:

Hello Don,

 Size. Since modern CPUs are memory-bandwidth limited, it's always
 going to be MUCH faster to use float[] instead of real[] once the
 array size gets too big to fit in the cache. Maybe around 2000
 elements or so.

I was under the impression that the memory buss could feed the CPU at least 
as fast as the CPU could process data but just with huge latency. Based on 
that, it's not how much data is loaded (bandwidth) but how many places it's 
loaded from. Is my initial assumption wrong or am I just nit picking?

Jul 01 2009

Don <nospam nospam.com> writes:

BCS wrote:
 Hello Don,
 
 Size. Since modern CPUs are memory-bandwidth limited, it's always
 going to be MUCH faster to use float[] instead of real[] once the
 array size gets too big to fit in the cache. Maybe around 2000
 elements or so.

 
 I was under the impression that the memory buss could feed the CPU at 
 least as fast as the CPU could process data but just with huge latency. 
 Based on that, it's not how much data is loaded (bandwidth) but how many 
 places it's loaded from. Is my initial assumption wrong or am I just nit 
 picking?
 

Intel Core2 can only perform one load per cycle, but can do one floating 
point add per cycle.
So in something like a[] += b[], you're limited by memory bandwidth even 
when everything is in the L1 cache.
But in practice, performance is usually dominated by cache misses.

Jul 02 2009

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Why use float and double instead of real?