digitalmars.D.learn - Some performance questions

Lars Kyllingstad (47/47) Feb 02 2009 I have some functions for which I want to find the nicest possible

Lars Kyllingstad (3/4) Feb 02 2009 Correction:
Jarrett Billingsley (10/54) Feb 02 2009 Any gains you get from skipping the initial calculations will be

Lars Kyllingstad (15/83) Feb 02 2009 OK. But if the object is allocated once (or seldomly, at least), and I
grauzone (3/72) Feb 02 2009 Why not use scope to allocate the class on the stack?

Jarrett Billingsley (4/6) Feb 02 2009 That's fine too, and would fit in with his needs to implement

Chris Nicholson-Sauls (10/17) Feb 02 2009 Or he's caching some very big/complex parameters in the code he's

Jarrett Billingsley (3/5) Feb 02 2009 http://d.puremagic.com/issues/show_bug.cgi?id=1909
Jarrett Billingsley (6/14) Feb 02 2009 Oh, I suppose I should also point out that if you made these functors'

grauzone (9/24) Feb 02 2009 As far as I know, interface methods can still be final methods in a

Jarrett Billingsley (16/23) Feb 02 2009 Sure, the method will be final, but it will still be virtual. The way

grauzone (10/10) Feb 02 2009 I agree. Of course using an interface to call a method always requires a...

Jarrett Billingsley (9/18) Feb 02 2009 What's the point of implementing an interface unless you plan on

Lars Kyllingstad (7/14) Feb 02 2009 You're assuming too much programming knowledge and carelessness on my

bearophile (5/6) Feb 02 2009 No amount of theory can replace actual timings of your code snippets :-)

Chris Nicholson-Sauls (20/81) Feb 02 2009 If I understand right that your main concern is with parameters that are...

Lars Kyllingstad (10/100) Feb 02 2009 Most of the time I use Tango, but in this particular case I don't want

Daniel Keep (17/25) Feb 02 2009 You're worried about a second function call which could potentially be

Lars Kyllingstad (18/49) Feb 02 2009 But that's the problem, you see. I don't know how expensive these

Chris Nicholson-Sauls (28/55) Feb 03 2009 Allocating stack memory is very cheap, because essentially the only

Lars Kyllingstad (3/67) Feb 03 2009 Thank you for a very informative reply. :)
Jarrett Billingsley (5/10) Feb 03 2009 It should be "before every allocation the garbage collector *may*

Chris Nicholson-Sauls (8/19) Feb 03 2009 Well okay, yes, it *may*. I was in a hurry and trying to be general.

Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:

I have some functions for which I want to find the nicest possible 
combination of performance and usability. I have two suggestions as to 
how they should be defined.

"Classic" style:

     real myFunction(real arg, int someParam, inout real aReturnValue)
     {
         declare temporary variables;
         do some calculations;
         store a return value in aReturnValue;
         return main return value;
     }

The user-friendly way, where the function is encapsulated in a class:

     class MyFunctionWorkspace
     {
         declare private temporary variables;

         real anotherReturnValue;

         this (int someParam)
         { ... }

         real myFunction(real arg)
         {
             do some calculations;
             store a return value in aReturnValue;
             return main return value;
         }
     }

I'm sure a lot of people will disagree with me on this, but let me first 
say why I think the last case is more user-friendly. For one thing, the 
same class can be used over and over again with the same parameter(s). 
Also, the user only has to retrieve aReturnValue if it is needed. If 
there are many such "additional" inout parameters which are seldom 
needed, it gets tedious to declare variables for them every time the 
function is called. I could overload the function, but this also has 
drawbacks if there are several inout parameters with the same type.

My questions are:

- If I do like in the second example above, and reuse temporary 
variables instead of allocating them every time the function is called, 
could this way also give the best performance? (Yes, I know this is bad 
form...)

...or, if not...

- If I (again in the second example) move the temporary variables inside 
the function, so they are allocated on the stack instead of the heap 
(?), will this improve or reduce performance?

I could write both types of code and test them against each other, but I 
am planning to use the same style for several different functions in 
several modules, and want to find the solution which is generally the 
best one.

-Lars

Feb 02 2009

Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:

Lars Kyllingstad wrote:
         real anotherReturnValue;

Correction:
     real aReturnValue;

Feb 02 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad
<public kyllingen.nospamnet> wrote:
 I have some functions for which I want to find the nicest possible
 combination of performance and usability. I have two suggestions as to how
 they should be defined.

 "Classic" style:

    real myFunction(real arg, int someParam, inout real aReturnValue)
    {
        declare temporary variables;
        do some calculations;
        store a return value in aReturnValue;
        return main return value;
    }

 The user-friendly way, where the function is encapsulated in a class:

    class MyFunctionWorkspace
    {
        declare private temporary variables;

        real anotherReturnValue;

        this (int someParam)
        { ... }

        real myFunction(real arg)
        {
            do some calculations;
            store a return value in aReturnValue;
            return main return value;
        }
    }

 I'm sure a lot of people will disagree with me on this, but let me first say
 why I think the last case is more user-friendly. For one thing, the same
 class can be used over and over again with the same parameter(s). Also, the
 user only has to retrieve aReturnValue if it is needed. If there are many
 such "additional" inout parameters which are seldom needed, it gets tedious
 to declare variables for them every time the function is called. I could
 overload the function, but this also has drawbacks if there are several
 inout parameters with the same type.

 My questions are:

 - If I do like in the second example above, and reuse temporary variables
 instead of allocating them every time the function is called, could this way
 also give the best performance? (Yes, I know this is bad form...)

 ...or, if not...

 - If I (again in the second example) move the temporary variables inside the
 function, so they are allocated on the stack instead of the heap (?), will
 this improve or reduce performance?

 I could write both types of code and test them against each other, but I am
 planning to use the same style for several different functions in several
 modules, and want to find the solution which is generally the best one.

Any gains you get from skipping the initial calculations will be
swiftly cut down by the cost of heap allocation and cache misses, if
you allocate this object several times.

A much better way to get the usability of the latter with the better
performance of the former is to use a struct instead of a class.  I
highly doubt you'll be needing to inherit these "operation objects"
anyway.  The struct will be allocated on the stack, and you still get
all the usability.

Feb 02 2009

Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:

Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad
 <public kyllingen.nospamnet> wrote:
 I have some functions for which I want to find the nicest possible
 combination of performance and usability. I have two suggestions as to how
 they should be defined.

 "Classic" style:

    real myFunction(real arg, int someParam, inout real aReturnValue)
    {
        declare temporary variables;
        do some calculations;
        store a return value in aReturnValue;
        return main return value;
    }

 The user-friendly way, where the function is encapsulated in a class:

    class MyFunctionWorkspace
    {
        declare private temporary variables;

        real anotherReturnValue;

        this (int someParam)
        { ... }

        real myFunction(real arg)
        {
            do some calculations;
            store a return value in aReturnValue;
            return main return value;
        }
    }

 I'm sure a lot of people will disagree with me on this, but let me first say
 why I think the last case is more user-friendly. For one thing, the same
 class can be used over and over again with the same parameter(s). Also, the
 user only has to retrieve aReturnValue if it is needed. If there are many
 such "additional" inout parameters which are seldom needed, it gets tedious
 to declare variables for them every time the function is called. I could
 overload the function, but this also has drawbacks if there are several
 inout parameters with the same type.

 My questions are:

 - If I do like in the second example above, and reuse temporary variables
 instead of allocating them every time the function is called, could this way
 also give the best performance? (Yes, I know this is bad form...)

 ...or, if not...

 - If I (again in the second example) move the temporary variables inside the
 function, so they are allocated on the stack instead of the heap (?), will
 this improve or reduce performance?

 I could write both types of code and test them against each other, but I am
 planning to use the same style for several different functions in several
 modules, and want to find the solution which is generally the best one.

 
 Any gains you get from skipping the initial calculations will be
 swiftly cut down by the cost of heap allocation and cache misses, if
 you allocate this object several times.

OK. But if the object is allocated once (or seldomly, at least), and I 
allocate any working variables on the stack, then the second case may 
not be half bad?

 A much better way to get the usability of the latter with the better
 performance of the former is to use a struct instead of a class.  I
 highly doubt you'll be needing to inherit these "operation objects"
 anyway.  The struct will be allocated on the stack, and you still get
 all the usability.

Thanks, I hadn't even thought of that! :) This could certainly be a 
solution. There are two problems, however:

1) In D1, structs don't have constructors, which could again make the 
initial parameter setting a tedious task. But this is not a big problem, 
as I could just define a static opCall for each struct as a kind of 
constructor.

2) Bigger problem: I was kinda hoping that all the functions could 
implement a common interface, so I can use them in generic algorithms. 
This could possibly be done with structs using templates, but plain old 
interfaces would be a cleaner solution.


-Lars

Feb 02 2009

grauzone <none example.net> writes:

Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad
 <public kyllingen.nospamnet> wrote:
 I have some functions for which I want to find the nicest possible
 combination of performance and usability. I have two suggestions as to how
 they should be defined.

 "Classic" style:

    real myFunction(real arg, int someParam, inout real aReturnValue)
    {
        declare temporary variables;
        do some calculations;
        store a return value in aReturnValue;
        return main return value;
    }

 The user-friendly way, where the function is encapsulated in a class:

    class MyFunctionWorkspace
    {
        declare private temporary variables;

        real anotherReturnValue;

        this (int someParam)
        { ... }

        real myFunction(real arg)
        {
            do some calculations;
            store a return value in aReturnValue;
            return main return value;
        }
    }

 I'm sure a lot of people will disagree with me on this, but let me first say
 why I think the last case is more user-friendly. For one thing, the same
 class can be used over and over again with the same parameter(s). Also, the
 user only has to retrieve aReturnValue if it is needed. If there are many
 such "additional" inout parameters which are seldom needed, it gets tedious
 to declare variables for them every time the function is called. I could
 overload the function, but this also has drawbacks if there are several
 inout parameters with the same type.

 My questions are:

 - If I do like in the second example above, and reuse temporary variables
 instead of allocating them every time the function is called, could this way
 also give the best performance? (Yes, I know this is bad form...)

 ...or, if not...

 - If I (again in the second example) move the temporary variables inside the
 function, so they are allocated on the stack instead of the heap (?), will
 this improve or reduce performance?

 I could write both types of code and test them against each other, but I am
 planning to use the same style for several different functions in several
 modules, and want to find the solution which is generally the best one.

 
 Any gains you get from skipping the initial calculations will be
 swiftly cut down by the cost of heap allocation and cache misses, if
 you allocate this object several times.
 
 A much better way to get the usability of the latter with the better
 performance of the former is to use a struct instead of a class.  I
 highly doubt you'll be needing to inherit these "operation objects"
 anyway.  The struct will be allocated on the stack, and you still get
 all the usability.

Why not use scope to allocate the class on the stack?
For everything else, I agree with Donald Knuth (if he really said that...)

Feb 02 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:
 Why not use scope to allocate the class on the stack?
 For everything else, I agree with Donald Knuth (if he really said that...)

That's fine too, and would fit in with his needs to implement
interfaces.  But again, if he's worried about caching some parameters
but not worried about the overhead of virtual calls.. something's off.

Feb 02 2009

Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:

Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:
 Why not use scope to allocate the class on the stack?
 For everything else, I agree with Donald Knuth (if he really said that...)

 
 That's fine too, and would fit in with his needs to implement
 interfaces.  But again, if he's worried about caching some parameters
 but not worried about the overhead of virtual calls.. something's off.

Or he's caching some very big/complex parameters in the code he's 
actually writing... maybe. That said: do we have any assurance that, 
were the functor class tagged as 'final', the call would cease to be 
virtual?  If so, then the only extra cost on the call is that of the 
hidden "this" sitting in ESI.  I still don't care for the memory 
allocation involved, personally, but if these are long-lived functors 
that may not be a major problem.  (Ie, if he calls foo(?,X) a million 
times, the cost of allocating one object is amortized into nearly nothing.)

-- Chris Nicholson-Sauls

Feb 02 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls
<ibisbasenji gmail.com> wrote:

 do we have any assurance that, were the functor
 class tagged as 'final', the call would cease to be virtual?

http://d.puremagic.com/issues/show_bug.cgi?id=1909

Feb 02 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls
<ibisbasenji gmail.com> wrote:
 Or he's caching some very big/complex parameters in the code he's actually
 writing... maybe. That said: do we have any assurance that, were the functor
 class tagged as 'final', the call would cease to be virtual?  If so, then
 the only extra cost on the call is that of the hidden "this" sitting in ESI.
  I still don't care for the memory allocation involved, personally, but if
 these are long-lived functors that may not be a major problem.  (Ie, if he
 calls foo(?,X) a million times, the cost of allocating one object is
 amortized into nearly nothing.)

Oh, I suppose I should also point out that if you made these functors'
methods final, they wouldn't be able to implement interfaces, since
interface implementations must be virtual.  So, at that point, you're
using a final scope class - might as well use a struct anyway.

Feb 02 2009

grauzone <none example.net> writes:

Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls
 <ibisbasenji gmail.com> wrote:
 Or he's caching some very big/complex parameters in the code he's actually
 writing... maybe. That said: do we have any assurance that, were the functor
 class tagged as 'final', the call would cease to be virtual?  If so, then
 the only extra cost on the call is that of the hidden "this" sitting in ESI.
  I still don't care for the memory allocation involved, personally, but if
 these are long-lived functors that may not be a major problem.  (Ie, if he
 calls foo(?,X) a million times, the cost of allocating one object is
 amortized into nearly nothing.)

 
 Oh, I suppose I should also point out that if you made these functors'
 methods final, they wouldn't be able to implement interfaces, since
 interface implementations must be virtual.  So, at that point, you're
 using a final scope class - might as well use a struct anyway.

As far as I know, interface methods can still be final methods in a 
class. final methods are only disallowed to be overridden further. But 
it's perfectly fine to mark a method final, that overrides a method from 
a super class. final so to say only works in one direction.

Then the compiler can optimize calls, if they are statically known to be 
final. If not, it still has to do a vtable lookup on a method call, even 
if the actually called method is final.

So it can still make sense to use a class instead of a struct.

Feb 02 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Mon, Feb 2, 2009 at 3:37 PM, grauzone <none example.net> wrote:

 As far as I know, interface methods can still be final methods in a class.
 final methods are only disallowed to be overridden further. But it's
 perfectly fine to mark a method final, that overrides a method from a super
 class. final so to say only works in one direction.

Sure, the method will be final, but it will still be virtual.  The way
interfaces work is by basically giving you a slice of the vtable.

 Then the compiler can optimize calls, if they are statically known to be
 final. If not, it still has to do a vtable lookup on a method call, even if
 the actually called method is final.

The compiler can't optimize calls on interface references away.  The
function that's using the interface reference only knows as much as
the interface tells it.  If some class implements the interface and
marks its implementation of the interface as final, it doesn't matter,
since the method is not marked final in the interface (and can't be!).

Okay, so *if* the compiler inlined the call to the function that took
the interface reference, *and* it was smart enough to recognize that
that interface reference did not escape, *and* it was smart enough to
realize that the interface really pointed to a class, *and* it knew
that the implementation of the method was final, it could inline it.
But that seems like an incredibly smart compiler and an incredibly
rare situation.  I also don't believe in relying on optimizations that
are not enforced, as it makes for nonportable code.

Feb 02 2009

grauzone <none example.net> writes:

I agree. Of course using an interface to call a method always requires a 
virtual method call. It's even slower than a virtual method call, 
because it needs to convert the interface reference into an object 
reference.

But he still could call the method in question directly. Implementing an 
interface can be useful to enforce a contract. You can't do that with 
structs.

Code compiled in debug mode (or was it not-release mode) also calls the 
code to check the invariant, even if you didn't define one. I guess this 
can make calling struct methods much faster than object methods.

Feb 02 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Mon, Feb 2, 2009 at 4:55 PM, grauzone <none example.net> wrote:
 I agree. Of course using an interface to call a method always requires a
 virtual method call. It's even slower than a virtual method call, because it
 needs to convert the interface reference into an object reference.

 But he still could call the method in question directly. Implementing an
 interface can be useful to enforce a contract. You can't do that with
 structs.

What's the point of implementing an interface unless you plan on
passing instances of that class to something that expects an interface
reference?  ;)

 Code compiled in debug mode (or was it not-release mode) also calls the code
 to check the invariant, even if you didn't define one. I guess this can make
 calling struct methods much faster than object methods.

Invariants (as well as in/out contracts and assertions) are turned off
in release mode.  FWIW, struct methods also do an "assert(this !is
null);" in debug mode, so they're sort of doing an invariant check.
But struct methods are never virtual, so yes, they will in general be
faster.

Feb 02 2009

Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:

Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:
 Why not use scope to allocate the class on the stack?
 For everything else, I agree with Donald Knuth (if he really said that...)

 
 That's fine too, and would fit in with his needs to implement
 interfaces.  But again, if he's worried about caching some parameters
 but not worried about the overhead of virtual calls.. something's off.

You're assuming too much programming knowledge and carelessness on my 
part. I merely wanted to know if the second solution would be 
significantly slower than the first one. Caching of the parameters would 
be a bonus, as would caching of additional output and the ability to use 
interfaces.

-Lars

Feb 02 2009

bearophile <bearophileHUGS lycos.com> writes:

Lars Kyllingstad:
 I merely wanted to know if the second solution would be significantly slower
than the first one.<

No amount of theory can replace actual timings of your code snippets :-)
(It's often true the other way too, practice doesn't replace theory. But here
there isn't too much theory, so lot of practice suffices if you don't know the
theory).

Bye,
bearophile

Feb 02 2009

Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:

Lars Kyllingstad wrote:
 I have some functions for which I want to find the nicest possible 
 combination of performance and usability. I have two suggestions as to 
 how they should be defined.
 
 "Classic" style:
 
     real myFunction(real arg, int someParam, inout real aReturnValue)
     {
         declare temporary variables;
         do some calculations;
         store a return value in aReturnValue;
         return main return value;
     }
 
 The user-friendly way, where the function is encapsulated in a class:
 
     class MyFunctionWorkspace
     {
         declare private temporary variables;
 
         real anotherReturnValue;
 
         this (int someParam)
         { ... }
 
         real myFunction(real arg)
         {
             do some calculations;
             store a return value in aReturnValue;
             return main return value;
         }
     }
 
 I'm sure a lot of people will disagree with me on this, but let me first 
 say why I think the last case is more user-friendly. For one thing, the 
 same class can be used over and over again with the same parameter(s). 
 Also, the user only has to retrieve aReturnValue if it is needed. If 
 there are many such "additional" inout parameters which are seldom 
 needed, it gets tedious to declare variables for them every time the 
 function is called. I could overload the function, but this also has 
 drawbacks if there are several inout parameters with the same type.
 
 My questions are:
 
 - If I do like in the second example above, and reuse temporary 
 variables instead of allocating them every time the function is called, 
 could this way also give the best performance? (Yes, I know this is bad 
 form...)
 
 ...or, if not...
 
 - If I (again in the second example) move the temporary variables inside 
 the function, so they are allocated on the stack instead of the heap 
 (?), will this improve or reduce performance?
 
 I could write both types of code and test them against each other, but I 
 am planning to use the same style for several different functions in 
 several modules, and want to find the solution which is generally the 
 best one.
 
 -Lars


If I understand right that your main concern is with parameters that are 
used over and over and over again -- which I can empathize with -- you 
could also look into function currying.  Assuming you are using Phobos, 
the module you want to look at is std.bind, usage of which is pretty 
straightforward.  Given a function:

real pow (real base, real exp);

You could emulate a square() function via std.bind like so:

square = bind(&pow, _0, 2.0);
square(42.0); // same as: pow(42.0, 2.0)

If you are using Tango, I'm honestly not sure off the top of my head 
what the relevant module is, but you could always install Tangobos and 
use std.bind just fine.

All that being said, I have no experience with currying functions with 
inout parameters.  If my understanding of how std.bind works its magic 
is right, it should be fine.  I believe it wraps the call up in a 
structure, which means the actual parameter will be from a field of said 
structure... which, actually, means it could also store state.  That in 
itself could be an interesting capability.

-- Chris Nicholson-Sauls

Feb 02 2009

Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:

Chris Nicholson-Sauls wrote:
 Lars Kyllingstad wrote:
 I have some functions for which I want to find the nicest possible 
 combination of performance and usability. I have two suggestions as to 
 how they should be defined.

 "Classic" style:

     real myFunction(real arg, int someParam, inout real aReturnValue)
     {
         declare temporary variables;
         do some calculations;
         store a return value in aReturnValue;
         return main return value;
     }

 The user-friendly way, where the function is encapsulated in a class:

     class MyFunctionWorkspace
     {
         declare private temporary variables;

         real anotherReturnValue;

         this (int someParam)
         { ... }

         real myFunction(real arg)
         {
             do some calculations;
             store a return value in aReturnValue;
             return main return value;
         }
     }

 I'm sure a lot of people will disagree with me on this, but let me 
 first say why I think the last case is more user-friendly. For one 
 thing, the same class can be used over and over again with the same 
 parameter(s). Also, the user only has to retrieve aReturnValue if it 
 is needed. If there are many such "additional" inout parameters which 
 are seldom needed, it gets tedious to declare variables for them every 
 time the function is called. I could overload the function, but this 
 also has drawbacks if there are several inout parameters with the same 
 type.

 My questions are:

 - If I do like in the second example above, and reuse temporary 
 variables instead of allocating them every time the function is 
 called, could this way also give the best performance? (Yes, I know 
 this is bad form...)

 ...or, if not...

 - If I (again in the second example) move the temporary variables 
 inside the function, so they are allocated on the stack instead of the 
 heap (?), will this improve or reduce performance?

 I could write both types of code and test them against each other, but 
 I am planning to use the same style for several different functions in 
 several modules, and want to find the solution which is generally the 
 best one.

 -Lars

 
 
 If I understand right that your main concern is with parameters that are 
 used over and over and over again -- which I can empathize with -- you 
 could also look into function currying.  Assuming you are using Phobos, 
 the module you want to look at is std.bind, usage of which is pretty 
 straightforward.  Given a function:
 
 real pow (real base, real exp);
 
 You could emulate a square() function via std.bind like so:
 
 square = bind(&pow, _0, 2.0);
 square(42.0); // same as: pow(42.0, 2.0)
 
 If you are using Tango, I'm honestly not sure off the top of my head 
 what the relevant module is, but you could always install Tangobos and 
 use std.bind just fine.
 
 All that being said, I have no experience with currying functions with 
 inout parameters.  If my understanding of how std.bind works its magic 
 is right, it should be fine.  I believe it wraps the call up in a 
 structure, which means the actual parameter will be from a field of said 
 structure... which, actually, means it could also store state.  That in 
 itself could be an interesting capability.
 
 -- Chris Nicholson-Sauls

Most of the time I use Tango, but in this particular case I don't want 
my code to depend on either library. Also I'm not sure whether the 
std.bind functionality is even present in Tango. I could always write 
my.own.bind, though.

Your solution is nice from a usability perspective, in that it reuses 
function arguments -- possibly even inout ones. From a performance 
perspective, however, it carries with it the overhead of an extra 
function call, which I'm not sure I want.

-Lars

Feb 02 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Lars Kyllingstad wrote:
 [snip]
 From a performance
 perspective, however, it carries with it the overhead of an extra
 function call, which I'm not sure I want.
 
 -Lars

You're worried about a second function call which could potentially be
inlined, yet you're seemingly not worried about the overhead of virtual
calls or heap allocations...

Allow me to quote Donald Knuth:

 We should forget about small efficiencies, say about 97% of the time:
 premature optimization is the root of all evil.

Unless you're doing something where you *know* you're going to need
every last cycle, just go with whichever design works best.  Your
response to Jarrett implies that you've already got a design in mind,
and are just fishing for a magic "make it go faster button."

Believe me, if Walter had invented such a thing, he wouldn't be wasting
his time putting up with us; he'd be too busy smoking $100 bills from
the comfort of his SPACE FORTRESS.  :D

In any case, I'm willing to bet that if there *are* inefficiencies
you're not going to know exactly where until you've written the code,
anyway.  :P

If classes work, and make for an elegant design, go for it.

  -- Daniel

Feb 02 2009

Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:

Daniel Keep wrote:
 
 Lars Kyllingstad wrote:
 [snip]
 From a performance
 perspective, however, it carries with it the overhead of an extra
 function call, which I'm not sure I want.

 -Lars

 
 You're worried about a second function call which could potentially be
 inlined, yet you're seemingly not worried about the overhead of virtual
 calls or heap allocations...

But that's the problem, you see. I don't know how expensive these 
operations are, hence my initial question(s). (This was also why I 
posted my question in D.learn.)

For instance, I didn't know (not sure I still do) what the cost is of 
frequent allocation/deallocation/access of stack memory vs. infrequent 
allocation/deallocation and frequent access of heap memory. From the 
replies I've got, it seems heap variables make for significantly slower 
code.

Nor was I sure, as you pointed out, how expensive a virtual function 
call is vs. an extra non-virtual function call.

I'm a physicist, not a computer scientist. :)

 Allow me to quote Donald Knuth:
 
 We should forget about small efficiencies, say about 97% of the time:
 premature optimization is the root of all evil.

 
 Unless you're doing something where you *know* you're going to need
 every last cycle, just go with whichever design works best.  Your
 response to Jarrett implies that you've already got a design in mind,
 and are just fishing for a magic "make it go faster button."

I want that button, yes. :)

But seriously, I am doing numerical computations, so performance is 
absolutely an issue. The main thing I wanted to know was, can I have 
both performance and usability, or do I have to choose? With Jarretts 
suggestion I can, to some degree, have both.

 Believe me, if Walter had invented such a thing, he wouldn't be wasting
 his time putting up with us; he'd be too busy smoking $100 bills from
 the comfort of his SPACE FORTRESS.  :D

What are you implying, that he wouldn't make it open-source? :)

 In any case, I'm willing to bet that if there *are* inefficiencies
 you're not going to know exactly where until you've written the code,
 anyway.  :P
 
 If classes work, and make for an elegant design, go for it.
 
   -- Daniel

Feb 02 2009

Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:

Lars Kyllingstad wrote:
 Daniel Keep wrote:
 Lars Kyllingstad wrote:
 [snip]
 From a performance
 perspective, however, it carries with it the overhead of an extra
 function call, which I'm not sure I want.

 -Lars

 You're worried about a second function call which could potentially be
 inlined, yet you're seemingly not worried about the overhead of virtual
 calls or heap allocations...

 
 But that's the problem, you see. I don't know how expensive these 
 operations are, hence my initial question(s). (This was also why I 
 posted my question in D.learn.)
 
 For instance, I didn't know (not sure I still do) what the cost is of 
 frequent allocation/deallocation/access of stack memory vs. infrequent 
 allocation/deallocation and frequent access of heap memory. From the 
 replies I've got, it seems heap variables make for significantly slower 
 code.

Allocating stack memory is very cheap, because essentially the only 
thing that has to be done is to offset a stack pointer.  Some stack 
variables are even optimized away if only used as temporaries (that is, 
their value is retained in a register until it isn't needed) and for 
short durations.

Allocating heap memory, on the other hand, is expensive for two reasons. 
  The first, is that the heap may have to grow, which means negotiating 
more memory from the operating system, which means switching the CPU 
back and forth between modes, sometimes several iterations.  Of course, 
this doesn't happen on every allocation, or even very often if you're 
careful.  The second reason, is that before every allocation the garbage 
collector will perform a collection run.  This can actually be disabled 
(at least in theory) if you plan on doing several allocations in a short 
period of time, and thereafter re-enabled.

For the latter case, see Phobos 'std.gc' or Tango 'tango.core.Memory'.

Once you have memory allocated, the cost of access is generally about 
the same, except that the stack is more likely to be cached by the CPU. 
  (Since it is inevitably accessed often.)

 Nor was I sure, as you pointed out, how expensive a virtual function 
 call is vs. an extra non-virtual function call.

It adds an additional step.  You start with an index into the object's 
vtable (a list of pointers) rather than the function's actual address. 
Its essentially the same as the difference between assigning to an 
'int**' versus an 'int*'.

 I'm a physicist, not a computer scientist. :)
 

Which is a good thing, since D could use more experience from 
non-programmers who need to program.  That's a demographic that 
occasionally (but never completely!) gets forgotten.  I'm not exactly a 
thirty-years guru, myself.

-- Chris Nicholson-Sauls

Feb 03 2009

Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:

Chris Nicholson-Sauls wrote:
 Lars Kyllingstad wrote:
 Daniel Keep wrote:
 Lars Kyllingstad wrote:
 [snip]
 From a performance
 perspective, however, it carries with it the overhead of an extra
 function call, which I'm not sure I want.

 -Lars

 You're worried about a second function call which could potentially be
 inlined, yet you're seemingly not worried about the overhead of virtual
 calls or heap allocations...

 But that's the problem, you see. I don't know how expensive these 
 operations are, hence my initial question(s). (This was also why I 
 posted my question in D.learn.)

 For instance, I didn't know (not sure I still do) what the cost is of 
 frequent allocation/deallocation/access of stack memory vs. infrequent 
 allocation/deallocation and frequent access of heap memory. From the 
 replies I've got, it seems heap variables make for significantly 
 slower code.

 
 Allocating stack memory is very cheap, because essentially the only 
 thing that has to be done is to offset a stack pointer.  Some stack 
 variables are even optimized away if only used as temporaries (that is, 
 their value is retained in a register until it isn't needed) and for 
 short durations.
 
 Allocating heap memory, on the other hand, is expensive for two reasons. 
  The first, is that the heap may have to grow, which means negotiating 
 more memory from the operating system, which means switching the CPU 
 back and forth between modes, sometimes several iterations.  Of course, 
 this doesn't happen on every allocation, or even very often if you're 
 careful.  The second reason, is that before every allocation the garbage 
 collector will perform a collection run.  This can actually be disabled 
 (at least in theory) if you plan on doing several allocations in a short 
 period of time, and thereafter re-enabled.
 
 For the latter case, see Phobos 'std.gc' or Tango 'tango.core.Memory'.
 
 Once you have memory allocated, the cost of access is generally about 
 the same, except that the stack is more likely to be cached by the CPU. 
  (Since it is inevitably accessed often.)
 
 Nor was I sure, as you pointed out, how expensive a virtual function 
 call is vs. an extra non-virtual function call.

 
 It adds an additional step.  You start with an index into the object's 
 vtable (a list of pointers) rather than the function's actual address. 
 Its essentially the same as the difference between assigning to an 
 'int**' versus an 'int*'.
 
 I'm a physicist, not a computer scientist. :)

 
 Which is a good thing, since D could use more experience from 
 non-programmers who need to program.  That's a demographic that 
 occasionally (but never completely!) gets forgotten.  I'm not exactly a 
 thirty-years guru, myself.
 
 -- Chris Nicholson-Sauls

Thank you for a very informative reply. :)

-Lars

Feb 03 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Tue, Feb 3, 2009 at 3:44 PM, Chris Nicholson-Sauls
<ibisbasenji gmail.com> wrote:
 The
 second reason, is that before every allocation the garbage collector will
 perform a collection run.  This can actually be disabled (at least in
 theory) if you plan on doing several allocations in a short period of time,
 and thereafter re-enabled.

It should be "before every allocation the garbage collector *may*
perform a collection run."  If it collected on every allocation it
would make your program's execution speed next to useless ;)

Feb 03 2009

Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:

Jarrett Billingsley wrote:
 On Tue, Feb 3, 2009 at 3:44 PM, Chris Nicholson-Sauls
 <ibisbasenji gmail.com> wrote:
 The
 second reason, is that before every allocation the garbage collector will
 perform a collection run.  This can actually be disabled (at least in
 theory) if you plan on doing several allocations in a short period of time,
 and thereafter re-enabled.

 
 It should be "before every allocation the garbage collector *may*
 perform a collection run."  If it collected on every allocation it
 would make your program's execution speed next to useless ;)

Well okay, yes, it *may*.  I was in a hurry and trying to be general. 
;)  Chances are, though, that if you are doing so many allocations in a 
short period as to be worried about it, that it probably will.  If I 
remember right, the current GC runs a collection just before requesting 
more heap, so its actually related to the first issue.  (I may well 
remember wrong, its been a very long time since I dove into the GC code.)

-- Chris Nicholson-Sauls

Feb 03 2009

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Some performance questions