digitalmars.D.learn - Some performance questions
- Lars Kyllingstad (47/47) Feb 02 2009 I have some functions for which I want to find the nicest possible
- Lars Kyllingstad (3/4) Feb 02 2009 Correction:
- Jarrett Billingsley (10/54) Feb 02 2009 Any gains you get from skipping the initial calculations will be
- Lars Kyllingstad (15/83) Feb 02 2009 OK. But if the object is allocated once (or seldomly, at least), and I
- grauzone (3/72) Feb 02 2009 Why not use scope to allocate the class on the stack?
- Jarrett Billingsley (4/6) Feb 02 2009 That's fine too, and would fit in with his needs to implement
- Chris Nicholson-Sauls (10/17) Feb 02 2009 Or he's caching some very big/complex parameters in the code he's
- Jarrett Billingsley (3/5) Feb 02 2009 http://d.puremagic.com/issues/show_bug.cgi?id=1909
- Jarrett Billingsley (6/14) Feb 02 2009 Oh, I suppose I should also point out that if you made these functors'
- grauzone (9/24) Feb 02 2009 As far as I know, interface methods can still be final methods in a
- Jarrett Billingsley (16/23) Feb 02 2009 Sure, the method will be final, but it will still be virtual. The way
- grauzone (10/10) Feb 02 2009 I agree. Of course using an interface to call a method always requires a...
- Jarrett Billingsley (9/18) Feb 02 2009 What's the point of implementing an interface unless you plan on
- Lars Kyllingstad (7/14) Feb 02 2009 You're assuming too much programming knowledge and carelessness on my
- bearophile (5/6) Feb 02 2009 No amount of theory can replace actual timings of your code snippets :-)
- Chris Nicholson-Sauls (20/81) Feb 02 2009 If I understand right that your main concern is with parameters that are...
- Lars Kyllingstad (10/100) Feb 02 2009 Most of the time I use Tango, but in this particular case I don't want
- Daniel Keep (17/25) Feb 02 2009 You're worried about a second function call which could potentially be
- Lars Kyllingstad (18/49) Feb 02 2009 But that's the problem, you see. I don't know how expensive these
- Chris Nicholson-Sauls (28/55) Feb 03 2009 Allocating stack memory is very cheap, because essentially the only
- Lars Kyllingstad (3/67) Feb 03 2009 Thank you for a very informative reply. :)
- Jarrett Billingsley (5/10) Feb 03 2009 It should be "before every allocation the garbage collector *may*
- Chris Nicholson-Sauls (8/19) Feb 03 2009 Well okay, yes, it *may*. I was in a hurry and trying to be general.
I have some functions for which I want to find the nicest possible combination of performance and usability. I have two suggestions as to how they should be defined. "Classic" style: real myFunction(real arg, int someParam, inout real aReturnValue) { declare temporary variables; do some calculations; store a return value in aReturnValue; return main return value; } The user-friendly way, where the function is encapsulated in a class: class MyFunctionWorkspace { declare private temporary variables; real anotherReturnValue; this (int someParam) { ... } real myFunction(real arg) { do some calculations; store a return value in aReturnValue; return main return value; } } I'm sure a lot of people will disagree with me on this, but let me first say why I think the last case is more user-friendly. For one thing, the same class can be used over and over again with the same parameter(s). Also, the user only has to retrieve aReturnValue if it is needed. If there are many such "additional" inout parameters which are seldom needed, it gets tedious to declare variables for them every time the function is called. I could overload the function, but this also has drawbacks if there are several inout parameters with the same type. My questions are: - If I do like in the second example above, and reuse temporary variables instead of allocating them every time the function is called, could this way also give the best performance? (Yes, I know this is bad form...) ...or, if not... - If I (again in the second example) move the temporary variables inside the function, so they are allocated on the stack instead of the heap (?), will this improve or reduce performance? I could write both types of code and test them against each other, but I am planning to use the same style for several different functions in several modules, and want to find the solution which is generally the best one. -Lars
Feb 02 2009
Lars Kyllingstad wrote:real anotherReturnValue;Correction: real aReturnValue;
Feb 02 2009
On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad <public kyllingen.nospamnet> wrote:I have some functions for which I want to find the nicest possible combination of performance and usability. I have two suggestions as to how they should be defined. "Classic" style: real myFunction(real arg, int someParam, inout real aReturnValue) { declare temporary variables; do some calculations; store a return value in aReturnValue; return main return value; } The user-friendly way, where the function is encapsulated in a class: class MyFunctionWorkspace { declare private temporary variables; real anotherReturnValue; this (int someParam) { ... } real myFunction(real arg) { do some calculations; store a return value in aReturnValue; return main return value; } } I'm sure a lot of people will disagree with me on this, but let me first say why I think the last case is more user-friendly. For one thing, the same class can be used over and over again with the same parameter(s). Also, the user only has to retrieve aReturnValue if it is needed. If there are many such "additional" inout parameters which are seldom needed, it gets tedious to declare variables for them every time the function is called. I could overload the function, but this also has drawbacks if there are several inout parameters with the same type. My questions are: - If I do like in the second example above, and reuse temporary variables instead of allocating them every time the function is called, could this way also give the best performance? (Yes, I know this is bad form...) ...or, if not... - If I (again in the second example) move the temporary variables inside the function, so they are allocated on the stack instead of the heap (?), will this improve or reduce performance? I could write both types of code and test them against each other, but I am planning to use the same style for several different functions in several modules, and want to find the solution which is generally the best one.Any gains you get from skipping the initial calculations will be swiftly cut down by the cost of heap allocation and cache misses, if you allocate this object several times. A much better way to get the usability of the latter with the better performance of the former is to use a struct instead of a class. I highly doubt you'll be needing to inherit these "operation objects" anyway. The struct will be allocated on the stack, and you still get all the usability.
Feb 02 2009
Jarrett Billingsley wrote:On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad <public kyllingen.nospamnet> wrote:OK. But if the object is allocated once (or seldomly, at least), and I allocate any working variables on the stack, then the second case may not be half bad?I have some functions for which I want to find the nicest possible combination of performance and usability. I have two suggestions as to how they should be defined. "Classic" style: real myFunction(real arg, int someParam, inout real aReturnValue) { declare temporary variables; do some calculations; store a return value in aReturnValue; return main return value; } The user-friendly way, where the function is encapsulated in a class: class MyFunctionWorkspace { declare private temporary variables; real anotherReturnValue; this (int someParam) { ... } real myFunction(real arg) { do some calculations; store a return value in aReturnValue; return main return value; } } I'm sure a lot of people will disagree with me on this, but let me first say why I think the last case is more user-friendly. For one thing, the same class can be used over and over again with the same parameter(s). Also, the user only has to retrieve aReturnValue if it is needed. If there are many such "additional" inout parameters which are seldom needed, it gets tedious to declare variables for them every time the function is called. I could overload the function, but this also has drawbacks if there are several inout parameters with the same type. My questions are: - If I do like in the second example above, and reuse temporary variables instead of allocating them every time the function is called, could this way also give the best performance? (Yes, I know this is bad form...) ...or, if not... - If I (again in the second example) move the temporary variables inside the function, so they are allocated on the stack instead of the heap (?), will this improve or reduce performance? I could write both types of code and test them against each other, but I am planning to use the same style for several different functions in several modules, and want to find the solution which is generally the best one.Any gains you get from skipping the initial calculations will be swiftly cut down by the cost of heap allocation and cache misses, if you allocate this object several times.A much better way to get the usability of the latter with the better performance of the former is to use a struct instead of a class. I highly doubt you'll be needing to inherit these "operation objects" anyway. The struct will be allocated on the stack, and you still get all the usability.Thanks, I hadn't even thought of that! :) This could certainly be a solution. There are two problems, however: 1) In D1, structs don't have constructors, which could again make the initial parameter setting a tedious task. But this is not a big problem, as I could just define a static opCall for each struct as a kind of constructor. 2) Bigger problem: I was kinda hoping that all the functions could implement a common interface, so I can use them in generic algorithms. This could possibly be done with structs using templates, but plain old interfaces would be a cleaner solution. -Lars
Feb 02 2009
Jarrett Billingsley wrote:On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad <public kyllingen.nospamnet> wrote:Why not use scope to allocate the class on the stack? For everything else, I agree with Donald Knuth (if he really said that...)I have some functions for which I want to find the nicest possible combination of performance and usability. I have two suggestions as to how they should be defined. "Classic" style: real myFunction(real arg, int someParam, inout real aReturnValue) { declare temporary variables; do some calculations; store a return value in aReturnValue; return main return value; } The user-friendly way, where the function is encapsulated in a class: class MyFunctionWorkspace { declare private temporary variables; real anotherReturnValue; this (int someParam) { ... } real myFunction(real arg) { do some calculations; store a return value in aReturnValue; return main return value; } } I'm sure a lot of people will disagree with me on this, but let me first say why I think the last case is more user-friendly. For one thing, the same class can be used over and over again with the same parameter(s). Also, the user only has to retrieve aReturnValue if it is needed. If there are many such "additional" inout parameters which are seldom needed, it gets tedious to declare variables for them every time the function is called. I could overload the function, but this also has drawbacks if there are several inout parameters with the same type. My questions are: - If I do like in the second example above, and reuse temporary variables instead of allocating them every time the function is called, could this way also give the best performance? (Yes, I know this is bad form...) ...or, if not... - If I (again in the second example) move the temporary variables inside the function, so they are allocated on the stack instead of the heap (?), will this improve or reduce performance? I could write both types of code and test them against each other, but I am planning to use the same style for several different functions in several modules, and want to find the solution which is generally the best one.Any gains you get from skipping the initial calculations will be swiftly cut down by the cost of heap allocation and cache misses, if you allocate this object several times. A much better way to get the usability of the latter with the better performance of the former is to use a struct instead of a class. I highly doubt you'll be needing to inherit these "operation objects" anyway. The struct will be allocated on the stack, and you still get all the usability.
Feb 02 2009
On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:Why not use scope to allocate the class on the stack? For everything else, I agree with Donald Knuth (if he really said that...)That's fine too, and would fit in with his needs to implement interfaces. But again, if he's worried about caching some parameters but not worried about the overhead of virtual calls.. something's off.
Feb 02 2009
Jarrett Billingsley wrote:On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:Or he's caching some very big/complex parameters in the code he's actually writing... maybe. That said: do we have any assurance that, were the functor class tagged as 'final', the call would cease to be virtual? If so, then the only extra cost on the call is that of the hidden "this" sitting in ESI. I still don't care for the memory allocation involved, personally, but if these are long-lived functors that may not be a major problem. (Ie, if he calls foo(?,X) a million times, the cost of allocating one object is amortized into nearly nothing.) -- Chris Nicholson-SaulsWhy not use scope to allocate the class on the stack? For everything else, I agree with Donald Knuth (if he really said that...)That's fine too, and would fit in with his needs to implement interfaces. But again, if he's worried about caching some parameters but not worried about the overhead of virtual calls.. something's off.
Feb 02 2009
On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls <ibisbasenji gmail.com> wrote:do we have any assurance that, were the functor class tagged as 'final', the call would cease to be virtual?http://d.puremagic.com/issues/show_bug.cgi?id=1909
Feb 02 2009
On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls <ibisbasenji gmail.com> wrote:Or he's caching some very big/complex parameters in the code he's actually writing... maybe. That said: do we have any assurance that, were the functor class tagged as 'final', the call would cease to be virtual? If so, then the only extra cost on the call is that of the hidden "this" sitting in ESI. I still don't care for the memory allocation involved, personally, but if these are long-lived functors that may not be a major problem. (Ie, if he calls foo(?,X) a million times, the cost of allocating one object is amortized into nearly nothing.)Oh, I suppose I should also point out that if you made these functors' methods final, they wouldn't be able to implement interfaces, since interface implementations must be virtual. So, at that point, you're using a final scope class - might as well use a struct anyway.
Feb 02 2009
Jarrett Billingsley wrote:On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls <ibisbasenji gmail.com> wrote:As far as I know, interface methods can still be final methods in a class. final methods are only disallowed to be overridden further. But it's perfectly fine to mark a method final, that overrides a method from a super class. final so to say only works in one direction. Then the compiler can optimize calls, if they are statically known to be final. If not, it still has to do a vtable lookup on a method call, even if the actually called method is final. So it can still make sense to use a class instead of a struct.Or he's caching some very big/complex parameters in the code he's actually writing... maybe. That said: do we have any assurance that, were the functor class tagged as 'final', the call would cease to be virtual? If so, then the only extra cost on the call is that of the hidden "this" sitting in ESI. I still don't care for the memory allocation involved, personally, but if these are long-lived functors that may not be a major problem. (Ie, if he calls foo(?,X) a million times, the cost of allocating one object is amortized into nearly nothing.)Oh, I suppose I should also point out that if you made these functors' methods final, they wouldn't be able to implement interfaces, since interface implementations must be virtual. So, at that point, you're using a final scope class - might as well use a struct anyway.
Feb 02 2009
On Mon, Feb 2, 2009 at 3:37 PM, grauzone <none example.net> wrote:As far as I know, interface methods can still be final methods in a class. final methods are only disallowed to be overridden further. But it's perfectly fine to mark a method final, that overrides a method from a super class. final so to say only works in one direction.Sure, the method will be final, but it will still be virtual. The way interfaces work is by basically giving you a slice of the vtable.Then the compiler can optimize calls, if they are statically known to be final. If not, it still has to do a vtable lookup on a method call, even if the actually called method is final.The compiler can't optimize calls on interface references away. The function that's using the interface reference only knows as much as the interface tells it. If some class implements the interface and marks its implementation of the interface as final, it doesn't matter, since the method is not marked final in the interface (and can't be!). Okay, so *if* the compiler inlined the call to the function that took the interface reference, *and* it was smart enough to recognize that that interface reference did not escape, *and* it was smart enough to realize that the interface really pointed to a class, *and* it knew that the implementation of the method was final, it could inline it. But that seems like an incredibly smart compiler and an incredibly rare situation. I also don't believe in relying on optimizations that are not enforced, as it makes for nonportable code.
Feb 02 2009
I agree. Of course using an interface to call a method always requires a virtual method call. It's even slower than a virtual method call, because it needs to convert the interface reference into an object reference. But he still could call the method in question directly. Implementing an interface can be useful to enforce a contract. You can't do that with structs. Code compiled in debug mode (or was it not-release mode) also calls the code to check the invariant, even if you didn't define one. I guess this can make calling struct methods much faster than object methods.
Feb 02 2009
On Mon, Feb 2, 2009 at 4:55 PM, grauzone <none example.net> wrote:I agree. Of course using an interface to call a method always requires a virtual method call. It's even slower than a virtual method call, because it needs to convert the interface reference into an object reference. But he still could call the method in question directly. Implementing an interface can be useful to enforce a contract. You can't do that with structs.What's the point of implementing an interface unless you plan on passing instances of that class to something that expects an interface reference? ;)Code compiled in debug mode (or was it not-release mode) also calls the code to check the invariant, even if you didn't define one. I guess this can make calling struct methods much faster than object methods.Invariants (as well as in/out contracts and assertions) are turned off in release mode. FWIW, struct methods also do an "assert(this !is null);" in debug mode, so they're sort of doing an invariant check. But struct methods are never virtual, so yes, they will in general be faster.
Feb 02 2009
Jarrett Billingsley wrote:On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:You're assuming too much programming knowledge and carelessness on my part. I merely wanted to know if the second solution would be significantly slower than the first one. Caching of the parameters would be a bonus, as would caching of additional output and the ability to use interfaces. -LarsWhy not use scope to allocate the class on the stack? For everything else, I agree with Donald Knuth (if he really said that...)That's fine too, and would fit in with his needs to implement interfaces. But again, if he's worried about caching some parameters but not worried about the overhead of virtual calls.. something's off.
Feb 02 2009
Lars Kyllingstad:I merely wanted to know if the second solution would be significantly slower than the first one.<No amount of theory can replace actual timings of your code snippets :-) (It's often true the other way too, practice doesn't replace theory. But here there isn't too much theory, so lot of practice suffices if you don't know the theory). Bye, bearophile
Feb 02 2009
Lars Kyllingstad wrote:I have some functions for which I want to find the nicest possible combination of performance and usability. I have two suggestions as to how they should be defined. "Classic" style: real myFunction(real arg, int someParam, inout real aReturnValue) { declare temporary variables; do some calculations; store a return value in aReturnValue; return main return value; } The user-friendly way, where the function is encapsulated in a class: class MyFunctionWorkspace { declare private temporary variables; real anotherReturnValue; this (int someParam) { ... } real myFunction(real arg) { do some calculations; store a return value in aReturnValue; return main return value; } } I'm sure a lot of people will disagree with me on this, but let me first say why I think the last case is more user-friendly. For one thing, the same class can be used over and over again with the same parameter(s). Also, the user only has to retrieve aReturnValue if it is needed. If there are many such "additional" inout parameters which are seldom needed, it gets tedious to declare variables for them every time the function is called. I could overload the function, but this also has drawbacks if there are several inout parameters with the same type. My questions are: - If I do like in the second example above, and reuse temporary variables instead of allocating them every time the function is called, could this way also give the best performance? (Yes, I know this is bad form...) ...or, if not... - If I (again in the second example) move the temporary variables inside the function, so they are allocated on the stack instead of the heap (?), will this improve or reduce performance? I could write both types of code and test them against each other, but I am planning to use the same style for several different functions in several modules, and want to find the solution which is generally the best one. -LarsIf I understand right that your main concern is with parameters that are used over and over and over again -- which I can empathize with -- you could also look into function currying. Assuming you are using Phobos, the module you want to look at is std.bind, usage of which is pretty straightforward. Given a function: real pow (real base, real exp); You could emulate a square() function via std.bind like so: square = bind(&pow, _0, 2.0); square(42.0); // same as: pow(42.0, 2.0) If you are using Tango, I'm honestly not sure off the top of my head what the relevant module is, but you could always install Tangobos and use std.bind just fine. All that being said, I have no experience with currying functions with inout parameters. If my understanding of how std.bind works its magic is right, it should be fine. I believe it wraps the call up in a structure, which means the actual parameter will be from a field of said structure... which, actually, means it could also store state. That in itself could be an interesting capability. -- Chris Nicholson-Sauls
Feb 02 2009
Chris Nicholson-Sauls wrote:Lars Kyllingstad wrote:Most of the time I use Tango, but in this particular case I don't want my code to depend on either library. Also I'm not sure whether the std.bind functionality is even present in Tango. I could always write my.own.bind, though. Your solution is nice from a usability perspective, in that it reuses function arguments -- possibly even inout ones. From a performance perspective, however, it carries with it the overhead of an extra function call, which I'm not sure I want. -LarsI have some functions for which I want to find the nicest possible combination of performance and usability. I have two suggestions as to how they should be defined. "Classic" style: real myFunction(real arg, int someParam, inout real aReturnValue) { declare temporary variables; do some calculations; store a return value in aReturnValue; return main return value; } The user-friendly way, where the function is encapsulated in a class: class MyFunctionWorkspace { declare private temporary variables; real anotherReturnValue; this (int someParam) { ... } real myFunction(real arg) { do some calculations; store a return value in aReturnValue; return main return value; } } I'm sure a lot of people will disagree with me on this, but let me first say why I think the last case is more user-friendly. For one thing, the same class can be used over and over again with the same parameter(s). Also, the user only has to retrieve aReturnValue if it is needed. If there are many such "additional" inout parameters which are seldom needed, it gets tedious to declare variables for them every time the function is called. I could overload the function, but this also has drawbacks if there are several inout parameters with the same type. My questions are: - If I do like in the second example above, and reuse temporary variables instead of allocating them every time the function is called, could this way also give the best performance? (Yes, I know this is bad form...) ...or, if not... - If I (again in the second example) move the temporary variables inside the function, so they are allocated on the stack instead of the heap (?), will this improve or reduce performance? I could write both types of code and test them against each other, but I am planning to use the same style for several different functions in several modules, and want to find the solution which is generally the best one. -LarsIf I understand right that your main concern is with parameters that are used over and over and over again -- which I can empathize with -- you could also look into function currying. Assuming you are using Phobos, the module you want to look at is std.bind, usage of which is pretty straightforward. Given a function: real pow (real base, real exp); You could emulate a square() function via std.bind like so: square = bind(&pow, _0, 2.0); square(42.0); // same as: pow(42.0, 2.0) If you are using Tango, I'm honestly not sure off the top of my head what the relevant module is, but you could always install Tangobos and use std.bind just fine. All that being said, I have no experience with currying functions with inout parameters. If my understanding of how std.bind works its magic is right, it should be fine. I believe it wraps the call up in a structure, which means the actual parameter will be from a field of said structure... which, actually, means it could also store state. That in itself could be an interesting capability. -- Chris Nicholson-Sauls
Feb 02 2009
Lars Kyllingstad wrote:[snip] From a performance perspective, however, it carries with it the overhead of an extra function call, which I'm not sure I want. -LarsYou're worried about a second function call which could potentially be inlined, yet you're seemingly not worried about the overhead of virtual calls or heap allocations... Allow me to quote Donald Knuth:We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.Unless you're doing something where you *know* you're going to need every last cycle, just go with whichever design works best. Your response to Jarrett implies that you've already got a design in mind, and are just fishing for a magic "make it go faster button." Believe me, if Walter had invented such a thing, he wouldn't be wasting his time putting up with us; he'd be too busy smoking $100 bills from the comfort of his SPACE FORTRESS. :D In any case, I'm willing to bet that if there *are* inefficiencies you're not going to know exactly where until you've written the code, anyway. :P If classes work, and make for an elegant design, go for it. -- Daniel
Feb 02 2009
Daniel Keep wrote:Lars Kyllingstad wrote:But that's the problem, you see. I don't know how expensive these operations are, hence my initial question(s). (This was also why I posted my question in D.learn.) For instance, I didn't know (not sure I still do) what the cost is of frequent allocation/deallocation/access of stack memory vs. infrequent allocation/deallocation and frequent access of heap memory. From the replies I've got, it seems heap variables make for significantly slower code. Nor was I sure, as you pointed out, how expensive a virtual function call is vs. an extra non-virtual function call. I'm a physicist, not a computer scientist. :)[snip] From a performance perspective, however, it carries with it the overhead of an extra function call, which I'm not sure I want. -LarsYou're worried about a second function call which could potentially be inlined, yet you're seemingly not worried about the overhead of virtual calls or heap allocations...Allow me to quote Donald Knuth:I want that button, yes. :) But seriously, I am doing numerical computations, so performance is absolutely an issue. The main thing I wanted to know was, can I have both performance and usability, or do I have to choose? With Jarretts suggestion I can, to some degree, have both.We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.Unless you're doing something where you *know* you're going to need every last cycle, just go with whichever design works best. Your response to Jarrett implies that you've already got a design in mind, and are just fishing for a magic "make it go faster button."Believe me, if Walter had invented such a thing, he wouldn't be wasting his time putting up with us; he'd be too busy smoking $100 bills from the comfort of his SPACE FORTRESS. :DWhat are you implying, that he wouldn't make it open-source? :)In any case, I'm willing to bet that if there *are* inefficiencies you're not going to know exactly where until you've written the code, anyway. :P If classes work, and make for an elegant design, go for it. -- Daniel
Feb 02 2009
Lars Kyllingstad wrote:Daniel Keep wrote:Allocating stack memory is very cheap, because essentially the only thing that has to be done is to offset a stack pointer. Some stack variables are even optimized away if only used as temporaries (that is, their value is retained in a register until it isn't needed) and for short durations. Allocating heap memory, on the other hand, is expensive for two reasons. The first, is that the heap may have to grow, which means negotiating more memory from the operating system, which means switching the CPU back and forth between modes, sometimes several iterations. Of course, this doesn't happen on every allocation, or even very often if you're careful. The second reason, is that before every allocation the garbage collector will perform a collection run. This can actually be disabled (at least in theory) if you plan on doing several allocations in a short period of time, and thereafter re-enabled. For the latter case, see Phobos 'std.gc' or Tango 'tango.core.Memory'. Once you have memory allocated, the cost of access is generally about the same, except that the stack is more likely to be cached by the CPU. (Since it is inevitably accessed often.)Lars Kyllingstad wrote:But that's the problem, you see. I don't know how expensive these operations are, hence my initial question(s). (This was also why I posted my question in D.learn.) For instance, I didn't know (not sure I still do) what the cost is of frequent allocation/deallocation/access of stack memory vs. infrequent allocation/deallocation and frequent access of heap memory. From the replies I've got, it seems heap variables make for significantly slower code.[snip] From a performance perspective, however, it carries with it the overhead of an extra function call, which I'm not sure I want. -LarsYou're worried about a second function call which could potentially be inlined, yet you're seemingly not worried about the overhead of virtual calls or heap allocations...Nor was I sure, as you pointed out, how expensive a virtual function call is vs. an extra non-virtual function call.It adds an additional step. You start with an index into the object's vtable (a list of pointers) rather than the function's actual address. Its essentially the same as the difference between assigning to an 'int**' versus an 'int*'.I'm a physicist, not a computer scientist. :)Which is a good thing, since D could use more experience from non-programmers who need to program. That's a demographic that occasionally (but never completely!) gets forgotten. I'm not exactly a thirty-years guru, myself. -- Chris Nicholson-Sauls
Feb 03 2009
Chris Nicholson-Sauls wrote:Lars Kyllingstad wrote:Thank you for a very informative reply. :) -LarsDaniel Keep wrote:Allocating stack memory is very cheap, because essentially the only thing that has to be done is to offset a stack pointer. Some stack variables are even optimized away if only used as temporaries (that is, their value is retained in a register until it isn't needed) and for short durations. Allocating heap memory, on the other hand, is expensive for two reasons. The first, is that the heap may have to grow, which means negotiating more memory from the operating system, which means switching the CPU back and forth between modes, sometimes several iterations. Of course, this doesn't happen on every allocation, or even very often if you're careful. The second reason, is that before every allocation the garbage collector will perform a collection run. This can actually be disabled (at least in theory) if you plan on doing several allocations in a short period of time, and thereafter re-enabled. For the latter case, see Phobos 'std.gc' or Tango 'tango.core.Memory'. Once you have memory allocated, the cost of access is generally about the same, except that the stack is more likely to be cached by the CPU. (Since it is inevitably accessed often.)Lars Kyllingstad wrote:But that's the problem, you see. I don't know how expensive these operations are, hence my initial question(s). (This was also why I posted my question in D.learn.) For instance, I didn't know (not sure I still do) what the cost is of frequent allocation/deallocation/access of stack memory vs. infrequent allocation/deallocation and frequent access of heap memory. From the replies I've got, it seems heap variables make for significantly slower code.[snip] From a performance perspective, however, it carries with it the overhead of an extra function call, which I'm not sure I want. -LarsYou're worried about a second function call which could potentially be inlined, yet you're seemingly not worried about the overhead of virtual calls or heap allocations...Nor was I sure, as you pointed out, how expensive a virtual function call is vs. an extra non-virtual function call.It adds an additional step. You start with an index into the object's vtable (a list of pointers) rather than the function's actual address. Its essentially the same as the difference between assigning to an 'int**' versus an 'int*'.I'm a physicist, not a computer scientist. :)Which is a good thing, since D could use more experience from non-programmers who need to program. That's a demographic that occasionally (but never completely!) gets forgotten. I'm not exactly a thirty-years guru, myself. -- Chris Nicholson-Sauls
Feb 03 2009
On Tue, Feb 3, 2009 at 3:44 PM, Chris Nicholson-Sauls <ibisbasenji gmail.com> wrote:The second reason, is that before every allocation the garbage collector will perform a collection run. This can actually be disabled (at least in theory) if you plan on doing several allocations in a short period of time, and thereafter re-enabled.It should be "before every allocation the garbage collector *may* perform a collection run." If it collected on every allocation it would make your program's execution speed next to useless ;)
Feb 03 2009
Jarrett Billingsley wrote:On Tue, Feb 3, 2009 at 3:44 PM, Chris Nicholson-Sauls <ibisbasenji gmail.com> wrote:Well okay, yes, it *may*. I was in a hurry and trying to be general. ;) Chances are, though, that if you are doing so many allocations in a short period as to be worried about it, that it probably will. If I remember right, the current GC runs a collection just before requesting more heap, so its actually related to the first issue. (I may well remember wrong, its been a very long time since I dove into the GC code.) -- Chris Nicholson-SaulsThe second reason, is that before every allocation the garbage collector will perform a collection run. This can actually be disabled (at least in theory) if you plan on doing several allocations in a short period of time, and thereafter re-enabled.It should be "before every allocation the garbage collector *may* perform a collection run." If it collected on every allocation it would make your program's execution speed next to useless ;)
Feb 03 2009