digitalmars.D - Allocatoin policy in Phobos

digitalmars.D - Allocatoin policy in Phobos - Was: Vote for std.process

Manu (22/46) Apr 12 2013 Just to be clear, I'm not arguing optimisation for performance here, I'm

Vladimir Panteleev (30/37) Apr 12 2013 Well, ironically or not, it is not something utterly trivial.

Manu (20/53) Apr 12 2013 Yes, you're right, there's an 'if' required here to catch unreasonably

Walter Bright (5/9) Apr 12 2013 I would also expect that Phobos modules that know the lifetimes of their...

Steven Schveighoffer (7/16) Apr 12 2013 I would like a better solution. Allocating things with malloc/free mean...

Walter Bright (3/17) Apr 12 2013 Seems like overkill for a small issue.

deadalnix (3/13) Apr 12 2013 Why not use GC.free ? malloc is invisible for the GC, so nothing
Kagamin (2/4) Apr 12 2013 Or even better use new/delete. Delete also nullifies the pointer.

Walter Bright (3/9) Apr 12 2013 Absolutely right. All phobos functions should not allocate unless absolu...

Andrej Mitrovic (3/5) Apr 12 2013 Well that came out of nowhere, when has this rule ever been
Steven Schveighoffer (9/22) Apr 12 2013 Define "absolutely." For example, there was an objection to accepting a...

Walter Bright (2/8) Apr 12 2013 I think this is best done on a case-by-case basis using best engineering...
Manu (6/31) Apr 12 2013 Great! I was raising the issue, with the intent to open it for discussio...

Manu <turkeyman gmail.com> writes:

On 12 April 2013 21:00, Vladimir Panteleev <vladimir thecybershadow.net>wrote:

 Consider the following hypothetical decisions and outcomes:

 1. std.process is left at is. One user is angry / turned away because it
 performs 0.1% slower than it can be.

 2. std.process is rewritten to minimize allocations. Code complexity goes
 up, new improvements are challenging to add; bugs pop up and go unfixed for
 a while because fewer programmers are qualified or willing to commit the
 effort of making correct fixes. More people are angry / turned away from D
 because its standard library is buggy.

 Of course, the above is an exaggerated illustration. But would optimizing
 all code left and right really make more D users happier?

Just to be clear, I'm not arguing optimisation for performance here, I'm
arguing intolerance for __unnecessary__ allocations as a policy, or at
least a habit.
There's a whole separate thread on the topic of fighting unnecessary
garbage, and having the ability to use D with strict control over the GC
and/or allocation in general.

If std functions have no reason to allocate, why should they?

There's also the question of priorities. Would you rather than effort is
 spent on optimizing std.process (and dealing with all the fallout from any
 such optimizations), or working on something that is acutely missing and
 hurting D?


If it's somehow hard to put a string on the stack, then there may be a hole
in phobos. I'm not suggesting changes that are somehow hard to implement,
or obscure in some way... they should be utterly trivial.

 D is a systems programming language, there is hope that it will penetrate
 a wide range of systems and environments - sure in many cases a little bit
 of memory use or performance loss is unimportant, but for many it will be
 the decisive factor which makes D unusable there.

 This is surely an exaggeration.

 D does not attempt to please everyone out there who is choosing a
 programming language for their next project. There is no such language, nor
 can one exist. One has to accept that D has a number of goals, none of
 which are absolute, but merely point towards a certain, but not overly
 specific, point in the multidimensional matrix of trade-offs. D never was
 about achieving maximum performance in all possible cases.

And I never suggested we scrap phobos and rewrite it so it maximises
performance at all costs. I highlighted, and suggested trivial changes that
would make a big difference and don't hurt anyone. If it were habit of
phobos devs to generally consider and try and avoid unnecessary allocations
(almost all of which would be approached by using the stack wherever
applicable), the situation would be much better in general. End-users can
write D code however they want, but phobos should strive to be usable in as
many types of software as possible, otherwise what good is a standard
library?

Apr 12 2013

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 12 April 2013 at 12:52:39 UTC, Manu wrote:
 If it's somehow hard to put a string on the stack, then there 
 may be a hole
 in phobos. I'm not suggesting changes that are somehow hard to 
 implement,
 or obscure in some way... they should be utterly trivial.

Well, ironically or not, it is not something utterly trivial.

The main issue is that the stack can't hold a lot of data. This 
is not a problem with the heap, which is limited by the amount of 
memory and address space; these (usually abundant) limits are 
usually the user's concern, not the programmer's.

Did you know that Linux does not impose a limit on the size of 
the environment? The default stack size seems to be 8MB... Now, 
what would happen if on certain machines that, for one reason or 
another, have an environment larger than that, and std.process 
did not account for it?

So, to perform the task correctly, std.process would need to 
perform most allocations on the stack if they are up to a certain 
size, and on the heap otherwise.

What would be a good limit for stack allocations? You may want to 
choose a value based on whatever's the default stack size on 
today's Linux versions (after all, std.process is near the "leaf" 
parts of call stacks). However, certain applications create a lot 
of stacks, for example for use in lightweight threads (fibers). 
When restricted by a small address space (32-bit architecture), 
the stacks need to be much smaller than usual...

 I highlighted, and suggested trivial changes that
 would make a big difference and don't hurt anyone.

Well, why do you think they would make a big difference in 
std.process?

I don't think any of the Phobos developers are against improving 
performance when the cost is low. So, it's not that I think 
you're wrong in general, but that the std.process scapegoat (for 
lack of better word) was not the best choice.

I suggest that you file enhancement requests on Bugzilla for each 
specific component of Phobos / Druntime, improving the allocation 
behavior of which would result in a real-world benefit for you.

Apr 12 2013

Manu <turkeyman gmail.com> writes:

On 12 April 2013 23:32, Vladimir Panteleev <vladimir thecybershadow.net>wrote:

 On Friday, 12 April 2013 at 12:52:39 UTC, Manu wrote:

 If it's somehow hard to put a string on the stack, then there may be a
 hole
 in phobos. I'm not suggesting changes that are somehow hard to implement,
 or obscure in some way... they should be utterly trivial.

 Well, ironically or not, it is not something utterly trivial.

 The main issue is that the stack can't hold a lot of data. This is not a
 problem with the heap, which is limited by the amount of memory and address
 space; these (usually abundant) limits are usually the user's concern, not
 the programmer's.

 Did you know that Linux does not impose a limit on the size of the
 environment? The default stack size seems to be 8MB... Now, what would
 happen if on certain machines that, for one reason or another, have an
 environment larger than that, and std.process did not account for it?

 So, to perform the task correctly, std.process would need to perform most
 allocations on the stack if they are up to a certain size, and on the heap
 otherwise.

 What would be a good limit for stack allocations? You may want to choose a
 value based on whatever's the default stack size on today's Linux versions
 (after all, std.process is near the "leaf" parts of call stacks). However,
 certain applications create a lot of stacks, for example for use in
 lightweight threads (fibers). When restricted by a small address space
 (32-bit architecture), the stacks need to be much smaller than usual...


Yes, you're right, there's an 'if' required here to catch unreasonably
large environment blocks, but I still consider that within the realm of
'trivial'.
This is processed in an appending loop, just check the next bit fits, and
if it overflows 1kb or so of stack string, revert to the heap and continue.

I reckon helpers could be written to assist with common cases of this
(which would have to be mixin template based I guess?)...
And I really like the variable-length static array idea!

 I highlighted, and suggested trivial changes that
 would make a big difference and don't hurt anyone.

 Well, why do you think they would make a big difference in std.process?

 I don't think any of the Phobos developers are against improving
 performance when the cost is low. So, it's not that I think you're wrong in
 general, but that the std.process scapegoat (for lack of better word) was
 not the best choice.

Fuck, I've repeated myself so many times now. The point I make is a general
issue I have with phobos, I consider it an issue that should be made policy
(irrespective of module being considered), and std.process came into
question right at the moment I thought to make the point. It may not be the
strongest case for the principle, it's just the one that appeared.

I suggest that you file enhancement requests on Bugzilla for each specific
 component of Phobos / Druntime, improving the allocation behavior of which
 would result in a real-world benefit for you.

I'll start doing it myself, but I also suggest it be made a policy, and
carefully considered when considering acceptance of ANY new module. That
way, new code that suffers the unpredictable/"surprise!" allocation
problems won't be introduced.

Apr 12 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 4/12/2013 7:05 AM, Manu wrote:
 I'll start doing it myself, but I also suggest it be made a policy, and
 carefully considered when considering acceptance of ANY new module. That way,
 new code that suffers the unpredictable/"surprise!" allocation problems won't
be
 introduced.

I would also expect that Phobos modules that know the lifetimes of their 
allocated data use malloc/free rather than the gc.

Of course, that entails more effort in coding the modules to ensure no leaks, 
but we can certainly expect that of phobos developers.

Apr 12 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 12 Apr 2013 13:41:57 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 4/12/2013 7:05 AM, Manu wrote:
 I'll start doing it myself, but I also suggest it be made a policy, and
 carefully considered when considering acceptance of ANY new module.  
 That way,
 new code that suffers the unpredictable/"surprise!" allocation problems  
 won't be
 introduced.

 I would also expect that Phobos modules that know the lifetimes of their  
 allocated data use malloc/free rather than the gc.

I would like a better solution.  Allocating things with malloc/free means  
no GC references involved,or clunky addroot/removeroot calls.  That is  
dangerous to say the least.

What about dsimcha's region allocator?

-Steve

Apr 12 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 4/12/2013 10:56 AM, Steven Schveighoffer wrote:
 On Fri, 12 Apr 2013 13:41:57 -0400, Walter Bright <newshound2 digitalmars.com>
 wrote:

 On 4/12/2013 7:05 AM, Manu wrote:
 I'll start doing it myself, but I also suggest it be made a policy, and
 carefully considered when considering acceptance of ANY new module. That way,
 new code that suffers the unpredictable/"surprise!" allocation problems won't
be
 introduced.

 I would also expect that Phobos modules that know the lifetimes of their
 allocated data use malloc/free rather than the gc.

 I would like a better solution.  Allocating things with malloc/free means no GC
 references involved,or clunky addroot/removeroot calls.  That is dangerous to
 say the least.

Yes, it takes some engineering work to do it right.

 What about dsimcha's region allocator?

Seems like overkill for a small issue.

Apr 12 2013

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 12 April 2013 at 17:41:59 UTC, Walter Bright wrote:
 On 4/12/2013 7:05 AM, Manu wrote:
 I'll start doing it myself, but I also suggest it be made a 
 policy, and
 carefully considered when considering acceptance of ANY new 
 module. That way,
 new code that suffers the unpredictable/"surprise!" allocation 
 problems won't be
 introduced.

 I would also expect that Phobos modules that know the lifetimes 
 of their allocated data use malloc/free rather than the gc.

Why not use GC.free ? malloc is invisible for the GC, so nothing 
GCed can be stored there safely.

Apr 12 2013

"Kagamin" <spam here.lot> writes:

On Friday, 12 April 2013 at 17:41:59 UTC, Walter Bright wrote:
 I would also expect that Phobos modules that know the lifetimes 
 of their allocated data use malloc/free rather than the gc.

Or even better use new/delete. Delete also nullifies the pointer.

Apr 12 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 4/12/2013 5:52 AM, Manu wrote:
 Just to be clear, I'm not arguing optimisation for performance here, I'm
arguing
 intolerance for __unnecessary__ allocations as a policy, or at least a habit.
 There's a whole separate thread on the topic of fighting unnecessary garbage,
 and having the ability to use D with strict control over the GC and/or
 allocation in general.

 If std functions have no reason to allocate, why should they?

Absolutely right. All phobos functions should not allocate unless absolutely 
necessary.

Apr 12 2013

"Andrej Mitrovic" <andrej.mitrovich gmail.com> writes:

On Friday, 12 April 2013 at 17:37:52 UTC, Walter Bright wrote:
 Absolutely right. All phobos functions should not allocate 
 unless absolutely necessary.

Well that came out of nowhere, when has this rule ever been
defined anywhere?

Apr 12 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 12 Apr 2013 13:37:50 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 4/12/2013 5:52 AM, Manu wrote:
 Just to be clear, I'm not arguing optimisation for performance here,  
 I'm arguing
 intolerance for __unnecessary__ allocations as a policy, or at least a  
 habit.
 There's a whole separate thread on the topic of fighting unnecessary  
 garbage,
 and having the ability to use D with strict control over the GC and/or
 allocation in general.

 If std functions have no reason to allocate, why should they?

 Absolutely right. All phobos functions should not allocate unless  
 absolutely necessary.

Define "absolutely."  For example, there was an objection to accepting an  
AA as an "environment" map to std.process.spawnX functions because even  
though reading the AA would not require allocation, allocation would  
certainly be required to build the AA.  Is that acceptable?  Certainly we  
could invent a new non-allocating map type and accept that instead.

I think we need clearer lines drawn here, if they are to be respected.

-Steve

Apr 12 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 4/12/2013 10:51 AM, Steven Schveighoffer wrote:
 Define "absolutely."  For example, there was an objection to accepting an AA as
 an "environment" map to std.process.spawnX functions because even though
reading
 the AA would not require allocation, allocation would certainly be required to
 build the AA.  Is that acceptable?  Certainly we could invent a new
 non-allocating map type and accept that instead.

 I think we need clearer lines drawn here, if they are to be respected.

I think this is best done on a case-by-case basis using best engineering
judgement.

Apr 12 2013

Manu <turkeyman gmail.com> writes:

On 13 April 2013 03:51, Steven Schveighoffer <schveiguy yahoo.com> wrote:

 On Fri, 12 Apr 2013 13:37:50 -0400, Walter Bright <
 newshound2 digitalmars.com> wrote:

  On 4/12/2013 5:52 AM, Manu wrote:
 Just to be clear, I'm not arguing optimisation for performance here, I'm
 arguing
 intolerance for __unnecessary__ allocations as a policy, or at least a
 habit.
 There's a whole separate thread on the topic of fighting unnecessary
 garbage,
 and having the ability to use D with strict control over the GC and/or
 allocation in general.

 If std functions have no reason to allocate, why should they?

 Absolutely right. All phobos functions should not allocate unless
 absolutely necessary.

 Define "absolutely."  For example, there was an objection to accepting an
 AA as an "environment" map to std.process.spawnX functions because even
 though reading the AA would not require allocation, allocation would
 certainly be required to build the AA.  Is that acceptable?  Certainly we
 could invent a new non-allocating map type and accept that instead.

 I think we need clearer lines drawn here, if they are to be respected.

Great! I was raising the issue, with the intent to open it for discussion.

I never said an AA was intrinsically bad, only that it was impossible to
call the function with an environment without allocating, ie, there is no
way to pass a literal, and it's just being parsed and piped straight
through to a system call, seems redundant to me.

Apr 12 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Allocatoin policy in Phobos - Was: Vote for std.process