digitalmars.D - Was: Re: Vote for std.process
- Regan Heath (35/93) Apr 12 2013 non localised improvements have a fixed cost and ever increasing benefit...
- Lars T. Kyllingstad (32/33) Apr 12 2013 Thanks!
- Vladimir Panteleev (4/10) Apr 12 2013 Multiple chained array concatenations are performed at once,
- Lars T. Kyllingstad (4/16) Apr 12 2013 Good to know.
- Manu (9/37) Apr 12 2013 That's beautiful!
- Vladimir Panteleev (35/66) Apr 12 2013 OK, but so far my interpretation and replies were mostly in the
- Manu (36/92) Apr 12 2013 If allocating a string on the stack makes it buggy, then there is someth...
- Vladimir Panteleev (23/53) Apr 12 2013 env ~= "FOO=BAR";
- Manu (23/68) Apr 12 2013 I didn't see any attempt to index the array by key in this case. That's
- Vladimir Panteleev (87/108) Apr 12 2013 Sorry, not following.
- Manu (18/119) Apr 12 2013 I didn't see anywhere where it was possible to example the current
- Vladimir Panteleev (23/45) Apr 12 2013 There's the "environment" object. It generally acts like a
- Mr. Anonymous (37/65) Apr 12 2013 I just thought of something!
- Mr. Anonymous (5/5) Apr 13 2013 After some Googling, I found out a similar technique already
- Timothee Cour (12/18) Apr 13 2013 That would be great. Question about the type: it won't be int[n] as n
- Lars T. Kyllingstad (18/28) Apr 12 2013 Environment variables are a mapping of strings to strings. The
- Manu (6/32) Apr 12 2013 That's a good point, do AA's support literals that don't allocate? You
- Steven Schveighoffer (13/30) Apr 12 2013 No, because it would have to be COW.
I've moved this to another thread to allay complaints."Vladimir Panteleev" <vladimir thecybershadow.net> On Friday, 12 April 2013 at 10:14:35 UTC, Regan Heath wrote:non localised improvements have a fixed cost and ever increasing benefit - is the point I was making here. Agreed, you cannot remove the cost, and in fact well written reusable code often carries a slightly higher cost by it's very nature.All true. However, complexity can and should be packaged insuch a way as to localise, and this localised complex codeshould be tested to death and maintained by someone whounderstands it. It should be bracketed by sufficient commentsand warnings about how/why it does what it does. The resultingpackaged complexity, with it's associated cost can be re-usedmany times over for all the benefit it gives. Stack allocating the environment variables need not be alocalised improvement but could be a standard library functionwhich can be reused, for example.Performant abstractions. Like the much-awaited allocator design? Either way, they can only diminuate, not remove the costs.The initial point was a vague one, not a specific one. Manu wasn't attempting to block std.process, he had a general concern which I share.But once again, you speak in vague terms.Would you still say that the above costs are worth the nearly-intangible gain?"nearly-intangible" is wrong. Library code is code which isused by (hopefully) millions of people, writing millions ofapplications, running for millions of hours, on millions ofsystems, creating thousands of processes, etc.. In short, alittle effort now pays massive dividends over it's lifetime. So, yes, IMO the costs shown above are worth the resultinggains. D is constantly being compared to other languages on the basisof performance, so it's clearly an important aspect of D'ssuccess. Library code needs to work first time and work well or peoplewill roll their own wasting time, energy and in many casesgetting some aspect of it just plain wrong.Consider the following hypothetical decisions and outcomes: 1. std.process is left at is. One user is angry / turned awaybecause it performs 0.1% slower than it can be.It very much matters *who* that 1 user is. And, the count may be higher, and we might never "hear" from these people as they find other solutions. We're lucky that some people who try D and have issues tell us about them, they may be 5% of the total for all we know.2. std.process is rewritten to minimize allocations. Codecomplexity goes up, new improvements are challenging to add; bugspop up and go unfixed for a while because fewer programmers arequalified or willing to commit the effort of making correctfixes. More people are angry / turned away from D because itsstandard library is buggy. Of course, the above is an exaggerated illustration.the suggested improvements would add only very minor complexity and prevent none of the current crop of contributors from working with/on std.process.But would optimizing all code left and right really make more D users happier?Yes, as well as the users of their applications. True, none of them will even realise they could have been less happy, so none of them will realise the effort that went into it, but all of them will be better off.There's also the question of priorities. Would you rather thaneffort is spent on optimizing std.process (and dealing with allthe fallout from any such optimizations), or working on somethingthat is acutely missing and hurting D?Add the missing items, without a doubt - which is why no-one is suggesting blocking std.process over this issue.Why? There exist platforms and environments where memory and performance are concerns, if the D standard library code is not "careful" in it's use of both then it will be less suitable than C (for example) and so D will not penetrate those platforms. Manu is using D for games development on modern high-end gaming PCs and he is still concerned with memory and performance. So, there's 2 very different cases where memory and performance are still a concern, and .. if they become too much of a concern another solution will have to be sought.. and that's bad news for D.D is a systems programming language, there is hope that it will penetrate a wide range of systems and environments - sure inmany cases a little bit of memory use or performance loss isunimportant, but for many it will be the decisive factor whichmakes D unusable there.This is surely an exaggeration.D does not attempt to please everyone out there who is choosing a programming language for their next project. There is no suchlanguage, nor can one exist. One has to accept that D has anumber of goals, none of which are absolute, but merely pointtowards a certain, but not overly specific, point in themultidimensional matrix of trade-offs. D never was aboutachieving maximum performance in all possible cases.All true, but performance is one of D's top draw cards: <quote>The D programming language. Modern convenience. Modeling power. Native **efficiency**.</quote> (**emphasis mine**) So, it behoves us to make sure the standard library keeps that in mind. R
Apr 12 2013
On Friday, 12 April 2013 at 11:37:14 UTC, Regan Heath wrote:I've moved this to another thread to allay complaints.Thanks! I completely agree that if code can be made more performant without a significant increase in complexity, then we should do so. While it is mostly (but not entirely) irrelevant in the context of std.process, it is a problem that should be tackled in Phobos as a whole. Several things could/should be done: It would be nice to have some sugar on top of alloca(), the use of which is usually considered bad practice. Someone (bearophile?) once suggested static arrays whose length is determined at runtime, which would be a great addition to the language: void foo(int n) { int[n] myArr; ... } Furthermore, we need to get the allocator design in place. In SciD, I use David Simcha's region allocator to allocate temporary workspace, and it works really well. The only times I use 'new' is when I need persistent memory (e.g. for a return value) and the user-supplied buffer is too small. Phobos would greatly benefit from doing this too. Finally, an example from the new std.process which got some heavy criticism in the other thread: envz[pos++] = (var~'='~val~'\0').ptr; I have been operating under the assumption that the compiler is smart enough to make the above a single allocation. If it isn't, I would consider it a compiler issue. That said, I am aware that std.process could be improved in some places. Lars
Apr 12 2013
On Friday, 12 April 2013 at 12:30:09 UTC, Lars T. Kyllingstad wrote:Finally, an example from the new std.process which got some heavy criticism in the other thread: envz[pos++] = (var~'='~val~'\0').ptr; I have been operating under the assumption that the compiler is smart enough to make the above a single allocation. If it isn't, I would consider it a compiler issue.Multiple chained array concatenations are performed at once, using the _d_arraycatnT function in Druntime.
Apr 12 2013
On Friday, 12 April 2013 at 12:43:57 UTC, Vladimir Panteleev wrote:On Friday, 12 April 2013 at 12:30:09 UTC, Lars T. Kyllingstad wrote:Good to know. LarsFinally, an example from the new std.process which got some heavy criticism in the other thread: envz[pos++] = (var~'='~val~'\0').ptr; I have been operating under the assumption that the compiler is smart enough to make the above a single allocation. If it isn't, I would consider it a compiler issue.Multiple chained array concatenations are performed at once, using the _d_arraycatnT function in Druntime.
Apr 12 2013
On 12 April 2013 22:30, Lars T. Kyllingstad <public kyllingen.net> wrote:On Friday, 12 April 2013 at 11:37:14 UTC, Regan Heath wrote:That's beautiful! Furthermore, we need to get the allocator design in place. In SciD, I useI've moved this to another thread to allay complaints.Thanks! I completely agree that if code can be made more performant without a significant increase in complexity, then we should do so. While it is mostly (but not entirely) irrelevant in the context of std.process, it is a problem that should be tackled in Phobos as a whole. Several things could/should be done: It would be nice to have some sugar on top of alloca(), the use of which is usually considered bad practice. Someone (bearophile?) once suggested static arrays whose length is determined at runtime, which would be a great addition to the language: void foo(int n) { int[n] myArr; ... }David Simcha's region allocator to allocate temporary workspace, and it works really well. The only times I use 'new' is when I need persistent memory (e.g. for a return value) and the user-supplied buffer is too small. Phobos would greatly benefit from doing this too. Finally, an example from the new std.process which got some heavy criticism in the other thread: envz[pos++] = (var~'='~val~'\0').ptr; I have been operating under the assumption that the compiler is smart enough to make the above a single allocation. If it isn't, I would consider it a compiler issue.Does it? I've not seen the compiler do that, although I'd like to think it should be possible. 1 allocation is better than 3 I guess, however, I wonder if that code could be restructured to use the stack aswell. alloca() is really underrated! I can't imagine why people don't like it. Perhaps some more helpers built around it might encourage its use? It does feel a little bit 'raw', like malloc(). It implies some annoying casts.
Apr 12 2013
On Friday, 12 April 2013 at 11:37:14 UTC, Regan Heath wrote:The initial point was a vague one, not a specific one. Manu wasn't attempting to block std.process, he had a general concern which I share.OK, but so far my interpretation and replies were mostly in the context of std.process - this module being an example where performance improvements would have a very small real-life benefit. I agree that (generally speaking) improving the performance of the code in std.algorithm/array/range would be worth the effort and complexity.It very much matters *who* that 1 user is. And, the count may be higher, and we might never "hear" from these people as they find other solutions. We're lucky that some people who try D and have issues tell us about them, they may be 5% of the total for all we know.The same applies to the other side of the argument. A buggy standard library probably leaves a worse impression than a slow standard library...In reality the suggested improvements would add only very minor complexity and prevent none of the current crop of contributors from working with/on std.process.Well, how do you qualify the amount of optimization that is appropriate? For example, the code in std.process would be even faster, if it was completely written in assembler. I hope we'll agree than in practice, this would be absurd. Now, what set of well-defined arguments would conclude that rewriting it in assembler is pointless, but optimizing memory allocations is not? All three versions of std.process would perform as well as far as the end-user can perceive.Yes, as well as the users of their applications. True, none of them will even realise they could have been less happy, so none of them will realise the effort that went into it, but all of them will be better off.Absolutely - if you ignore the costs. 100%-correct faster code is always better than 100%-correct slower code, but the costs are the counter-argument.Add the missing items, without a doubt - which is why no-one is suggesting blocking std.process over this issue.Blocking is one thing, but asking for faster code where it doesn't really matter - when there are areas where D could be improved at much higher gain per effort - is another.OK, but once again - how does that line up with the purpose of std.process? I can see how std.algorithm can be useful in low-spec embedded/gaming systems, but std.process?Why? There exist platforms and environments where memory and performance are concerns, if the D standard library code is not "careful" in it's use of both then it will be less suitable than C (for example) and so D will not penetrate those platforms.D is a systems programming language, there is hope that itManu is using D for games development on modern high-end gaming PCs and he is still concerned with memory and performance.In Manu's case, every bit of performance counts in the code that runs in tight loops, e.g. for every game frame. However, does that include std.process?All true, but performance is one of D's top draw cards: <quote>The D programming language. Modern convenience. Modeling power. Native **efficiency**.</quote> (**emphasis mine**) So, it behoves us to make sure the standard library keeps that in mind.Again, I don't (generally) disagree for the general case, however I think it pays to mind the context and perspective. When the context is std.process and the perspective is the relative cost of process creation, it seems like quite a pointless argument.
Apr 12 2013
On 12 April 2013 23:08, Vladimir Panteleev <vladimir thecybershadow.net>wrote:On Friday, 12 April 2013 at 11:37:14 UTC, Regan Heath wrote:If allocating a string on the stack makes it buggy, then there is something really wrong. It should be no less convenient if appropriate helpers are available. With consideration to the string[string] argument, surely instances like that can be reconsidered? How is string[] going to produce more bugs than string[string]? You're being paranoid, or sensationalising the effect of simple optimisation. In reality the suggested improvements would add only very minor complexityIt very much matters *who* that 1 user is. And, the count may be higher, and we might never "hear" from these people as they find other solutions. We're lucky that some people who try D and have issues tell us about them, they may be 5% of the total for all we know.The same applies to the other side of the argument. A buggy standard library probably leaves a worse impression than a slow standard library...As much is convenient without causing you to start obscuring your code? That's my personal rule. But I make it a habit to consider efficiency when designing code, I never retrofit it. I tend to choose designs that are both simple and efficient at the start. For example, the code in std.process would be even faster, if it wasand prevent none of the current crop of contributors from working with/on std.process.Well, how do you qualify the amount of optimization that is appropriate?completely written in assembler. I hope we'll agree than in practice, this would be absurd. Now, what set of well-defined arguments would conclude that rewriting it in assembler is pointless, but optimizing memory allocations is not? All three versions of std.process would perform as well as far as the end-user can perceive.Actually, it would probably be slower if hand-written in assembler. And again, speed is not my concern here, it's inconsiderate the allocation policy. Yes, as well as the users of their applications. True, none of them willCan you describe the 'costs'? Add the missing items, without a doubt - which is why no-one is suggestingeven realise they could have been less happy, so none of them will realise the effort that went into it, but all of them will be better off.Absolutely - if you ignore the costs. 100%-correct faster code is always better than 100%-correct slower code, but the costs are the counter-argument.I'm asking for code that doesn't needlessly allocate, as a policy/habit in phobos. D is a systems programming language, there is hope that itblocking std.process over this issue.Blocking is one thing, but asking for faster code where it doesn't really matter - when there are areas where D could be improved at much higher gain per effort - is another.I'm interested in eliminating allocations. It's just another function that can't be called in a no-gc area. If it used the stack for its temporaries, no problem. All true, but performance is one of D's top draw cards:OK, but once again - how does that line up with the purpose of std.process? I can see how std.algorithm can be useful in low-spec embedded/gaming systems, but std.process? Manu is using D for games development on modern high-end gaming PCs andThere exist platforms and environments where memory and performance are concerns, if the D standard library code is not "careful" in it's use of both then it will be less suitable than C (for example) and so D will not penetrate those platforms.Why?he is still concerned with memory and performance.In Manu's case, every bit of performance counts in the code that runs in tight loops, e.g. for every game frame. However, does that include std.process?It was the first module that appeared for consideration since the recent discussions about irresponsible GC usage. The argument applies to everything considered for acceptance into phobos. I'd like to see it applied as a systematic consideration in the future, irrespective of the module being considered. Avoiding allocation for temporaries shouldn't be hard, if some tools are missing, then that is something that needs further discussion I guess.<quote>The D programming language. Modern convenience. Modeling power. Native **efficiency**.</quote> (**emphasis mine**) So, it behoves us to make sure the standard library keeps that in mind.Again, I don't (generally) disagree for the general case, however I think it pays to mind the context and perspective. When the context is std.process and the perspective is the relative cost of process creation, it seems like quite a pointless argument.
Apr 12 2013
On Friday, 12 April 2013 at 13:39:38 UTC, Manu wrote:If allocating a string on the stack makes it buggy, then there is something really wrong. It should be no less convenient if appropriate helpers are available.Please see my reply to your other post.With consideration to the string[string] argument, surely instances like that can be reconsidered? How is string[] going to produce more bugs than string[string]?env ~= "FOO=BAR"; This will probably not do what you want if there was already a line starting with "FOO=" in env. An array of strings is a less direct representation of the environment than a string map. Certain common operations, such as finding the value of a variable, or setting / overwriting a variable, become more difficult.You're being paranoid, or sensationalising the effect of simple optimisation.Strong words...And again, speed is not my concern here, it's inconsiderate the allocation policy.I'm interested in eliminating allocations. It's just another function that can't be called in a no-gc area. If it used the stack for its temporaries, no problem.Why allocations, specifically, if not for the performance costs of allocation and garbage collection?Can you describe the 'costs'?See my previous posts in today's discussions.As much is convenient without causing you to start obscuring your code? That's my personal rule. But I make it a habit to consider efficiency when designing code, I never retrofit it. I tend to choose designs that are both simple and efficient at the start.OK, so if I understand you correctly: you would like Phobos to adopt a policy of avoiding heap allocations whenever possible, and this argument applies to std.process not because doing so would result in a tangible improvement of its performance or other metric, but for the purpose of being consistent across Phobos. Assuming that the language can provide or allow implementing suitably safe abstractions for doing so without complicating the code much, I think that's a goal worth looking forward, and we have been doing so for some time (hence the pending allocator design).
Apr 12 2013
On 13 April 2013 00:19, Vladimir Panteleev <vladimir thecybershadow.net>wrote:On Friday, 12 April 2013 at 13:39:38 UTC, Manu wrote:I didn't see any attempt to index the array by key in this case. That's what an AA is for, and it's not being used here, so it's not a job for an AA. I wouldn't use env ~= "FOO=BAR"; I would use env ~= EnvVar("FOO", "BAR"); Or whatever key/value pair structure you like. You're being paranoid, or sensationalising the effect of simpleIf allocating a string on the stack makes it buggy, then there is something really wrong. It should be no less convenient if appropriate helpers are available.Please see my reply to your other post. With consideration to the string[string] argument, surely instances likethat can be reconsidered? How is string[] going to produce more bugs than string[string]?env ~= "FOO=BAR"; This will probably not do what you want if there was already a line starting with "FOO=" in env. An array of strings is a less direct representation of the environment than a string map. Certain common operations, such as finding the value of a variable, or setting / overwriting a variable, become more difficult.Well it seemed appropriate. I can't understand what's so wildly complex that it would make code utterly unmaintainable, and error prone. Andoptimisation.Strong words...That's one aspect, but it's also about having control over the allocation patterns of your program in general. Lots of small allocations fragment the heap, and they also push the memory barrier. I couldn't disable the GC and call into phobos functions for very long, micro-allocations of temporaries not being freed would quickly eat all the system memory. Reducing allocations is always better where possible. As much is convenient without causing you to start obscuring your code?again, speed is not my concern here, it's inconsiderate the allocation policy.I'm interested in eliminating allocations. It's just another function thatcan't be called in a no-gc area. If it used the stack for its temporaries, no problem.Why allocations, specifically, if not for the performance costs of allocation and garbage collection?It starts as soon as the majority agree it's important enough to enforce. Although I think a tool like: char[len] stackString; would be a super-useful tool to make this considerably more painless. Some C compilers support this.That's my personal rule.But I make it a habit to consider efficiency when designing code, I never retrofit it. I tend to choose designs that are both simple and efficient at the start.OK, so if I understand you correctly: you would like Phobos to adopt a policy of avoiding heap allocations whenever possible, and this argument applies to std.process not because doing so would result in a tangible improvement of its performance or other metric, but for the purpose of being consistent across Phobos. Assuming that the language can provide or allow implementing suitably safe abstractions for doing so without complicating the code much, I think that's a goal worth looking forward, and we have been doing so for some time (hence the pending allocator design).
Apr 12 2013
On Friday, 12 April 2013 at 14:58:10 UTC, Manu wrote:I didn't see any attempt to index the array by key in this case. That's what an AA is for, and it's not being used here, so it's not a job for an AA.Sorry, not following. Are you suggesting to use a dynamic array for creating processes, but an associative array for examining the current process's environment?I wouldn't use env ~= "FOO=BAR"; I would use env ~= EnvVar("FOO", "BAR"); Or whatever key/value pair structure you like.OK, but I don't see how it changes your argument. Also, you mentioned string[] earlier.Well it seemed appropriate. I can't understand what's so wildly complex that it would make code utterly unmaintainable, and error prone.I did not say it would be _utterly_ unmaintainable or error prone. Just more so.It starts as soon as the majority agree it's important enough to enforce. Although I think a tool like: char[len] stackString; would be a super-useful tool to make this considerably more painless. Some C compilers support this.I believe this is the feature: http://en.wikipedia.org/wiki/Variable-length_array It is part of C99. ------------------------------------------------------------- Earlier today, I wrote:Please rewrite some part of std.process with performance in mind, and post it here for review. This way, we can analyze the benefits and drawbacks based on a concrete example, instead of vapor and hot air.I've tried doing this myself, for the bit of code you brought up (constructing environment variables). Here's what I ended up with: ------------------------------------------------------------- import std.array; private struct StaticAppender(size_t SIZE, T) { T[SIZE] buffer = void; Appender!(T[]) appender; alias appender this; } /// Returns a struct containing a fixed-size buffer and an /// appender. The appender will use the buffer (which has /// SIZE elements) until it runs out of space, at which /// point it will reallocate itself on the heap. auto staticAppender(size_t SIZE, T)() { StaticAppender!(SIZE, T) result; result.appender = appender(result.buffer[]); return result; } /// Allows allocating a T[] given a size. /// Contains a fized-size buffer of T elements with SIZE /// length. When asked to allocate an array with /// length <= than SIZE, the buffer is used instead of /// the heap. struct StaticArray(size_t SIZE, T) { T[SIZE] buffer = void; T[] get(size_t size) { if (size <= SIZE) { buffer[0..size] = T.init; return buffer[0..size]; } else return new T[size]; } } // -------------------------------------------------------- void exec(const(char)*[] envz) { /+ ... +/ } void oldWay(string[string] environment) { auto envz = new const(char)*[environment.length + 1]; int pos = 0; foreach (var, val; environment) envz[pos++] = (var~'='~val~'\0').ptr; exec(envz); } void newWay(string[string] environment) { auto buf = staticAppender!(4096, char)(); StaticArray!(64, size_t) envpBuf; size_t[] envp = envpBuf.get(environment.length + 1); size_t pos; foreach (var, val; environment) { envp[pos++] = buf.data.length; buf.put(var); buf.put('='); buf.put(val); buf.put('\0'); } // Convert offsets to pointers in-place auto envz = cast(const(char)*[])envp; foreach (n; 0..pos) envz[n] += cast(size_t)buf.data.ptr; exec(envz); } ------------------------------------------------------------- As you can see, the code is quite more verbose, even with the helper types. It's no longer obvious at a glance what the code is doing. Perhaps you can come up with better abstractions?
Apr 12 2013
On 13 April 2013 01:29, Vladimir Panteleev <vladimir thecybershadow.net>wrote:On Friday, 12 April 2013 at 14:58:10 UTC, Manu wrote:I didn't see anywhere where it was possible to example the current processed environment? I only saw the mechanism for feeding additional env vars to the system command. Linear array of key/value pair struct would be fine. I wouldn't use env ~= "FOO=BAR";I didn't see any attempt to index the array by key in this case. That's what an AA is for, and it's not being used here, so it's not a job for an AA.Sorry, not following. Are you suggesting to use a dynamic array for creating processes, but an associative array for examining the current process's environment?Sorry. I didn't think it through at the time. It starts as soon as the majority agree it's important enough to enforce.I would use env ~= EnvVar("FOO", "BAR"); Or whatever key/value pair structure you like.OK, but I don't see how it changes your argument. Also, you mentioned string[] earlier.Beautiful! Actually, I think if you look a couple of pages below, I think you'll see something rather like that already there in the windows code. Thought not sure why you use a size_t array which you just cast to a char* array? I think you could also fold that into one pass, rather than 2. And I'm not sure about this line: envz[n] += cast(size_t)buf.data.ptr; But those helpers make the problem rather painless. I wonder if there's opportunity for improvement by having appender support the ~ operator? Might be able to jig it to use natural concat syntax rather than put()...Although I think a tool like: char[len] stackString; would be asuper-useful tool to make this considerably more painless. Some C compilers support this.I believe this is the feature: http://en.wikipedia.org/wiki/**Variable-length_array<http://en.wikipedia.org/wiki/Variable-length_array> It is part of C99. ------------------------------**------------------------------**- Earlier today, I wrote: Please rewrite some part of std.process with performance in mind, andpost it here for review. This way, we can analyze the benefits and drawbacks based on a concrete example, instead of vapor and hot air.I've tried doing this myself, for the bit of code you brought up (constructing environment variables). Here's what I ended up with: ------------------------------**------------------------------**- import std.array; private struct StaticAppender(size_t SIZE, T) { T[SIZE] buffer = void; Appender!(T[]) appender; alias appender this; } /// Returns a struct containing a fixed-size buffer and an /// appender. The appender will use the buffer (which has /// SIZE elements) until it runs out of space, at which /// point it will reallocate itself on the heap. auto staticAppender(size_t SIZE, T)() { StaticAppender!(SIZE, T) result; result.appender = appender(result.buffer[]); return result; } /// Allows allocating a T[] given a size. /// Contains a fized-size buffer of T elements with SIZE /// length. When asked to allocate an array with /// length <= than SIZE, the buffer is used instead of /// the heap. struct StaticArray(size_t SIZE, T) { T[SIZE] buffer = void; T[] get(size_t size) { if (size <= SIZE) { buffer[0..size] = T.init; return buffer[0..size]; } else return new T[size]; } } // ------------------------------**-------------------------- void exec(const(char)*[] envz) { /+ ... +/ } void oldWay(string[string] environment) { auto envz = new const(char)*[environment.**length + 1]; int pos = 0; foreach (var, val; environment) envz[pos++] = (var~'='~val~'\0').ptr; exec(envz); } void newWay(string[string] environment) { auto buf = staticAppender!(4096, char)(); StaticArray!(64, size_t) envpBuf; size_t[] envp = envpBuf.get(environment.length + 1); size_t pos; foreach (var, val; environment) { envp[pos++] = buf.data.length; buf.put(var); buf.put('='); buf.put(val); buf.put('\0'); } // Convert offsets to pointers in-place auto envz = cast(const(char)*[])envp; foreach (n; 0..pos) envz[n] += cast(size_t)buf.data.ptr; exec(envz); } ------------------------------**------------------------------**- As you can see, the code is quite more verbose, even with the helper types. It's no longer obvious at a glance what the code is doing. Perhaps you can come up with better abstractions?
Apr 12 2013
On Friday, 12 April 2013 at 16:04:09 UTC, Manu wrote:I didn't see anywhere where it was possible to example the current processed environment? I only saw the mechanism for feeding additional env vars to the system command. Linear array of key/value pair struct would be fine.There's the "environment" object. It generally acts like a string[string], and has a toAA() method that constructs a real string[string].Beautiful! Actually, I think if you look a couple of pages below, I think you'll see something rather like that already there in the windows code.I see, it also uses appender, although appender's buffer will be on the heap, and there is no need for an envz array.Thought not sure why you use a size_t array which you just cast to a char* array? I think you could also fold that into one pass, rather than 2. And I'm not sure about this line: envz[n] += cast(size_t)buf.data.ptr;The problem is that the pointer to the data may change once appender reallocates the buffer when it reaches the current buffer's capacity. For this reason, we can't store pointers to the strings we store, since they can "move" around until the point that we're done appending. I think this is the most common gotcha when writing / working with appenders, and it bit be once too. As you can see, the code is not completely obvious ;)But those helpers make the problem rather painless. I wonder if there's opportunity for improvement by having appender support the ~ operator? Might be able to jig it to use natural concat syntax rather than put()...Allowing put() take multiple arguments would be an improvement as well - not just in usability, but performance as well, since it would only need to check for overflow once for all arguments. I have this in my own appender: https://github.com/CyberShadow/ae/blob/master/utils/appender.d Rob Jacques was working on a Phobos appender replacement which also had this, I believe: http://d.puremagic.com/issues/show_bug.cgi?id=5813 Too bad nothing came out of the latter.
Apr 12 2013
void oldWay(string[string] environment) { auto envz = new const(char)*[environment.length + 1]; int pos = 0; foreach (var, val; environment) envz[pos++] = (var~'='~val~'\0').ptr; exec(envz); } void newWay(string[string] environment) { auto buf = staticAppender!(4096, char)(); StaticArray!(64, size_t) envpBuf; size_t[] envp = envpBuf.get(environment.length + 1); size_t pos; foreach (var, val; environment) { envp[pos++] = buf.data.length; buf.put(var); buf.put('='); buf.put(val); buf.put('\0'); } // Convert offsets to pointers in-place auto envz = cast(const(char)*[])envp; foreach (n; 0..pos) envz[n] += cast(size_t)buf.data.ptr; exec(envz); }I just thought of something! I don't comment here often, but I want to express my opinion: Currently D, being a system language, offers full control over the allocation method, be it the stack or the heap. The helpers above show how flexible it is in doing custom optimized stuff if one wants to. But there's an obvious drawback, quoting Vladimir: "the code is quite more verbose, even with the helper types. It's no longer obvious at a glance what the code is doing." But think for a moment - does the programmer usually needs to choose the allocation method? Why explicit is default? It could be the other way around: void oldWay(string[string] environment) { const(char)* envz[environment.length + 1]; // Compiler decides whether to use stack or heap. int pos = 0; foreach (var, val; environment) envz[pos++] = (var~'='~val~'\0').ptr; // Use appender. Use stack and switch to heap if necessary exec(envz); } How nifty would that be, don't you think? Another benefit could be that the compiler could adjust the stack allocation limit per architecture, and probably it could be defined as a command line parameter, e.g. when targeting a low end device. This principle could work on fields other than allocation, e.g. parameter passing by ref/value. The programmer needs to only specify whether he wants a copy or the actual value, and the compiler would decide the optimal way to pass it. e.g.: immutable int n; func(n); // by value immutable int arr[30]; func(arr); // by ref But these are probably suggestions for D3 or something... Too drastic :)
Apr 12 2013
After some Googling, I found out a similar technique already exists: auto_buffer in C++. http://goo.gl/3RLK6 I think it would be perfect to make it the default allocation method in D.
Apr 13 2013
Someone (bearophile?) once suggested static arrays whose length is determined at runtime, which would > be a great addition to the language: void foo(int n) { int[n] myArr; }That would be great. Question about the type: it won't be int[n] as n is runtime not compile time. would it be int[] or some other type? pros of some other type: makes some optimizations that benefit from the fact it's stack allocated possible cons: we need to be able to reuse existing algos that don't make such distinction. maybe another type that would satisfy both type traits: isDynamicArray!T isAllocaArray!T On Sat, Apr 13, 2013 at 4:19 AM, Mr. Anonymous <mailnew4ster gmail.com> wrote:After some Googling, I found out a similar technique already exists: auto_buffer in C++. http://goo.gl/3RLK6 I think it would be perfect to make it the default allocation method in D.
Apr 13 2013
On Friday, 12 April 2013 at 07:04:23 UTC, Manu wrote:string[string] is used in the main API to receive environment variables; perhaps kinda convenient, but it's impossible to supply environment variables with loads of allocations.Environment variables are a mapping of strings to strings. The natural way to express such a mapping in D is with a string[string]. It shouldn't be necessary to allocate an AA literal, though.toStringz is used liberally; alternatively, alloca() could allocate the c-string's on the stack and zero terminate them there, passing a pointer to the stack string to the OS functions.It is kind of hard to use alloca() in a safe manner in D, because DMD will happily inline functions that use it. The following program will overflow the stack if compiled with -inline: void doStuff() { auto p = alloca(100); } void main() { foreach (i; 0 .. 1_000_000) doStuff(); } This is of course fixable, but until that happens, I would consider alloca() a no-go for Phobos.
Apr 12 2013
On 12 April 2013 23:21, Lars T. Kyllingstad <public kyllingen.net> wrote:On Friday, 12 April 2013 at 07:04:23 UTC, Manu wrote:That's a good point, do AA's support literals that don't allocate? You can't even produce an array literal without it needlessly allocating. toStringz is used liberally; alternatively, alloca() could allocate thestring[string] is used in the main API to receive environment variables; perhaps kinda convenient, but it's impossible to supply environment variables with loads of allocations.Environment variables are a mapping of strings to strings. The natural way to express such a mapping in D is with a string[string]. It shouldn't be necessary to allocate an AA literal, though.Very good point. This is a problem. Hmmm...c-string's on the stack and zero terminate them there, passing a pointer to the stack string to the OS functions.It is kind of hard to use alloca() in a safe manner in D, because DMD will happily inline functions that use it. The following program will overflow the stack if compiled with -inline: void doStuff() { auto p = alloca(100); } void main() { foreach (i; 0 .. 1_000_000) doStuff(); } This is of course fixable, but until that happens, I would consider alloca() a no-go for Phobos.
Apr 12 2013
On Fri, 12 Apr 2013 09:42:43 -0400, Manu <turkeyman gmail.com> wrote:On 12 April 2013 23:21, Lars T. Kyllingstad <public kyllingen.net> wrote:No, because it would have to be COW. However, a string[string] literal that uses strings only must allocate the structure of the AA, not the strings themselves. The compiler might be able to optimize this by figuring out how much memory to allocate in order to contain the entire AA structure, then create one block that has all the data in it. It still needs a new block though, otherwise, what happens when you change an AA that was once a literal? I think the best path forward is to replace the functions with a templated one that takes a indexable type as the env pointer. Then one can optimize as much as one desires. -SteveOn Friday, 12 April 2013 at 07:04:23 UTC, Manu wrote:That's a good point, do AA's support literals that don't allocate? You can't even produce an array literal without it needlessly allocating.string[string] is used in the main API to receive environment variables; perhaps kinda convenient, but it's impossible to supply environment variables with loads of allocations.Environment variables are a mapping of strings to strings. The natural way to express such a mapping in D is with a string[string]. It shouldn't be necessary to allocate an AA literal, though.
Apr 12 2013