digitalmars.D - Feature suggestion: in-place append to array
- KF (11/11) Mar 04 2010 I hope this is the right place to post it.
- Steven Schveighoffer (10/16) Mar 04 2010 No, it does not create a copy. a = a ~ e always creates a copy, a ~= e ...
- grauzone (23/42) Mar 04 2010 Some sort of "resetAndReuse" function to clear an array, but enabling to...
- Steven Schveighoffer (25/47) Mar 04 2010 proposed usage (as checked in a couple days ago):
- Clemens (6/8) Mar 04 2010 I would prefer the name reserve(). It has precedent in the STL, and the ...
- Steven Schveighoffer (12/23) Mar 04 2010 You are correct, setCapacity ensures that *at least* the given number of...
- Mike S (22/39) Mar 31 2010 Sorry if resurrecting this thread is against netiquette, but it caught
- bearophile (4/5) Mar 31 2010 It's a nice read. I don't see them switching to D soon. If you whisper t...
- Mike S (45/52) Mar 31 2010 Hah...well, there's a reason I'm still just looking into D rather than
- bearophile (28/35) Mar 31 2010 I am not able to tell the future. Some parts of D design are already old...
- Mike S (80/117) Mar 31 2010 Yeah, you're right about the demanding tool and maturity requirements
- bearophile (10/17) Apr 01 2010 People will try to use D2 for this purpose too, for example to teaching ...
- Mike S (13/17) Apr 01 2010 I figured the eager ones wouldn't be a problem, but I wondered whether
- Andrei Alexandrescu (21/47) Apr 01 2010 The book is finished and is on schedule. It's been out of my hands for a...
- bearophile (39/42) Apr 01 2010 Walter doesn't want to change octals, so I think it's a waste of time to...
- bearophile (6/12) Apr 01 2010 From what I've seen, D3 will probably be backwards compatible with D2.
- Steven Schveighoffer (27/60) Mar 31 2010 What do you mean by nondeterministic? It's very deterministic, just not...
- Mike S (32/62) Mar 31 2010 Steven Schveighoffer wrote:
- Steven Schveighoffer (33/90) Apr 01 2010 Its abstracted to the GC, but the current GC is well defined. If you
- Mike S (24/59) Apr 01 2010 With respect to the current GC, yup. :) Sticking to powers of 2 (or
- Steven Schveighoffer (9/14) Apr 01 2010 As an aside, because the GC is implemented all in runtime, you can swap ...
- grauzone (10/50) Mar 04 2010 What shrinkToFit() does is not really clear. Does it reallocate the
- Steven Schveighoffer (23/71) Mar 04 2010 Sorry, should have added:
- grauzone (11/93) Mar 04 2010 Doesn't it conform to Andrei's ideas about memory safety? It can stomp
- Steven Schveighoffer (22/32) Mar 04 2010 I think the idea is that anything that *could* result in undefined
I hope this is the right place to post it. In my work, I often need to add elements at the end of dynamic arrays and remove them from the end. This incremental changes would most conveniently be performed by a~=e for addition of e at the end of a, and say a=a[0..$-1]. Unfortunately, ~= always creates a copy and is thus too time consuming. What I would like to suggest is either a new operator, ~~=, in-place append that does not create a copy if possible. Alternatively, new properties/methods could be defined: - a.push(element or array) - add elements at the end of the array, without copying if possible - a.pop(x) - remove x elements from the end of the array - a.capacity - get or set the actual length allocated for the array Currently, I am using appender(a).put from algorithm module, but it is very obscure and makes code hard to understand. Cheers, KF
Mar 04 2010
On Thu, 04 Mar 2010 04:02:46 -0500, KF <kfleszar gmail.com> wrote:I hope this is the right place to post it. In my work, I often need to add elements at the end of dynamic arrays and remove them from the end. This incremental changes would most conveniently be performed by a~=e for addition of e at the end of a, and say a=a[0..$-1]. Unfortunately, ~= always creates a copy and is thus too time consuming.No, it does not create a copy. a = a ~ e always creates a copy, a ~= e appends in place if possible. This still will be slower than appender, but in the next release of DMD it will be much faster than the current release. With the next release of DMD, a = a[0..$-1] will shrink the array, but the next append to a will reallocate. There will be a function to get around this, but use it only if you know that the data removed from the end isn't referenced anywhere else. -Steve
Mar 04 2010
Steven Schveighoffer wrote:On Thu, 04 Mar 2010 04:02:46 -0500, KF <kfleszar gmail.com> wrote:Some sort of "resetAndReuse" function to clear an array, but enabling to reuse the old memory would be nice: int[] a = data; a = null; a ~= 1; //reallocates (of course) a.length = 0; a ~= 1; //will reallocate (for safety), used not to reallocate resetAndReuse(a); assert(a.length == 0); a ~= 1; //doesn't reallocate This can be implemented by setting both the slice and the internal runtime length fields to 0. Additionally, another function is necessary to replace the old preallocation trick: //preallocate 1000 elements, but don't change actual slice length auto len = a.length; a.length = len + 1000; a.length = len; As I understood it, this won't work anymore after the change. This can be implemented by enlarging the array's memory block without touching any length fields. I'm sure the function you had in mind does one of those things or both.I hope this is the right place to post it. In my work, I often need to add elements at the end of dynamic arrays and remove them from the end. This incremental changes would most conveniently be performed by a~=e for addition of e at the end of a, and say a=a[0..$-1]. Unfortunately, ~= always creates a copy and is thus too time consuming.No, it does not create a copy. a = a ~ e always creates a copy, a ~= e appends in place if possible. This still will be slower than appender, but in the next release of DMD it will be much faster than the current release. With the next release of DMD, a = a[0..$-1] will shrink the array, but the next append to a will reallocate. There will be a function to get around this, but use it only if you know that the data removed from the end isn't referenced anywhere else.-Steve
Mar 04 2010
On Thu, 04 Mar 2010 11:43:27 -0500, grauzone <none example.net> wrote:Some sort of "resetAndReuse" function to clear an array, but enabling to reuse the old memory would be nice: int[] a = data; a = null; a ~= 1; //reallocates (of course) a.length = 0; a ~= 1; //will reallocate (for safety), used not to reallocate resetAndReuse(a); assert(a.length == 0); a ~= 1; //doesn't reallocate This can be implemented by setting both the slice and the internal runtime length fields to 0. Additionally, another function is necessary to replace the old preallocation trick: //preallocate 1000 elements, but don't change actual slice length auto len = a.length; a.length = len + 1000; a.length = len; As I understood it, this won't work anymore after the change. This can be implemented by enlarging the array's memory block without touching any length fields. I'm sure the function you had in mind does one of those things or both.proposed usage (as checked in a couple days ago): int[] a; a.setCapacity(10000); // pre-allocate at least 10000 elements. foreach(i; 0..10000) a ~= i; // no reallocation a.length = 100; a.shrinkToFit(); // resize "allocated" length to 100 elements a ~= 5; // no reallocation. I will probably add a function to do the preallocation of a new array without having to write two statements. Also, one cool thing about this that was not available before is that increasing the capacity will not initialize the new data as long as "no pointers" flag is set. For pointer-containing elements, the contents must be initialized to zero to prevent false positives during collection. Note that setCapacity is supposed to be a property (i.e. a.capacity = 10000), but the compiler doesn't support this, I added a bug (feature) request for this (3857). There is also a readable capacity property to get the number of elements the array could grow to without reallocation, a feature that may be useful for tuning. Also the name "shrinkToFit" may not be final :) I wanted to call it minimize, but there were objections, and shrinkToFit is easy to search/replace later. -Steve
Mar 04 2010
Steven Schveighoffer Wrote:int[] a; a.setCapacity(10000); // pre-allocate at least 10000 elements.I would prefer the name reserve(). It has precedent in the STL, and the method doesn't actually always set the capacity, only if the current capacity is less then the argument. Even then it may conceivably allocate more than was asked for. In other words, the name setCapacity suggests this will always succeed: a.setCapacity(n); assert(a.capacity == n); ...which is not the case as far as I can tell.
Mar 04 2010
On Thu, 04 Mar 2010 12:29:25 -0500, Clemens <eriatarka84 gmail.com> wrote:Steven Schveighoffer Wrote:You are correct, setCapacity ensures that *at least* the given number of elements will be available for appending. I planned on making the function a property (but a bug would not allow that), the original intended usage was: a.capacity = 10000; Reserve doesn't work in this context. Can you come up with a name that does? I'll bring up reserve (as a function) as an alternative on the phobos mailing list, and see what people say. I kind of liked the setter/getter idea, but you make a good point. -Steveint[] a; a.setCapacity(10000); // pre-allocate at least 10000 elements.I would prefer the name reserve(). It has precedent in the STL, and the method doesn't actually always set the capacity, only if the current capacity is less then the argument. Even then it may conceivably allocate more than was asked for. In other words, the name setCapacity suggests this will always succeed: a.setCapacity(n); assert(a.capacity == n); ...which is not the case as far as I can tell.
Mar 04 2010
Steven Schveighoffer wrote:You are correct, setCapacity ensures that *at least* the given number of elements will be available for appending. I planned on making the function a property (but a bug would not allow that), the original intended usage was: a.capacity = 10000; Reserve doesn't work in this context. Can you come up with a name that does? I'll bring up reserve (as a function) as an alternative on the phobos mailing list, and see what people say. I kind of liked the setter/getter idea, but you make a good point. -SteveSorry if resurrecting this thread is against netiquette, but it caught my eye, and this is my first newsgroup post in years. ;) Anyway, is there any compelling reason why setCapacity or modifying a.capacity should allocate a nondeterministic amount of storage? Depending on the application, programmers might require strict control over memory allocation patterns and strict accounting for allocated memory. Game programmers, especially console game programmers, tend to strongly prefer deterministic allocation patterns, and nondeterminism is one of the [several] common complaints about the C++ STL (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html is a good resource on these kind of issues). In the case of D (which I'm considering learning), this is especially important for dynamic arrays, partly because they're so useful by themselves, and partly because they may form the backbone of custom containers. Whereas it's easy to add "smart nondeterministic" behavior to a deterministic setCapacity function by providing a wrapper, ordinary language users can't do the opposite. Because of this, and because dynamic arrays are so central to the D language, a nondeterministic setCapacity function may deter game programmers, especially console programmers, from adopting D. Assuming you see this post, what are your thoughts here?
Mar 31 2010
Mike S:http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.htmlIt's a nice read. I don't see them switching to D soon. If you whisper them that D is based on a GC they will run away screaming :-) Bye, bearophile
Mar 31 2010
bearophile wrote:Mike S:Hah...well, there's a reason I'm still just looking into D rather than diving in headfirst! :p Actually though, I do believe the needs of game programmers should be taken seriously while considering D's evolution: Right now, D hasn't completely found its niche, but it seems to position itself as a sane successor to C++ for systems-level programming. As it stands, I believe there are only two major kinds of programmers who still use C++, and those are game programmers and masochists. ;) Odds are, D won't be replacing C anytime soon for operating system kernels and such. It's too low-level for scripting tasks, and most website designers and non-real-time applications programmers use higher-level unlikely to go back. I think D will eventually be used for writing other heavy-duty non-OS frameworks and software systems, but if it's really going to become the successor to C++, it's going to have to absorb most of C++'s user base...and that includes game programmers. You're right that the garbage collector is a major issue - probably the biggest one inherent to the language design - but I haven't determined it's a dealbreaker, at least not yet. After all, D also allows manual memory management, and freeing memory early apparently helps speed things up anyway (http://stackoverflow.com/questions/472133/turning-off-the-d garbage-collector). That part helps ensure control over how much memory is used/available, and the only other issue with the garbage collector is reconciling its running time with the soft real-time constraint that games have to satisfy. I can think of a few tactics which should help here: 1.) In addition to permitting better reasoning about memory allocations, freeing most memory manually should reduce the load on the garbage collector and reduce its runtime, right? 2.) On the simulation side of the game engine, I believe a constant timestep promotes a more robust design, and that means some frames (relatively idle ones) will have plenty of CPU time left over. If you can explicitly call the garbage collector to make it run during those times instead of at nondeterministic times (can you?), you can maintain a smooth framerate without any GC-induced spikes. 3.) Can user code execute simultaneously with the GC in other threads (on other cores), or does the GC halt the entire program for safety reasons? Assuming simultaneous threaded execution is permitted, it would also dramatically reduce the GC's impact on multi-core systems. Assuming these strategies work, the garbage collector by itself shouldn't be a showstopper. In the case of dynamic arrays, resizing capacity deterministically is one of those small things that would be really helpful to anal game programmers, and it probably wouldn't hurt anyone else, either. Plus, it's easier to implement than "smart nondeterministic" resizing anyway. :)http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.htmlIt's a nice read. I don't see them switching to D soon. If you whisper them that D is based on a GC they will run away screaming :-) Bye, bearophile
Mar 31 2010
Mike S:the needs of game programmers should be taken seriously while considering D's evolution:<The short D1 history shows that designers of small games are willing to use D. Some game designers seem almost desperate to find an usable language simpler than C++. So I agree with you that D2 can be designed keeping an eye at game designers too. But that's very demanding people, it's not easy to satisfy them even with a mature language + compiler + std lib + dev tools. And currently nothing in D2 is mature. For them maybe not even the most mature thing you can find in D world, the back-end of ldc (llvm), is mature enough :-)Right now, D hasn't completely found its niche, but it seems to position itself as a sane successor to C++ for systems-level programming.<I am not able to tell the future. Some parts of D design are already old-style: - Some early design decisions make hard to inline D virtual functions (so if you write D code in Java style, you see a significant slow down compared to similar Java code running with HotSpot). So far no one seems to care of this, we'll see if I am right to see a problem here; - Some of D unsafe characteristics are worked on to improve their safety, but there's lot of road to travel still, for example null-safety and integers-overflow-safety are far away still. People are trying to explain Walter still why null-safety has some importance. - D2 defaults to mutables. This can be acceptable, I don't know; - Currently D2 is not designed from the start to work with an IDE (but I think this problem can be fixed with not too much work); - The built-in unit testing and documentation are not fit for professional usage (but the documentation is easy to extend because they are just comments, so it's a smaller problem). - etc. A system language is something that you can use to write very small binaries, that can be used to write a kernel like Linux, device drivers for a smaller computer+CPU, etc. Such things are hard to do in D2, I don't see Linus using D2 to write his kernel, he even thinks C++ is unfit. So I see D2 more like a "low-level application language", on a level located somewhere between C andAs it stands, I believe there are only two major kinds of programmers who still use C++, and those are game programmers and masochists. ;)<There's also an army of legacy programmers that have to update and debug tons of C++ code. Part of the human society works thanks to a mountain of C++ code. Online programming competitions are usually won by C++ code. People will find ways to use C++ for many more years, it will probably outlast us all.It's too low-level for scripting tasks,<I have asked several times to have Python-style array/lazy comprehensions in D :-) They help. I think their introduction can reduce by 10-30% the length of D2 programs.I think D will eventually be used for writing other heavy-duty non-OS frameworks and software systems,<From what I've seen so far I think D2 will appeal to some numerics folks too, so it can eat a bit of Fortran pie too. Some improvements can make D2 more appealing to them, Don is working on this too. (Some ideas from Chapel language can help here, but I think no one here has taken a serious look at it so far).You're right that the garbage collector is a major issue - probably the biggest one inherent to the language design - but I haven't determined it's a dealbreaker, at least not yet.<The situation with the D GC is interesting. First of all D GC is not refined, Java VM GCs are way more advanced. So D GC will need a much more modern GC. Another problem is that the current D GC is quite imprecise, this causes leaks when you use it in real programs that have to run for more than few minutes. Part of this problem can be solved using a better GC that's more precise (this can slow it down a bit, but avoids a good amount of memory leaks). The other problem is intrinsic of the language, that makes it hard or impossible to invent a fully precise GC for D. And D makes it hard to use a modern generational moving GC with D. You can't just adopt a JavaVM GC with D. Even the Mono GC (that knows the notion of pinned/unpinned memory) can be unfit (because it's designed for mostly unpinned memory). This is partially caused by D being a low level language with pointers, and it's partially caused by D2 type system unable to tell apart: 1) hand-managed pointers, to GC memory or C heap memory; 2) GC-managed pointers to pinned memory; 3) GC-managed pointers to unpinned memory. I think Walter think that telling them apart in the language makes D too much complex, and he can be right. But the current situation makes it hard to design a very efficient GC for D. So I don't think high-performance game designers will use D GC for the next few years, they will manage most or all the memory manually. I am ignorant, but I think D designers have to work a little harder in finding ways to allocate unpinned objects. This (with a refined GC able to move unpinned memory, that keeps a stack-like Eden plus two or three generations of objects) can help a lot for programs written in Java-style. But computer science history has shown that if enough people work on a problem they can often find some partial solution. At the beginning Java was very slow. So there's a bit of hope still. Of course enough GC experts will work on the D GC only if D will have some success.In the case of dynamic arrays, resizing capacity deterministically is one of those small things that would be really helpful to anal game programmers, and it probably wouldn't hurt anyone else, either. Plus, it's easier to implement than "smart nondeterministic" resizing anyway. :)<Don't nail your mind only on that problem, that's only one of many problems. You can think that dynamic arrays are simple, but from what I've seen there's nothing simple in D dynamic arrays, people have found a way to improve them a little only now, after years of discussions about them, and I am not fully sure yet the recent changes are worth it, I mean I am not sure yet that the current D2 arrays are better than the older ones + an appender helper struct. There is no good benchmarking suite yet to test if they are an improvement. Bye, bearophile
Mar 31 2010
bearophile wrote:The short D1 history shows that designers of small games are willing to use D. Some game designers seem almost desperate to find an usable language simpler than C++. So I agree with you that D2 can be designed keeping an eye at game designers too. But that's very demanding people, it's not easy to satisfy them even with a mature language + compiler + std lib + dev tools. And currently nothing in D2 is mature. For them maybe not even the most mature thing you can find in D world, the back-end of ldc (llvm), is mature enough :-)Yeah, you're right about the demanding tool and maturity requirements that game studios have, but assuming people continue working on D and other people adopt it and enhance the tools, those things will flesh out over time. I'm young enough that I look forward to seeing it overtake C++ in the game world someday.I am not able to tell the future. Some parts of D design are already old-style: - Some early design decisions make hard to inline D virtual functions (so if you write D code in Java style, you see a significant slow down compared to similar Java code running with HotSpot). So far no one seems to care of this, we'll see if I am right to see a problem here;Well, writing code Java-style is certainly no problem for game devs, considering they already minimize virtual function usage, at least in lower code layers. ;)<snip> A system language is something that you can use to write very small binaries, that can be used to write a kernel like Linux, device drivers for a smaller computer+CPU, etc. Such things are hard to do in D2, I don't see Linus using D2 to write his kernel, he even thinks C++ is unfit. So I see D2 more like a "low-level application language", on a level located somewhere between C andThis is true, but I do recall seeing an executable size comparison somewhere, and the D version of a program (hello world?) beat out the C++ version by about a factor of two. The C version killed both, but still, perhaps D might not be eternally unfit even if C++ is. ;) Then again, maybe the C++ program was just including a superfluous amount of library code, and maybe D programs are generally larger than their C++ equivalents. Plus, if it was hello world, it obviously wasn't using a lot of higher-level features. Either way though, even if D does become fit for kernel/driver code someday, it'll still be a long time before someone actually starts from scratch to write a new kernel using it anyway.You're right, and I actually realized I misspoke here a little bit ago while I was eating. The legacy code just might keep people using C++ until the sun dies. Still, I maintain that game developers and masochists probably comprise a large portion of programmers starting completely new projects in C++. :DAs it stands, I believe there are only two major kinds of programmers who still use C++, and those are game programmers and masochists. ;)<There's also an army of legacy programmers that have to update and debug tons of C++ code. Part of the human society works thanks to a mountain of C++ code. Online programming competitions are usually won by C++ code. People will find ways to use C++ for many more years, it will probably outlast us all.How difficult do you think that would be for the compiler devs to implement in the semantic sense? Assuming it can be done without major hardship or compromising the design of the language, that would be really cool. Syntactically speaking, Python list comprehensions make the source so much more compact, expressive, and clean that a statically compiled language using them would really stand out. If they're implemented correctly, I can't see any reasons why the syntactic sugar would be any slower than spelling everything out explicitly, either. The syntax would have to be a bit different to feel at home in D, but the idea itself probably isn't too foreign. I also noticed a discussion about Python tuples from October 2009 I think, and native tuples in D would also be useful...more useful than in Python, in fact. After all, Python lists can contain mixed types (unlike arrays in D), so they make tuples largely redundant except for their different conventional meanings (and except for the ability to used named tuples). In comparison, built-in tuples in D with similarly elegant syntax would fill in a much larger gap. I suppose they'd work something like implicitly generated struct types, which could be hastily constructed as lvalues or rvalues, returned from functions, packed/unpacked and passed to functions, etc. Honestly, I think there's a lot to be learned from the expressiveness of scripting languages (especially Python, given its elegant syntax without all of the Perl/PHP applied to statically compiled languages without speed hits or design compromises. > From what I've seen so far I think D2 will appeal to some numerics folks too, so it can eat a bit of Fortran pie too. Some improvements can make D2 more appealing to them, Don is working on this too. (Some ideas from Chapel language can help here, but I think no one here has taken a serious look at it so far). It interested me when you mentioned numerics above, because game engine design is also moving more in the direction of dataflow-oriented programming, where the object hierarchy is more streamlined and there's more of a focus on transforming one type of data to another. This helps with both cache efficiency and exposing data-level parallelism. With all of the vector and matrix math involved in those transformation steps (physics engines, etc.), the same improvements that appeal to FORTRAN/numerics programmers might also find some use cases here. Of course, I could be talking out of my ass, since I don't quite know what D 2.0 improvements you're referring to, but I imagine they might apply.It's too low-level for scripting tasks,<I have asked several times to have Python-style array/lazy comprehensions in D :-) They help. I think their introduction can reduce by 10-30% the length of D2 programs.The situation with the D GC is interesting. First of all D GC is not refined, Java VM GCs are way more advanced. So D GC will need a much more modern GC. Another problem is that the current D GC is quite imprecise, this causes leaks when you use it in real programs that have to run for more than few minutes. Part of this problem can be solved using a better GC that's more precise (this can slow it down a bit, but avoids a good amount of memory leaks). The other problem is intrinsic of the language, that makes it hard or impossible to invent a fully precise GC for D. And D makes it hard to use a modern generational moving GC with D. You can't just adopt a JavaVM GC with D. Even the Mono GC (that knows the notion of pinned/unpinned memory) can be unfit (because it's designed for mostly unpinned memory).. This is partially caused by D being a low level language with pointers, and it's partially caused by D2 type system unable to tell apart: 1) hand-managed pointers, to GC memory or C heap memory; 2) GC-managed pointers to pinned memory; 3) GC-managed pointers to unpinned memory. I think Walter think that telling them apart in the language makes D too much complex, and he can be right. But the current situation makes it hard to design a very efficient GC for D. So I don't think high-performance game designers will use D GC for the next few years, they will manage most or all the memory manually. I am ignorant, but I think D designers have to work a little harder in finding ways to allocate unpinned objects. This (with a refined GC able to move unpinned memory, that keeps a stack-like Eden plus two or three generations of objects) can help a lot for programs written in Java-style.Yeah, that's really a shame about the current state of the garbage collector. Any memory leaks at all are the death knell of console games, and they're also the death of pretty much any long-running application. Are the memory leaks eternal and irrevocable, or are we just talking about memory that takes a long time for the garbage collector to figure out it should free? That said, my interest in D is less about its state today and more about its state tomorrow. I'm not planning on dying anytime soon, so I imagine I'll be coding for a long time...and I hope it's not just going to be C++, C++, and more C++ for the rest of my life!But computer science history has shown that if enough people work on a problem they can often find some partial solution. At the beginning Java was very slow. So there's a bit of hope still. Of course enough GC experts will work on the D GC only if D will have some success.That's pretty much the way I look at it too. Assuming people don't just abandon D, it's only a matter of time before the genius programmers of the world fix the rough spots.Don't nail your mind only on that problem, that's only one of many problems. You can think that dynamic arrays are simple, but from what I've seen there's nothing simple in D dynamic arrays, people have found a way to improve them a little only now, after years of discussions about them, and I am not fully sure yet the recent changes are worth it, I mean I am not sure yet that the current D2 arrays are better than the older ones + an appender helper struct. There is no good benchmarking suite yet to test if they are an improvement.You're right: It's just that when I saw this thread, I figured I could bring up this one problem of many which happens to be an easy fix. :)Bye, bearophile
Mar 31 2010
Mike S:Well, writing code Java-style is certainly no problem for game devs,<Right. Here I was talking about D uses more in general, sorry, like young programmers coming out of the university.it'll still be a long time before someone actually starts from scratch to write a new kernel using it anyway.<People will try to use D2 for this purpose too, for example to teaching purposes.How difficult do you think that would be for the compiler devs to implement in the semantic sense? Assuming it can be done without major hardship or compromising the design of the language, that would be really cool.<They are easy to implement. Even the lazy ones. See ShedSkin "compiler".I also noticed a discussion about Python tuples from October 2009 I think, and native tuples in D would also be useful...<We can talk about them again for D3. At the moment D2 needs less new features and better implementation/debugging of the already present features.I think there's a lot to be learned from the expressiveness of scripting languages<That was one of the original goals of D.Are the memory leaks eternal and irrevocable, or are we just talking about memory that takes a long time for the garbage collector to figure out it should free?<I am mostly talking about false pointers, values that the GC thinks are pointers, while they are not. They can keep alive blocks of memory.Assuming people don't just abandon D, it's only a matter of time before the genius programmers of the world fix the rough spots.<But there are limits in what smart people can invent/solve. So the language designers have to work to allow them to find solutions. Bye, bearophile
Apr 01 2010
bearophile wrote:I figured the eager ones wouldn't be a problem, but I wondered whether the lazy ones might be a pain. Guess not, so cool. :)How difficult do you think that would be for the compiler devs to implement in the semantic sense? Assuming it can be done without major hardship or compromising the design of the language, that would be really cool.<They are easy to implement. Even the lazy ones. See ShedSkin "compiler".We can talk about them again for D3. At the moment D2 needs less new features and better implementation/debugging of the already present features.That's very true...I'm looking forward to Andrei's book, but I can't imagine how he's finishing it on schedule, considering how quickly both the language itself and the compiler are evolving. If the language specification and reference compiler are both as incomplete, volatile, and partially implemented as they are now, a June release might do some real damage to D's reputation. As far as D3 goes though: Obviously nothing about it has really been discussed at length, but is the general idea that it would be another backwards-incompatible overhaul, or is the plan to make D2 the target for backwards compatibility from here on out?
Apr 01 2010
Mike S wrote:bearophile wrote:The book is finished and is on schedule. It's been out of my hands for a while - currently in the final copyedit stage. (Walter, last chance to remove octal literals.) I'll publish a schedule on my website soon. The entire genesis of TDPL and its growth in conjunction with the language itself, was a very interesting process (in addition to being a near-death experience) that I hope I'll find time to write an article about soon. Bottom line, the thing looks like a million 1920 bucks. I fail to see many ways in which the people involved could have made it better. That being said, it's entirely unpredictable how the book's going to fare. I'll be very curious.I figured the eager ones wouldn't be a problem, but I wondered whether the lazy ones might be a pain. Guess not, so cool. :)How difficult do you think that would be for the compiler devs to implement in the semantic sense? Assuming it can be done without major hardship or compromising the design of the language, that would be really cool.<They are easy to implement. Even the lazy ones. See ShedSkin "compiler".We can talk about them again for D3. At the moment D2 needs less new features and better implementation/debugging of the already present features.That's very true...I'm looking forward to Andrei's book, but I can't imagine how he's finishing it on schedule, considering how quickly both the language itself and the compiler are evolving.If the language specification and reference compiler are both as incomplete, volatile, and partially implemented as they are now, a June release might do some real damage to D's reputation.Not at all. Consider the state of C++, Perl, Java, or Ruby definitions and implementations at the time when their first "hello, world" book was printed. If anything, TDPL is slightly later, not earlier, than average. Even K&R's classic does not cover the entire language (even as it was defined back in the time) and has things like prototype-less calls, which C has since dropped like a bad habit.As far as D3 goes though: Obviously nothing about it has really been discussed at length, but is the general idea that it would be another backwards-incompatible overhaul, or is the plan to make D2 the target for backwards compatibility from here on out?The incompatibility with D1 was a one-off thing. D2 is the flagship of D going forward. If you ask me, I predict that the schism of D2 from D1 will go down as Walter's smartest gambit ever. Andrei
Apr 01 2010
Andrei Alexandrescu:The book is finished and is on schedule. It's been out of my hands for a while - currently in the final copyedit stage. (Walter, last chance to remove octal literals.) I'll publish a schedule on my website soon.Walter doesn't want to change octals, so I think it's a waste of time to keep talking about that. There are several other small problems in D2 that deserve a look. They are small but important things. Please talk about those. I list most of them here in approximate order of decreasing importance: Syntax & semantics for array assigns http://d.puremagic.com/issues/show_bug.cgi?id=3971 [module system] Tiding up the imports http://d.puremagic.com/issues/show_bug.cgi?id=3819 Signed lengths (and other built-in values) http://d.puremagic.com/issues/show_bug.cgi?id=3843 [missing error] Array literal length doesn't match Array literal assign to array of different length http://d.puremagic.com/issues/show_bug.cgi?id=3849 http://d.puremagic.com/issues/show_bug.cgi?id=3948 opCast(bool) in classes is bug-prone http://d.puremagic.com/issues/show_bug.cgi?id=3926 Require opEquals/opCmp in a class the defines toHash http://d.puremagic.com/issues/show_bug.cgi?id=3844 automatic joining of adjacent strings is bad http://d.puremagic.com/issues/show_bug.cgi?id=3827 pure/nothrow functions/delegates are a subtype of the nonpure/throw ones http://d.puremagic.com/issues/show_bug.cgi?id=3833 const arguments/instance attributes in conditions/invariants http://d.puremagic.com/issues/show_bug.cgi?id=3856 bool opEquals() for structs instead of int opEquals() http://d.puremagic.com/issues/show_bug.cgi?id=3967 byte ==> sbyte http://d.puremagic.com/issues/show_bug.cgi?id=3936 http://d.puremagic.com/issues/show_bug.cgi?id=3850 A bug-prone situation with AAs http://d.puremagic.com/issues/show_bug.cgi?id=3825 Arguments and attributes with the same name http://d.puremagic.com/issues/show_bug.cgi?id=3878 More useful and more clean 'is' http://d.puremagic.com/issues/show_bug.cgi?id=3981 Those things are small breaking changes, so it's much better to think about them sooner. If you want I can explain better each one of them. Bye, bearophile
Apr 01 2010
Mike S:I figured the eager ones wouldn't be a problem, but I wondered whether the lazy ones might be a pain. Guess not, so cool. :)Lazy ones are just a struct/class instance that contains an opApply method (or that follows the Range protocol methods). Very easy.As far as D3 goes though: Obviously nothing about it has really been discussed at length, but is the general idea that it would be another backwards-incompatible overhaul, or is the plan to make D2 the target for backwards compatibility from here on out?From what I've seen, D3 will probably be backwards compatible with D2. (But maybe D3 will feel free to fix few backwards incompatible warts/errors found in the meantime, I don't know, like un-flattening tuples). Bye, bearophile
Apr 01 2010
On Wed, 31 Mar 2010 17:57:07 -0400, Mike S <mikes notarealaddresslololololol.com> wrote:Steven Schveighoffer wrote:What do you mean by nondeterministic? It's very deterministic, just not always easy to determine ;) However, given enough context, it's really easy to determine.You are correct, setCapacity ensures that *at least* the given number of elements will be available for appending. I planned on making the function a property (but a bug would not allow that), the original intended usage was: a.capacity = 10000; Reserve doesn't work in this context. Can you come up with a name that does? I'll bring up reserve (as a function) as an alternative on the phobos mailing list, and see what people say. I kind of liked the setter/getter idea, but you make a good point. -SteveSorry if resurrecting this thread is against netiquette, but it caught my eye, and this is my first newsgroup post in years. ;) Anyway, is there any compelling reason why setCapacity or modifying a.capacity should allocate a nondeterministic amount of storage?Depending on the application, programmers might require strict control over memory allocation patterns and strict accounting for allocated memory. Game programmers, especially console game programmers, tend to strongly prefer deterministic allocation patterns, and nondeterminism is one of the [several] common complaints about the C++ STL (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html is a good resource on these kind of issues). In the case of D (which I'm considering learning), this is especially important for dynamic arrays, partly because they're so useful by themselves, and partly because they may form the backbone of custom containers.The amount of memory given is determined by the GC, and ultimately by the OS. The currently supported OSes allocate in Page-sized chunks, so when you allocate any memory from the OS, you are allocating a page (4k). Most likely, you may not need a whole page for the data you are allocating, so the GC gives you more finely sized chunks by breaking up a page into smaller pieces. This strategy works well in some cases, and can be wasteful in others. The goal is to strike a balance that is "good enough" for everyday programming, but can be specialized when you need it. If you want to control memory allocation yourself, you can always do that by allocating page-sized chunks and doing the memory management on those chunks yourself. I do something very similar in dcollections to speed up allocation/destruction.Whereas it's easy to add "smart nondeterministic" behavior to a deterministic setCapacity function by providing a wrapper, ordinary language users can't do the opposite. Because of this, and because dynamic arrays are so central to the D language, a nondeterministic setCapacity function may deter game programmers, especially console programmers, from adopting D. Assuming you see this post, what are your thoughts here?I think D has deterministic allocation, and better ability than C++ to make custom types that look and act like builtins. Therefore, you can make an array type that suits your needs and is almost exactly the same syntax as a builtin array (except for some things reserved for builtins, like literals). Such a thing is certainly possible, even with using the GC for your allocation. BTW, I made the change to the runtime renaming the function previously known as setCapacity to reserve. It won't be a property, even if that bug is fixed. -Steve
Mar 31 2010
Steven Schveighoffer wrote: > What do you mean by nondeterministic? It's very deterministic, just notalways easy to determine ;) However, given enough context, it's really easy to determine.When I say deterministic, I'm referring to determinism from the user's point of view, where the allocation behavior is affected solely by the parameter (the size request, e.g. 10000 objects) and not by some kind of internal state, hidden context, or arcane black magic. :p > The amount of memory given is determined by the GC, and ultimately bythe OS. The currently supported OSes allocate in Page-sized chunks, so when you allocate any memory from the OS, you are allocating a page (4k). Most likely, you may not need a whole page for the data you are allocating, so the GC gives you more finely sized chunks by breaking up a page into smaller pieces. This strategy works well in some cases, and can be wasteful in others. The goal is to strike a balance that is "good enough" for everyday programming, but can be specialized when you need it.That's understandable, and it makes sense that the actual memory being allocated would correspond to some chunk size. It's really just opaque black box behavior that poses a problem; if users are given well-defined guidelines and chunk sizes, that would work just fine. For instance, a spec like, "reserve a multiple of 512 bytes and that's exactly what you will be given," would allow users to minimize wastefulness and know precisely how much memory they're allocating.If you want to control memory allocation yourself, you can always do that by allocating page-sized chunks and doing the memory management on those chunks yourself. I do something very similar in dcollections to speed up allocation/destruction. <snip> I think D has deterministic allocation, and better ability than C++ to make custom types that look and act like builtins. Therefore, you can make an array type that suits your needs and is almost exactly the same syntax as a builtin array (except for some things reserved for builtins, like literals). Such a thing is certainly possible, even with using the GC for your allocation.That parallels what game devs do in C++: They tend to use custom allocators a lot, and they're likely to follow the same basic strategy in D too, if/when it becomes a suitable replacement. I'm still just browsing though, and I'm not all that familiar with D. If you can't actually use the built-in dynamic arrays for this purpose, how difficult would it be to reimplement a contiguously stored dynamic container using custom allocation? I suppose you'd have to build it from the ground up using a void pointer to a custom allocated block of memory, right? Do user-defined types in D have any/many performance disadvantages compared to built-ins?BTW, I made the change to the runtime renaming the function previously known as setCapacity to reserve. It won't be a property, even if that bug is fixed. -SteveThat's a bit of a downer, since a capacity property would have nice symmetry with the length property. I suppose there were good reasons though. Considering the name change, does that mean reserve can only reserve new space, i.e. it can't free any that's already been allocated? (That makes me wonder: Out of curiosity, how does the garbage collector know how much space is allocated to a dynamic array or especially to a void pointer? I suppose it's registered somewhere?)
Mar 31 2010
On Thu, 01 Apr 2010 01:41:02 -0400, Mike S <mikes notarealaddresslololololol.com> wrote:Steven Schveighoffer wrote: > What do you mean by nondeterministic? It's very deterministic, just notIts abstracted to the GC, but the current GC is well defined. If you request to allocate blocks with length of a power of 2 under a page, you will get exactly that length, all the way down to 16 bytes. If you request to allocate a page or greater, you get a contiguous block of memory that is a multiple of a page. With that definition, is the allocator deterministic enough for your needs?always easy to determine ;) However, given enough context, it's really easy to determine.When I say deterministic, I'm referring to determinism from the user's point of view, where the allocation behavior is affected solely by the parameter (the size request, e.g. 10000 objects) and not by some kind of internal state, hidden context, or arcane black magic. :p> The amount of memory given is determined by the GC, and ultimately byI think in the interest of allowing innovative freedom, such requirements should be left up to the GC implementor, not the spec or runtime. Anyone who wants to closely control memory usage should just understand how the GC they are using works.the OS. The currently supported OSes allocate in Page-sized chunks, so when you allocate any memory from the OS, you are allocating a page (4k). Most likely, you may not need a whole page for the data you are allocating, so the GC gives you more finely sized chunks by breaking up a page into smaller pieces. This strategy works well in some cases, and can be wasteful in others. The goal is to strike a balance that is "good enough" for everyday programming, but can be specialized when you need it.That's understandable, and it makes sense that the actual memory being allocated would correspond to some chunk size. It's really just opaque black box behavior that poses a problem; if users are given well-defined guidelines and chunk sizes, that would work just fine. For instance, a spec like, "reserve a multiple of 512 bytes and that's exactly what you will be given," would allow users to minimize wastefulness and know precisely how much memory they're allocating.No, you would most likely use templates, not void pointers. D's template system is far advanced past C++, and I used it to implement my custom allocators. It works great. User-defined types are as high performance as builtins as long as the compiler inlines properly.If you want to control memory allocation yourself, you can always do that by allocating page-sized chunks and doing the memory management on those chunks yourself. I do something very similar in dcollections to speed up allocation/destruction. <snip> I think D has deterministic allocation, and better ability than C++ to make custom types that look and act like builtins. Therefore, you can make an array type that suits your needs and is almost exactly the same syntax as a builtin array (except for some things reserved for builtins, like literals). Such a thing is certainly possible, even with using the GC for your allocation.That parallels what game devs do in C++: They tend to use custom allocators a lot, and they're likely to follow the same basic strategy in D too, if/when it becomes a suitable replacement. I'm still just browsing though, and I'm not all that familiar with D. If you can't actually use the built-in dynamic arrays for this purpose, how difficult would it be to reimplement a contiguously stored dynamic container using custom allocation? I suppose you'd have to build it from the ground up using a void pointer to a custom allocated block of memory, right? Do user-defined types in D have any/many performance disadvantages compared to built-ins?Capacity still exists as a read-only property. I did like the symmetry, but the point was well taken that the act of setting the capacity was not exact. It does mean that reserving space can only grow, not shrink. In fact, the capacity property calls the same runtime function as reserve, just passing 0 as the amount requested to get the currently reserved space. You can't use capacity to free space because that could result in dangling pointers. Freeing space is done through the delete keyword. We do not want to make it easy to accidentally free space.BTW, I made the change to the runtime renaming the function previously known as setCapacity to reserve. It won't be a property, even if that bug is fixed. -SteveThat's a bit of a downer, since a capacity property would have nice symmetry with the length property. I suppose there were good reasons though. Considering the name change, does that mean reserve can only reserve new space, i.e. it can't free any that's already been allocated?(That makes me wonder: Out of curiosity, how does the garbage collector know how much space is allocated to a dynamic array or especially to a void pointer? I suppose it's registered somewhere?)The GC can figure out what page an interior pointer belongs to, and therefore how much memory that block uses. There is a GC function to get the block info of an interior pointer, which returns a struct that contains the pointer to the block, the length of the block, and its flags (whether it contains pointers or not). This function is what the array append feature uses to determine how much capacity can be used. I believe this lookup is logarithmic in complexity. -Steve
Apr 01 2010
Steven Schveighoffer wrote:Its abstracted to the GC, but the current GC is well defined. If you request to allocate blocks with length of a power of 2 under a page, you will get exactly that length, all the way down to 16 bytes. If you request to allocate a page or greater, you get a contiguous block of memory that is a multiple of a page. With that definition, is the allocator deterministic enough for your needs?With respect to the current GC, yup. :) Sticking to powers of 2 (or integral numbers of pages) >= 16 bytes is an easy enough rule. Of course, I'm just starting out, so experienced game programmers might disagree, but it sounds perfectly reasonable to me. I suppose the abstraction makes it a QOI though, so depending on the compiler and GC used in the future it could become a question again. As it stands, I'm less interested in its precise current state and more interested in keeping an eye on its direction and how the spec/compiler/tools will mature over the next few years. > I think in the interest of allowing innovative freedom, suchrequirements should be left up to the GC implementor, not the spec or runtime. Anyone who wants to closely control memory usage should just understand how the GC they are using works.It's definitely a tradeoff...leaving things unspecified opens the door to innovation and better implementations, but it simultaneously raises issues of inconsistent behavior across multiple platforms where the same compiler/GC might not be able to be used. To give an extreme example, look where unspecified behavior regarding static initialization order got C++! ;) I guess I'll just have to see where things go over time, but the predictable allocation with the current GC is a good sign at least.No, you would most likely use templates, not void pointers. D's template system is far advanced past C++, and I used it to implement my custom allocators. It works great. User-defined types are as high performance as builtins as long as the compiler inlines properly.D's template system is pretty intriguing. I don't really know anything about it, but I've read that it's less intimidating and tricky than C++'s, and that can only be a good thing for people wanting to harness its power.Capacity still exists as a read-only property. I did like the symmetry, but the point was well taken that the act of setting the capacity was not exact. It does mean that reserving space can only grow, not shrink. In fact, the capacity property calls the same runtime function as reserve, just passing 0 as the amount requested to get the currently reserved space. You can't use capacity to free space because that could result in dangling pointers. Freeing space is done through the delete keyword. We do not want to make it easy to accidentally free space.It's a shame the asymmetry was necessary, but I agree it makes sense.The GC can figure out what page an interior pointer belongs to, and therefore how much memory that block uses. There is a GC function to get the block info of an interior pointer, which returns a struct that contains the pointer to the block, the length of the block, and its flags (whether it contains pointers or not). This function is what the array append feature uses to determine how much capacity can be used. I believe this lookup is logarithmic in complexity. -SteveThanks!
Apr 01 2010
On Thu, 01 Apr 2010 11:01:38 -0400, Mike S <mikes notarealaddresslololololol.com> wrote:I suppose the abstraction makes it a QOI though, so depending on the compiler and GC used in the future it could become a question again. As it stands, I'm less interested in its precise current state and more interested in keeping an eye on its direction and how the spec/compiler/tools will mature over the next few years.As an aside, because the GC is implemented all in runtime, you can swap out the GC for something else that suits your needs. Therefore, the issue of having to deal with different GCs on different platforms can be circumvented by inventing a GC that works the way you want on all platforms :) Also, I don't see the major attributes of the GC changing anytime soon. -Steve
Apr 01 2010
Steven Schveighoffer wrote:On Thu, 04 Mar 2010 11:43:27 -0500, grauzone <none example.net> wrote:What shrinkToFit() does is not really clear. Does it reallocate the memory block of the array such, that no space is wasted? Or does it provide (almost) the same functionality as my resetAndReuse(), and make the superfluous trailing memory available for appending without reallocation? I think a resetAndReuse is really needed. I have found it can prevent "GC thrashing" in many cases. E.g. when caching frequently re-evaluated data in form of arrays (free previous array, then allocate array that isn't larger than the previous array).Some sort of "resetAndReuse" function to clear an array, but enabling to reuse the old memory would be nice: int[] a = data; a = null; a ~= 1; //reallocates (of course) a.length = 0; a ~= 1; //will reallocate (for safety), used not to reallocate resetAndReuse(a); assert(a.length == 0); a ~= 1; //doesn't reallocate This can be implemented by setting both the slice and the internal runtime length fields to 0. Additionally, another function is necessary to replace the old preallocation trick: //preallocate 1000 elements, but don't change actual slice length auto len = a.length; a.length = len + 1000; a.length = len; As I understood it, this won't work anymore after the change. This can be implemented by enlarging the array's memory block without touching any length fields. I'm sure the function you had in mind does one of those things or both.proposed usage (as checked in a couple days ago): int[] a; a.setCapacity(10000); // pre-allocate at least 10000 elements. foreach(i; 0..10000) a ~= i; // no reallocation a.length = 100; a.shrinkToFit(); // resize "allocated" length to 100 elements a ~= 5; // no reallocation.
Mar 04 2010
On Thu, 04 Mar 2010 12:43:32 -0500, grauzone <none example.net> wrote:Steven Schveighoffer wrote:Sorry, should have added: assert(a.length == 101); Basically, shrinkToFit shrinks the "allocated" space to the length of the array. To put it another way, you could write your resetAndReuse function as follows: void resetAndReuse(T)(ref T[] arr) { arr.length = 0; arr.shrinkToFit(); } I want to avoid assuming that shrinking the length to 0 is the only usable idiom.On Thu, 04 Mar 2010 11:43:27 -0500, grauzone <none example.net> wrote:What shrinkToFit() does is not really clear. Does it reallocate the memory block of the array such, that no space is wasted? Or does it provide (almost) the same functionality as my resetAndReuse(), and make the superfluous trailing memory available for appending without reallocation?Some sort of "resetAndReuse" function to clear an array, but enabling to reuse the old memory would be nice: int[] a = data; a = null; a ~= 1; //reallocates (of course) a.length = 0; a ~= 1; //will reallocate (for safety), used not to reallocate resetAndReuse(a); assert(a.length == 0); a ~= 1; //doesn't reallocate This can be implemented by setting both the slice and the internal runtime length fields to 0. Additionally, another function is necessary to replace the old preallocation trick: //preallocate 1000 elements, but don't change actual slice length auto len = a.length; a.length = len + 1000; a.length = len; As I understood it, this won't work anymore after the change. This can be implemented by enlarging the array's memory block without touching any length fields. I'm sure the function you had in mind does one of those things or both.proposed usage (as checked in a couple days ago): int[] a; a.setCapacity(10000); // pre-allocate at least 10000 elements. foreach(i; 0..10000) a ~= i; // no reallocation a.length = 100; a.shrinkToFit(); // resize "allocated" length to 100 elements a ~= 5; // no reallocation.I think a resetAndReuse is really needed. I have found it can prevent "GC thrashing" in many cases. E.g. when caching frequently re-evaluated data in form of arrays (free previous array, then allocate array that isn't larger than the previous array).Yes, it is useful in such cases. The only questionable part about it is that it allows for stomping in cases like: auto str = "hello".idup; auto str2 = str[3..4]; str2.shrinkToFit(); str2 ~= "a"; assert(str == "hella"); So it probably should be marked as unsafe. Again, the name shrinkToFit isn't my favorite, ideas welcome. -Steve
Mar 04 2010
Steven Schveighoffer wrote:On Thu, 04 Mar 2010 12:43:32 -0500, grauzone <none example.net> wrote:Ah, great.Steven Schveighoffer wrote:Sorry, should have added: assert(a.length == 101); Basically, shrinkToFit shrinks the "allocated" space to the length of the array. To put it another way, you could write your resetAndReuse function as follows: void resetAndReuse(T)(ref T[] arr) { arr.length = 0; arr.shrinkToFit(); } I want to avoid assuming that shrinking the length to 0 is the only usable idiom.On Thu, 04 Mar 2010 11:43:27 -0500, grauzone <none example.net> wrote:What shrinkToFit() does is not really clear. Does it reallocate the memory block of the array such, that no space is wasted? Or does it provide (almost) the same functionality as my resetAndReuse(), and make the superfluous trailing memory available for appending without reallocation?Some sort of "resetAndReuse" function to clear an array, but enabling to reuse the old memory would be nice: int[] a = data; a = null; a ~= 1; //reallocates (of course) a.length = 0; a ~= 1; //will reallocate (for safety), used not to reallocate resetAndReuse(a); assert(a.length == 0); a ~= 1; //doesn't reallocate This can be implemented by setting both the slice and the internal runtime length fields to 0. Additionally, another function is necessary to replace the old preallocation trick: //preallocate 1000 elements, but don't change actual slice length auto len = a.length; a.length = len + 1000; a.length = len; As I understood it, this won't work anymore after the change. This can be implemented by enlarging the array's memory block without touching any length fields. I'm sure the function you had in mind does one of those things or both.proposed usage (as checked in a couple days ago): int[] a; a.setCapacity(10000); // pre-allocate at least 10000 elements. foreach(i; 0..10000) a ~= i; // no reallocation a.length = 100; a.shrinkToFit(); // resize "allocated" length to 100 elements a ~= 5; // no reallocation.Doesn't it conform to Andrei's ideas about memory safety? It can stomp over other data, but it can't be used to subvert the type system. Also, it's not the default behavior: if you don't use this function, stomping can never happen. But it must be disabled for arrays of immutable types (i.e. strings).I think a resetAndReuse is really needed. I have found it can prevent "GC thrashing" in many cases. E.g. when caching frequently re-evaluated data in form of arrays (free previous array, then allocate array that isn't larger than the previous array).Yes, it is useful in such cases. The only questionable part about it is that it allows for stomping in cases like: auto str = "hello".idup; auto str2 = str[3..4]; str2.shrinkToFit(); str2 ~= "a"; assert(str == "hella"); So it probably should be marked as unsafe.Again, the name shrinkToFit isn't my favorite, ideas welcome.stompOnAppend()? uniqueSlice()? trashRemainder()? I don't know either, all sound a bit silly. PS: I'm glad to hear that your patch is supposed to be included in the next release. Still waiting for dsimcha's one.-Steve
Mar 04 2010
On Thu, 04 Mar 2010 13:38:30 -0500, grauzone <none example.net> wrote:Steven Schveighoffer wrote:I think the idea is that anything that *could* result in undefined behavior is marked as unsafe. It just means it's not available from safe functions. It still can be called from safe functions via a trusted wrapper. I'd call it undefined behavior, because the memory is effectively released. Future GCs may make assumptions about that unallocated space that would cause problems. I feel like safeD has a slight lean towards sacrificing performance for the sake of memory safety. This is not a bad thing, and in most cases inconsequential.So it probably should be marked as unsafe.Doesn't it conform to Andrei's ideas about memory safety? It can stomp over other data, but it can't be used to subvert the type system.Also, it's not the default behavior: if you don't use this function, stomping can never happen.What's nice about capturing it into a function is you *can* classify it as unsafe, versus the current method which is harder to detect. I'd categorize it like casting.But it must be disabled for arrays of immutable types (i.e. strings).This is tricky, you could have partially immutable types. The function would have to examine the type of all the contained items and make sure there were no immutable bytes in the element. Plus, it's not truly unsafe to use on immutable types if the given array is the only reference to that data. You could potentially implement a stack using builtin array appending and shrinkToFit, even on immutable elements (although I'd advise against it :)I think I like shrinkToFit better than all those :) I still like minimize the best, but the objection was that it has mathematical connotations. -SteveAgain, the name shrinkToFit isn't my favorite, ideas welcome.stompOnAppend()? uniqueSlice()? trashRemainder()? I don't know either, all sound a bit silly.
Mar 04 2010