www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to create nogc code?

reply Adam Sansier <Adam.Sansier gmail.com> writes:
So, I have to create some nogc code. Basically all it uses is 
idup to create a string that is passed as a return value. It 
seems this is necessary or the string will be reused and 
corrupted.

idup uses the gc, I am currently just malloc'ing the string and 
allowing for the memory leak. This is somewhat acceptable given 
that this code should rarely be called and generally only at 
startup. It will generally waste only a few KB of memory.

Regardless, is there a better way to do this that avoids the gc 
and doesn't potentially leak considerably?  Currently the only 
option is to notify the user through comments on the function 
that it leaks and should carefully be used.

Ultimately a struct is built which includes these strings(which 
is partly the corruption issue without idup).

How does phobos deal with this type of stuff? Does it force the 
user to allocate the memory so they are at least aware that they 
have to control it?
Jul 10 2016
next sibling parent reply Adam Sansier <Adam.Sansier gmail.com> writes:
Also, When dealing with a complex tree like structure, is there 
an easy way to recursively free it by free'ing all the sub 
elements?

Also, since I'm dealing with simple structs and strings, maybe I 
more intelligent string type can be used? One that uses opAssign 
to do reference counting? I imagine that the only time there are 
issues is during assignment, in most cases?
Jul 10 2016
parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Monday, July 11, 2016 00:37:39 Adam Sansier via Digitalmars-d-learn wrote:
 Also, When dealing with a complex tree like structure, is there
 an easy way to recursively free it by free'ing all the sub
 elements?
When manually managing memory, you're dealing with basically the same constructs that you would have in C/C++. I mean, you're even using the exact same functions when you're dealing with malloc and free. So, ultimately, something is going to have to free each of those nodes individually, just like it would in C/C++, but it could be managed by destructors if RAII or reference counting is being used, just like you would in C++.
 Also, since I'm dealing with simple structs and strings, maybe I
 more intelligent string type can be used? One that uses opAssign
 to do reference counting? I imagine that the only time there are
 issues is during assignment, in most cases?
Andrei is currently working a ref-counted string type that he's calling RCStr that he wants to get into Phobos (or maybe druntime) once it's done, and it would not only be reference counted, but it would have small string optimizations. So, for code where that would work better than just using D's built-in arrays, that will be an option. And you could certainly code up a type yourself that wrapped whatever array type that you wanted so that it was managed via malloc and reference counting rather than with the GC, and the malloc-ed memory was encapsulated within that type. For most code, the built-in arrays work very well though. So, while you may very well be better of using a user-defined type that used malloc internally, you might also be prematurely optimizing out of fear of the GC as folks sometimes do. But not knowing what you're doing in detail, I couldn't say. - Jonathan M Davis
Jul 10 2016
parent reply Adam Sansier <Adam.Sansier gmail.com> writes:
On Monday, 11 July 2016 at 01:08:16 UTC, Jonathan M Davis wrote:
 On Monday, July 11, 2016 00:37:39 Adam Sansier via 
 Digitalmars-d-learn wrote:
 [...]
When manually managing memory, you're dealing with basically the same constructs that you would have in C/C++. I mean, you're even using the exact same functions when you're dealing with malloc and free. So, ultimately, something is going to have to free each of those nodes individually, just like it would in C/C++, but it could be managed by destructors if RAII or reference counting is being used, just like you would in C++. [...]
Thanks. I'd rather prematurely optimize out the gc then have to go back and get it to work. It's not hard to write non-gc code, it's been done for ages. But having some compiler help makes things nice. It seems there is more of a phobia of writing non-gc code than the phobia of the gc.
Jul 10 2016
parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Monday, July 11, 2016 01:16:11 Adam Sansier via Digitalmars-d-learn wrote:
 Thanks. I'd rather prematurely optimize out the gc then have to
 go back and get it to work. It's not hard to write non-gc code,
 it's been done for ages.  But having some compiler help makes
 things nice. It seems there is more of a phobia of writing non-gc
 code than the phobia of the gc.
It's more a case that you're just making life harder for yourself if you avoid the GC. Some programs (like AAA games) are going to need to avoid the GC, but your average program is going to be just fine using the GC - especially if you use idiomatic D and favor structs over classes and use ranges rather than allocating a bunch of stuff on the heap like you'd do in Java, or even often in C++. So, most folks who are trying to avoid the GC are causing themselves pain by doing so without actually needing to. It's perfectly possible to do avoid the GC in D, and some programs will need to, but most won't, and avoiding the GC is always more of a pain than just using it. My advice would that unless you're doing something where you know that a stop the world GC will be unacceptable, that you just use the GC and not worry about it until profiling shows you that you need to do something differently. And even then, it's often the case that you just need to alter a small portion of your program so that the GC doesn't run during that piece of the code, or you make a critical thread be non-GC so that it doesn't get stopped during a collection cycle. Folks have made very performance-critical programs work that way, and it's a lot more pleasant than trying to avoid the GC everywhere. I don't know what you're doing, so I don't know whether you should be trying to avoid the GC or not. You're obviously the one who's going to have to judge that. I'm just pointing out that odds are that you don't need to avoid it and that you're just making life harder for yourself if you do. But what you do is obviously completely up to you. - Jonathan M Davis
Jul 10 2016
parent reply Adam Sansier <Adam.Sansier gmail.com> writes:
On Monday, 11 July 2016 at 02:35:13 UTC, Jonathan M Davis wrote:
 On Monday, July 11, 2016 01:16:11 Adam Sansier via 
 Digitalmars-d-learn wrote:
 [...]
It's more a case that you're just making life harder for yourself if you avoid the GC. Some programs (like AAA games) are going to need to avoid the GC, but your average program is going to be just fine using the GC - especially if you use idiomatic D and favor structs over classes and use ranges rather than allocating a bunch of stuff on the heap like you'd do in Java, or even often in C++. So, most folks who are trying to avoid the GC are causing themselves pain by doing so without actually needing to. It's perfectly possible to do avoid the GC in D, and some programs will need to, but most won't, and avoiding the GC is always more of a pain than just using it. [...]
Yes, thanks. I know. Even if I didn't, it doesn't matter ;) I think it would be more beneficial to try to address the pitfalls rather than the `what if's`, hypotheticals and cliff hangers. You know, you don't see your argument much in C++ forums. In fact, it's probably the opposite ;) Do I have to write a song called `Don't fear the GC`?
Jul 10 2016
parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Monday, July 11, 2016 03:51:27 Adam Sansier via Digitalmars-d-learn wrote:
 You know, you don't see your argument much in C++ forums. In
 fact, it's probably the opposite ;)
C++ doesn't have a GC built-in and does not have features that rely on it. So, it's in a very different situation. And when talking about the GC in D and how avoiding the GC causes problems, you're usually talking about features that C++ doesn't even have. If you never use features like D's dynamic arrays or delegates, then it's mostly a non-issue. If all you're doing is passing around int* and the like, then the situation is the same as in C and is fine. But stuff like int[] becomes problematic, because it assumes that you're using the GC. But that's stuff that doesn't exist in C++. The main area that's a problem in D that isn't a problem in C++ that involves features that C++ has is allocating user-defined objects in that C++'s new does not use the GC, whereas D's does, and in both cases, new is the clean and easy way to allocate an object. So, unlike in C++, in D, if you want to put user-defined objects on a non-GC heap, it can be a pain - since malloc and free don't handle that for you; they just deal with the memory itself. You need a wrapper that handles not only the allocation, but the construction and destruction correctly. But std(.experimental).allocator is where that's getting fixed. So, fortunately, that problem is going away, but without those wrappers, avoiding the GC gets miserable fast. So, if you stick purely to features that don't use the GC at all, then you lose out on some nice features, but you're in a similar boat to C or C++, and you don't need to worry about the GC, since you're not using it, and you're not using anything that's designed to use it. But then you have to avoid some nice features, which sucks, and makes writing your programs harder than they would be otherwise. By far the bigger gain is writing your code in a way that minimizes heap allocations in general, and that's going to be of benefit whether you're using the GC or not (and that's just as true in C++ as it is D). Because their feature sets are different, the situations in C++ and D are fundamentally different even though they're similar languages, and I wouldn't expect the same arguments or conclusions on all of the various topics in a C++ discussion that you'd have in a D discussion, even if the folks making those arguments were the same. Sometimes, the best way to do things is exactly the same in both languages, and sometimes it's very different. - Jonathan M Davis
Jul 11 2016
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Monday, 11 July 2016 at 21:28:58 UTC, Jonathan M Davis wrote:
 If all you're doing is passing around int* and the like, then 
 the situation is the same as in C and is fine. But stuff like 
 int[] becomes problematic, because it assumes that you're using 
 the GC. But that's stuff that doesn't exist in C++.
It exists in C++, but is broken into multiple separate concepts: std::string_view, std::vector, gsl::span etc. D and Go mixes up a view and ownership which is confusing.
Jul 11 2016
prev sibling next sibling parent Jonathan M Davis via Digitalmars-d-learn writes:
On Monday, July 11, 2016 00:31:10 Adam Sansier via Digitalmars-d-learn wrote:
 So, I have to create some nogc code. Basically all it uses is
 idup to create a string that is passed as a return value. It
 seems this is necessary or the string will be reused and
 corrupted.

 idup uses the gc, I am currently just malloc'ing the string and
 allowing for the memory leak. This is somewhat acceptable given
 that this code should rarely be called and generally only at
 startup. It will generally waste only a few KB of memory.

 Regardless, is there a better way to do this that avoids the gc
 and doesn't potentially leak considerably?  Currently the only
 option is to notify the user through comments on the function
 that it leaks and should carefully be used.

 Ultimately a struct is built which includes these strings(which
 is partly the corruption issue without idup).

 How does phobos deal with this type of stuff? Does it force the
 user to allocate the memory so they are at least aware that they
 have to control it?
Usually, it either allocates an array via the GC, or it returns a range which requires no allocation. Internally, it does sometimes use malloc, but I don't think that it ever returns anything that was malloc-ed that it expects you to then free. Straight-up mallocing dynamic arrays and passing them around is asking for trouble. It works fine when the code is nicely encapsulated, e.g. auto func(Arg arg) { immutable len = 24; auto ptr = cast(int*)malloc(len * int.sizeof); scope(exit) free(ptr); auto arr = ptr[0 .. len]; // do stuff with arr, maybe pass it to some functions that don't keep // it, but definitely don't pass it to anything that would escape this // function. return blah; } But if you start passing around an int[] that was allocated via malloc, you have to be _very_ careful. int[] does not manage its memory, and it gives you no way to do so. And it isn't reference counted. You have to keep track of the original, malloc-ed pointer and somehow know when it's valid to free it - and then free it. The code that operates on the int[] won't care and won't leak. It'll even work with operations like ~=, because they'll just reallocate the array with the GC (though obviously, you wouldn't want to do that in a program that was avoiding the GC). Code in general doesn't care what memory backs a dynamic array, because all of the operations just work (though any that would require reallocating or checking the capacity of the array would require the GC). But that also means that it does nothing with managing that memory. So, if you do something like allocate a dynamic array via malloc, pass it around a bunch, and then at some point later in the program - when you're sure that nothing is using that memory anymore - free it. Then, you're fine. But be aware that code in general is not going to worry about who owns or manages the memory backing a dynamic array whether it's doing any operations that would require the GC or not. So, if you want to pass around malloc-ed memory and have it reference-counted have it do anything else that would actually keep track of the lifetime of that memory, then you can't pass it around as an array. You'll need to create a wrapper type that encapsulates the memory. You can, of course, write your own code using int[] just like C/C++ would use int*, and keep track of the memory in whatever way you would in C/C++, but D code in general is going to assume that it can slice and append and do all of the other fun array operations to int[] without caring about what memory backs it - and as long as whover malloc-ed the memory frees it at the right time, then there is no problem, but it's up to the one who malloc-ed the memory to somehow know when that is. And doing that without leaking memory tends to mean that you can't just pass around a dynamic array which refers to malloc-ed memory. - Jonathan M Davis
Jul 10 2016
prev sibling parent ag0aep6g <anonymous example.com> writes:
On 07/11/2016 02:31 AM, Adam Sansier wrote:
 idup uses the gc, I am currently just malloc'ing the string and allowing
 for the memory leak. This is somewhat acceptable given that this code
 should rarely be called and generally only at startup. It will generally
 waste only a few KB of memory.
If you can afford the memory leak, you could probably afford the GC, no? That section (only called at startup) doesn't seem to be performance-critical. It doesn't matter then if the GC does its thing during it. And keep in mind that the GC only collects when an allocation is made. So if you never allocate with it again, you're basically in the same spot as now: allocated a few KB of memory that never get freed.
Jul 10 2016