digitalmars.D - std.process: memory allocation with malloc in execv_
- kdevel (26/26) Jan 26 2023 Why is the memory in `execv_` in `std.process` allocated with
- Steven Schveighoffer (9/36) Jan 26 2023 Yep, that's 100% the problem. Introduced here:
- kdevel (10/18) Jan 27 2023 On Thursday, 26 January 2023 at 14:45:29 UTC, Steven
- Steven Schveighoffer (6/19) Jan 30 2023 It's actually fine to use GC, you are right. But use `GC.disable` before...
- kdevel (7/15) Jan 30 2023 There is no indication that the GC kicks in after patching (v0)
- Steven Schveighoffer (13/30) Jan 31 2023 Using `GC.disable` ensures the GC will not run when you allocate memory....
- kdevel (16/49) Jan 31 2023 That leads directly to an avoidable allocation failure if there
- Steven Schveighoffer (18/49) Jan 31 2023 The GC will still run if it runs out of memory as a last-ditch effort to...
Why is the memory in `execv_` in `std.process` allocated with `malloc` (`core.stdc.stdlib.malloc`) and why is this memory not registered with the GC? It seems to be responsible for the memory corruption in [1]. ``` diff --git a/process.d b/process.d index 3eaa283..cea45a9 100644 --- a/process.d +++ b/process.d -4295,7 +4295,9 private int execvp_(in string pathname, in string[] argv) import std.exception : enforce; auto argv_ = cast(const(char)**)core.stdc.stdlib.malloc((char*).sizeof * (1 + argv.length)); enforce!OutOfMemoryError(argv_ !is null, "Out of memory in std.process."); - scope(exit) core.stdc.stdlib.free(argv_); + import core.memory; + GC.addRange (argv_, (char *).sizeof * (1 + argv.length)); + scope(exit) { GC.removeRange (argv_); core.stdc.stdlib.free(argv_); } toAStringz(argv, argv_); ``` [1] https://issues.dlang.org/show_bug.cgi?id=23654 Issue 23654 - execv_: toAStringz: memory corruption
Jan 26 2023
On 1/26/23 8:32 AM, kdevel wrote:Why is the memory in `execv_` in `std.process` allocated with `malloc` (`core.stdc.stdlib.malloc`) and why is this memory not registered with the GC? It seems to be responsible for the memory corruption in [1]. ``` diff --git a/process.d b/process.d index 3eaa283..cea45a9 100644 --- a/process.d +++ b/process.d -4295,7 +4295,9 private int execvp_(in string pathname, in string[] argv) import std.exception : enforce; auto argv_ = cast(const(char)**)core.stdc.stdlib.malloc((char*).sizeof * (1 + argv.length)); enforce!OutOfMemoryError(argv_ !is null, "Out of memory in std.process."); - scope(exit) core.stdc.stdlib.free(argv_); + import core.memory; + GC.addRange (argv_, (char *).sizeof * (1 + argv.length)); + scope(exit) { GC.removeRange (argv_); core.stdc.stdlib.free(argv_); } toAStringz(argv, argv_); ``` [1] https://issues.dlang.org/show_bug.cgi?id=23654 Issue 23654 - execv_: toAStringz: memory corruptionYep, that's 100% the problem. Introduced here: https://github.com/dlang/phobos/commit/6302257b0cdc5d171511cc6f1566956ff11b09c5 Amazing it's been broken like this for 7 years! I don't like the solution of adding the range, or using the GC. A better option is to replace toAStringz with a function that creates everything for you into a type (toStringz isn't complex, just replace with one that uses malloc), that then automatically frees everything. -Steve
Jan 26 2023
On Thursday, 26 January 2023 at 14:45:29 UTC, Steven Schveighoffer wrote: [...]Yep, that's 100% the problem. Introduced here: https://github.com/dlang/phobos/commit/6302257b0cdc5d171511cc6f1566956ff11b09c5 Amazing it's been broken like this for 7 years!*shrug*I don't like the solution of adding the range, or using the GC. A better option is to replace toAStringz with a function that creates everything for you into a type (toStringz isn't complex, just replace with one that uses malloc), that then automatically frees everything.Freeing the memory is — in the "happy path" — neither required nor possible. When unhappy the GC is ready to clean up the mess. I uploaded a patch to the issue. How do I check my patched version without editing the module name in process.d? Either I don't get the functions linked into the executable or the linker complains about duplicate symbols.
Jan 27 2023
On 1/27/23 3:26 PM, kdevel wrote:On Thursday, 26 January 2023 at 14:45:29 UTC, Steven Schveighoffer wrote:It's actually fine to use GC, you are right. But use `GC.disable` before using it (with a `scope(exit)` to re-enable), because running a GC just before exec is also pointless.I don't like the solution of adding the range, or using the GC. A better option is to replace toAStringz with a function that creates everything for you into a type (toStringz isn't complex, just replace with one that uses malloc), that then automatically frees everything.Freeing the memory is — in the "happy path" — neither required nor possible. When unhappy the GC is ready to clean up the mess. I uploaded a patch to the issue.How do I check my patched version without editing the module name in process.d? Either I don't get the functions linked into the executable or the linker complains about duplicate symbols.I don't think you can. But building phobos is pretty quick. -Steve
Jan 30 2023
On Monday, 30 January 2023 at 17:19:13 UTC, Steven Schveighoffer wrote:There is no indication that the GC kicks in after patching (v0) https://issues.dlang.org/attachment.cgi?id=1868&action=diff I have a patch v1 in preparation which removes the wrappers entirely. BTW: There is a non-POSIX function execvpe in the process.d which is actually a GNU extension.[...] Freeing the memory is — in the "happy path" — neither required nor possible. When unhappy the GC is ready to clean up the mess. I uploaded a patch to the issue.It's actually fine to use GC, you are right. But use `GC.disable` before using it (with a `scope(exit)` to re-enable), because running a GC just before exec is also pointless.
Jan 30 2023
On 1/30/23 12:56 PM, kdevel wrote:On Monday, 30 January 2023 at 17:19:13 UTC, Steven Schveighoffer wrote:Using `GC.disable` ensures the GC will not run when you allocate memory. Whether it runs or not is up to the memory allocator. There is no guarantee it will run, so checking whether it did run is not conclusive. Running a collection just before replacing the entire image with another program isn't productive work. Just add: ```d GC.disable; scope(exit) GC.enable; ``` to the part where you are about to set up the call to `exec` -SteveThere is no indication that the GC kicks in after patching (v0) https://issues.dlang.org/attachment.cgi?id=1868&action=diff I have a patch v1 in preparation which removes the wrappers entirely. BTW: There is a non-POSIX function execvpe in the process.d which is actually a GNU extension.[...] Freeing the memory is — in the "happy path" — neither required nor possible. When unhappy the GC is ready to clean up the mess. I uploaded a patch to the issue.It's actually fine to use GC, you are right. But use `GC.disable` before using it (with a `scope(exit)` to re-enable), because running a GC just before exec is also pointless.
Jan 31 2023
On Tuesday, 31 January 2023 at 15:29:35 UTC, Steven Schveighoffer wrote:On 1/30/23 12:56 PM, kdevel wrote:That leads directly to an avoidable allocation failure if there is no free memory but enough memory which could be reclaimed in order to allocate `argv_` (the array of pointers to C strings).On Monday, 30 January 2023 at 17:19:13 UTC, Steven Schveighoffer wrote:Using `GC.disable` ensures the GC will not run when you allocate memory.There is no indication that the GC kicks in after patching (v0) https://issues.dlang.org/attachment.cgi?id=1868&action=diff I have a patch v1 in preparation which removes the wrappers entirely. BTW: There is a non-POSIX function execvpe in the process.d which is actually a GNU extension.[...] Freeing the memory is — in the "happy path" — neither required nor possible. When unhappy the GC is ready to clean up the mess. I uploaded a patch to the issue.It's actually fine to use GC, you are right. But use `GC.disable` before using it (with a `scope(exit)` to re-enable), because running a GC just before exec is also pointless.Whether it runs or not is up to the memory allocator.That is the way how systems with GC appear to work since the sixties? Is there a "guideline" that Phobos functions shall **not** be implemented in plain vanilla D? I mean: There is little point in using a GC managed allocation when you have to switch the GC off every now and then.There is no guarantee it will run, so checking whether it did run is not conclusive.Noone declared the intent to implement a check if the GC ran.Running a collection just before replacing the entire image with another program isn't productive work.Can you quantify the likelihood of such incidents and the impact (performance, electrical power, money loss) of a GC not switched off before `execv*`?Just add: ```d GC.disable; scope(exit) GC.enable; ``` to the part where you are about to set up the call to `exec`To me there is no benefit of doing so. However, it makes the code more complicated and hence less readable.
Jan 31 2023
On 1/31/23 1:45 PM, kdevel wrote:On Tuesday, 31 January 2023 at 15:29:35 UTC, Steven Schveighoffer wrote:The GC will still run if it runs out of memory as a last-ditch effort to reclaim some memory. https://github.com/dlang/dmd/blob/b3129254c8e4c25043bc11f820e5d8ea323ac603/druntime/src/core/internal/gc/impl/conservative/gc.d#L1947 But even if this wasn't the case, this is such a rare occurrence to try to avoid that I think the pros of not running a collection in every other case is worth this drawback.Using `GC.disable` ensures the GC will not run when you allocate memory.That leads directly to an avoidable allocation failure if there is no free memory but enough memory which could be reclaimed in order to allocate `argv_` (the array of pointers to C strings).This isn't every now and then, this is literally right before the process is about to be obliterated, and all the memory the GC just reclaimed for us will immediately be reclaimed by the OS. It's the embodiment of wasting cycles. If we can help it, we should avoid it.Whether it runs or not is up to the memory allocator.That is the way how systems with GC appear to work since the sixties? Is there a "guideline" that Phobos functions shall **not** be implemented in plain vanilla D? I mean: There is little point in using a GC managed allocation when you have to switch the GC off every now and then.What are you talking about?Running a collection just before replacing the entire image with another program isn't productive work.Can you quantify the likelihood of such incidents and the impact (performance, electrical power, money loss) of a GC not switched off before `execv*`?If you like wasting CPU cycles for no gain, go ahead and see if it works for you. This isn't going to change anything incredibly significant. The memory corruption definitely should be fixed either way. I'm done with this thread, you don't seem interested in fruitful discussion. Do what you will. -SteveJust add: ```d GC.disable; scope(exit) GC.enable; ``` to the part where you are about to set up the call to `exec`To me there is no benefit of doing so. However, it makes the code more complicated and hence less readable.
Jan 31 2023