digitalmars.D - Question about D, garbage collection and fork()
- Jerry Quinn (4/4) Mar 09 2011 Where I work, we find it very useful to start a process, load data, then...
- Steven Schveighoffer (7/17) Mar 10 2011 Do you know what causes the OS to regard that memory as read-only? Sinc...
- Jerry Quinn (5/9) Mar 10 2011 It's not that the OS considers the memory actually read-only. It uses c...
- Steven Schveighoffer (13/26) Mar 10 2011 Some pages are made of bins of smaller blocks. For example, a page may ...
- Lionello Lunesu (20/24) Mar 10 2011 D's try-catch will catch all errors, even access violations and stack
- Vladimir Panteleev (6/8) Mar 10 2011 Only on Windows.
Where I work, we find it very useful to start a process, load data, then fork() to parallelize. Our data is large, such that we'd run out of memory trying to run a complete copy on each core. Once the process is loaded, we don't need that much writable memory, so fork is appealing to share the loaded pages. It's possible to use mmap for some of the data, but inconvenient for other data, even though it's read-only at runtime. So here's my question: In D, if I create a lot of data in the garbage-collected heap that will be read-only, then fork the process, will I get the benefit of the operating system's copy-on-write and only use a small amount of additional memory per process? In case you're wondering why I wouldn't use threading, one argument is that if you have a bug and the process crashes, you only lose one process instead of N threads. That's actually useful for robustness. Thoughts?
Mar 09 2011
On Wed, 09 Mar 2011 17:56:54 -0500, Jerry Quinn <jlquinn optonline.net> wrote:Where I work, we find it very useful to start a process, load data, then fork() to parallelize. Our data is large, such that we'd run out of memory trying to run a complete copy on each core. Once the process is loaded, we don't need that much writable memory, so fork is appealing to share the loaded pages. It's possible to use mmap for some of the data, but inconvenient for other data, even though it's read-only at runtime. So here's my question: In D, if I create a lot of data in the garbage-collected heap that will be read-only, then fork the process, will I get the benefit of the operating system's copy-on-write and only use a small amount of additional memory per process?Do you know what causes the OS to regard that memory as read-only? Since fork() is a C system call, and D gets its heap memory the same as any other unix process (brk()), I can't see why it wouldn't work. As long as you do the same thing you do in C, I think it will work. -Steve
Mar 10 2011
Steven Schveighoffer Wrote:Do you know what causes the OS to regard that memory as read-only? Since fork() is a C system call, and D gets its heap memory the same as any other unix process (brk()), I can't see why it wouldn't work. As long as you do the same thing you do in C, I think it will work.It's not that the OS considers the memory actually read-only. It uses copy-on-write so the pages will be shared between the processes until one or the other attempts to write to the page. So if the garbage collector moves things around, it will cause the pages to be copied and unshared. So my question is really probably whether the garbage collector will tend to dirty shared pages or not. Jerry
Mar 10 2011
On Thu, 10 Mar 2011 14:44:40 -0500, Jerry Quinn <jlquinn optonline.net> wrote:Steven Schveighoffer Wrote:Some pages are made of bins of smaller blocks. For example, a page may be a set of 16-byte blocks. In this case, it's entirely possible that both process-local and process-shared data can be in the same page. To get around this, allocate blocks of more than PAGESIZE/2 size. Then use those to contain your read-only data. The GC stores its metadata in separate pages than the actual data, so you don't have to worry about pages being dirtied by the GC (for example during garbage collection) even though the data is static. You also always have the ability to use C malloc if you prefer to avoid GC involvement. -SteveDo you know what causes the OS to regard that memory as read-only? Since fork() is a C system call, and D gets its heap memory the same as any other unix process (brk()), I can't see why it wouldn't work. As long as you do the same thing you do in C, I think it will work.It's not that the OS considers the memory actually read-only. It uses copy-on-write so the pages will be shared between the processes until one or the other attempts to write to the page. So if the garbage collector moves things around, it will cause the pages to be copied and unshared. So my question is really probably whether the garbage collector will tend to dirty shared pages or not.
Mar 10 2011
On 10-3-2011 6:56, Jerry Quinn wrote:Where I work, we find it very useful to start a process, load data, then fork() to parallelize. Our data is large, such that we'd run out of memory trying to run a complete copy on each core. Once the process is loaded, we don't need that much writable memory, so fork is appealing to share the loaded pages. It's possible to use mmap for some of the data, but inconvenient for other data, even though it's read-only at runtime. So here's my question: In D, if I create a lot of data in the garbage-collected heap that will be read-only, then fork the process, will I get the benefit of the operating system's copy-on-write and only use a small amount of additional memory per process? In case you're wondering why I wouldn't use threading, one argument is that if you have a bug and the process crashes, you only lose one process instead of N threads. That's actually useful for robustness. Thoughts?D's try-catch will catch all errors, even access violations and stack overflow: import std.stdio; void so() { so(); } void main() { try { so(); } catch { } writeln("graceful exit"); } By wrapping each thread's code in try-catch you can handle each thread going down. Of course, a thread can still corrupt the memory of another thread. To share memory between processes you'd have to use an OS specific API. On Windows you'd use a file mapping. L.
Mar 10 2011
On Thu, 10 Mar 2011 16:13:16 +0200, Lionello Lunesu <lio lunesu.remove.com> wrote:D's try-catch will catch all errors, even access violations and stack overflow:Only on Windows. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Mar 10 2011