digitalmars.D.learn - core.thread.Fiber --- runtime stack overflow unlike goroutines
- Carl Sturtivant (20/20) Aug 14 2014 The default size of the runtime stack for a Fiber is 4*PAGESIZE
- Yuriy (7/28) Aug 14 2014 Infinite stack comes with a cost. Every function call first
- Sean Kelly (3/3) Aug 14 2014 On 64 bit, reserve a huge chunk of memory, set a SEGV handler and
- Dicebot (8/12) Aug 14 2014 I think using some sort of thread-local shared heap pool is
- Kagamin (4/8) Aug 15 2014 AFAIK, OS already provides this transparently: when you allocate
- Kagamin (1/7) Aug 15 2014
- Sean Kelly (3/10) Aug 15 2014 Oh handy, so there's basically no work to be done on Windows.
- Kagamin (3/5) Aug 15 2014 I heard, calloc behaves this way on linux (COW blank page mapped
- Kagamin (2/8) Aug 15 2014 Another abstraction for core.vmm.
- Sean Kelly (6/16) Aug 15 2014 It sounds like mmap (typically) works the same way on Linux, and
- Carl Sturtivant (4/8) Aug 15 2014 Very nice; the hardware VM machinery takes care of it and there's
- Brad Anderson (7/28) Aug 14 2014 Segmented stacks come with a cost. Rust abandoned them for
- Kagamin (3/7) Aug 15 2014 Pass a bigger stack size to the Fiber constructor?
- Dicebot (3/10) Aug 15 2014 Won't that kind of kill the purpose of Fiber as low-cost context
- Kagamin (2/5) Aug 15 2014 I didn't measure it.
- Sean Kelly (9/12) Aug 15 2014 As long as allocation speed is fast for large allocs (which I
- Sean Kelly (10/10) Aug 15 2014 At least on OSX, it appears that mapping memory is constant time
- Martin Nowak (14/23) Sep 21 2014 While pages are committed lazily it's not free to map more memory.
- Dicebot (4/17) Aug 15 2014 No, I was referring to the proposal to supply bigger stack size
- Sean Kelly (7/10) Aug 15 2014 I thought it did, but apparently the behavior of VirtualAlloc and
- Dicebot (2/14) Aug 15 2014 Guess that means some experimenting this weekend for me too :)
- Dicebot (22/37) Aug 22 2014 Looks like it indeed just magically works. I have used this
- Carl Sturtivant (4/16) Aug 15 2014 Just read this after posting earlier replies! Very exciting.
- Sean Kelly (4/15) Aug 15 2014 I'm sure it works the same, but reserving large chunks of memory
- Carl Sturtivant (3/10) Aug 15 2014 No good if the stack size needed depends dynamically on the
- Carl Sturtivant (4/15) Aug 15 2014 Should have read further down the thread --- you're right as the
- ketmar via Digitalmars-d-learn (5/8) Aug 15 2014 On Fri, 15 Aug 2014 20:19:18 +0000
- Kagamin (4/7) Aug 16 2014 You should only choose stack size carefully or keep data in
- ketmar via Digitalmars-d-learn (5/7) Aug 16 2014 On Sat, 16 Aug 2014 17:49:28 +0000
- Martin Nowak (4/6) Sep 21 2014 Simply choose a big enough stack size when creating your fibers.
The default size of the runtime stack for a Fiber is 4*PAGESIZE which is very small, and a quick test shows that a Fiber suffers a stack overflow that doesn't lead to a clean termination when this limit is exceeded. This makes it difficult to simulate deterministic alternation where the stack size needed is unpredictable because complex deterministic computations are going on inside Fibers. In contrast, the Go programming language's goroutines can extend their stacks as needed at runtime, and so can be used to simulate deterministic alternation without this limitation, and yet be initially executed with each having only a small stack size. There seems to be a claim that all that's needed to add D-routines (goroutines for D) is a scheduler and a Channel type, on top of Fiber. http://forum.dlang.org/thread/lphnen$1ml7$1 digitalmars.com See the initial post, point 7., as well as supporting remarks in later replies. Am I missing something? Is there a clean and simple way to get Fiber to no longer suffer a stack overflow when implementing D-routines?
Aug 14 2014
Infinite stack comes with a cost. Every function call first checks the state of the stack. This should be done with the help of compiler. And such a "tradeoff" could scare bare-metal programmers away from D. Though maybe there's a chance of making this stack check controllable, but not that i can think of. On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant wrote:The default size of the runtime stack for a Fiber is 4*PAGESIZE which is very small, and a quick test shows that a Fiber suffers a stack overflow that doesn't lead to a clean termination when this limit is exceeded. This makes it difficult to simulate deterministic alternation where the stack size needed is unpredictable because complex deterministic computations are going on inside Fibers. In contrast, the Go programming language's goroutines can extend their stacks as needed at runtime, and so can be used to simulate deterministic alternation without this limitation, and yet be initially executed with each having only a small stack size. There seems to be a claim that all that's needed to add D-routines (goroutines for D) is a scheduler and a Channel type, on top of Fiber. http://forum.dlang.org/thread/lphnen$1ml7$1 digitalmars.com See the initial post, point 7., as well as supporting remarks in later replies. Am I missing something? Is there a clean and simple way to get Fiber to no longer suffer a stack overflow when implementing D-routines?
Aug 14 2014
On 64 bit, reserve a huge chunk of memory, set a SEGV handler and commit more as needed. Basically how kernel thread stacks work. I've been meaning to do this but haven't gotten around to it yet.
Aug 14 2014
On Thursday, 14 August 2014 at 18:52:00 UTC, Sean Kelly wrote:On 64 bit, reserve a huge chunk of memory, set a SEGV handler and commit more as needed. Basically how kernel thread stacks work. I've been meaning to do this but haven't gotten around to it yet.I think using some sort of thread-local shared heap pool is better approach in general as it does need any SEGV handling overhead and simplifies fiber implementation (all fibers are guaranteed to take fixed amount of memory). It is not a silver bullet and forces you to think much more about application memory layout but I believe this is much better approach for high performance services than segmented stack like in Go.
Aug 14 2014
On Thursday, 14 August 2014 at 18:52:00 UTC, Sean Kelly wrote:On 64 bit, reserve a huge chunk of memory, set a SEGV handler and commit more as needed. Basically how kernel thread stacks work. I've been meaning to do this but haven't gotten around to it yet.AFAIK, OS already provides this transparently: when you allocate memory, OS only reserves the memory range and commits pages when they are accessed.
Aug 15 2014
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspxAllocates memory charges (from the overall size of memory and the paging files on disk) for the specified reserved memory pages. The function also guarantees that when the caller later initially accesses the memory, the contents will be zero. Actual physical pages are not allocated unless/until the virtual addresses are actually accessed.
Aug 15 2014
On Friday, 15 August 2014 at 08:36:34 UTC, Kagamin wrote:http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspxOh handy, so there's basically no work to be done on Windows. I'll have to check the behavior of mmap on Posix.Allocates memory charges (from the overall size of memory and the paging files on disk) for the specified reserved memory pages. The function also guarantees that when the caller later initially accesses the memory, the contents will be zero. Actual physical pages are not allocated unless/until the virtual addresses are actually accessed.
Aug 15 2014
On Friday, 15 August 2014 at 14:26:28 UTC, Sean Kelly wrote:Oh handy, so there's basically no work to be done on Windows. I'll have to check the behavior of mmap on Posix.I heard, calloc behaves this way on linux (COW blank page mapped to the entire range), it was discussed here some time ago.
Aug 15 2014
On Friday, 15 August 2014 at 14:28:34 UTC, Kagamin wrote:On Friday, 15 August 2014 at 14:26:28 UTC, Sean Kelly wrote:Another abstraction for core.vmm.Oh handy, so there's basically no work to be done on Windows. I'll have to check the behavior of mmap on Posix.I heard, calloc behaves this way on linux (COW blank page mapped to the entire range), it was discussed here some time ago.
Aug 15 2014
On Friday, 15 August 2014 at 14:26:28 UTC, Sean Kelly wrote:On Friday, 15 August 2014 at 08:36:34 UTC, Kagamin wrote:It sounds like mmap (typically) works the same way on Linux, and that malloc generally does as well. I'll have to test this to be sure. If so, and if doing so is fast, I'm going to increase the default stack size on 64-bit systems to something reasonably large. And here I thought I would have to do this all manually.http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspxOh handy, so there's basically no work to be done on Windows. I'll have to check the behavior of mmap on Posix.Allocates memory charges (from the overall size of memory and the paging files on disk) for the specified reserved memory pages. The function also guarantees that when the caller later initially accesses the memory, the contents will be zero. Actual physical pages are not allocated unless/until the virtual addresses are actually accessed.
Aug 15 2014
On Thursday, 14 August 2014 at 18:52:00 UTC, Sean Kelly wrote:On 64 bit, reserve a huge chunk of memory, set a SEGV handler and commit more as needed. Basically how kernel thread stacks work. I've been meaning to do this but haven't gotten around to it yet.Very nice; the hardware VM machinery takes care of it and there's only overhead when overflow occurs. D's Fibers will really be something remarkable for user-space with this facility.
Aug 15 2014
On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant wrote:The default size of the runtime stack for a Fiber is 4*PAGESIZE which is very small, and a quick test shows that a Fiber suffers a stack overflow that doesn't lead to a clean termination when this limit is exceeded. This makes it difficult to simulate deterministic alternation where the stack size needed is unpredictable because complex deterministic computations are going on inside Fibers. In contrast, the Go programming language's goroutines can extend their stacks as needed at runtime, and so can be used to simulate deterministic alternation without this limitation, and yet be initially executed with each having only a small stack size. There seems to be a claim that all that's needed to add D-routines (goroutines for D) is a scheduler and a Channel type, on top of Fiber. http://forum.dlang.org/thread/lphnen$1ml7$1 digitalmars.com See the initial post, point 7., as well as supporting remarks in later replies. Am I missing something? Is there a clean and simple way to get Fiber to no longer suffer a stack overflow when implementing D-routines?Segmented stacks come with a cost. Rust abandoned them for reasons you can read about here: https://mail.mozilla.org/pipermail/rust-dev/2013-November/006314.html I believe Go has taken steps to help mitigate stack thrashing but I don't know if they have been successful yet.
Aug 14 2014
On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant wrote:The default size of the runtime stack for a Fiber is 4*PAGESIZE which is very small, and a quick test shows that a Fiber suffers a stack overflow that doesn't lead to a clean termination when this limit is exceeded.Pass a bigger stack size to the Fiber constructor?
Aug 15 2014
On Friday, 15 August 2014 at 08:41:30 UTC, Kagamin wrote:On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant wrote:Won't that kind of kill the purpose of Fiber as low-cost context abstraction? Stack size does add up for thousands of fibers.The default size of the runtime stack for a Fiber is 4*PAGESIZE which is very small, and a quick test shows that a Fiber suffers a stack overflow that doesn't lead to a clean termination when this limit is exceeded.Pass a bigger stack size to the Fiber constructor?
Aug 15 2014
On Friday, 15 August 2014 at 14:28:34 UTC, Dicebot wrote:Won't that kind of kill the purpose of Fiber as low-cost context abstraction? Stack size does add up for thousands of fibers.I didn't measure it.
Aug 15 2014
On Friday, 15 August 2014 at 14:28:34 UTC, Dicebot wrote:Won't that kind of kill the purpose of Fiber as low-cost context abstraction? Stack size does add up for thousands of fibers.As long as allocation speed is fast for large allocs (which I have to test), I want to change the default size to be very large. The virtual address space in a 64-bit app is enormous, and if the memory is committed on demand then physical memory use should only match what the user actually requires. This should allow us to create millions of fibers and not overrun system memory, and also not worry about stack overruns, which has always been a concern with the default fiber stack size.
Aug 15 2014
At least on OSX, it appears that mapping memory is constant time regardless of size, but there is some max total memory I'm allowed to map, presumably based on the size of a vmm lookup tabe. The max block size I can allocate is 1 GB, and I can allocate roughly 131,000 of these blocks before getting an out of memory error. If I reduce the block size to 4 MB I can allocate more than 10M blocks without error. I think some default stack size around 4 MB seems about right. Increasing the size to 16 MB failed after about 800,000 allocations, which isn't enough (potential) fibers.
Aug 15 2014
On 08/15/2014 05:09 PM, Sean Kelly wrote:At least on OSX, it appears that mapping memory is constant time regardless of size, but there is some max total memory I'm allowed to map, presumably based on the size of a vmm lookup tabe. The max block size I can allocate is 1 GB, and I can allocate roughly 131,000 of these blocks before getting an out of memory error. If I reduce the block size to 4 MB I can allocate more than 10M blocks without error. I think some default stack size around 4 MB seems about right. Increasing the size to 16 MB failed after about 800,000 allocations, which isn't enough (potential) fibers.While pages are committed lazily it's not free to map more memory. For example the kernel will have to allocate and setup more page table entries. Furthermore we might need to use MAP_NORESERVE to not run into overcommit limitations, but that's not available on all OSes I think. It might make sense to raise the stack size again [1] if we had an idea what size would avoid common issues (vibe.d uses 64kB). 4MB is too much for the default size IMO. Given that the stack size is configurable let's instead try to improve the diagnostic of Fiber stack overflows first, we should also map a guard page on posix and add a segfault handler to detect overflows. [1]: https://github.com/D-Programming-Language/druntime/commit/9dca50fe65402fb3cdfbb689f1aca58dc835dce4#diff-8bb12ed976acf0a5132e877ec5a01ea8R3163
Sep 21 2014
On Friday, 15 August 2014 at 14:45:02 UTC, Sean Kelly wrote:On Friday, 15 August 2014 at 14:28:34 UTC, Dicebot wrote:No, I was referring to the proposal to supply bigger stack size to Fiber constructor - AFAIR it currently does allocate that memory eagerly (and does not use any OS CoW tools), doesn't it?Won't that kind of kill the purpose of Fiber as low-cost context abstraction? Stack size does add up for thousands of fibers.As long as allocation speed is fast for large allocs (which I have to test), I want to change the default size to be very large. The virtual address space in a 64-bit app is enormous, and if the memory is committed on demand then physical memory use should only match what the user actually requires. This should allow us to create millions of fibers and not overrun system memory, and also not worry about stack overruns, which has always been a concern with the default fiber stack size.
Aug 15 2014
On Friday, 15 August 2014 at 15:25:23 UTC, Dicebot wrote:No, I was referring to the proposal to supply bigger stack size to Fiber constructor - AFAIR it currently does allocate that memory eagerly (and does not use any OS CoW tools), doesn't it?I thought it did, but apparently the behavior of VirtualAlloc and mmap (which Fiber uses to allocate the stack) simply reserves the range and then commits it lazily, even though what you've told it to do is allocate the memory. This is really great news since it means that no code changes will be required to do the thing I wanted to do anyway.
Aug 15 2014
On Friday, 15 August 2014 at 15:40:35 UTC, Sean Kelly wrote:On Friday, 15 August 2014 at 15:25:23 UTC, Dicebot wrote:Guess that means some experimenting this weekend for me too :)No, I was referring to the proposal to supply bigger stack size to Fiber constructor - AFAIR it currently does allocate that memory eagerly (and does not use any OS CoW tools), doesn't it?I thought it did, but apparently the behavior of VirtualAlloc and mmap (which Fiber uses to allocate the stack) simply reserves the range and then commits it lazily, even though what you've told it to do is allocate the memory. This is really great news since it means that no code changes will be required to do the thing I wanted to do anyway.
Aug 15 2014
On Friday, 15 August 2014 at 15:50:30 UTC, Dicebot wrote:On Friday, 15 August 2014 at 15:40:35 UTC, Sean Kelly wrote:Looks like it indeed just magically works. I have used this simple program: import core.thread; void foo() { // int[10240] array; Fiber.yield(); for (;;) {} } void main() { foreach (_ ; 0 .. 10_000) { //auto fiber = new Fiber(&foo); auto fiber = new Fiber(&foo, 100_000); fiber.call(); } } And checked peak memory profile via `time --format="%M"`. Fiber constructor argument size did not change memory consumption at all, only changing size of fiber stack variables did.On Friday, 15 August 2014 at 15:25:23 UTC, Dicebot wrote:Guess that means some experimenting this weekend for me too :)No, I was referring to the proposal to supply bigger stack size to Fiber constructor - AFAIR it currently does allocate that memory eagerly (and does not use any OS CoW tools), doesn't it?I thought it did, but apparently the behavior of VirtualAlloc and mmap (which Fiber uses to allocate the stack) simply reserves the range and then commits it lazily, even though what you've told it to do is allocate the memory. This is really great news since it means that no code changes will be required to do the thing I wanted to do anyway.
Aug 22 2014
On Friday, 15 August 2014 at 15:40:35 UTC, Sean Kelly wrote:On Friday, 15 August 2014 at 15:25:23 UTC, Dicebot wrote:Just read this after posting earlier replies! Very exciting. I'll be doing some experiments to see how this works out. What about at 32-bits?No, I was referring to the proposal to supply bigger stack size to Fiber constructor - AFAIR it currently does allocate that memory eagerly (and does not use any OS CoW tools), doesn't it?I thought it did, but apparently the behavior of VirtualAlloc and mmap (which Fiber uses to allocate the stack) simply reserves the range and then commits it lazily, even though what you've told it to do is allocate the memory. This is really great news since it means that no code changes will be required to do the thing I wanted to do anyway.
Aug 15 2014
On Friday, 15 August 2014 at 20:17:51 UTC, Carl Sturtivant wrote:On Friday, 15 August 2014 at 15:40:35 UTC, Sean Kelly wrote:I'm sure it works the same, but reserving large chunks of memory there would eat up the address space. I think the default will have to remain some reasonably low number on 32-bit.I thought it did, but apparently the behavior of VirtualAlloc and mmap (which Fiber uses to allocate the stack) simply reserves the range and then commits it lazily, even though what you've told it to do is allocate the memory. This is really great news since it means that no code changes will be required to do the thing I wanted to do anyway.Just read this after posting earlier replies! Very exciting. I'll be doing some experiments to see how this works out. What about at 32-bits?
Aug 15 2014
On Friday, 15 August 2014 at 08:41:30 UTC, Kagamin wrote:On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant wrote:No good if the stack size needed depends dynamically on the computation in that Fiber.The default size of the runtime stack for a Fiber is 4*PAGESIZE which is very small, and a quick test shows that a Fiber suffers a stack overflow that doesn't lead to a clean termination when this limit is exceeded.Pass a bigger stack size to the Fiber constructor?
Aug 15 2014
On Friday, 15 August 2014 at 20:11:43 UTC, Carl Sturtivant wrote:On Friday, 15 August 2014 at 08:41:30 UTC, Kagamin wrote:Should have read further down the thread --- you're right as the memory is in effect merely reserved virtual memory and isn't actually allocated.On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant wrote:No good if the stack size needed depends dynamically on the computation in that Fiber.The default size of the runtime stack for a Fiber is 4*PAGESIZE which is very small, and a quick test shows that a Fiber suffers a stack overflow that doesn't lead to a clean termination when this limit is exceeded.Pass a bigger stack size to the Fiber constructor?
Aug 15 2014
On Fri, 15 Aug 2014 20:19:18 +0000 Carl Sturtivant via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:Should have read further down the thread --- you're right as the=20 memory is in effect merely reserved virtual memory and isn't=20 actually allocated.and we -- 32-bit addicts -- will run out of address space while 64-bit happy people will laugh looking at us. ;-)
Aug 15 2014
On Friday, 15 August 2014 at 22:26:54 UTC, ketmar via Digitalmars-d-learn wrote:and we -- 32-bit addicts -- will run out of address space while 64-bit happy people will laugh looking at us. ;-)You should only choose stack size carefully or keep data in TempAlloc instead of stack.
Aug 16 2014
On Sat, 16 Aug 2014 17:49:28 +0000 Kagamin via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:You should only choose stack size carefully or keep data in=20 TempAlloc instead of stack.but i can choose stack size carefully without such 'address reserving' feature. ;-)
Aug 16 2014
Am I missing something? Is there a clean and simple way to get Fiber to no longer suffer a stack overflow when implementing D-routines?Simply choose a big enough stack size when creating your fibers. http://dlang.org/library/core/thread/Fiber.this.html It's fairly cheap to use a really big stack like 4MB, because memory pages are committed lazily by the OS.
Sep 21 2014