digitalmars.D - std.concurrency and fibers
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (33/33) Oct 04 2012 Hi,
- Timon Gehr (2/31) Oct 04 2012 +1, but what about TLS?
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (12/51) Oct 04 2012 I think that no matter what we do, we have to simply say "don't do that"...
- Timon Gehr (6/54) Oct 04 2012 If it is not seamless, we have failed. IMO the runtime should expose an
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (17/76) Oct 04 2012 I suppose it could be done.
- Dmitry Olshansky (9/26) Oct 04 2012 Agreed.
- Sean Kelly (7/10) Oct 04 2012 This is another reason I've been delaying using fibers. The correct =
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (8/13) Oct 04 2012 I think we'd need compiler support to be able to do it in a reasonable
- Dmitry Olshansky (13/42) Oct 04 2012 Cool, but currently it's a leaky abstraction. For instance if task is
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (35/82) Oct 04 2012 Yeah, it's a problem all right. But we'll need compiler support for this...
- Dmitry Olshansky (30/73) Oct 05 2012 This just doesn't work though.
- Johannes Pfau (12/18) Oct 05 2012 We should probably do some analysis on the phobos source code to see if
- Jonathan M Davis (11/12) Oct 04 2012 std.concurrency is supposed to be designed such that it can be used for=
- Sean Kelly (20/25) Oct 04 2012 encourage people to use it instead of OS threads, which is great. =
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (9/16) Oct 04 2012 Mostly in that everything operates on Tids (as opposed to some opaque
- Sean Kelly (27/39) Oct 05 2012 ourage people to use it instead of OS threads, which is great. However, ...
- deadalnix (2/31) Oct 04 2012 Something I wonder for a while : why not run everything in fibers ?
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (9/48) Oct 04 2012 Because then we definitely need dynamic stack growth wired into both the...
Hi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking. B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!). C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea. All of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas? -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 04 2012
On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:Hi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking. B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!). C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea. All of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas?+1, but what about TLS?
Oct 04 2012
On 04-10-2012 14:11, Timon Gehr wrote:On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:I think that no matter what we do, we have to simply say "don't do that" to thread-local state (it would break in distributed scenarios too, for instance). Instead, I think we should do what the Rust folks did: Use *task*-local state and leave it up to std.concurrency to figure out how to deal with it. It won't be as 'seamless' as TLS variables in D of course, but I think it's good enough in practice. -- Alex Rønne Petersen alex lycus.org http://lycus.orgHi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking. B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!). C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea. All of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas?+1, but what about TLS?
Oct 04 2012
On 10/04/2012 02:22 PM, Alex Rønne Petersen wrote:On 04-10-2012 14:11, Timon Gehr wrote:If it is not seamless, we have failed. IMO the runtime should expose an interface for allocating TLS, switching between TLS instances and destroying TLS. What about the stack? Allocating a fixed-size stack per task is costly and Walter opposes dynamic stack growth.On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:I think that no matter what we do, we have to simply say "don't do that" to thread-local state (it would break in distributed scenarios too, for instance). Instead, I think we should do what the Rust folks did: Use *task*-local state and leave it up to std.concurrency to figure out how to deal with it. It won't be as 'seamless' as TLS variables in D of course, but I think it's good enough in practice.Hi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking. B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!). C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea. All of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas?+1, but what about TLS?
Oct 04 2012
On 04-10-2012 14:48, Timon Gehr wrote:On 10/04/2012 02:22 PM, Alex Rønne Petersen wrote:I suppose it could be done. But keep in mind the side-effects of an approach like this: Some thread-local variables (for instance, think 'chunk' inside emplace) would break (or at least behave very weirdly) if you switch the *entire* TLS context when entering a task. Sure, we could use the runtime interface for TLS switching only for task-local state, but then we're back to square one with it not being seamless.On 04-10-2012 14:11, Timon Gehr wrote:If it is not seamless, we have failed. IMO the runtime should expose an interface for allocating TLS, switching between TLS instances and destroying TLS.On 10/04/2012 01:32 PM, Alex Rønne Petersen wrote:I think that no matter what we do, we have to simply say "don't do that" to thread-local state (it would break in distributed scenarios too, for instance). Instead, I think we should do what the Rust folks did: Use *task*-local state and leave it up to std.concurrency to figure out how to deal with it. It won't be as 'seamless' as TLS variables in D of course, but I think it's good enough in practice.Hi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking. B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!). C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea. All of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas?+1, but what about TLS?What about the stack? Allocating a fixed-size stack per task is costly and Walter opposes dynamic stack growth.Yeah, I never understood why. It's essential for functional-style code running in constrained tasks. It's not just about conserving memory; it's to make recursion feasible. In any case, fibers currently allocate PAGE_SIZE * 4 bytes for stacks. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Oct 04 2012
On 04-Oct-12 16:48, Timon Gehr wrote:On 10/04/2012 02:22 PM, Alex Rønne Petersen wrote:[snip]On 04-10-2012 14:11, Timon Gehr wrote:Agreed.I think that no matter what we do, we have to simply say "don't do that" to thread-local state (it would break in distributed scenarios too, for instance). Instead, I think we should do what the Rust folks did: Use *task*-local state and leave it up to std.concurrency to figure out how to deal with it. It won't be as 'seamless' as TLS variables in D of course, but I think it's good enough in practice.If it is not seamless, we have failed. IMO the runtime should expose an interface for allocating TLS, switching between TLS instances and destroying TLS.What about the stack? Allocating a fixed-size stack per task is costly and Walter opposes dynamic stack growth.Allocating a fixed-size stack is costly only in terms of virtual address space. Then running out of address space is of concern on 32-bits only. On 64 bits you may as well allocate 1 Gb per task it will only get reserved if it's used. -- Dmitry Olshansky
Oct 04 2012
On Oct 4, 2012, at 5:48 AM, Timon Gehr <timon.gehr gmx.ch> wrote:=20 What about the stack? Allocating a fixed-size stack per task is costly and Walter opposes dynamic stack growth.This is another reason I've been delaying using fibers. The correct = approach is probably to go the distance by reserving a large block, = committing only a portion, and commit the rest dynamically as needed. = The current fiber implementation does have a guard page in some cases, = but doesn't go so far as to reserve/commit portions of a larger stack = space.=
Oct 04 2012
On 05-10-2012 01:34, Sean Kelly wrote:On Oct 4, 2012, at 5:48 AM, Timon Gehr <timon.gehr gmx.ch> wrote:I think we'd need compiler support to be able to do it in a reasonable way at all. Doing it via OS virtual memory hacks seems like a bad idea to me. -- Alex Rønne Petersen alex lycus.org http://lycus.orgWhat about the stack? Allocating a fixed-size stack per task is costly and Walter opposes dynamic stack growth.This is another reason I've been delaying using fibers. The correct approach is probably to go the distance by reserving a large block, committing only a portion, and commit the rest dynamically as needed. The current fiber implementation does have a guard page in some cases, but doesn't go so far as to reserve/commit portions of a larger stack space.
Oct 04 2012
On 04-Oct-12 15:32, Alex Rønne Petersen wrote:Hi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking.Cool, but currently it's a leaky abstraction. For instance if task is implemented with fibers static variables will be shared among threads. Essentially I think Fibers need TLS (or rather FLS) synced with language 'static' keyword. Otherwise the whole TLS by default is a useless chunk of machinery.B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!).Bleh.C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea.Seems a lot like A but with task defined to be a fiber. I'd prefer this. However then it needs a user-defined policy for distributing fibers across real threads (pools). Btw A is full of this too.All of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas?+1 -- Dmitry Olshansky
Oct 04 2012
On 04-10-2012 22:04, Dmitry Olshansky wrote:On 04-Oct-12 15:32, Alex Rønne Petersen wrote:Yeah, it's a problem all right. But we'll need compiler support for this stuff in any case. Can't help but wonder if it's really worth it. It seems to me like a simple AA-like API based on the typeid of data would be better -- as in, much more generic -- than trying to teach the compiler and runtime how to deal with this stuff. Think something like this: struct Data { int foo; float bar; } void myTask() { auto data = Data(42, 42.42f); TaskStore.save(data); // work ... foo(); // work ... } void foo() { auto data = TaskStore.load!Data(); // work ... } I admit, not as seamless as static variables, but a hell of a lot less magical.Hi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking.Cool, but currently it's a leaky abstraction. For instance if task is implemented with fibers static variables will be shared among threads. Essentially I think Fibers need TLS (or rather FLS) synced with language 'static' keyword. Otherwise the whole TLS by default is a useless chunk of machinery.By choosing C we effectively give up any hope of distributed tasks and especially if we have a scheduler API. Is that really a good idea in this day and age?B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!).Bleh.C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea.Seems a lot like A but with task defined to be a fiber. I'd prefer this. However then it needs a user-defined policy for distributing fibers across real threads (pools). Btw A is full of this too.-- Alex Rønne Petersen alex lycus.org http://lycus.orgAll of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas?+1
Oct 04 2012
On 05-Oct-12 08:27, Alex Rønne Petersen wrote:On 04-10-2012 22:04, Dmitry Olshansky wrote:This just doesn't work though. The true problem is not in the code you as a programmer doing distibuted stuff do. It's library writers that typically use TLS for some persistent state inside module and D currently makes it easy and transparent just like in the old non-MT days but for threads ONLY. Now having them all pack their stuff and go about fixing globals to TaskStore.store/.load is not realistic and down right horrible. Currently I suspect w.r.t. Fibers all that works is based on conventions & luck. One problem with making everything FLS is that cost becomes darn high. On the other hand Fibers are yielded only manually (+scheduler now? probably on recive/send etc.) and a lot of things can be "fiber-safe" as is. Also it seems like for this to work we need not only a scheduler but reworked libraries that are fiber-aware (so they don't block on I/O etc.). See e.g. vibe.d.Cool, but currently it's a leaky abstraction. For instance if task is implemented with fibers static variables will be shared among threads. Essentially I think Fibers need TLS (or rather FLS) synced with language 'static' keyword. Otherwise the whole TLS by default is a useless chunk of machinery.Yeah, it's a problem all right. But we'll need compiler support for this stuff in any case. Can't help but wonder if it's really worth it. It seems to me like a simple AA-like API based on the typeid of data would be better -- as in, much more generic than trying to teach the compiler and runtime how to deal with this stuff. Think something like this: struct Data { int foo; float bar; } void myTask() { auto data = Data(42, 42.42f); TaskStore.save(data); // work ... foo(); // work ... } void foo() { auto data = TaskStore.load!Data(); // work ... } I admit, not as seamless as static variables, but a hell of a lot less magical.Why? Remote fibers should go for a distributed tasks. Like I said just make Fiber == task. As long as there is a suitable protocol for communication it's all right. I'm insisting on fiber as a task as this makes for simpler logic of message passing. And scheduler is still inevitable as fibers wait for messages and are multiplexed on only as many threads. I just don't see any other abstraction you want to put in place of task. It should be self-contained persistent 'worker' so that message passing works transparently. -- Dmitry OlshanskyBy choosing C we effectively give up any hope of distributed tasks and especially if we have a scheduler API. Is that really a good idea in this day and age?C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea.Seems a lot like A but with task defined to be a fiber. I'd prefer this. However then it needs a user-defined policy for distributing fibers across real threads (pools). Btw A is full of this too.
Oct 05 2012
Am Fri, 05 Oct 2012 12:58:18 +0400 schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:The true problem is not in the code you as a programmer doing distibuted stuff do. It's library writers that typically use TLS for some persistent state inside module and D currently makes it easy and transparent just like in the old non-MT days but for threads ONLY.We should probably do some analysis on the phobos source code to see if this really is the case. I thought TLS is mainly used to avoid threading issues, which works for Fibers. Things like the thread local RNG generator variables work fine with usual TLS and even if the Fiber is passed between different threads, this still works well. I think we'd only have problems with APIs which leave TLS variables in an inconsistent state between calls to functions. But I always though such behavior doesn't fit TLS variables well and should be abstracted into a struct+member variable as state. In the end, isn't 'global TLS' state just as bad as global state in C and should be avoided?
Oct 05 2012
On Thursday, October 04, 2012 13:32:01 Alex R=C3=B8nne Petersen wrote:Thoughts? Other ideas?std.concurrency is supposed to be designed such that it can be used for= more=20 than just threads (e.g. sending messages across the network), so if it = needs=20 to be adjusted to accomodate that, then we should do so, but we need to= be=20 careful to do it in a way that minimizes code breakage as much as reaso= nably=20 possible. - Jonathan M Davis
Oct 04 2012
On Oct 4, 2012, at 4:32 AM, Alex R=F8nne Petersen <alex lycus.org> = wrote:Hi, =20 We currently have std.concurrency as a message-passing mechanism. We =encourage people to use it instead of OS threads, which is great. = However, what is *not* great is that spawned tasks correspond 1:1 to OS = threads. This is not even remotely scalable for Erlang-style = concurrency. There's a fairly simple way to fix that: Fibers.=20 The only problem with adding fiber support to std.concurrency is that =the interface is just not flexible enough. The current interface is = completely and entirely tied to the notion of threads (contrary to what = its module description says). How is the interface tied to the notion of threads? I had hoped to = design it with the underlying concurrency mechanism completely = abstracted. The most significant reason that fibers aren't used behind = the scenes today is because the default storage class of static data is = thread-local, and this would really have to be made fiber-local. I'm = reasonably certain this could be done and have considered going so far = as to make the main thread in D a fiber, but the implementation is = definitely non-trivial and will probably be slower than the built-in TLS = mechanism as well. So consider the current std.concurrency = implementation to be a prototype. I'd also like to add interprocess = messaging, but that will be another big task.=
Oct 04 2012
On 05-10-2012 01:30, Sean Kelly wrote:On Oct 4, 2012, at 4:32 AM, Alex Rønne Petersen <alex lycus.org> wrote:Mostly in that everything operates on Tids (as opposed to some opaque Cid type) and, as you mentioned, TLS. The problem is basically that people have gotten used to std.concurrency always using OS threads due to subtle things like that from day one. -- Alex Rønne Petersen alex lycus.org http://lycus.orgHi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says).How is the interface tied to the notion of threads? I had hoped to design it with the underlying concurrency mechanism completely abstracted. The most significant reason that fibers aren't used behind the scenes today is because the default storage class of static data is thread-local, and this would really have to be made fiber-local. I'm reasonably certain this could be done and have considered going so far as to make the main thread in D a fiber, but the implementation is definitely non-trivial and will probably be slower than the built-in TLS mechanism as well. So consider the current std.concurrency implementation to be a prototype. I'd also like to add interprocess messaging, but that will be another big task.
Oct 04 2012
On Oct 4, 2012, at 9:18 PM, Alex R=C3=B8nne Petersen <alex lycus.org> wrote:=On 05-10-2012 01:30, Sean Kelly wrote:te:On Oct 4, 2012, at 4:32 AM, Alex R=C3=B8nne Petersen <alex lycus.org> wro=ourage people to use it instead of OS threads, which is great. However, what= is *not* great is that spawned tasks correspond 1:1 to OS threads. This is n= ot even remotely scalable for Erlang-style concurrency. There's a fairly sim= ple way to fix that: Fibers.=20Hi, =20 We currently have std.concurrency as a message-passing mechanism. We enc=e interface is just not flexible enough. The current interface is completely= and entirely tied to the notion of threads (contrary to what its module des= cription says).=20 The only problem with adding fiber support to std.concurrency is that th=n it with the underlying concurrency mechanism completely abstracted. The m= ost significant reason that fibers aren't used behind the scenes today is be= cause the default storage class of static data is thread-local, and this wou= ld really have to be made fiber-local. I'm reasonably certain this could be= done and have considered going so far as to make the main thread in D a fib= er, but the implementation is definitely non-trivial and will probably be sl= ower than the built-in TLS mechanism as well. So consider the current std.c= oncurrency implementation to be a prototype. I'd also like to add interproc= ess messaging, but that will be another big task.=20 How is the interface tied to the notion of threads? I had hoped to desig==20 Mostly in that everything operates on Tids (as opposed to some opaque Cid t=ype) and, as you mentioned, TLS. The problem is basically that people have g= otten used to std.concurrency always using OS threads due to subtle things l= ike that from day one. A Tid is a Cid and in the first iteration I actually named it Cid and was as= ked to change it. Tid seems reasonable since it represents a logical thread= anyway. It just may not actually be a kernel thread. I think we have to mak= e TLS work for fibers or using them isn't an option. It would be ridiculous t= o say "D has this cool new idea about statics but you can't use it if you're= using the standard concurrency package."=
Oct 05 2012
Le 04/10/2012 13:32, Alex Rønne Petersen a écrit :Hi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking. B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!). C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea. All of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas?Something I wonder for a while : why not run everything in fibers ?
Oct 04 2012
On 05-10-2012 04:14, deadalnix wrote:Le 04/10/2012 13:32, Alex Rønne Petersen a écrit :Because then we definitely need dynamic stack growth wired into both the compiler and the runtime. Not impossible, but there's a *lot* of effort required (and convincing, in Walter's case). -- Alex Rønne Petersen alex lycus.org http://lycus.orgHi, We currently have std.concurrency as a message-passing mechanism. We encourage people to use it instead of OS threads, which is great. However, what is *not* great is that spawned tasks correspond 1:1 to OS threads. This is not even remotely scalable for Erlang-style concurrency. There's a fairly simple way to fix that: Fibers. The only problem with adding fiber support to std.concurrency is that the interface is just not flexible enough. The current interface is completely and entirely tied to the notion of threads (contrary to what its module description says). Now, I see a number of ways we can fix this: A) We completely get rid of the notion of threads and instead simply speak of 'tasks'. This trivially allows us to use threads, fibers, whatever to back the module. I personally think this is the best way to build a message-passing abstraction because it gives enough transparency to *actually* distribute tasks across machines without things breaking. B) We make the module capable of backing tasks with both threads and fibers, and expose an interface that allows the user to choose what kind of task is spawned. I'm *not* convinced this is a good approach because it's extremely error-prone (imagine doing a thread-based receive inside a fiber-based task!). C) We just swap out threads with fibers and document that the module uses fibers. See my comments in A for why I'm not sure this is a good idea. All of these are going to break code in one way or another - that's unavoidable. But we really need to make std.concurrency grow up; other languages (Erlang, Rust, Go, ...) have had micro-threads (in some form) for years, and if we want D to be seriously usable for large-scale concurrency, we need to have them too. Thoughts? Other ideas?Something I wonder for a while : why not run everything in fibers ?
Oct 04 2012