digitalmars.D - Go's march to low-latency GC
- Enamex (2/2) Jul 07 2016 https://news.ycombinator.com/item?id=12042198
- ikod (5/16) Jul 07 2016 Correct me if I'm wrong, but in D fibers allocate stack
- Martin Nowak (4/8) Jul 08 2016 Fiber stacks are just mapped virtual memory pages that the kernel only
- ikod (9/18) Jul 08 2016 But the size of fiber stack is fixed? When we call Fiber
- Dicebot (3/6) Jul 09 2016 Nope, this is exactly the point. You can demand crazy 10 MB of stack for
- ikod (2/10) Jul 09 2016 Thanks, nice to know.
- Sergey Podobry (5/13) Jul 10 2016 Remember that virtual address space is limited on 32-bit
- Dicebot (6/14) Jul 11 2016 Sorry, but someone who tries to run highly concurrent server
- Sergey Podobry (2/13) Jul 11 2016 Agreed. I don't know why golang guys bother about it.
- Russel Winder via Digitalmars-d (14/29) Jul 11 2016 Maybe because they are developing a language for the 1980s?
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/5) Jul 11 2016 It is quite common for web services to run with less than 1GB.
- deadalnix (5/8) Jul 11 2016 Because they have nothing else to propose than massive goroutine
- Patrick Schluter (8/22) Jul 11 2016 Because of attitudes like shown in that thread
- jmh530 (5/12) Jul 11 2016 Why can't you use both 32bit and 64bit pointers when compiling
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/17) Jul 11 2016 You can, but OSes usually give you randomized memory layout as a
- jmh530 (5/7) Jul 12 2016 What if the memory allocation scheme were something like:
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/25) Jul 12 2016 One possible technique is to use contiguous "unmapped" memory
- deadalnix (2/10) Jul 12 2016 There is a mmap flag for this on linux.
- Kagamin (2/4) Jul 12 2016 And what's with address space?
- Chris Wright (6/15) Jul 09 2016 The downside is that it's difficult to release that memory. On the other...
- Andrei Alexandrescu (7/9) Jul 09 2016 A very nice article and success story. We've had similar stories with
- bob belcher (3/14) Jul 09 2016 kickstarter for improve gc :)
- Martin Nowak (22/32) Jul 09 2016 Exactly, how someone can run a big site with 2 second pauses in
- Andrei Alexandrescu (3/6) Jul 09 2016 Yah, I was thinking in a more general sense. Plenty of improvements of
- Martin Nowak (13/15) Jul 10 2016 Yes, but hardly anything that would allow us to do partial
- Dejan Lekic (11/13) Jul 09 2016 I humbly believe it is not just about amassing experts, but also
- ZombineDev (4/17) Jul 09 2016 https://github.com/dlang/druntime/blob/master/src/gc/gcinterface.d
- Dejan Lekic (5/9) Sep 26 2016 That is actually the only case that I know of that an interface
- Istvan Dobos (10/21) Jul 14 2016 Hello Andrei,
- thedeemon (7/10) Jul 16 2016 This requires drastically changing 99% of the language and it's
- The D dude (9/19) Jul 16 2016 Yes that's the case for Rust, but no one has proven yet that an
https://news.ycombinator.com/item?id=12042198 ^ reposting a link in the right place.
Jul 07 2016
On Thursday, 7 July 2016 at 22:36:29 UTC, Enamex wrote:https://news.ycombinator.com/item?id=12042198 ^ reposting a link in the right place.While a program using 10,000 OS threads might perform poorly, that number of goroutines is nothing unusual. One difference is that a goroutine starts with a very small stack — only 2kB — which grows as needed, contrasted with the large fixed-size stacks that are common elsewhere. Go’s function call preamble makes sure there’s enough stack space for the next call, and if not will move the goroutine’s stack to a larger memory area — rewriting pointers as needed — before allowing the call to continue.Correct me if I'm wrong, but in D fibers allocate stack statically, so we have to preallocate large stacks. If yes - can we allocate stack frames on demand from some non-GC area?
Jul 07 2016
On 07/08/2016 07:45 AM, ikod wrote:Correct me if I'm wrong, but in D fibers allocate stack statically, so we have to preallocate large stacks. If yes - can we allocate stack frames on demand from some non-GC area?Fiber stacks are just mapped virtual memory pages that the kernel only backs with physical memory when they're actually used. So they already are allocated on demand.
Jul 08 2016
On Friday, 8 July 2016 at 20:35:05 UTC, Martin Nowak wrote:On 07/08/2016 07:45 AM, ikod wrote:But the size of fiber stack is fixed? When we call Fiber constructor, the second parameter for ctor is stack size. If I made a wrong guess and ask for too small stack then programm may crash. If I ask for too large stack then I probably waste resources. So, it would be nice if programmer will not forced to make any wrong decisions about fiber's stack size. Or maybe I'm wrong and I shouldn't care about stack size when I create new fiber?Correct me if I'm wrong, but in D fibers allocate stack statically, so we have to preallocate large stacks. If yes - can we allocate stack frames on demand from some non-GC area?Fiber stacks are just mapped virtual memory pages that the kernel only backs with physical memory when they're actually used. So they already are allocated on demand.
Jul 08 2016
On 07/09/2016 02:48 AM, ikod wrote:If I made a wrong guess and ask for too small stack then programm may crash. If I ask for too large stack then I probably waste resources.Nope, this is exactly the point. You can demand crazy 10 MB of stack for each fiber and only the actually used part will be allocated by kernel.
Jul 09 2016
On Saturday, 9 July 2016 at 13:48:41 UTC, Dicebot wrote:On 07/09/2016 02:48 AM, ikod wrote:Thanks, nice to know.If I made a wrong guess and ask for too small stack then programm may crash. If I ask for too large stack then I probably waste resources.Nope, this is exactly the point. You can demand crazy 10 MB of stack for each fiber and only the actually used part will be allocated by kernel.
Jul 09 2016
On Saturday, 9 July 2016 at 13:48:41 UTC, Dicebot wrote:On 07/09/2016 02:48 AM, ikod wrote:Remember that virtual address space is limited on 32-bit platforms. Thus spawning 2000 threads 1 MB stack each will occupy all available VA space and you'll get an allocation failure (even if the real memory usage is low).If I made a wrong guess and ask for too small stack then programm may crash. If I ask for too large stack then I probably waste resources.Nope, this is exactly the point. You can demand crazy 10 MB of stack for each fiber and only the actually used part will be allocated by kernel.
Jul 10 2016
On Sunday, 10 July 2016 at 19:49:11 UTC, Sergey Podobry wrote:On Saturday, 9 July 2016 at 13:48:41 UTC, Dicebot wrote:Sorry, but someone who tries to run highly concurrent server software with thousands of fibers on 32-bit platform is quite unwise and there is no point in taking such use case into account. 32-bit has its own niche with different kinds of concerns.Nope, this is exactly the point. You can demand crazy 10 MB of stack for each fiber and only the actually used part will be allocated by kernel.Remember that virtual address space is limited on 32-bit platforms. Thus spawning 2000 threads 1 MB stack each will occupy all available VA space and you'll get an allocation failure (even if the real memory usage is low).
Jul 11 2016
On Monday, 11 July 2016 at 11:23:26 UTC, Dicebot wrote:On Sunday, 10 July 2016 at 19:49:11 UTC, Sergey Podobry wrote:Agreed. I don't know why golang guys bother about it.Remember that virtual address space is limited on 32-bit platforms. Thus spawning 2000 threads 1 MB stack each will occupy all available VA space and you'll get an allocation failure (even if the real memory usage is low).Sorry, but someone who tries to run highly concurrent server software with thousands of fibers on 32-bit platform is quite unwise and there is no point in taking such use case into account. 32-bit has its own niche with different kinds of concerns.
Jul 11 2016
On Mon, 2016-07-11 at 12:21 +0000, Sergey Podobry via Digitalmars-d wrote:On Monday, 11 July 2016 at 11:23:26 UTC, Dicebot wrote:Maybe because they are developing a language for the 1980s? ;-) --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winderOn Sunday, 10 July 2016 at 19:49:11 UTC, Sergey Podobry wrote:=20 Agreed. I don't know why golang guys bother about it.=20 Remember that virtual address space is limited on 32-bit=C2=A0 platforms. Thus spawning 2000 threads 1 MB stack each will=C2=A0 occupy all available VA space and you'll get an allocation=C2=A0 failure (even if the real memory usage is low).=20 Sorry, but someone who tries to run highly concurrent server=C2=A0 software with thousands of fibers on 32-bit platform is quite=C2=A0 unwise and there is no point in taking such use case into=C2=A0 account. 32-bit has its own niche with different kinds of=C2=A0 concerns.
Jul 11 2016
On Monday, 11 July 2016 at 13:05:09 UTC, Russel Winder wrote:Maybe because they are developing a language for the 1980s? ;-)It is quite common for web services to run with less than 1GB. 64bit would be very wasteful.
Jul 11 2016
On Monday, 11 July 2016 at 13:05:09 UTC, Russel Winder wrote:Because they have nothing else to propose than massive goroutine orgy so they kind of have to make it work.Agreed. I don't know why golang guys bother about it.Maybe because they are developing a language for the 1980s? ;-)It's not like they are using the Plan9 toolchain... Ho wait...
Jul 11 2016
On Monday, 11 July 2016 at 12:21:04 UTC, Sergey Podobry wrote:On Monday, 11 July 2016 at 11:23:26 UTC, Dicebot wrote:Because of attitudes like shown in that thread https://forum.dlang.org/post/ilbmfvywzktilhskpeoh forum.dlang.org people who do not really understand why 32 bit systems are a really problematic even if the apps don't use more than 2 GiB of memory. Here's Linus Torvalds classic rant about 64 bit https://cl4ssic4l.wordpress.com/2011/05/24/linus-torvalds-about-pae/ (it's more about PAE but the reasons why 64 bits is a good thing in general are the same: address space!)On Sunday, 10 July 2016 at 19:49:11 UTC, Sergey Podobry wrote:Agreed. I don't know why golang guys bother about it.Remember that virtual address space is limited on 32-bit platforms. Thus spawning 2000 threads 1 MB stack each will occupy all available VA space and you'll get an allocation failure (even if the real memory usage is low).Sorry, but someone who tries to run highly concurrent server software with thousands of fibers on 32-bit platform is quite unwise and there is no point in taking such use case into account. 32-bit has its own niche with different kinds of concerns.
Jul 11 2016
On Monday, 11 July 2016 at 13:13:02 UTC, Patrick Schluter wrote:Because of attitudes like shown in that thread https://forum.dlang.org/post/ilbmfvywzktilhskpeoh forum.dlang.org people who do not really understand why 32 bit systems are a really problematic even if the apps don't use more than 2 GiB of memory. Here's Linus Torvalds classic rant about 64 bit https://cl4ssic4l.wordpress.com/2011/05/24/linus-torvalds-about-pae/ (it's more about PAE but the reasons why 64 bits is a good thing in general are the same: address space!)Why can't you use both 32bit and 64bit pointers when compiling for x86_64? My guess would be that using 64bit registers precludes the use of 32bit registers.
Jul 11 2016
On Monday, 11 July 2016 at 17:14:17 UTC, jmh530 wrote:On Monday, 11 July 2016 at 13:13:02 UTC, Patrick Schluter wrote:You can, but OSes usually give you randomized memory layout as a security measure.Because of attitudes like shown in that thread https://forum.dlang.org/post/ilbmfvywzktilhskpeoh forum.dlang.org people who do not really understand why 32 bit systems are a really problematic even if the apps don't use more than 2 GiB of memory. Here's Linus Torvalds classic rant about 64 bit https://cl4ssic4l.wordpress.com/2011/05/24/linus-torvalds-about-pae/ (it's more about PAE but the reasons why 64 bits is a good thing in general are the same: address space!)Why can't you use both 32bit and 64bit pointers when compiling for x86_64? My guess would be that using 64bit registers precludes the use of 32bit registers.
Jul 11 2016
On Monday, 11 July 2016 at 17:23:49 UTC, Ola Fosheim Grøstad wrote:You can, but OSes usually give you randomized memory layout as a security measure.What if the memory allocation scheme were something like: randomly pick memory locations below some threshold from the 32bit segment and then above the threshold pick from elsewhere?
Jul 12 2016
On Tuesday, 12 July 2016 at 13:28:33 UTC, jmh530 wrote:On Monday, 11 July 2016 at 17:23:49 UTC, Ola Fosheim Grøstad wrote:One possible technique is to use contiguous "unmapped" memory areas that cover your worst case number of elements with a specific base and just use indexes instead of absolute addressing. That way you often can just use 16 bits typed addressing (assuming max 65535 objects of a given type + a null index). The base address may then be injected (during linking or by using self-modifying code if the OS allows it) into the code segments. Or you could use TLS + indexing, or whatever the OS supports. Using global 64 bit pointers is just for generality and to keep language-implementation simple. It is not strictly hardware related if you have a MMU, nor directly related to machine language as such. For a statically typed language you could probably get away with 16 or 32 bits for typed pointers most of the time if the OS and language doesn't make it difficult (like the conservative D GC scan).You can, but OSes usually give you randomized memory layout as a security measure.What if the memory allocation scheme were something like: randomly pick memory locations below some threshold from the 32bit segment and then above the threshold pick from elsewhere?
Jul 12 2016
On Tuesday, 12 July 2016 at 13:28:33 UTC, jmh530 wrote:On Monday, 11 July 2016 at 17:23:49 UTC, Ola Fosheim Grøstad wrote:There is a mmap flag for this on linux.You can, but OSes usually give you randomized memory layout as a security measure.What if the memory allocation scheme were something like: randomly pick memory locations below some threshold from the 32bit segment and then above the threshold pick from elsewhere?
Jul 12 2016
On Monday, 11 July 2016 at 13:13:02 UTC, Patrick Schluter wrote:(it's more about PAE but the reasons why 64 bits is a good thing in general are the same: address space!)And what's with address space?
Jul 12 2016
On Fri, 08 Jul 2016 22:35:05 +0200, Martin Nowak wrote:On 07/08/2016 07:45 AM, ikod wrote:The downside is that it's difficult to release that memory. On the other hand, Go had a lot of problems with its implementation in part because it released memory. At some point you start telling users: if you want a fiber that does a huge recursion, dispose of it when you're done. It's cheap enough to create another fiber later.Correct me if I'm wrong, but in D fibers allocate stack statically, so we have to preallocate large stacks. If yes - can we allocate stack frames on demand from some non-GC area?Fiber stacks are just mapped virtual memory pages that the kernel only backs with physical memory when they're actually used. So they already are allocated on demand.
Jul 09 2016
On 7/7/16 6:36 PM, Enamex wrote:https://news.ycombinator.com/item?id=12042198 ^ reposting a link in the right place.A very nice article and success story. We've had similar stories with several products at Facebook. There is of course the opposite view - an orders-of-magnitude improvement means there was quite a lot of waste just before that. I wish we could amass the experts able to make similar things happen for us. Andrei
Jul 09 2016
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu wrote:On 7/7/16 6:36 PM, Enamex wrote:kickstarter for improve gc :)https://news.ycombinator.com/item?id=12042198 ^ reposting a link in the right place.A very nice article and success story. We've had similar stories with several products at Facebook. There is of course the opposite view - an orders-of-magnitude improvement means there was quite a lot of waste just before that. I wish we could amass the experts able to make similar things happen for us. Andrei
Jul 09 2016
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu wrote:On 7/7/16 6:36 PM, Enamex wrote:Exactly, how someone can run a big site with 2 second pauses in the GC code is beyond me.https://news.ycombinator.com/item?id=12042198 ^ reposting a link in the right place.A very nice article and success story. We've had similar stories with several products at Facebook. There is of course the opposite view - an orders-of-magnitude improvement means there was quite a lot of waste just before that.I wish we could amass the experts able to make similar things happen for us.We sort of have an agreement that we don't want to pay 5% for write barriers, so the common algorithmic GC improvements aren't available for us. There is still connectivity based GC [¹], which is an interesting idea, but AFAIK it hasn't been widely tried. Maybe someone has an idea for optional write barriers, i.e. zero cost if you don't use them. Or we agree that it's worth to have different incompatible binaries. [¹]: https://www.cs.purdue.edu/homes/hosking/690M/cbgc.pdf In any case now that we made the GC pluggable we should port the forking GC. It has almost no latency at the price of higher peak memory usage and throughput, the same trade-offs you have with any concurrent mark phase. Moving the sweeping to background GC threads is sth. we should be doing anyhow. Overall I think we should focus more on good deterministic MM alternatives, rather than investing years of engineering into our GC, or hoping for silver bullets.
Jul 09 2016
On 07/09/2016 03:42 PM, Martin Nowak wrote:We sort of have an agreement that we don't want to pay 5% for write barriers, so the common algorithmic GC improvements aren't available for us.Yah, I was thinking in a more general sense. Plenty of improvements of all kinds are within reach. -- Andrei
Jul 09 2016
On Saturday, 9 July 2016 at 23:12:10 UTC, Andrei Alexandrescu wrote:Yah, I was thinking in a more general sense. Plenty of improvements of all kinds are within reach. -- AndreiYes, but hardly anything that would allow us to do partial collections. And without that you always have to scan the full live heap, this can't scale to bigger heaps, there is no way to scan a GB sized heap fast. So either we facilitate to get by with a small GC heap, i.e. more deterministic MM, or we spent a lot of time to make some partial collection algorithm work. Ideally we do both but the former is a simpler goal. The connectivity based GC would be a realistic goal as well, only somewhat more complex than the precise GC. But it's unclear how well it will work for typical applications.
Jul 10 2016
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu wrote:I wish we could amass the experts able to make similar things happen for us.I humbly believe it is not just about amassing experts, but also making it easy to do experiments. Phobos/druntime should provide set of APIs for literally everything so people can do their own implementations of ANY standard library module(s). I wish D offered module interfaces the same way Modula-3 did... To work on new GC in D one needs to remove the old one, and replace it with his/her new implementation, while with competition it is more/less implementation of few interfaces, and instructing compiler to use the new GC...
Jul 09 2016
On Saturday, 9 July 2016 at 21:25:34 UTC, Dejan Lekic wrote:On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu wrote:https://github.com/dlang/druntime/blob/master/src/gc/gcinterface.d https://github.com/dlang/druntime/blob/master/src/gc/impl/manual/gc.d What else do you need to start working on a new GC implementation?I wish we could amass the experts able to make similar things happen for us.I humbly believe it is not just about amassing experts, but also making it easy to do experiments. Phobos/druntime should provide set of APIs for literally everything so people can do their own implementations of ANY standard library module(s). I wish D offered module interfaces the same way Modula-3 did... To work on new GC in D one needs to remove the old one, and replace it with his/her new implementation, while with competition it is more/less implementation of few interfaces, and instructing compiler to use the new GC...
Jul 09 2016
On Saturday, 9 July 2016 at 23:14:38 UTC, ZombineDev wrote:https://github.com/dlang/druntime/blob/master/src/gc/gcinterface.d https://github.com/dlang/druntime/blob/master/src/gc/impl/manual/gc.d What else do you need to start working on a new GC implementation?That is actually the only case that I know of that an interface was provided to be implemented by 3rd parties... My reply was about Phobos in general. To repeat again - Phobos should provide the API (interfaces) *and* reference implementations of those.
Sep 26 2016
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu wrote:On 7/7/16 6:36 PM, Enamex wrote:Hello Andrei, May only be slightly related, but when you talked about D vs Go vs Rust in that Quora answer (here: https://www.quora.com/Which-language-has-the-brightest-future-in-replacement-of-C-between-D-Go-and-Rust-And-Why/answer/A drei-Alexandrescu), I was thinking, okay, so D's GC seems to turned out not that great. But how about the idea of transplanting Rust's ownership system instead of trying to make the GC better? Disclaimer: I know very little about D's possibly similar mechanisms. Thanks, Istvanhttps://news.ycombinator.com/item?id=12042198 ^ reposting a link in the right place.A very nice article and success story. We've had similar stories with several products at Facebook. There is of course the opposite view - an orders-of-magnitude improvement means there was quite a lot of waste just before that. I wish we could amass the experts able to make similar things happen for us. Andrei
Jul 14 2016
On Thursday, 14 July 2016 at 10:58:47 UTC, Istvan Dobos wrote:I was thinking, okay, so D's GC seems to turned out not that great. But how about the idea of transplanting Rust's ownership system instead of trying to make the GC better?This requires drastically changing 99% of the language and it's bringing not just the benefits but also all the pain coming with this ownership system. Productivity goes down, learning curve goes up. And it will be a very different language in the end, so you might want to just use Rust instead of trying to make D another Rust.
Jul 16 2016
On Saturday, 16 July 2016 at 11:02:00 UTC, thedeemon wrote:On Thursday, 14 July 2016 at 10:58:47 UTC, Istvan Dobos wrote:Yes that's the case for Rust, but no one has proven yet that an ownership system needs to such a pain. In fact someone recently proposed an idea for a readable ownership system: http://forum.dlang.org/post/ensdiijttlpcwuhdfpuu forum.dlang.org and I believe that it's quite possible to improve over Rust and still having a productive language. In fact the simple `scope` statements are a first and excellent step on this journey ;-)I was thinking, okay, so D's GC seems to turned out not that great. But how about the idea of transplanting Rust's ownership system instead of trying to make the GC better?This requires drastically changing 99% of the language and it's bringing not just the benefits but also all the pain coming with this ownership system. Productivity goes down, learning curve goes up. And it will be a very different language in the end, so you might want to just use Rust instead of trying to make D another Rust.
Jul 16 2016