www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Mesh: a new approach to memory compaction for C/C++

reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
strangeloop talk here:

https://www.youtube.com/watch?v=c1UBJbfR-H0

and github repo here:

https://github.com/plasma-umass/Mesh

Sounds to me like making D's GC precise by default to enable 
memory compaction is worth pursuing.
Sep 17
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable 
 memory compaction is worth pursuing.
I've just skimmed the paper. It seems that all this does is to defragment the memory at runtime. Which it does at regular intervals unless a steady-state heuristic stops mesh from doing that. It also seems this allocation scheme only works for small to medium size allocations done in large quantities. And therefore would not work if the application does allocation in large chunks itself (as is the case for dmd by default). I am going to look into this a bit further.
Sep 17
parent reply Emery Berger <emery cs.umass.edu> writes:
On Tuesday, 17 September 2019 at 14:06:06 UTC, Stefan Koch wrote:
 On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw 
 wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable 
 memory compaction is worth pursuing.
(Mesh co-author here) FWIW, Mesh eliminates the need for precise garbage collection.
 I've just skimmed the paper.
 It seems that all this does is to defragment the memory at 
 runtime.
"all this does"?? That's a lot! :) Mesh is the first system to automatically perform defragmentation for C-like languages. -- emery PS We'd love to see it adopted by D, and we welcome comments and pull requests!
Sep 22
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Sunday, 22 September 2019 at 16:12:13 UTC, Emery Berger wrote:
 On Tuesday, 17 September 2019 at 14:06:06 UTC, Stefan Koch 
 wrote:
 On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw 
 wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable 
 memory compaction is worth pursuing.
(Mesh co-author here) FWIW, Mesh eliminates the need for precise garbage collection.
 I've just skimmed the paper.
 It seems that all this does is to defragment the memory at 
 runtime.
"all this does"?? That's a lot! :) Mesh is the first system to automatically perform defragmentation for C-like languages. -- emery PS We'd love to see it adopted by D, and we welcome comments and pull requests!
Hi Emery, I didn't mean to sound patronizing. :-/ The Paper is well written and easy to understand. I wish more papers were like yours. Cheers
Sep 22
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 9/22/19 12:12 PM, Emery Berger wrote:
 On Tuesday, 17 September 2019 at 14:06:06 UTC, Stefan Koch wrote:
 On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable memory 
 compaction is worth pursuing.
(Mesh co-author here) FWIW, Mesh eliminates the need for precise garbage collection.
 I've just skimmed the paper.
 It seems that all this does is to defragment the memory at runtime.
"all this does"?? That's a lot! :) Mesh is the first system to automatically perform defragmentation for C-like languages. -- emery PS We'd love to see it adopted by D, and we welcome comments and pull requests!
Thanks, Emery. I'm cc'ing my students with this and will make the point to consider this a possible project for our upcoming students. Also, we should put this on the ideas page for GSoC and SAoC.
Sep 23
prev sibling next sibling parent reply Gregor =?UTF-8?B?TcO8Y2ts?= <gregormueckl gmx.de> writes:
On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable 
 memory compaction is worth pursuing.
I'm am thinking about the implications for debugging. Mapping the same page to two virtual memory addresses on the heap creates weird situations where writing over the bounds of one object doesn't trash allocations with nearby memory addresses, but can affect seemingly far away allocations that ended up randomly aliased to the same page. Even worse, Mesh randomizes allocations within pages to get more opportunities to alias pages. This means that some heap-smashing bug will likely present itself differently on each run. Have fun finding the root cause of that :/.
Sep 18
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 18 September 2019 at 09:31:07 UTC, Gregor Mückl 
wrote:
 On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw 
 wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable 
 memory compaction is worth pursuing.
I'm am thinking about the implications for debugging. Mapping the same page to two virtual memory addresses on the heap creates weird situations where writing over the bounds of one object doesn't trash allocations with nearby memory addresses, but can affect seemingly far away allocations that ended up randomly aliased to the same page. Even worse, Mesh randomizes allocations within pages to get more opportunities to alias pages. This means that some heap-smashing bug will likely present itself differently on each run. Have fun finding the root cause of that :/.
Good point. I hadn't thought about that. All the additional OS calls to copy stuff into different physical pages will also most likely also not be great for perf. If the allocator increases the runtime of perlbench by 4% that's acutally quite significant as I can't imagine perlbench runs fast.
Sep 18
prev sibling parent reply Emery Berger <emery cs.umass.edu> writes:
On Wednesday, 18 September 2019 at 09:31:07 UTC, Gregor Mückl 
wrote:
 On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw 
 wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable 
 memory compaction is worth pursuing.
I'm am thinking about the implications for debugging. Mapping the same page to two virtual memory addresses on the heap creates weird situations where writing over the bounds of one object doesn't trash allocations with nearby memory addresses, but can affect seemingly far away allocations that ended up randomly aliased to the same page. Even worse, Mesh randomizes allocations within pages to get more opportunities to alias pages. This means that some heap-smashing bug will likely present itself differently on each run. Have fun finding the root cause of that :/.
You can always seed the random number generator with a fixed value to get reproducible runs. More interestingly, it's also possible to exploit an Exterminator-like approach (see https://people.cs.umass.edu/~emery/pubs/p87-novark.pdf) to automatically find the source of such bugs! This is specifically enabled by randomization. We implemented Exterminator for a previous randomizing allocator (DieHard: code here, including Exterminator -- https://github.com/plasma-umass/DieHard), and there's no technical barrier to doing the same for Mesh. -- emery (Mesh co-author)
Sep 22
parent Gregor =?UTF-8?B?TcO8Y2ts?= <gregormueckl gmx.de> writes:
On Sunday, 22 September 2019 at 16:15:08 UTC, Emery Berger wrote:
 On Wednesday, 18 September 2019 at 09:31:07 UTC, Gregor Mückl 
 wrote:
 On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw 
 wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable 
 memory compaction is worth pursuing.
I'm am thinking about the implications for debugging. Mapping the same page to two virtual memory addresses on the heap creates weird situations where writing over the bounds of one object doesn't trash allocations with nearby memory addresses, but can affect seemingly far away allocations that ended up randomly aliased to the same page. Even worse, Mesh randomizes allocations within pages to get more opportunities to alias pages. This means that some heap-smashing bug will likely present itself differently on each run. Have fun finding the root cause of that :/.
You can always seed the random number generator with a fixed value to get reproducible runs. More interestingly, it's also possible to exploit an Exterminator-like approach (see https://people.cs.umass.edu/~emery/pubs/p87-novark.pdf) to automatically find the source of such bugs! This is specifically enabled by randomization. We implemented Exterminator for a previous randomizing allocator (DieHard: code here, including Exterminator -- https://github.com/plasma-umass/DieHard), and there's no technical barrier to doing the same for Mesh. -- emery (Mesh co-author)
I've recently had a cursory look into some of the work of your group. You've got some wonderfully crazy ideas and it's nice to see the proofs of concept. I'm not sure I see the practical value of Exterminator for application developers. As I understand it, MS included a derivative of DieHard on Windows to be able to cheat misbehaving applications into crashing less often. As I see it, this is a pretty desperate workaround for when you can't fix the actual bug directly. I am really curious about one thing: what can Exterminator do that valgrind cannot when I'm debugging a program locally? For the sake of discussion, let's assume that valgrind gets full heap allocation information from the GC (which currently isn't the case for the D garbage collector as I understand it; other GCs like moarvm's have that integration). Also, to put my foot squarely into my own mouth: I have a fair bit of reservation regarding probabilistic algorithms on a system level. I generally prefer the underlying systems I work with to be as deterministic as possible. I also develop Monte Carlo simulations for a living... ;)
Sep 24
prev sibling parent zoujiaqing <zoujiaqing gmail.com> writes:
On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
 strangeloop talk here:

 https://www.youtube.com/watch?v=c1UBJbfR-H0

 and github repo here:

 https://github.com/plasma-umass/Mesh

 Sounds to me like making D's GC precise by default to enable 
 memory compaction is worth pursuing.
I think, performance is often more important than momory compaction :)
Nov 11