www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to debug long-lived D program memory usage?

reply Adam D. Ruppe <destructionator gmail.com> writes:
D programs are a vital part of my home computer infrastructure. I 
run some 60 D processes at almost any time.... and have recently 
been running out of memory.

Each individual process eats ~30-100 MB, but that times 60 = 
trouble. They start off small, like 5 MB, and grow over weeks or 
months, so it isn't something I can easily isolate in a debugger 
after recompiling.

I'm pretty sure this is the result of wasteful code somewhere in 
my underlying libraries, but nothing is obviously jumping out at 
me in the code. So I want to look at some of my existing 
processes and see just what is responsible for this.

I tried attaching to one and `call gc_stats()` in gdb... and it 
segfaulted. Whoops.




I am willing to recompile and run again, though I need to 
actually use the programs, so if instrumenting makes them 
unusable it won't really help. Is there a magic --DRT- argument 
perhaps? Or some trick with gdb attaching to a running process I 
don't know?

What I'm hoping to do is get an idea of which line of code 
allocates the most that isn't subsequently freed.
Apr 17 2019
next sibling parent reply Julian <julian.fondren gmail.com> writes:
On Wednesday, 17 April 2019 at 16:27:02 UTC, Adam D. Ruppe wrote:
 D programs are a vital part of my home computer infrastructure. 
 I run some 60 D processes at almost any time.... and have 
 recently been running out of memory.

 Each individual process eats ~30-100 MB, but that times 60 = 
 trouble. They start off small, like 5 MB, and grow over weeks 
 or months, so it isn't something I can easily isolate in a 
 debugger after recompiling.

 I'm pretty sure this is the result of wasteful code somewhere 
 in my underlying libraries, but nothing is obviously jumping 
 out at me in the code. So I want to look at some of my existing 
 processes and see just what is responsible for this.

 I tried attaching to one and `call gc_stats()` in gdb... and it 
 segfaulted. Whoops.




 I am willing to recompile and run again, though I need to 
 actually use the programs, so if instrumenting makes them 
 unusable it won't really help. Is there a magic --DRT- argument 
 perhaps? Or some trick with gdb attaching to a running process 
 I don't know?

 What I'm hoping to do is get an idea of which line of code 
 allocates the most that isn't subsequently freed.
One thing you can try, without recompiling, is using pmap -x on one of the bloated processes, and then dumping a large memory region to file, and then just looking at the binary. It might be something obvious on visual inspection. You can dump memory with gdb -p $pid --eval-command 'dump binary memory file.bin 0xfromLL 0xtoLL' -batch
Apr 17 2019
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 17 April 2019 at 16:57:51 UTC, Julian wrote:
 It might be something obvious on visual inspection.
Huh, indeed this got me the biggest obvious-in-retrospect-but-i-didnt-think-to-look-there win of the day - the memory dump showed a super-bloated scrollback buffer in my terminal emulator. I removed 24 bit color support and slashed that in half, then instituted some limits to bring the peak down a bit more. Still have more to go, but this little thing actually added up to a whopping gigabyte across my whole system.
 You can dump memory with
thanks for the tip!
Apr 17 2019
parent Kagamin <spam here.lot> writes:
If you have a slow memory leak, you can speed it up by a stress 
test. Also the debug built application can run in a separate 
environment.
Apr 18 2019
prev sibling next sibling parent reply Martin Krejcirik <mk-junk i-line.cz> writes:
On Wednesday, 17 April 2019 at 16:27:02 UTC, Adam D. Ruppe wrote:
 Each individual process eats ~30-100 MB, but that times 60 = 
 trouble. They start off small, like 5 MB, and grow over weeks 
 or months, so it isn't something I can easily isolate in a
Do you run GC.minimize ?
Apr 17 2019
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 17 April 2019 at 17:03:20 UTC, Martin Krejcirik 
wrote:
 Do you run GC.minimize ?
Not explicitly, but I did try `call gc_minimize()` from the debugger when attached to processes and it made no difference. Maybe I'll add a hook to the program to call that on a hotkey press for the future though, I can see some situations where it might make a difference. (though I'd be kinda surprised if it didn't at least sometimes run automatically...)
Apr 17 2019
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2019-04-17 18:27, Adam D. Ruppe wrote:

 I am willing to recompile and run again, though I need to actually use 
 the programs, so if instrumenting makes them unusable it won't really 
 help. Is there a magic --DRT- argument perhaps? Or some trick with gdb 
 attaching to a running process I don't know?
Perhaps try some of these flags [1] and [2]. I tried to look for other `--DRT-` flags but unfortunately it's spread across the druntime code base and not handled in a single place. There's no documentation and there's no generic `--DRT-help` flag. It's a mess. [1] https://dlang.org/changelog/2.067.0.html#gc-options [2] https://dlang.org/changelog/2.068.0.html#gc-api-profile -- /Jacob Carlborg
Apr 17 2019
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 17 April 2019 at 19:07:46 UTC, Jacob Carlborg wrote:
 Perhaps try some of these flags [1] and [2].
oooh, those are very interesting too. What I was kinda hoping is it would have stats for which file and line of code was responsible for most allocations; a detailed profile. But even so, this is an interesting gem.
 There's no documentation and there's no generic `--DRT-help` 
 flag. It's a mess.
Indeed, we need to fix that. But I'm too lazy to do it myself :(
Apr 17 2019
parent reply Meta <jared771 gmail.com> writes:
On Wednesday, 17 April 2019 at 22:37:38 UTC, Adam D. Ruppe wrote:
 On Wednesday, 17 April 2019 at 19:07:46 UTC, Jacob Carlborg 
 wrote:
 Perhaps try some of these flags [1] and [2].
oooh, those are very interesting too. What I was kinda hoping is it would have stats for which file and line of code was responsible for most allocations; a detailed profile. But even so, this is an interesting gem.
Not at all what you want, but it may be useful for figuring out where the leaks are. Have you tried compiling with -vgc? https://dlang.org/dmd-windows.html#switch-vgc
Apr 17 2019
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 17 April 2019 at 23:57:41 UTC, Meta wrote:
 Not at all what you want, but it may be useful for figuring out 
 where the leaks are. Have you tried compiling with -vgc?
That wouldn't help me much here because I know parts are GC allocating, and I'm ok with that. What I want to know are the parts the GC is not collecting for whatever reason. These parts may be malloc'd too; it isn't necessary the GC's fault.
Apr 17 2019
prev sibling next sibling parent reply ikod <geller.garry gmail.com> writes:
On Wednesday, 17 April 2019 at 16:27:02 UTC, Adam D. Ruppe wrote:
 D programs are a vital part of my home computer infrastructure. 
 I run some 60 D processes at almost any time.... and have 
 recently been running out of memory.
I usually run program under valgrind in this case. Though it will not help you to debug GC problems, but will cut off memory leaked malloc-s.
Apr 18 2019
parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Thursday, 18 April 2019 at 12:00:10 UTC, ikod wrote:
 On Wednesday, 17 April 2019 at 16:27:02 UTC, Adam D. Ruppe 
 wrote:
 D programs are a vital part of my home computer 
 infrastructure. I run some 60 D processes at almost any 
 time.... and have recently been running out of memory.
I usually run program under valgrind in this case. Though it will not help you to debug GC problems, but will cut off memory leaked malloc-s.
Even valgrind tool=massif ?
Apr 21 2019
parent ikod <geller.garry gmail.com> writes:
On Sunday, 21 April 2019 at 21:04:52 UTC, Patrick Schluter wrote:
 On Thursday, 18 April 2019 at 12:00:10 UTC, ikod wrote:
 On Wednesday, 17 April 2019 at 16:27:02 UTC, Adam D. Ruppe 
 wrote:
 D programs are a vital part of my home computer 
 infrastructure. I run some 60 D processes at almost any 
 time.... and have recently been running out of memory.
I usually run program under valgrind in this case. Though it will not help you to debug GC problems, but will cut off memory leaked malloc-s.
Even valgrind tool=massif ?
I rarely use massif. It is heap profiler, and my primary target usually are memory leaks. Does your question mean that massif can help to debug GC?
Apr 22 2019
prev sibling parent reply Alex <AJ gmail.com> writes:
On Wednesday, 17 April 2019 at 16:27:02 UTC, Adam D. Ruppe wrote:
 D programs are a vital part of my home computer infrastructure. 
 I run some 60 D processes at almost any time.... and have 
 recently been running out of memory.

 Each individual process eats ~30-100 MB, but that times 60 = 
 trouble. They start off small, like 5 MB, and grow over weeks 
 or months, so it isn't something I can easily isolate in a 
 debugger after recompiling.

 I'm pretty sure this is the result of wasteful code somewhere 
 in my underlying libraries, but nothing is obviously jumping 
 out at me in the code. So I want to look at some of my existing 
 processes and see just what is responsible for this.

 I tried attaching to one and `call gc_stats()` in gdb... and it 
 segfaulted. Whoops.




 I am willing to recompile and run again, though I need to 
 actually use the programs, so if instrumenting makes them 
 unusable it won't really help. Is there a magic --DRT- argument 
 perhaps? Or some trick with gdb attaching to a running process 
 I don't know?

 What I'm hoping to do is get an idea of which line of code 
 allocates the most that isn't subsequently freed.
Curious, what are these programs? You might have hook in to the GC and just write out stats, I believe there is a stats collector somewhere though. I did this by replacing new and monitored and calculated the allocations. This didn't help for any GC issues but at least made sure all my allocations were correct. What you could do is something similar to this and just output stuff to a text file(that is written every so often). If they programs are not too large you could used named allocations that could then be graphed individually(or use the file locations, __FILE__, etc). Search and replace all new's and allocs with your custom ones and override the GC's. Should give you a good idea of what's going on.
Apr 18 2019
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 19 April 2019 at 02:58:34 UTC, Alex wrote:
 Curious, what are these programs?
A terminal emulator gui (like xterm), a detachable terminal emulator (like gnu screen), a slack client, an irc client, and a bunch of http servers including d doc search, a work program, and a personal utility. All of them would show growing memory time, some worse than others. You can see a lot of difference in them - gui programs, terminal programs, network server programs. But, I did write it all myself, so it could be a mistake I keep on making. So far, I basically tracked down the terminal emulator things to being inefficient scrollback storage. I made the structs smaller and limited the amount saved more than before and cut this by half. The ddoc search was keeping the index in memory, that's fixed, but it still shows growing usage over time. Of course, restarting that is trivial if need be, but still, I wanna make sure I am doing it right too - especially if it is one of my underlying libraries to blame.
 You might have hook in to the GC and just write out stats, I 
 believe there is a stats collector somewhere though.
Yes, indeed. I am starting to make serious progress now - mostly the fault is me storing excessive histories inefficiently. Should have been obvious in retrospect, but I didn't realize just how much overhead there was in my designs!
Apr 18 2019
parent reply Alex <AJ gmail.com> writes:
On Friday, 19 April 2019 at 03:27:04 UTC, Adam D. Ruppe wrote:
 On Friday, 19 April 2019 at 02:58:34 UTC, Alex wrote:
 Curious, what are these programs?
A terminal emulator gui (like xterm), a detachable terminal emulator (like gnu screen), a slack client, an irc client, and a bunch of http servers including d doc search, a work program, and a personal utility.
Ok, nothing useful ;) I was thinking you might be doing stuff like running a security system that did computer vision, or some type of advanced house monitoring and control(voice activated doors or something) ;) Could you not, as a quick fix, just reboot and automate restarting those? Maybe design an auto-restart which saves the state, shuts itself down and then relaunches itself and loads the data? This could be done as a fail safe when memory consumption get's too high.
 All of them would show growing memory time, some worse than 
 others. You can see a lot of difference in them - gui programs, 
 terminal programs, network server programs. But, I did write it 
 all myself, so it could be a mistake I keep on making.

 So far, I basically tracked down the terminal emulator things 
 to being inefficient scrollback storage. I made the structs 
 smaller and limited the amount saved more than before and cut 
 this by half. The ddoc search was keeping the index in memory, 
 that's fixed, but it still shows growing usage over time. Of 
 course, restarting that is trivial if need be, but still, I 
 wanna make sure I am doing it right too - especially if it is 
 one of my underlying libraries to blame.
Gonna have to be one of those things you track down because it could be something as simple as the GC or something more serious.
 You might have hook in to the GC and just write out stats, I 
 believe there is a stats collector somewhere though.
Yes, indeed. I am starting to make serious progress now - mostly the fault is me storing excessive histories inefficiently. Should have been obvious in retrospect, but I didn't realize just how much overhead there was in my designs!
D should have a very good memory statistics library built(I guess it has something with the switches)... since it should have no issues tracking memory usage. Every memory allocation must have a corresponding deallocation for stable programs. All allocations and deallocations have specific file locations or occur in the GC. I don't see why it would be difficult to monitor this stuff. As I mentioned, I generally never use new precisely so I can track this stuff myself and have a nice printout of memory usage when I need it and even verify the net memory allocation 0 zero on program exit. I haven't messed with the GC but I imagine it too shouldn't be hard to add the info too.
Apr 19 2019
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 19 April 2019 at 17:30:50 UTC, Alex wrote:
 I was thinking you might be doing stuff like running a security 
 system that did computer vision, or some type of advanced house 
 monitoring and control(voice activated doors or something) ;)
LOL, now *that* would be totally useless! Of course, I can restart stuff, it is just a hassle, and besides, I also wanna make sure my libs aren't too badly written.
 D should have a very good memory statistics library built(I 
 guess it has something with the switches)... since it should 
 have no issues tracking memory usage.
Indeed, I am sorta starting to make a hacky add-on module that will provide some of that info. But I need to hook deallocations too and haven't gotten that yet. It'll be cool to get a report out of the program at any time that tells me which lines have how much outstanding allocations.
Apr 19 2019