www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC pauses

reply "Maxime Larose" <mlarose broadsoft.com> writes:
I want to code an application in D.

The application must be (almost) real-time and needs to have *no* GC pause
throughout. Well, maybe a pause of 20 ms is acceptable, but any more than
that is not. The application will have a *lot* of memory allocated. In the
range of 3-4 GB of allocated memory.

Will the GC pause?  I guess so, as there is no way in the world a GC will
not pause for a significant amount of time with 4GB of memory allocated
(significant being >20ms).

Assuming yes to the above, I want to:
- Disable GC. This is not difficult.
- I want to use language constructs and a library that don't rely on GC
(that doesn't throw garbage around).

So, two questions:
1. Are there language constructs that do rely on GC?  Ex: AA, dynamic
arrays, etc. E.g. will doing something like the following generate garbage:
char[] a = "a"; a ~= b;
2. Does phobos in general rely on GC? I guess the answer is yes. Even if
not, I suppose there is no contract saying that it will remain that way. So,
any new version could change the no to a yes.

Given the above, I plan on developing a library that will not rely on GC. I
believe there is none currently. However, before I reinvent the wheel, is
there such a library in existence today or any efforts towards one?

Also, please do correct me if I am wrong in any of my assumptions.

Thanks,

Max
May 02 2005
next sibling parent Sean Kelly <sean f4.ca> writes:
In article <d55mvu$25r4$1 digitaldaemon.com>, Maxime Larose says...
I want to code an application in D.

The application must be (almost) real-time and needs to have *no* GC pause
throughout. Well, maybe a pause of 20 ms is acceptable, but any more than
that is not. The application will have a *lot* of memory allocated. In the
range of 3-4 GB of allocated memory.

Will the GC pause?  I guess so, as there is no way in the world a GC will
not pause for a significant amount of time with 4GB of memory allocated
(significant being >20ms).

Assuming yes to the above, I want to:
- Disable GC. This is not difficult.
- I want to use language constructs and a library that don't rely on GC
(that doesn't throw garbage around).

So, two questions:
1. Are there language constructs that do rely on GC?  Ex: AA, dynamic
arrays, etc. E.g. will doing something like the following generate garbage:
char[] a = "a"; a ~= b;
Yes. Language constructs are, for the most part, implemented in the guts of Phobos. Check phobos/internal.
2. Does phobos in general rely on GC? I guess the answer is yes. Even if
not, I suppose there is no contract saying that it will remain that way. So,
any new version could change the no to a yes.
Yes it does.
Given the above, I plan on developing a library that will not rely on GC. I
believe there is none currently. However, before I reinvent the wheel, is
there such a library in existence today or any efforts towards one?
There is none that I'm aware of. If you're interested, Ares is intended to ultimately be a replacement standard library for D. It's still in its infancy, but I've already stripped all of the 'std' out of Phobos, so it's as minimal as possible. I have some patches that still need to be applied, but the version that's available works just fine under Windows (the patches are all Linux-related). Check http://www.dsource.org or download it directly at: http://home.f4.ca/sean/d/ares.zip It's worth noting that thread pausing can be prevented in one of two ways. The first is to avoid calling code that uses GC memory. The second is to launch your realtime code in a thread that is not visible to the GC. The easiest way to do this would be to create a thread using _beginthreadex or pthread_create, rather than using std.Thread. The more difficult method would be to modify std.Thread to allow threads to be 'detached' from the allThreads collection, so the GC has no way to access them during a GC run. Both methods have obvious issues if such threads are using GCed memory. Sean
May 02 2005
prev sibling next sibling parent Kevin Bealer <Kevin_member pathlink.com> writes:
In article <d55mvu$25r4$1 digitaldaemon.com>, Maxime Larose says...
I want to code an application in D.

The application must be (almost) real-time and needs to have *no* GC 
pause
throughout. Well, maybe a pause of 20 ms is acceptable, but any more 
than
that is not. The application will have a *lot* of memory allocated. In 
the
range of 3-4 GB of allocated memory.
(I would think using 3-4 GB of memory is going to be tricky on Linux or Windows regardless of memory strategies. The applications I've been working with lately in C++ have trouble over 2.5 GB or so, but they are doing large mmaps() so I'm not positive here..)
Will the GC pause?  I guess so, as there is no way in the world a GC 
will
not pause for a significant amount of time with 4GB of memory 
allocated
(significant being >20ms).

Assuming yes to the above, I want to:
- Disable GC. This is not difficult.
- I want to use language constructs and a library that don't rely on 
GC
(that doesn't throw garbage around).

So, two questions:
1. Are there language constructs that do rely on GC?  Ex: AA, dynamic
arrays, etc. E.g. will doing something like the following generate 
garbage:
char[] a = "a"; a ~= b;
Yes, I think it probably always will, and there is probably no helping it. If you could guarantee that this was only done a little you might even disable the GC and just ignore the small leakings. Or you could start out with big D arrays for each type: char[] x; x.length = 16*(1 << 20); // 16 mb for strings . then divide it up; instead of allocating, use slicing of the existing memory. Track it with your own "free lists". Which in turn need to be fixed size arrays (or integrated into the objects). Not impossible per se. If you need more, you can allocate another array with malloc() or create a blank file and mmap() it.
2. Does phobos in general rely on GC? I guess the answer is yes. Even 
if
not, I suppose there is no contract saying that it will remain that 
way. So,
any new version could change the no to a yes.
I would just guess that something like 80 % of phobos functions would need to change to accept some kind of memory allocation parameter(s), or at least to have precise semantics. Many of the functions allocate memory conditionally, ie sometimes allocate, sometimes not, without indicating which.
Given the above, I plan on developing a library that will not rely on 
GC. I
believe there is none currently. However, before I reinvent the wheel, 
is
there such a library in existence today or any efforts towards one?
Primitives and structs would probably be doable. Classes are probably completely out. The best you can do with classes (I think) is to keep a free list for each class and use a reset() method or something to re-construct objects.
Also, please do correct me if I am wrong in any of my assumptions.
I think there GC's for real time code that pause for very short, maybe even configurable, amounts of time. If you have *a lot* of code to write you might be able to write your own GC or adapt an existing "RT" one. Slab allocation with a free list of slabs might be a good step -- if you could get phobos functions to allocate from a specified slab, you would not need to re-engineer them; for a regular expression, for example, if all of its 'scratch space' objects were in the slab you gave it, you can just wipe the entire slab. This is safe and lets you keep the existing algorithms. If you use slabs, you will need to dup the data you want to another slab before wiping the memory. This is how a copying garbage collector typically works -- it does a deep copy of the in-use data from one "arena" into another. You can do this if you have a pretty good idea how you used the slabe, which pieces are still interesting, etc. (All in all, I don't think the parameters you describe are impossible, but your specs seem quite ambitious if it is for a 32 bit machine.) One assumption you might revisit: single application. If you split the code into two processes, one that manages the "real time" part (IO or graphics or whatever) and does very little else, and one that is free to pause for a few seconds, you might make your job much easier. The no-pausing constraint is a very rigorous requirement, and my instinct is to wall it off where it will have a reduced impact, if your application is amenable to such a seperation. I saw the thread from Sean about threads and I'm not familiar with the technique he describes but it may be better than seperate processes. Still, seperate processes might have other advantages, depending on how far you went with it (ie each process can address its own proprietary 2-4 GB of memory). Kevin
May 02 2005
prev sibling parent reply "Maxime Larose" <mlarose broadsoft.com> writes:
(Sean, Kevin,)

Thanks for the in-depth comments.  At this point, what I've decided is to
make some "real" code and to test to see what the actual performances of the
GC are. If they are barely unacceptable, I will go on and plan to write a RT
GC along the way. If they are really unnacceptable, I'll probably think
about rewriting the GC right away to see if what I want to do is even
feasible. (I *hate* C++, that's my problem. I'm so biased against it that
just the thought of doing a pet project in C++ makes me feel bad... ;)

The "problem" with D is that everything is allocated on the heap. Even small
arrays or temporary classes. Now _that_ creates garbage for the GC. At
least, it is feasible to change this default, even if the syntax for doing
so is extremly clumsy.


Max
May 04 2005
parent Sean Kelly <sean f4.ca> writes:
In article <d5ap38$26se$1 digitaldaemon.com>, Maxime Larose says...
The "problem" with D is that everything is allocated on the heap. Even small
arrays or temporary classes. Now _that_ creates garbage for the GC. At
least, it is feasible to change this default, even if the syntax for doing
so is extremly clumsy.
This bothers me as well. But if you're working with DMD (as opposed to GDC) then you might want to check out alloca for stack allocating classes. There's some discussion of it in the "memory management" section of the D spec. Sean
May 04 2005