digitalmars.D.announce - =?ISO-8859-1?Q?DConf_2013_Day_2_Talk_5=3A_A

digitalmars.D.announce - =?ISO-8859-1?Q?DConf_2013_Day_2_Talk_5=3A_A_Precis?=

Andrei Alexandrescu (8/8) Jun 05 2013 Reddit:

bearophile (5/6) Jun 05 2013 Is this useful to make the GC precise regarding the stack too?

SomeDude (4/10) Jun 05 2013 Here is the blog post deadalnix is referring about at the very

bearophile (4/13) Jun 05 2013 I don't understand how your answer is related to my question...

Adam D. Ruppe (5/6) Jun 05 2013 At least in terms of memory leaks, preciseness is solving a

Rainer Schuetze (22/27) Jun 05 2013 I was imagining something similar too. The part about multi-threading in...

Diggory (18/18) Jun 06 2013 Interesting talk. It seems there are a few features D could

Rainer Schuetze (9/26) Jun 08 2013 There are also D functions annotated extern(C) to subvert the module

Benjamin Thaut (25/39) Jun 06 2013 I think you don't really need percise scanning which is thread safe. If

Rainer Schuetze (14/56) Jun 08 2013 Considering "string" is "immutable(char)[]", would you want to allocate

Benjamin Thaut (23/85) Jun 08 2013 Not for every pointer assignment, only for pointers that are on the

Nick Sabalausky (4/7) Jun 05 2013 Torrents up, and the other links:
nazriel (4/15) Jun 06 2013 Very cool presentation.
Juan Manuel Cabo (8/22) Jun 12 2013 Loved this talk.
Juan Manuel Cabo (8/22) Jun 12 2013 Loved this talk.

Rainer Schuetze (10/30) Jun 12 2013 A struct allocated on the heap does not need the type info in memory, it...

bearophile (12/13) Jun 27 2013 Another thing to keep in account while designing a more precise

Rainer Schuetze (4/14) Jul 01 2013 In the proposed implementation, the gc_emplace function can be used to

bearophile (28/32) Jul 01 2013 Thank you for the answer.

Rainer Schuetze (17/47) Jul 01 2013 Yes, that would be Algebraic's responsibility. An optimization is to

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Reddit: 
http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

Hackernews: https://news.ycombinator.com/item?id=5825320

Twitter: https://twitter.com/D_Programming/status/342269600689430529

Facebook: https://www.facebook.com/dlang.org/posts/651801198166898

Youtube: http://youtube.com/watch?v=LQY1m_eT37c

Please drive discussions on the social channels, they help D a lot.


Andrei

Jun 05 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

Is this useful to make the GC precise regarding the stack too?

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.5570

Bye,
bearophile

Jun 05 2013

"SomeDude" <lovelydear mailmetrash.com> writes:

On Wednesday, 5 June 2013 at 14:14:45 UTC, bearophile wrote:
 Andrei Alexandrescu:

 http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

 Is this useful to make the GC precise regarding the stack too?

 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.5570

 Bye,
 bearophile

Here is the blog post deadalnix is referring about at the very 
end of the video:
http://www.deadalnix.me/2012/03/05/impact-of-64bits-vs-32bits-when-using-non-precise-gc/

Jun 05 2013

"bearophile" <bearophileHUGS lycos.com> writes:

SomeDude:

 Is this useful to make the GC precise regarding the stack too?

 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.5570

 Bye,
 bearophile

 Here is the blog post deadalnix is referring about at the very 
 end of the video:
 http://www.deadalnix.me/2012/03/05/impact-of-64bits-vs-32bits-when-using-non-precise-gc/

I don't understand how your answer is related to my question...

Bye,
bearophile

Jun 05 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 5 June 2013 at 20:14:32 UTC, bearophile wrote:
 I don't understand how your answer is related to my question...

At least in terms of memory leaks, preciseness is solving a 
problem that doesn't exist on 64 bit. I don't know if precise 
stack scanning does anything on 32 bit though and idk about the 
gc's speed difference in any case.

Jun 05 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 05.06.2013 16:14, bearophile wrote:
 Andrei Alexandrescu:

 http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

 Is this useful to make the GC precise regarding the stack too?

 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.5570

I was imagining something similar too. The part about multi-threading in 
the paper isn't too convincing, though. Especially the need for GC safe 
points in tight loops to block a thread during collection will not get 
many friends in the D community.

As parameters to function are covered by the callee, the approach won't 
work if you pass a temporary reference to a C function, which does a 
callback into code that has a safe point. The temporary on the stack 
will not be seen during a collection. Example:

/// D
void main() { c_func(toStringz(to!string(3))); }
bool gc_running() { gc_check(); return true; }

/// C
void c_func(const char* s) { if(gc_running()) printf("s=%s\n", s); }

My idea is to generate a description of the stack with information where 
pointers might be found within the functions stack space. If this 
includes parameters to called functions you might not even have problems 
with pausing a thread inside a C callback. This is slightly less precise 
because some stack fields might be used by both pointers and 
non-pointers. The runtime overhead would be similar to the 
implementation in the paper: insert/remove descriptors from a single 
linked list.

Jun 05 2013

"Diggory" <diggsey googlemail.com> writes:

Interesting talk. It seems there are a few features D could 
provide which would make for some more powerful GCs:

- Hook for when generating code which reads/writes a pointer 
allowing GC to put in read/write barriers where necessary.

- Hook for union assignment so that the GC could emplace the 
correct type bits

- Hook for passing pointers to extern(XXX) functions so that the 
GC can pin the arguments allowing a compacting GC

- Hook for function call/return so that stack type-info can be 
stored. There's actually another way to do this which would have 
less overhead - at compile time the compiler could write the 
address range of each function along with a bitmap for possible 
pointers in that function to some known location. When the GC 
runs it could unwind the stack and look each frame pointer up in 
the list of functions to find the bit pattern to use. When 
calling a function that does some low-level stuff and might 
prevent the unwinding from working correctly it would be possible 
to switch back to conservative mode until the function returns.

Jun 06 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 06.06.2013 21:42, Diggory wrote:
 Interesting talk. It seems there are a few features D could provide
 which would make for some more powerful GCs:

 - Hook for when generating code which reads/writes a pointer allowing GC
 to put in read/write barriers where necessary.

 - Hook for union assignment so that the GC could emplace the correct
 type bits

 - Hook for passing pointers to extern(XXX) functions so that the GC can
 pin the arguments allowing a compacting GC

There are also D functions annotated extern(C) to subvert the module 
system. It might be a bit expensive to pin any pointer passed to such 
function.

 - Hook for function call/return so that stack type-info can be stored.
 There's actually another way to do this which would have less overhead -
 at compile time the compiler could write the address range of each
 function along with a bitmap for possible pointers in that function to
 some known location. When the GC runs it could unwind the stack and look
 each frame pointer up in the list of functions to find the bit pattern
 to use. When calling a function that does some low-level stuff and might
 prevent the unwinding from working correctly it would be possible to
 switch back to conservative mode until the function returns.

I was considering something like that too (AFAICT exception handling on 
Win64 works similar), but the problem is to actually properly unwind the 
stack. If you have called some C function with a callback into your D 
code, stack unwinding might not be able to find the correct stack frames 
above the C function, so that you don't scan the stack above that.

Jun 08 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 06.06.2013 08:28, schrieb Rainer Schuetze:
On 05.06.2013 16:14, bearophile wrote:
Andrei Alexandrescu:

http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

Is this useful to make the GC precise regarding the stack too?

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.5570

I was imagining something similar too. The part about multi-threading in
the paper isn't too convincing, though. Especially the need for GC safe
points in tight loops to block a thread during collection will not get
many friends in the D community.

I think you don't really need percise scanning which is thread safe. If
you have one pool per thread, and you can scan the thread local pools
percisley within the thread that would be enough. Because you would then
be able to do generational garbage collection for the thread local
pools. If you have to scan one of (or the) global pool, percise scanning
of the stacks is not really needed, the old impercises scanning is
sufficient, you just have to pin those memory blocks you might think are
referenced from stack memory.

But to be able to actually do thread local pools a bit more then write
barriers would be needed. For each of the following situations a call
into druntime would be needed.

1) Creating a shared or immutable object
2) Casting to shared or immutable
3) Assigning a shared, immutable, __gshared global or static variable

If you have these and you are able to percisley scan the stack for the
current thread only you will then be able to move all of the referenced
memory from the thread local pool to the global pool if any of the above
calls happen.

This would mean that most of the time only thread local garbage
collection would be neccessary and you won't have to stop other threads
from doing whatever they are doing. Only in rare cases it should be
necessary to scan the global pool.

Kind Regards
Benjamin Thaut

Jun 06 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 06.06.2013 22:27, Benjamin Thaut wrote:
Am 06.06.2013 08:28, schrieb Rainer Schuetze:
On 05.06.2013 16:14, bearophile wrote:
Andrei Alexandrescu:

http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

Is this useful to make the GC precise regarding the stack too?

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.5570

Wouldn't that mean a write-barrier for every pointer assignment?

But to be able to actually do thread local pools a bit more then write
barriers would be needed. For each of the following situations a call
into druntime would be needed.

1) Creating a shared or immutable object
2) Casting to shared or immutable
3) Assigning a shared, immutable, __gshared global or static variable

Considering "string" is "immutable(char)[]", would you want to allocate
all temporary strings on the global heap? Also, I don't like to have
possible expensive operations for casting.

I agree that a thread local pool can give good performance improvements.
But as long as you still have a global heap (which you probably cannot
eliminate), it's not a simplification to have thread local garbage
collections in addition.

The problem to implement it is that shared semantics are still pretty
undefined. AFAICT "shared" is only a type modifier that has different
conversion rules than non-shared types. There are no runtime guarantees
with "shared", even less with the absence of shared, and even if they
exist, __gshared and casting are meant to subvert them.

Jun 08 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 08.06.2013 09:50, schrieb Rainer Schuetze:
On 06.06.2013 22:27, Benjamin Thaut wrote:
Am 06.06.2013 08:28, schrieb Rainer Schuetze:
On 05.06.2013 16:14, bearophile wrote:
Andrei Alexandrescu:

http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

Is this useful to make the GC precise regarding the stack too?

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.5570

Wouldn't that mean a write-barrier for every pointer assignment?

Not for every pointer assignment, only for pointers that are on the
heap. Also most of these write barries could be skipped with really
cheap tests most of the time.

But to be able to actually do thread local pools a bit more then write
barriers would be needed. For each of the following situations a call
into druntime would be needed.

1) Creating a shared or immutable object
2) Casting to shared or immutable
3) Assigning a shared, immutable, __gshared global or static variable

Considering "string" is "immutable(char)[]", would you want to allocate
all temporary strings on the global heap? Also, I don't like to have
possible expensive operations for casting.

Good point, didn't think about that yet. In Order to improve the
performance of the GC we have to sacrifice performance in other places
of the language in my opinion. Given the fact that the current GC is 3
times slower then manual memory mangement (in my test case). It should
be easly possible to take a performance hit here and there but end up
beeing faster in the end anyway. The current GC is especially bad with
short lived allocations and I don't think this is going to get any
better when only using a Mark & Sweep. The D community has to see that
to get a state of the art GC some perfomance sacrifces have to be done.
If those are not wanted we can go back to manual memory management right
away, in my opinion. I recently also had a long talk with Claus
Gittinger (created the SmalltalkX VM) about the topic. He also had the
same opinion on the topic. He also stated that it is going to be very
hard if not impossible to implement a state of the Art GC into a
language like D which does not have any restrictions in the language
which help with that Task.

No one said that it is a simplification. Its always going to get more
complicated if you want a better GC.

Then maybe the semantics of shared should be defined with a GC in mind.

Jun 08 2013

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Wed, 05 Jun 2013 09:23:30 -0400
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Reddit: 
 http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/
 

Torrents up, and the other links:
http://semitwist.com/download/misc/dconf2013/

Jun 05 2013

"nazriel" <spam dzfl.pl> writes:

On Wednesday, 5 June 2013 at 13:23:30 UTC, Andrei Alexandrescu 
wrote:
 Reddit: 
 http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

 Hackernews: https://news.ycombinator.com/item?id=5825320

 Twitter: 
 https://twitter.com/D_Programming/status/342269600689430529

 Facebook: 
 https://www.facebook.com/dlang.org/posts/651801198166898

 Youtube: http://youtube.com/watch?v=LQY1m_eT37c

 Please drive discussions on the social channels, they help D a 
 lot.


 Andrei

Very cool presentation.
Thank you Rainer.

Jun 06 2013

Juan Manuel Cabo <juanmanuel.cabo gmail.com> writes:

On 06/05/2013 10:23 AM, Andrei Alexandrescu wrote:
 Reddit: http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/
 
 Hackernews: https://news.ycombinator.com/item?id=5825320
 
 Twitter: https://twitter.com/D_Programming/status/342269600689430529
 
 Facebook: https://www.facebook.com/dlang.org/posts/651801198166898
 
 Youtube: http://youtube.com/watch?v=LQY1m_eT37c
 
 Please drive discussions on the social channels, they help D a lot.
 
 
 Andrei


Loved this talk.

Would struct have an extra field in memory pointing to the
needed type info? If all of this is implemented, will this
mean that an array of structs will not have their data contiguous
in memory?

Thanks for the talk!

--jm

Jun 12 2013

Juan Manuel Cabo <juanmanuel.cabo gmail.com> writes:

On 06/05/2013 10:23 AM, Andrei Alexandrescu wrote:
 Reddit: http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/
 
 Hackernews: https://news.ycombinator.com/item?id=5825320
 
 Twitter: https://twitter.com/D_Programming/status/342269600689430529
 
 Facebook: https://www.facebook.com/dlang.org/posts/651801198166898
 
 Youtube: http://youtube.com/watch?v=LQY1m_eT37c
 
 Please drive discussions on the social channels, they help D a lot.
 
 
 Andrei


Loved this talk.

Would struct have an extra field in memory pointing to the
needed type info? If all of this is implemented, will this
mean that an array of structs will not have their data contiguous
in memory?

Thanks for the talk!

--jm

Jun 12 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 12.06.2013 15:38, Juan Manuel Cabo wrote:
 On 06/05/2013 10:23 AM, Andrei Alexandrescu wrote:
 Reddit: http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

 Hackernews: https://news.ycombinator.com/item?id=5825320

 Twitter: https://twitter.com/D_Programming/status/342269600689430529

 Facebook: https://www.facebook.com/dlang.org/posts/651801198166898

 Youtube: http://youtube.com/watch?v=LQY1m_eT37c

 Please drive discussions on the social channels, they help D a lot.


 Andrei


 Loved this talk.

Thanks.

 Would struct have an extra field in memory pointing to the
 needed type info? If all of this is implemented, will this
 mean that an array of structs will not have their data contiguous
 in memory?

A struct allocated on the heap does not need the type info in memory, it 
is passed with the call invoked by "new" and it is used to supply the 
bitmap of pointers that is copied to the appropriate memory.

Due to the implementation details of arrays, it is a bit different for 
them: The memory layout changes slightly depending on the size of the 
array, so it is not the allocation that creates the pointer bitmap, but 
the array functions call "emplace" to fill the bitmap from an array type 
info. The memory layout of the array itself is unchanged.

Jun 12 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

Another thing to keep in account while designing a more precise 
garbage collection is a possible special casing for Algebraic 
(and Variant, and more generally for some standardized kind of 
tagged union):

http://d.puremagic.com/issues/show_bug.cgi?id=5057

In an Algebraic there is run-time information for the GC to 
decide if inside it there are pointers to follow or not. It's 
mostly a matter of letting the GC recognize and use such 
information.

Bye,
bearophile

Jun 27 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 27.06.2013 19:33, bearophile wrote:
 Andrei Alexandrescu:

 http://www.reddit.com/r/programming/comments/1fpw2r/dconf_2013_day_2_talk_5_a_precise_garbage/

 Another thing to keep in account while designing a more precise garbage
 collection is a possible special casing for Algebraic (and Variant, and
 more generally for some standardized kind of tagged union):

 http://d.puremagic.com/issues/show_bug.cgi?id=5057

 In an Algebraic there is run-time information for the GC to decide if
 inside it there are pointers to follow or not. It's mostly a matter of
 letting the GC recognize and use such information.

In the proposed implementation, the gc_emplace function can be used to 
pass this information to the GC. This would need to be called whenever 
the location of pointers changes, so it's not high-performance.

Jul 01 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Rainer Schuetze:

 In the proposed implementation, the gc_emplace function can be 
 used to pass this information to the GC. This would need to be 
 called whenever the location of pointers changes, so it's not 
 high-performance.

Thank you for the answer.

Let me see if I understand what you are saying. If I have an 
usage of Algebraic like this, at line 4 x contains no GC-managed 
pointers, while at line 6 it contains a pointer to the array data:


import std.variant: Algebraic;
void main() {
     alias T = Algebraic!(ulong, int[]);
     T x = 1UL;  // line 4
     auto items = [1, 2];
     x = items; // line 6
     x = 2UL;   // line 7
     x = items; // line 8
}


You say that every time you change the _type_ of the contents of 
x you have to call gc_emplace.

But isn't the D GC called only when an allocation occurs? So 
what's the point of calling gc_emplace (at lines 7 and 8) if no 
garbage collection happens? Isn't it possible for the GC to use 
(and update) this information lazily, only when a collection 
occurs?

(Extra note: Algebraic is meant for functional-style programming. 
In such kind of programming data is mostly immutable. This means 
you don't change the contents and the type of an algebraic 
variable once it is assigned. So usually you call gc_emplace only 
once for one of such variables).

Bye,
bearophile

Jul 01 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 01.07.2013 12:30, bearophile wrote:
 Rainer Schuetze:

 In the proposed implementation, the gc_emplace function can be used to
 pass this information to the GC. This would need to be called whenever
 the location of pointers changes, so it's not high-performance.

 Thank you for the answer.

 Let me see if I understand what you are saying. If I have an usage of
 Algebraic like this, at line 4 x contains no GC-managed pointers, while
 at line 6 it contains a pointer to the array data:


 import std.variant: Algebraic;
 void main() {
      alias T = Algebraic!(ulong, int[]);
      T x = 1UL;  // line 4
      auto items = [1, 2];
      x = items; // line 6
      x = 2UL;   // line 7
      x = items; // line 8
 }


 You say that every time you change the _type_ of the contents of x you
 have to call gc_emplace.

Yes, that would be Algebraic's responsibility. An optimization is to 
only do it if the respective pointer info changes, but that would tie it 
a bit more to the GC implementation.

In your example, x is a stack variable, so as long as there is no 
precise scanning of the stack, that actually generates a lot of useless 
interaction with the GC.

 But isn't the D GC called only when an allocation occurs? So what's the
 point of calling gc_emplace (at lines 7 and 8) if no garbage collection
 happens? Isn't it possible for the GC to use (and update) this
 information lazily, only when a collection occurs?

I was thinking about allowing some callback to do the scanning, but that 
has some serious drawbacks:

- it is not easily composable if structs like Algebraic are fields of a 
class/struct. You would not want to call a function for every field.

- having to call a (virtual) scanning function for every allocated 
object is probably also a performance killer, though I haven't tried it yet.

- it does not allow very unpredictable usages of memory using std.emplace.

Doing just the pointer bitmap update lazily is an interesting idea, 
though I'm not sure if it is actually less work to cache this info than 
to update the bitmap immediately.

 (Extra note: Algebraic is meant for functional-style programming. In
 such kind of programming data is mostly immutable. This means you don't
 change the contents and the type of an algebraic variable once it is
 assigned. So usually you call gc_emplace only once for one of such
 variables).

 Bye,
 bearophile

Jul 01 2013

D Programming

C/C++ Programming

Other

digitalmars.D.announce - =?ISO-8859-1?Q?DConf_2013_Day_2_Talk_5=3A_A_Precis?=