digitalmars.D - D on next-gen consoles and for game development

Brad Anderson (46/46) May 23 2013 While there hasn't been anything official, I think it's a safe

Joseph Rushton Wakeling (4/10) May 23 2013 As a starting point, do we have a list of the Phobos functions that allo...

Brad Anderson (6/24) May 23 2013 I think that's where Johannes Pfau's -vgc can come in and help.
Don (20/38) May 24 2013 It's worth noting that our code at Sociomantic faces *exactly*

Dicebot (4/6) May 24 2013 It is worth noting that _anyone_ trying to write code with either
Manu (16/53) May 24 2013 Yeah, I've often wanted API's in that fashion too.

Timon Gehr (3/15) May 25 2013 Yes, that is basically it.

Jonathan M Davis (18/23) May 24 2013 We already have stuff like format vs formattedWrite where one allocates ...

Brad Anderson (3/12) May 24 2013 Sounds good to me. Should the overloads return the output range

Diggory (6/18) May 24 2013 If it returned the output range it would be possible to make
Jonathan M Davis (4/17) May 24 2013 Right now, all of the functions that we have like that don't return the ...

Szymon Gatner (3/6) May 23 2013 May I ask where this intel comes from? Do you have any more

Brad Anderson (5/11) May 23 2013 You can watch Manu's talk from DConf here:

Szymon Gatner (8/21) May 23 2013 Ah I did watch it. Didn't realize Manu works at Remedy. Being

Manu (3/30) May 23 2013 I really hope D on ARM gets some more attention in the near future. The ...
Joseph Rushton Wakeling (6/8) May 23 2013 GDC is close to being fully usable on ARM, no? And as I recall the only...
Manu (14/26) May 23 2013 Well the compiler seems fine actually. It generates good ARM code in my

Joseph Rushton Wakeling (4/14) May 23 2013 Do you think we could expect those ports to be given back to the

Manu (4/17) May 23 2013 I'd like to think they'd be made available. I certainly would. But you c...
Nick Sabalausky (17/32) May 23 2013 It would be prohibited by console manufacturer's

H. S. Teoh (12/25) May 23 2013 I listened to Manu's talk yesterday, and I agree with what he said, that

Jacob Carlborg (4/7) May 23 2013 Perhaps using a UDA.

Kiith-Sa (23/69) May 23 2013 Without official confirmation, I think it's rather early to

Nick Sabalausky (19/84) May 23 2013 I'd like to hear an official confirmation (or denial) at this point,
Jonathan M Davis (4/10) May 23 2013 Presumably, we'll get that with custom allocators. So, it's probably jus...
Manu (44/113) May 23 2013 ome
Joseph Rushton Wakeling (10/28) May 23 2013 Maybe someone else can point to an example, but I can't think of any lan...

Jacob Carlborg (4/7) May 24 2013 You can already swap the GC implementation at link time.

Dicebot (4/12) May 24 2013 Yep, exactly. Hard-wired GC is not the problem. Lack of
deadalnix (3/11) May 24 2013 Granted the GC fit in the model. Which means no barriers for
Manu (3/9) May 24 2013 Sure, but there's not an established suite of options to choose from.

H. S. Teoh (21/41) May 24 2013 Makes sense, so basically the GC should not cause jittery framerates,
Manu (23/59) May 24 2013 Precisely.

deadalnix (13/43) May 24 2013 That is kind of biased, as you'll generally win on other aspects.

H. S. Teoh (34/67) May 24 2013 Makes sense. So basically some kind of incremental algorithm is in
Manu (28/61) May 24 2013 t

deadalnix (20/51) May 24 2013 It about how much garbage you produce while the GC is collecting.

Manu (15/52) May 24 2013 This totally depends on the task. Almost every task will have its own
Manu (27/34) May 24 2013 Actually, I don't think I've made this point clearly before, but it is o...

deadalnix (2/6) May 24 2013 So the GC is kind of out.

Manu (7/12) May 24 2013 Yeah, I'm wondering if that's just a basic truth for embedded.

deadalnix (9/29) May 24 2013 This is technically possible, but you said you make few

Manu (9/32) May 24 2013 But it would be deterministic, and if the allocations are few, the cost

Paulo Pinto (17/47) May 25 2013 Yes, but is was mainly for not being able to have a stable working GC
deadalnix (6/16) May 25 2013 You'll pay a tax on pointer write, not on allocations ! It won't
Steven Schveighoffer (30/34) May 28 2013 Having used ObjC for the last year or so working on iOS, it is a very ni...

David Nadlinger (16/20) May 28 2013 Hm, apparently I was imprecise or I slightly misunderstood your

Steven Schveighoffer (5/11) May 28 2013 More like I am a compiler ignorant and didn't understand/properly rememb...

Manu (14/46) May 28 2013 Right. This is almost precisely how I imagined it would be working.

David Nadlinger (8/11) May 28 2013 It isn't the best idea to do this sort of optimizations

Manu (6/16) May 28 2013 ay

Paulo Pinto (3/26) May 28 2013 I imagine Microsoft also does a similar thing with their C++/CX language...

Manu (15/49) May 28 2013 Yeah certainly. It's ref counted, not garbage collected. And Android's V...

Steven Schveighoffer (7/14) May 28 2013 An interesting thing to note, Apple tried garbage collection with Obj-C,...

Paulo Pinto (14/33) May 29 2013 The main reason was that the GC never worked properly given the C

Jacob Carlborg (6/16) May 29 2013 I'm pretty it works for their CoreFoundation framework which is a C

Michel Fortin (17/37) May 29 2013 It does for CF types which are toll-free bridged, if you mark them to

Rainer Schuetze (21/35) May 29 2013 Please note that you have to deal with circular references manually in

Paulo Pinto (4/53) May 29 2013 There is a nice document where it is described alongside all
Manu (7/50) May 29 2013 What do you think is easier, or perhaps even POSSIBLE in D?

Michel Fortin (14/18) May 29 2013 Given that both require calling a function of some sort on pointer
Rainer Schuetze (16/22) May 30 2013 I think none of them is feasible without write-barriers on pointer

Manu (20/43) May 30 2013 I'm talking about embedded hardware. No virtualisation, tight memory lim...

Dicebot (4/10) May 30 2013 Well, anything that is done by OS can also be done by program

Manu (2/11) May 30 2013 Which 'both' cases?

Dicebot (2/3) May 30 2013 "OS support for fork+CoW" vs "no support, own implementation"

Diggory (5/8) May 30 2013 If you can modify the DMD compiler to output a special sequence

Michel Fortin (11/14) May 30 2013 This also happens to be the same requirement for automatic reference

Rainer Schuetze (10/51) May 30 2013 I suspected embedded systems would not have enough support for COW. I

Benjamin Thaut (4/6) May 30 2013 Would they? Shouldn't it be possible to make this part of the post-blit

Rainer Schuetze (3/7) May 30 2013 Not in general, e.g. reference counting needs to know the state before

Michel Fortin (13/22) May 31 2013 No. Reference counting would work with post-blit: you have the pointer,

Rainer Schuetze (24/42) May 31 2013 I was thinking about struct assignment through copying and then calling

Paulo Pinto (12/85) May 29 2013 Actually what I was implying was the cleverness of the compiler

Patrick Down (20/24) May 25 2013 Incidentally, I ran across this paper that talks about a

Nick Sabalausky (15/30) May 24 2013 Heh, I think that'd be nobel-prize territory. "Side Effect Oriented

Joseph Rushton Wakeling (4/11) May 31 2013 Don't have the experience to judge it, but someone made a remark about N...

Flamaros (12/58) May 23 2013 As a game developer I will be really enjoyed to be able to
Piotr Szturmaj (38/42) May 23 2013 When I started learning D 2 years ago, I read on the D webpage that D
QAston (14/32) May 23 2013 I think that Phobos should have some support for manual memory

Steven Schveighoffer (12/17) May 23 2013 While I'm not specifically addressing the ability or not to disable the ...

QAston (10/23) May 23 2013 Yes, I know the rationale behind deprecating delete and i agree

QAston (4/6) May 23 2013 Sorry, should be:
H. S. Teoh (8/35) May 23 2013 Please file a bug on the bugtracker to update memory.html to reflect

1100110 (3/39) May 24 2013 Agreed, even if it's just a Warning Deprecated

Brad Anderson (15/22) May 23 2013 There is std.typecons.Unique and std.typecons.RefCounted. Unique

QAston (3/25) May 23 2013 Thank you very much for the reply - I didn't realize those were

Jonathan M Davis (11/24) May 23 2013 delete is only used for GC memory, and manual memory management should r...
Manu (16/44) May 23 2013 I've always steered away from things like this because it creates a

Andrei Alexandrescu (7/19) May 23 2013 But that's worse than non-intrusive refcounting, and way worse than

deadalnix (4/10) May 23 2013 To benefit from a custom allocator, you need to be under a very

Sean Cavanaugh (75/84) May 23 2013 Most general allocators choke on multi-threaded code, so a large part of...

deadalnix (5/8) May 23 2013 It is safe to assume that the future is multithreaded and that

Michel Fortin (11/19) May 23 2013 I just want to note that this is exactly how reference counts are
deadalnix (6/21) May 23 2013 Reference counting also tend to create die in mass effect

Manu (7/24) May 23 2013 In my experience that's fine.

Dmitry Olshansky (29/44) May 23 2013 I have simple and future proof proposal:
Paulo Pinto (7/50) May 23 2013 With the increase usage of Windows Phone 8 (I know I know), MonoGame,
Joseph Rushton Wakeling (8/13) May 23 2013 I'm also in agreement with Manu. There may well already be bugs for som...

Jacob Carlborg (6/10) May 24 2013 toUpper/lower cannot be made in place if it should handle all Unicode.

Dmitry Olshansky (4/14) May 24 2013 Yes! Now we're getting somewhere. The function was a mistake to begin wi...
Peter Alexander (3/7) May 24 2013 In that case it should only allocate when needed. Most strings

Jacob Carlborg (5/7) May 24 2013 What I mean is that something called "InPlace" doesn't go hand in hand

Peter Alexander (2/8) May 24 2013 Ah right, I see your point. My bad.

Joseph Rushton Wakeling (5/8) May 24 2013 Surely it's possible to put in-place checks for whether the character le...

Peter Alexander (6/9) May 24 2013 Unfortunately it's either that or lose compatibility with ASCII.

Simen Kjaeraas (10/17) May 24 2013 =

Manu (6/16) May 24 2013 he

Dmitry Olshansky (46/62) May 24 2013 Okay, here you go - an UTF-8 table of cased sin :)

Jonathan M Davis (40/91) May 23 2013 We use smart pointers where I work and it's a godsend for avoiding memor...
Manu (5/116) May 23 2013 /agree, except the issue I raised, when ~ is used in phobos.

Regan Heath (49/52) May 24 2013 It's not the allocation caused by ~ which is the issue though is it, it'...

Dicebot (14/21) May 24 2013 Depends. When it comes to real-time software you can't say

Regan Heath (21/42) May 24 2013 If you disable collection, then the GC runs out of memory what happens? ...

Manu (21/66) May 24 2013 Yes, but the unpredictability is the real concern. It's hard to control

Regan Heath (9/11) May 24 2013 Yeah, there is complexity. It all boils down to whether it is possible ...

Manu (7/91) May 24 2013 I might just add that there are some other important targets as well in ...

Benjamin Thaut (5/11) May 24 2013 Fully agree there. See

Jonathan M Davis (30/33) May 23 2013 Yes, we need to look at that. I actually don't think that ~ gets used mu...

Marco Leise (12/18) May 23 2013 On a related note, a while back I benchmarked the naive Phobos

Manu (11/28) May 23 2013 I don't think it's hack-ish at all, that's precisely what the stack is

deadalnix (3/16) May 23 2013 That is probably something that could be handled in the optimizer

Manu (8/24) May 23 2013 The optimiser probably can't predict if the function may recurse, and as
Jonathan M Davis (8/11) May 23 2013 That does sound probable, as toStringz will often (and unpredictably) re...
Manu (5/19) May 23 2013 Yeah, an alloca based cstring helper which performs the zero-terminate

Dmitry Olshansky (4/36) May 24 2013 Alternatively just make a TLS buffer as scratchpad and use that everywhe...

Peter Alexander (2/4) May 24 2013 I believe that's what TempAlloc is for.
Manu (4/46) May 24 2013 e.

Dmitry Olshansky (5/53) May 24 2013 Can pass across function boundaries up/down. Can grow arbitrary large

Jacob Carlborg (8/19) May 24 2013 Basically every function in Tango that operates on some kind of array

Brad Anderson (12/17) May 23 2013 I have yet to look at any of these entries but I went ahead and
Benjamin Thaut (19/62) May 24 2013 Besides my studies I'm working at havok and the biggest problems most

Manu (9/81) May 24 2013 Win64 works for me out of the box... ?

Paulo Pinto (11/21) May 25 2013 Given that Android, Windows Phone 7/8 and PS Vita have system languages
Benjamin Thaut (8/9) May 25 2013 For me dmd produces type names like modulename.typename.subtypename

Manu (3/12) May 25 2013 True, sadly there are holes in the debug experience, which are pretty
Brad Roberts (2/13) May 25 2013 Bugzilla links?

Rob T (21/23) May 24 2013 I would love to have something like @nogc to guarantee there's no

"Brad Anderson" <eco gnuk.net> writes:

While there hasn't been anything official, I think it's a safe 
bet to say that D is being used for a major title, Remedy's 
Quantum Break, featured prominently during the announcement of 
Xbox One. Quantum Break doesn't come out until 2014 so the 
timeline seems about right (Remedy doesn't appear to work on more 
than one game at a time from what I can tell).


That's pretty huge news.


Now I'm wondering what can be done to foster this newly acquired 
credibility in games.  By far the biggest issue I hear about when 
it comes to people working on games in D is the garbage 
collector.  You can work around the GC without too much 
difficulty as Manu's experience shared in his DConf talk shows 
but a lot of people new to D don't know how to do that.  We could 
also use some tools and guides to help people identify and avoid 
GC use when necessary.

 nogc comes to mind (I believe Andrei mentioned it during one of 
the talks released). [1][2]

Johannes Pfau's work in progress -vgc command line option [3] 
would be another great tool that would help people identify GC 
allocations.  This or something similar could also be used to 
document throughout phobos when GC allocations can happen (and 
help eliminate it where it makes sense to).

There was a lot of interesting stuff in Benjamin Thaut's article 
about GC versus manual memory management in a game [4] and the 
discussion about it on the forums [5].  A lot of this collective 
knowledge built up on manual memory management techniques 
specific to D should probably be formalized and added to the 
official documentation.  There is a Memory Management [6] page in 
the documentation but it appears to be rather dated at this point 
and not particularly applicable to modern D2 (no mention of 
emplace or scoped and it talks about using delete and scope 
classes).

Game development is one place D can really get a foothold but all 
too often the GC is held over D's head because people taking 
their first look at D don't know how to avoid using it and often 
don't realize you can avoid using it entirely. This is easily the 
most common issue raised by newcomers to D with a C or C++ 
background that I see in the #d IRC channel (many of which are 
interested in game dev but concerned the GC will kill their 
game's performance).


1: http://d.puremagic.com/issues/show_bug.cgi?id=5219
2: http://wiki.dlang.org/DIP18
3: https://github.com/D-Programming-Language/dmd/pull/1886
4: http://3d.benjamin-thaut.de/?p=20#more-20
5: http://forum.dlang.org/post/k27bh7$t7f$1 digitalmars.com
6: http://dlang.org/memory.html

May 23 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 05/23/2013 08:13 PM, Brad Anderson wrote:
 Now I'm wondering what can be done to foster this newly acquired credibility in
 games.  By far the biggest issue I hear about when it comes to people working
on
 games in D is the garbage collector.  You can work around the GC without too
 much difficulty as Manu's experience shared in his DConf talk shows but a lot
of
 people new to D don't know how to do that.  We could also use some tools and
 guides to help people identify and avoid GC use when necessary.

As a starting point, do we have a list of the Phobos functions that allocate
using GC when there's no need to?  That's a concern of Manu's that it ought to
be possible to address relatively swiftly if the information is to hand.

May 23 2013

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 23 May 2013 at 18:22:54 UTC, Joseph Rushton Wakeling 
wrote:
 On 05/23/2013 08:13 PM, Brad Anderson wrote:
 Now I'm wondering what can be done to foster this newly 
 acquired credibility in
 games.  By far the biggest issue I hear about when it comes to 
 people working on
 games in D is the garbage collector.  You can work around the 
 GC without too
 much difficulty as Manu's experience shared in his DConf talk 
 shows but a lot of
 people new to D don't know how to do that.  We could also use 
 some tools and
 guides to help people identify and avoid GC use when necessary.

 As a starting point, do we have a list of the Phobos functions 
 that allocate
 using GC when there's no need to?  That's a concern of Manu's 
 that it ought to
 be possible to address relatively swiftly if the information is 
 to hand.

I think that's where Johannes Pfau's -vgc can come in and help.  
The phobos unit tests have pretty good coverage so building those 
with -vgc would, in theory, point out the vast majority of places 
in phobos that use the gc.

May 23 2013

"Don" <turnyourkidsintocash nospam.com> writes:

On Thursday, 23 May 2013 at 18:22:54 UTC, Joseph Rushton Wakeling 
wrote:
 On 05/23/2013 08:13 PM, Brad Anderson wrote:
 Now I'm wondering what can be done to foster this newly 
 acquired credibility in
 games.  By far the biggest issue I hear about when it comes to 
 people working on
 games in D is the garbage collector.  You can work around the 
 GC without too
 much difficulty as Manu's experience shared in his DConf talk 
 shows but a lot of
 people new to D don't know how to do that.  We could also use 
 some tools and
 guides to help people identify and avoid GC use when necessary.



It's worth noting that our code at Sociomantic faces *exactly* 
the same issues.
We cannot use Phobos because of its reliance on the GC.
Essentially, we want to have the option of avoiding GC usage in 
every single function.

 As a starting point, do we have a list of the Phobos functions 
 that allocate
 using GC when there's no need to?  That's a concern of Manu's 
 that it ought to
 be possible to address relatively swiftly if the information is 
 to hand.

That is only part of the problem with Phobos. The bigger problem 
is with the functions that DO need to allocate memory. In Tango, 
and in our code, all such functions accept a buffer to store the 
results in.
So that, even though they need to allocate memory, if you call 
the function a thousand times, it only allocates memory once, and 
keeps reusing the buffer.

I'm not sure how feasible it is to add that afterwards. I hope it 
can be done without changing all the API's, but I fear it might 
not be.

But anyway, after fixing the obvious Phobos offenders, another 
huge step would be to get TempAlloc into druntime and used 
wherever possible in Phobos.

May 24 2013

"Dicebot" <m.strashun gmail.com> writes:

On Friday, 24 May 2013 at 07:57:42 UTC, Don wrote:
 It's worth noting that our code at Sociomantic faces *exactly* 
 the same issues.

It is worth noting that _anyone_ trying to write code with either 
soft or hard real-time requirements faces exactly the same issues 
;)

May 24 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 17:57, Don <turnyourkidsintocash nospam.com> wrote:

 On Thursday, 23 May 2013 at 18:22:54 UTC, Joseph Rushton Wakeling wrote:

 On 05/23/2013 08:13 PM, Brad Anderson wrote:

 Now I'm wondering what can be done to foster this newly acquired
 credibility in
 games.  By far the biggest issue I hear about when it comes to people
 working on
 games in D is the garbage collector.  You can work around the GC without
 too
 much difficulty as Manu's experience shared in his DConf talk shows but
 a lot of
 people new to D don't know how to do that.  We could also use some tools
 and
 guides to help people identify and avoid GC use when necessary.


 It's worth noting that our code at Sociomantic faces *exactly* the same
 issues.
 We cannot use Phobos because of its reliance on the GC.
 Essentially, we want to have the option of avoiding GC usage in every
 single function.


  As a starting point, do we have a list of the Phobos functions that
 allocate
 using GC when there's no need to?  That's a concern of Manu's that it
 ought to
 be possible to address relatively swiftly if the information is to hand.

 That is only part of the problem with Phobos. The bigger problem is with
 the functions that DO need to allocate memory. In Tango, and in our code,
 all such functions accept a buffer to store the results in.
 So that, even though they need to allocate memory, if you call the
 function a thousand times, it only allocates memory once, and keeps reusing
 the buffer.

 I'm not sure how feasible it is to add that afterwards. I hope it can be
 done without changing all the API's, but I fear it might not be.

Yeah, I've often wanted API's in that fashion too.
I wonder if it would be worth creating overloads of allocating functions
that receive an output buffer argument, rather than return an allocated
buffer...
Too messy?


But anyway, after fixing the obvious Phobos offenders, another huge step
 would be to get TempAlloc into druntime and used wherever possible in
 Phobos.

How does that work?

One pattern I've used a lot is, since we have a regular 60hz timeslice and
fairly a regular pattern from frame to frame, we use a temp heap which
pushes allocations on the end like a stack, then wipe it clean at the state
of the next frame.
Great for any small allocations that last no longer than a single frame.
It's fast (collection is instant), and it also combats memory
fragmentation, which is also critically important when working on memory
limited systems with no virtual memory/page file.

May 24 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 05/24/2013 04:33 PM, Manu wrote:
     But anyway, after fixing the obvious Phobos offenders, another huge
     step would be to get TempAlloc into druntime and used wherever
     possible in Phobos.


 How does that work?

 One pattern I've used a lot is, since we have a regular 60hz timeslice
 and fairly a regular pattern from frame to frame, we use a temp heap
 which pushes allocations on the end like a stack, then wipe it clean at
 the state of the next frame.
 Great for any small allocations that last no longer than a single frame.
 It's fast (collection is instant), and it also combats memory
 fragmentation, which is also critically important when working on memory
 limited systems with no virtual memory/page file.

Yes, that is basically it.


https://github.com/dsimcha/TempAlloc/blob/master/std/allocators/region.d

May 25 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Saturday, May 25, 2013 00:33:10 Manu wrote:
 Yeah, I've often wanted API's in that fashion too.
 I wonder if it would be worth creating overloads of allocating functions
 that receive an output buffer argument, rather than return an allocated
 buffer...
 Too messy?

We already have stuff like format vs formattedWrite where one allocates and the 
other takes an output range. We should adopt that practice in general. Where 
possible, it should probably be done with an overload of the function, but 
where that's not possible, we can simply create a new function with a similar 
name. Then any function which could allocate has the option of writing to an 
output range instead (which could be a delegate or an array or whatever) and 
avoid the allocation - though I'm not sure that arrays as output ranges 
currently handle running out of space very well, so we might need to figure 
something out there to properly deal with the case where there isn't enough 
room in the output range (arguably, output ranges need a bit of work in 
general though).

Regardless, the main question with regards to messiness is whether we can get 
away with creating overloads for existing functions which allocate or whether 
we'd be forced to create new ones (possibly using a naming scheme similar to 
how we have InPlace, only which indicates that it takes an output range or 
doesn't allocate or whatever).

- Jonathan M Davis

May 24 2013

"Brad Anderson" <eco gnuk.net> writes:

On Friday, 24 May 2013 at 19:44:23 UTC, Jonathan M Davis wrote:
 We already have stuff like format vs formattedWrite where one 
 allocates and the
 other takes an output range. We should adopt that practice in 
 general. Where
 possible, it should probably be done with an overload of the 
 function, but
 where that's not possible, we can simply create a new function 
 with a similar
 name.

Sounds good to me. Should the overloads return the output range 
or void?

May 24 2013

"Diggory" <diggsey googlemail.com> writes:

On Saturday, 25 May 2013 at 02:41:00 UTC, Brad Anderson wrote:
 On Friday, 24 May 2013 at 19:44:23 UTC, Jonathan M Davis wrote:
 We already have stuff like format vs formattedWrite where one 
 allocates and the
 other takes an output range. We should adopt that practice in 
 general. Where
 possible, it should probably be done with an overload of the 
 function, but
 where that's not possible, we can simply create a new function 
 with a similar
 name.

 Sounds good to me. Should the overloads return the output range 
 or void?

If it returned the output range it would be possible to make 
another function which returns a temporary output range and then 
easily chain together function calls:

CallWindowsApiW(mystr.writeUTF16z(tempBuffer()))

No GC allocation but not an unpleasant syntax either.

May 24 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, May 25, 2013 04:40:58 Brad Anderson wrote:
 On Friday, 24 May 2013 at 19:44:23 UTC, Jonathan M Davis wrote:
 We already have stuff like format vs formattedWrite where one
 allocates and the
 other takes an output range. We should adopt that practice in
 general. Where
 possible, it should probably be done with an overload of the
 function, but
 where that's not possible, we can simply create a new function
 with a similar
 name.

 
 Sounds good to me. Should the overloads return the output range
 or void?

Right now, all of the functions that we have like that don't return the output 
range, but I don't know that it would be a bad idea if they did.

- Jonathan M Davis

May 24 2013

"Szymon Gatner" <noemail gmail.com> writes:

On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
 While there hasn't been anything official, I think it's a safe 
 bet to say that D is being used for a major title, Remedy's 
 Quantum Break, featured prominently during the announcement of

May I ask where this intel comes from? Do you have any more
details on how D is used in the project?

May 23 2013

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 23 May 2013 at 18:43:01 UTC, Szymon Gatner wrote:
 On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
 While there hasn't been anything official, I think it's a safe 
 bet to say that D is being used for a major title, Remedy's 
 Quantum Break, featured prominently during the announcement of

 May I ask where this intel comes from? Do you have any more
 details on how D is used in the project?

You can watch Manu's talk from DConf here:

http://www.youtube.com/watch?v=FKceA691Wcg

tl;dw They are using it as a rapid turn around scripting language 
for their C++ engine.

May 23 2013

"Szymon Gatner" <noemail gmail.com> writes:

On Thursday, 23 May 2013 at 18:50:11 UTC, Brad Anderson wrote:
 On Thursday, 23 May 2013 at 18:43:01 UTC, Szymon Gatner wrote:
 On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
 While there hasn't been anything official, I think it's a 
 safe bet to say that D is being used for a major title, 
 Remedy's Quantum Break, featured prominently during the 
 announcement of

 May I ask where this intel comes from? Do you have any more
 details on how D is used in the project?

 You can watch Manu's talk from DConf here:

 http://www.youtube.com/watch?v=FKceA691Wcg

 tl;dw They are using it as a rapid turn around scripting 
 language for their C++ engine.

Ah I did watch it. Didn't realize Manu works at Remedy. Being 
small indie game dev I totally agree on the industry needing 
salvation from C++.

I am watching D closely for few years now but until compiler is 
more stable (tho this is less and less of a problem) and there is 
decent ARM support I still can't allow myself to switch. And the 
day of the switch will be glorious one.

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 05:02, Szymon Gatner <noemail gmail.com> wrote:

 On Thursday, 23 May 2013 at 18:50:11 UTC, Brad Anderson wrote:

 On Thursday, 23 May 2013 at 18:43:01 UTC, Szymon Gatner wrote:

 On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:

 While there hasn't been anything official, I think it's a safe bet to
 say that D is being used for a major title, Remedy's Quantum Break,
 featured prominently during the announcement of

 May I ask where this intel comes from? Do you have any more
 details on how D is used in the project?

 You can watch Manu's talk from DConf here:

 http://www.youtube.com/watch?**v=FKceA691Wcg<http://www.youtube.com/watch?v=FKceA691Wcg>

 tl;dw They are using it as a rapid turn around scripting language for
 their C++ engine.

 Ah I did watch it. Didn't realize Manu works at Remedy. Being small indie
 game dev I totally agree on the industry needing salvation from C++.

 I am watching D closely for few years now but until compiler is more
 stable (tho this is less and less of a problem) and there is decent ARM
 support I still can't allow myself to switch. And the day of the switch
 will be glorious one.

I really hope D on ARM gets some more attention in the near future. The day
it can be used on Android will be a very significant breakthrough!

May 23 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 05/24/2013 01:25 AM, Manu wrote:
 I really hope D on ARM gets some more attention in the near future. The day it
 can be used on Android will be a very significant breakthrough!

GDC is close to being fully usable on ARM, no?  And as I recall the only (albeit
major) problem you had with GDC was the delay between bugfixes landing in the D
frontend and carrying over to GDC.

So, the solution here might be the work to properly generalize the frontend so
that it will plug-and-play on top of any of the available backends.

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 09:44, Joseph Rushton Wakeling
<joseph.wakeling webdrake.net>wrote:

 On 05/24/2013 01:25 AM, Manu wrote:
 I really hope D on ARM gets some more attention in the near future. The

 day it
 can be used on Android will be a very significant breakthrough!

 GDC is close to being fully usable on ARM, no?  And as I recall the only
 (albeit
 major) problem you had with GDC was the delay between bugfixes landing in
 the D
 frontend and carrying over to GDC.

 So, the solution here might be the work to properly generalize the
 frontend so
 that it will plug-and-play on top of any of the available backends.

Well the compiler seems fine actually. It generates good ARM code in my
experience, ditto for PPC, MIPS, SH4 (those are all I have tested).

Druntime needs to be ported to Bionic. People have made a start, but I
recall mention of some complications that need some work?
iOS needs extern(ObjC), but it's a fairly standard posix underneath, so
should be less work on the runtime.

Systems like WiiU/Wii/PS3/XBox360, etc all need runtimes, and those will
probably not be developed by the D community.
It would land on a general gamedev's shoulders to do those, so I would
suggest the approach here would be to make a step-buy-step guide to porting
druntime. Make the process as simple as possible for individuals wanting to
support other 'niche' platforms...

May 23 2013

"Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> writes:

On Friday, 24 May 2013 at 00:06:05 UTC, Manu wrote:
 Systems like WiiU/Wii/PS3/XBox360, etc all need runtimes, and 
 those will
 probably not be developed by the D community.
 It would land on a general gamedev's shoulders to do those, so 
 I would
 suggest the approach here would be to make a step-buy-step 
 guide to porting
 druntime. Make the process as simple as possible for 
 individuals wanting to
 support other 'niche' platforms...

Do you think we could expect those ports to be given back to the 
community when they get written? Or is it more likely that game 
studios will keep their ports to themselves?

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 10:59, Joseph Rushton Wakeling
<joseph.wakeling webdrake.net>wrote:

 On Friday, 24 May 2013 at 00:06:05 UTC, Manu wrote:

 Systems like WiiU/Wii/PS3/XBox360, etc all need runtimes, and those will
 probably not be developed by the D community.
 It would land on a general gamedev's shoulders to do those, so I would
 suggest the approach here would be to make a step-buy-step guide to
 porting
 druntime. Make the process as simple as possible for individuals wanting
 to
 support other 'niche' platforms...

 Do you think we could expect those ports to be given back to the community
 when they get written? Or is it more likely that game studios will keep
 their ports to themselves?

I'd like to think they'd be made available. I certainly would. But you can
never predict what the suits up top will tell you that you have to do.

May 23 2013

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Fri, 24 May 2013 02:59:44 +0200
"Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> wrote:

 On Friday, 24 May 2013 at 00:06:05 UTC, Manu wrote:
 Systems like WiiU/Wii/PS3/XBox360, etc all need runtimes, and 
 those will
 probably not be developed by the D community.
 It would land on a general gamedev's shoulders to do those, so 
 I would
 suggest the approach here would be to make a step-buy-step 
 guide to porting
 druntime. Make the process as simple as possible for 
 individuals wanting to
 support other 'niche' platforms...

 
 Do you think we could expect those ports to be given back to the 
 community when they get written? Or is it more likely that game 
 studios will keep their ports to themselves?

It would be prohibited by console manufacturer's
NDAs/developer-licenses.

If you're an official licensed console developer, you can't provide any
console-specific code or technical specs to anyone who isn't also
covered by the same licensed developer agreement. I'm sure it could be
released to, or shared with, other licensed developers (might be
paperwork involved, I dunno), but not to the general community.

Licensed console dev is fairly cloak-and-dagger (minus the dagger,
perhaps). Game console manufacturers keep a tight enough grip on their
systems to make even Apple blush.

For such a thing to be released back to the "community" it would have
to come from the homebrew scene (which AIUI, could then be used by
licensed devs too...or at least that was my understanding with GBA,
so my info may be out-of-date). An officially licensed developer would
lose their license, or get sued, or something.

May 23 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, May 23, 2013 at 08:22:43PM +0200, Joseph Rushton Wakeling wrote:
 On 05/23/2013 08:13 PM, Brad Anderson wrote:
 Now I'm wondering what can be done to foster this newly acquired
 credibility in games.  By far the biggest issue I hear about when it
 comes to people working on games in D is the garbage collector.  You
 can work around the GC without too much difficulty as Manu's
 experience shared in his DConf talk shows but a lot of people new to
 D don't know how to do that.  We could also use some tools and
 guides to help people identify and avoid GC use when necessary.

 
 As a starting point, do we have a list of the Phobos functions that
 allocate using GC when there's no need to?  That's a concern of Manu's
 that it ought to be possible to address relatively swiftly if the
 information is to hand.

I listened to Manu's talk yesterday, and I agree with what he said, that
Phobos functions that don't *need* to allocate, shouldn't. Andrei was
also enthusiastic about std.algorithm being almost completely
allocation-free. Maybe we should file bugs (enhancement requests?) for
all such Phobos functions?

On the other hand, perhaps functions that *need* to allocate should be
labelled as such (esp. in the Phobos docs), so that users know what
they're getting into.


T

-- 
My program has no bugs! Only unintentional features...

May 23 2013

Jacob Carlborg <doob me.com> writes:

On 2013-05-23 20:43, H. S. Teoh wrote:

 On the other hand, perhaps functions that *need* to allocate should be
 labelled as such (esp. in the Phobos docs), so that users know what
 they're getting into.

Perhaps using a UDA.

-- 
/Jacob Carlborg

May 23 2013

"Kiith-Sa" <kiithsacmp gmail.com> writes:

On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
 While there hasn't been anything official, I think it's a safe 
 bet to say that D is being used for a major title, Remedy's 
 Quantum Break, featured prominently during the announcement of 
 Xbox One. Quantum Break doesn't come out until 2014 so the 
 timeline seems about right (Remedy doesn't appear to work on 
 more than one game at a time from what I can tell).


 That's pretty huge news.


 Now I'm wondering what can be done to foster this newly 
 acquired credibility in games.  By far the biggest issue I hear 
 about when it comes to people working on games in D is the 
 garbage collector.  You can work around the GC without too much 
 difficulty as Manu's experience shared in his DConf talk shows 
 but a lot of people new to D don't know how to do that.  We 
 could also use some tools and guides to help people identify 
 and avoid GC use when necessary.

  nogc comes to mind (I believe Andrei mentioned it during one 
 of the talks released). [1][2]

 Johannes Pfau's work in progress -vgc command line option [3] 
 would be another great tool that would help people identify GC 
 allocations.  This or something similar could also be used to 
 document throughout phobos when GC allocations can happen (and 
 help eliminate it where it makes sense to).

 There was a lot of interesting stuff in Benjamin Thaut's 
 article about GC versus manual memory management in a game [4] 
 and the discussion about it on the forums [5].  A lot of this 
 collective knowledge built up on manual memory management 
 techniques specific to D should probably be formalized and 
 added to the official documentation.  There is a Memory 
 Management [6] page in the documentation but it appears to be 
 rather dated at this point and not particularly applicable to 
 modern D2 (no mention of emplace or scoped and it talks about 
 using delete and scope classes).

 Game development is one place D can really get a foothold but 
 all too often the GC is held over D's head because people 
 taking their first look at D don't know how to avoid using it 
 and often don't realize you can avoid using it entirely. This 
 is easily the most common issue raised by newcomers to D with a 
 C or C++ background that I see in the #d IRC channel (many of 
 which are interested in game dev but concerned the GC will kill 
 their game's performance).


 1: http://d.puremagic.com/issues/show_bug.cgi?id=5219
 2: http://wiki.dlang.org/DIP18
 3: https://github.com/D-Programming-Language/dmd/pull/1886
 4: http://3d.benjamin-thaut.de/?p=20#more-20
 5: http://forum.dlang.org/post/k27bh7$t7f$1 digitalmars.com
 6: http://dlang.org/memory.html


Without official confirmation, I think it's rather early to 
assume D's
being used in Quantum Break. D might compile on the new consoles, 
but
what about druntime/phobos/etc ?

That said, I support this idea.

When I get time I'll try looking at Phobos if there is some 
low-hanging fruit with regards to GC usage and submit pull 
requests (I didn't make any non-doc contribution to Phobos yet, 
but I have a general idea of how its source looks).

I also think that many people overreact about GC too much.  nogc 
is certainly a good idea, but I think strategically using malloc, 
disabling/reenabling GC, using GC.free and even just using 
standard GC features *while taking care to
avoid unnecessary allocations* is vastly better than outright 
removing GC.

It'd be good to have an easy-to-use way to manually allocate 
classes/structs in Phobos (higher-level than emplace, something 
close in usability to C++ new/delete), preferably with a way to 
override the allocation mechanism (I assume the fabled 
"allocators" have something to do with this? Maybe we'll get them 
once DNF is released... ... ...)

May 23 2013

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Thu, 23 May 2013 21:37:26 +0200
"Kiith-Sa" <kiithsacmp gmail.com> wrote:

 On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
 While there hasn't been anything official, I think it's a safe 
 bet to say that D is being used for a major title, Remedy's 
 Quantum Break, featured prominently during the announcement of 
 Xbox One. Quantum Break doesn't come out until 2014 so the 
 timeline seems about right (Remedy doesn't appear to work on 
 more than one game at a time from what I can tell).


 That's pretty huge news.


 Now I'm wondering what can be done to foster this newly 
 acquired credibility in games.  By far the biggest issue I hear 
 about when it comes to people working on games in D is the 
 garbage collector.  You can work around the GC without too much 
 difficulty as Manu's experience shared in his DConf talk shows 
 but a lot of people new to D don't know how to do that.  We 
 could also use some tools and guides to help people identify 
 and avoid GC use when necessary.

  nogc comes to mind (I believe Andrei mentioned it during one 
 of the talks released). [1][2]

 Johannes Pfau's work in progress -vgc command line option [3] 
 would be another great tool that would help people identify GC 
 allocations.  This or something similar could also be used to 
 document throughout phobos when GC allocations can happen (and 
 help eliminate it where it makes sense to).

 There was a lot of interesting stuff in Benjamin Thaut's 
 article about GC versus manual memory management in a game [4] 
 and the discussion about it on the forums [5].  A lot of this 
 collective knowledge built up on manual memory management 
 techniques specific to D should probably be formalized and 
 added to the official documentation.  There is a Memory 
 Management [6] page in the documentation but it appears to be 
 rather dated at this point and not particularly applicable to 
 modern D2 (no mention of emplace or scoped and it talks about 
 using delete and scope classes).

 Game development is one place D can really get a foothold but 
 all too often the GC is held over D's head because people 
 taking their first look at D don't know how to avoid using it 
 and often don't realize you can avoid using it entirely. This 
 is easily the most common issue raised by newcomers to D with a 
 C or C++ background that I see in the #d IRC channel (many of 
 which are interested in game dev but concerned the GC will kill 
 their game's performance).


 1: http://d.puremagic.com/issues/show_bug.cgi?id=5219
 2: http://wiki.dlang.org/DIP18
 3: https://github.com/D-Programming-Language/dmd/pull/1886
 4: http://3d.benjamin-thaut.de/?p=20#more-20
 5: http://forum.dlang.org/post/k27bh7$t7f$1 digitalmars.com
 6: http://dlang.org/memory.html

 
 
 Without official confirmation, I think it's rather early to 
 assume D's
 being used in Quantum Break. D might compile on the new consoles, 
 but
 what about druntime/phobos/etc ?
 

I'd like to hear an official confirmation (or denial) at this point,
too (assuming Remedy is at a point where they're comfortable making a
statement on the matter - and after all, it would make sense if
they're still keeping open the possibility of backing out of D for
whatever they're using it on by release if they end up needing to do so,
even if such a possibility is very unlikely).

However, I do think it's a safe bet: Like Brad said, Remedy is a
relatively small dev company that doesn't have a history of working on
multiple AAA titles simultaneously. They *are* known to have one other
mystery title besides Quantum Break in development, but it's for iOS -
so it's not a AAA title, and it's definitely not x86, so that one can
definitely be ruled out (unless Manu was messing with us to keep
it super-secret ;) ).

As far as I'm concerned, the whole "Quantum Break uses D" thing *is*
technically a rumor, and I think it's probably best to keep it framed
that way out of respect for Manu and his employer. But it's a very
convincing rumor that I do believe.

May 23 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, May 23, 2013 21:37:26 Kiith-Sa wrote:
 It'd be good to have an easy-to-use way to manually allocate
 classes/structs in Phobos (higher-level than emplace, something
 close in usability to C++ new/delete), preferably with a way to
 override the allocation mechanism (I assume the fabled
 "allocators" have something to do with this? Maybe we'll get them
 once DNF is released... ... ...)'

Presumably, we'll get that with custom allocators. So, it's probably just a 
question of how long it'll take to sort those out.

- Jonathan M Davis

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 05:37, Kiith-Sa <kiithsacmp gmail.com> wrote:

 On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:

 While there hasn't been anything official, I think it's a safe bet to sa=


y
 that D is being used for a major title, Remedy's Quantum Break, featured
 prominently during the announcement of Xbox One. Quantum Break doesn't c=


ome
 out until 2014 so the timeline seems about right (Remedy doesn't appear =


to
 work on more than one game at a time from what I can tell).


 That's pretty huge news.


 Now I'm wondering what can be done to foster this newly acquired
 credibility in games.  By far the biggest issue I hear about when it com=


es
 to people working on games in D is the garbage collector.  You can work
 around the GC without too much difficulty as Manu's experience shared in
 his DConf talk shows but a lot of people new to D don't know how to do
 that.  We could also use some tools and guides to help people identify a=


nd
 avoid GC use when necessary.

  nogc comes to mind (I believe Andrei mentioned it during one of the
 talks released). [1][2]

 Johannes Pfau's work in progress -vgc command line option [3] would be
 another great tool that would help people identify GC allocations.  This=


 or
 something similar could also be used to document throughout phobos when =


GC
 allocations can happen (and help eliminate it where it makes sense to).

 There was a lot of interesting stuff in Benjamin Thaut's article about G=


C
 versus manual memory management in a game [4] and the discussion about i=


t
 on the forums [5].  A lot of this collective knowledge built up on manua=


l
 memory management techniques specific to D should probably be formalized
 and added to the official documentation.  There is a Memory Management [=


6]
 page in the documentation but it appears to be rather dated at this poin=


t
 and not particularly applicable to modern D2 (no mention of emplace or
 scoped and it talks about using delete and scope classes).

 Game development is one place D can really get a foothold but all too
 often the GC is held over D's head because people taking their first loo=


k
 at D don't know how to avoid using it and often don't realize you can av=


oid
 using it entirely. This is easily the most common issue raised by newcom=


ers
 to D with a C or C++ background that I see in the #d IRC channel (many o=


f
 which are interested in game dev but concerned the GC will kill their
 game's performance).


 1: http://d.puremagic.com/issues/**show_bug.cgi?id=3D5219<http://d.purem=


agic.com/issues/show_bug.cgi?id=3D5219>
 2: http://wiki.dlang.org/DIP18
 3: https://github.com/D-**Programming-Language/dmd/pull/**1886<https://g=


ithub.com/D-Programming-Language/dmd/pull/1886>
 4: http://3d.benjamin-thaut.de/?**p=3D20#more-20<http://3d.benjamin-thau=


t.de/?p=3D20#more-20>
 5: http://forum.dlang.org/post/**k27bh7$t7f$1 digitalmars.com<http://for=


um.dlang.org/post/k27bh7$t7f$1 digitalmars.com>
 6: http://dlang.org/memory.html


 Without official confirmation, I think it's rather early to assume D's
 being used in Quantum Break. D might compile on the new consoles, but
 what about druntime/phobos/etc ?

 That said, I support this idea.

 When I get time I'll try looking at Phobos if there is some low-hanging
 fruit with regards to GC usage and submit pull requests (I didn't make an=

y
 non-doc contribution to Phobos yet, but I have a general idea of how its
 source looks).

 I also think that many people overreact about GC too much.  nogc is
 certainly a good idea, but I think strategically using malloc,
 disabling/reenabling GC, using GC.free and even just using standard GC
 features *while taking care to
 avoid unnecessary allocations* is vastly better than outright removing GC=

.

Just to be clear, while I've hard many have, I've NEVER argued for removing
the GC. I think that's a hallmark of a modern language. I want to use the
GC in games, but it needs to have performance characteristics that are
applicable to realtime and embedded use.
Those are:
1. Can't stop the world.
2. Needs tight controls, enable/disable, and the allocators interface so
alternative memory sources can be used in mane places.
3. Needs to (somehow) run incrementally. I'm happy to budget a few hundred
=C2=B5s per frame, but not a millisecond every 10 frames, or 1 second every=
 1000.
    It can have 1-2% of overall frame time each frame, but it can't have
10-100% of random frames here and there. This results in framerate spikes.

The GC its self can be much less efficient than the existing GC if it
want's, it's only important that it can be halted at fine grained
intervals, and that it will eventually complete its collect cycle over the
long-term.
I know that an incremental GC like this is very complex, but I've never
heard of any real experiments, so maybe it's not impossible?

It'd be good to have an easy-to-use way to manually allocate
 classes/structs in Phobos (higher-level than emplace, something close in
 usability to C++ new/delete), preferably with a way to override the
 allocation mechanism (I assume the fabled "allocators" have something to =

do
 with this? Maybe we'll get them once DNF is released... ... ...)

May 23 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 05/24/2013 01:34 AM, Manu wrote:
 Just to be clear, while I've hard many have, I've NEVER argued for removing the
 GC. I think that's a hallmark of a modern language. I want to use the GC in
 games, but it needs to have performance characteristics that are applicable to
 realtime and embedded use.
 Those are:
 1. Can't stop the world.
 2. Needs tight controls, enable/disable, and the allocators interface so
 alternative memory sources can be used in mane places.
 3. Needs to (somehow) run incrementally. I'm happy to budget a few hundred µs
 per frame, but not a millisecond every 10 frames, or 1 second every 1000.
     It can have 1-2% of overall frame time each frame, but it can't have
10-100%
 of random frames here and there. This results in framerate spikes.
 
 The GC its self can be much less efficient than the existing GC if it want's,
 it's only important that it can be halted at fine grained intervals, and that
it
 will eventually complete its collect cycle over the long-term.
 I know that an incremental GC like this is very complex, but I've never heard
of
 any real experiments, so maybe it's not impossible?

Maybe someone else can point to an example, but I can't think of any language
prior to D that has both the precision and speed to be useful for games and
embedded programming, and that also has GC built in.

So it seems to me that this might well be an entirely new problem, as no other
GC language or library has had the motivation to create something that satisfies
these use parameters.

This also seems to suggest that an ideal solution might be to have several
different GC strategies, the choice of which could be made at compile time
depending on what's most suitable for the application in question.

May 23 2013

Jacob Carlborg <doob me.com> writes:

On 2013-05-24 01:51, Joseph Rushton Wakeling wrote:

 This also seems to suggest that an ideal solution might be to have several
 different GC strategies, the choice of which could be made at compile time
 depending on what's most suitable for the application in question.

You can already swap the GC implementation at link time.

-- 
/Jacob Carlborg

May 24 2013

"Dicebot" <m.strashun gmail.com> writes:

On Friday, 24 May 2013 at 08:01:35 UTC, Jacob Carlborg wrote:
 On 2013-05-24 01:51, Joseph Rushton Wakeling wrote:

 This also seems to suggest that an ideal solution might be to 
 have several
 different GC strategies, the choice of which could be made at 
 compile time
 depending on what's most suitable for the application in 
 question.

 You can already swap the GC implementation at link time.

Yep, exactly. Hard-wired GC is not the problem. Lack of 
alternative GC's is the problem. Lack of tools to reliably 
control avoiding of GC calls at all is the problem.

May 24 2013

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 24 May 2013 at 08:01:35 UTC, Jacob Carlborg wrote:
 On 2013-05-24 01:51, Joseph Rushton Wakeling wrote:

 This also seems to suggest that an ideal solution might be to 
 have several
 different GC strategies, the choice of which could be made at 
 compile time
 depending on what's most suitable for the application in 
 question.

 You can already swap the GC implementation at link time.

Granted the GC fit in the model. Which means no barriers for 
instance.

May 24 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 18:01, Jacob Carlborg <doob me.com> wrote:

 On 2013-05-24 01:51, Joseph Rushton Wakeling wrote:

  This also seems to suggest that an ideal solution might be to have several
 different GC strategies, the choice of which could be made at compile time
 depending on what's most suitable for the application in question.

 You can already swap the GC implementation at link time.


Sure, but there's not an established suite of options to choose from.
How do I select the incremental GC option? :)

May 24 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, May 24, 2013 at 09:34:41AM +1000, Manu wrote:
[...]
 Just to be clear, while I've hard many have, I've NEVER argued for
 removing the GC. I think that's a hallmark of a modern language. I
 want to use the GC in games, but it needs to have performance
 characteristics that are applicable to realtime and embedded use.
 Those are:
 1. Can't stop the world.
 2. Needs tight controls, enable/disable, and the allocators interface
 so alternative memory sources can be used in mane places.
 3. Needs to (somehow) run incrementally. I'm happy to budget a few
 hundred �s per frame, but not a millisecond every 10 frames, or 1
 second every 1000.
     It can have 1-2% of overall frame time each frame, but it can't
 have 10-100% of random frames here and there. This results in
 framerate spikes.

Makes sense, so basically the GC should not cause jittery framerates,
but should distribute its workload across frames so that the framerate
is more-or-less constant?


 The GC its self can be much less efficient than the existing GC if it
 want's, it's only important that it can be halted at fine grained
 intervals, and that it will eventually complete its collect cycle over
 the long-term.

 I know that an incremental GC like this is very complex, but I've
 never heard of any real experiments, so maybe it's not impossible?

Is there a hard upper limit to how much time the GC can take per frame?
Is it acceptable to use, say, a millisecond every frame as long as it's
*every* frame and not every 10 frames (which causes jitter)?

For me, I'm also interested in incremental GCs -- for time-sensitive
applications (even if it's just soft realtime, not hard), long
stop-the-world pauses are really disruptive. I'd rather have the option
of a somewhat larger memory footprint and a less efficient GC (in terms
of rate of memory reclamation) if it can be incremental, rather than a
very efficient GC that introduces big pauses every now and then. I'm
even willing to settle for lower framerates if it means I don't have to
deal with framerate spikes that makes the result jittery and unpleasant.


T

-- 
"I suspect the best way to deal with procrastination is to put off the
procrastination itself until later. I've been meaning to try this, but
haven't gotten around to it yet. " -- swr

May 24 2013

Manu <turkeyman gmail.com> writes:

On 25 May 2013 00:58, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Fri, May 24, 2013 at 09:34:41AM +1000, Manu wrote:
 [...]
 Just to be clear, while I've hard many have, I've NEVER argued for
 removing the GC. I think that's a hallmark of a modern language. I
 want to use the GC in games, but it needs to have performance
 characteristics that are applicable to realtime and embedded use.
 Those are:
 1. Can't stop the world.
 2. Needs tight controls, enable/disable, and the allocators interface
 so alternative memory sources can be used in mane places.
 3. Needs to (somehow) run incrementally. I'm happy to budget a few
 hundred =E7=9B=9C per frame, but not a millisecond every 10 frames, or =


1
 second every 1000.
     It can have 1-2% of overall frame time each frame, but it can't
 have 10-100% of random frames here and there. This results in
 framerate spikes.

 Makes sense, so basically the GC should not cause jittery framerates,
 but should distribute its workload across frames so that the framerate
 is more-or-less constant?

Precisely.

 The GC its self can be much less efficient than the existing GC if it
 want's, it's only important that it can be halted at fine grained
 intervals, and that it will eventually complete its collect cycle over
 the long-term.

 I know that an incremental GC like this is very complex, but I've
 never heard of any real experiments, so maybe it's not impossible?

 Is there a hard upper limit to how much time the GC can take per frame?
 Is it acceptable to use, say, a millisecond every frame as long as it's
 *every* frame and not every 10 frames (which causes jitter)?

Errr, well, 1ms is about 7% of the frame, that's quite a long time.
I'd be feeling pretty uneasy about any library that claimed to want 7% of
the whole game time, and didn't offer any visual/gameplay benefits...
Maybe if the GC happened to render some sweet water effects, or perform
some awesome cloth physics or something while it was at it ;)
I'd say 7% is too much for many developers.

I think 2% sacrifice for simplifying memory management would probably get
through without much argument.
That's ~300=C2=B5s... a few hundred microseconds seems reasonable. Maybe a
little more if targeting 30fps.
If it stuck to that strictly, I'd possibly even grant it permission to stop
the world...

For me, I'm also interested in incremental GCs -- for time-sensitive
 applications (even if it's just soft realtime, not hard), long
 stop-the-world pauses are really disruptive. I'd rather have the option
 of a somewhat larger memory footprint and a less efficient GC (in terms
 of rate of memory reclamation) if it can be incremental, rather than a
 very efficient GC that introduces big pauses every now and then. I'm
 even willing to settle for lower framerates if it means I don't have to
 deal with framerate spikes that makes the result jittery and unpleasant.

One important detail to consider for realtime usage, is that it's very
unconventional to allocate at runtime at all...
Perhaps a couple of short lived temp buffers each frame, and the occasional
change in resources as you progress through a world (which are probably not
allocated in GC memory anyway).
Surely the relatively high temporal consistency of the heap across cycles
can be leveraged here somehow to help?

May 24 2013

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 24 May 2013 at 15:17:00 UTC, Manu wrote:
 Errr, well, 1ms is about 7% of the frame, that's quite a long 
 time.
 I'd be feeling pretty uneasy about any library that claimed to 
 want 7% of
 the whole game time, and didn't offer any visual/gameplay 
 benefits...
 Maybe if the GC happened to render some sweet water effects, or 
 perform
 some awesome cloth physics or something while it was at it ;)
 I'd say 7% is too much for many developers.

 I think 2% sacrifice for simplifying memory management would 
 probably get
 through without much argument.
 That's ~300µs... a few hundred microseconds seems reasonable. 
 Maybe a
 little more if targeting 30fps.
 If it stuck to that strictly, I'd possibly even grant it 
 permission to stop
 the world...

That is kind of biased, as you'll generally win on other aspects. 
You don't free anymore, you don't need to count reference (which 
can become qui te costly in multithreaded code), etc . . .

Generally, I think what is needed for games is a concurrent GC. 
This incurs a memory usage overhead (floating garbage), and a tax 
on pointers write, but eliminate pause.

That is a easy way to export a part of the load in another 
thread, improving concurrency in the application with little 
effort.

With real time constraint, a memory overhead is better than a 
pause.

 One important detail to consider for realtime usage, is that 
 it's very
 unconventional to allocate at runtime at all...
 Perhaps a couple of short lived temp buffers each frame, and 
 the occasional
 change in resources as you progress through a world (which are 
 probably not
 allocated in GC memory anyway).
 Surely the relatively high temporal consistency of the heap 
 across cycles
 can be leveraged here somehow to help?

That is good because it means not a lot of floating garbage.

May 24 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, May 24, 2013 at 07:55:44PM +0200, deadalnix wrote:
 On Friday, 24 May 2013 at 15:17:00 UTC, Manu wrote:
Errr, well, 1ms is about 7% of the frame, that's quite a long time.
I'd be feeling pretty uneasy about any library that claimed to want
7% of the whole game time, and didn't offer any visual/gameplay
benefits...  Maybe if the GC happened to render some sweet water
effects, or perform some awesome cloth physics or something while it
was at it ;) I'd say 7% is too much for many developers.


OK.


I think 2% sacrifice for simplifying memory management would probably
get through without much argument.  That's ~300�s... a few hundred
microseconds seems reasonable.  Maybe a little more if targeting
30fps.  If it stuck to that strictly, I'd possibly even grant it
permission to stop the world...


Makes sense. So basically some kind of incremental algorithm is in
order.


 That is kind of biased, as you'll generally win on other aspects.
 You don't free anymore, you don't need to count reference (which can
 become qui te costly in multithreaded code), etc . . .
 
 Generally, I think what is needed for games is a concurrent GC. This
 incurs a memory usage overhead (floating garbage), and a tax on
 pointers write, but eliminate pause.
 
 That is a easy way to export a part of the load in another thread,
 improving concurrency in the application with little effort.

Wouldn't that require compiler support? Unless you're willing to forego
nice slicing syntax and use custom types for all references / pointers.


 With real time constraint, a memory overhead is better than a pause.
 
One important detail to consider for realtime usage, is that it's
very unconventional to allocate at runtime at all...  Perhaps a
couple of short lived temp buffers each frame, and the occasional
change in resources as you progress through a world (which are
probably not allocated in GC memory anyway).  Surely the relatively
high temporal consistency of the heap across cycles can be leveraged
here somehow to help?

 
 That is good because it means not a lot of floating garbage.

Isn't the usual solution here to use a memory pool that gets deallocated
in one shot at the end of the cycle? So during a frame, you'd create a
pool, allocate all short-lived objects on it, and at the end free the
entire pool in one shot (which could just be a no-op if you recycle the
pool memory for the temp objects in the next frame). Long-lived objects,
of course, will have to live in the heap, and since they usually aren't
in GC memory anyway, it wouldn't matter.

A na�ve, hackish implementation might be a function to reset all GC
memory to a clean slate. So basically, you treat the entire GC memory as
your pool, and you allocate at will during a single frame; then at the
end of the frame, you reset the GC, which is equivalent to collecting
every object from GC memory except it can probably be done much faster
than a real collection cycle. Anything that needs to live past a single
frame will have to be allocated via malloc/free. So this way, you don't
need any collection cycle at all.

Of course, this may interact badly with certain language constructs: if
any reference to GC objects lingers past a frame, you may break language
guarantees (e.g. immutable array gets reused, violating immutability
when you dereference the stale array pointer in the next frame). But if
the per-frame code has no escaping GC references, this problem won't
occur. Maybe if the per-frame code is marked pure? It doesn't work if
you need to malloc/free, though (as those are inherently impure -- the
pointers need to survive past the current frame). Can UDAs be used
somehow to enforce no escaping GC references but allow non-GC references
to persist past the frame?


T

-- 
People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG

May 24 2013

Manu <turkeyman gmail.com> writes:

On 25 May 2013 03:55, deadalnix <deadalnix gmail.com> wrote:

 On Friday, 24 May 2013 at 15:17:00 UTC, Manu wrote:

 Errr, well, 1ms is about 7% of the frame, that's quite a long time.
 I'd be feeling pretty uneasy about any library that claimed to want 7% o=


f
 the whole game time, and didn't offer any visual/gameplay benefits...
 Maybe if the GC happened to render some sweet water effects, or perform
 some awesome cloth physics or something while it was at it ;)
 I'd say 7% is too much for many developers.

 I think 2% sacrifice for simplifying memory management would probably ge=


t
 through without much argument.
 That's ~300=C2=B5s... a few hundred microseconds seems reasonable. Maybe=


 a
 little more if targeting 30fps.
 If it stuck to that strictly, I'd possibly even grant it permission to
 stop
 the world...

 That is kind of biased, as you'll generally win on other aspects. You
 don't free anymore, you don't need to count reference (which can become q=

ui
 te costly in multithreaded code), etc . . .

Freeing is a no-realtime-cost operation, since memory management is usually
scheduled for between-scenes, or passed to other threads.
And I've never heard of a major title that uses smart pointers, and assigns
them around the place at runtime.
I'm accustomed to memory management having a virtually zero cost at runtime=
.
So I don't think it's biased at all (in the sense you say), I think I'm
being quite reasonable.

Generally, I think what is needed for games is a concurrent GC. This incurs
 a memory usage overhead (floating garbage), and a tax on pointers write,
 but eliminate pause.

How much floating garbage? This might be acceptable... I don't know enough
about it.

That is a easy way to export a part of the load in another thread,
 improving concurrency in the application with little effort.

Are you saying a concurrent GC would operate exclusively in another thread?
How does it scan the stack of all other threads?

With real time constraint, a memory overhead is better than a pause.


I wouldn't necessarily agree. Depends on the magnitude of each.
What sort of magnitude are we talking?
If you had 64mb of ram, and no virtual memory, would you be happy to
sacrifice 20% of it? 5% of it?

 One important detail to consider for realtime usage, is that it's very
 unconventional to allocate at runtime at all...
 Perhaps a couple of short lived temp buffers each frame, and the
 occasional
 change in resources as you progress through a world (which are probably
 not
 allocated in GC memory anyway).
 Surely the relatively high temporal consistency of the heap across cycle=


s
 can be leveraged here somehow to help?

 That is good because it means not a lot of floating garbage.

Right. But what's the overhead of a scan process (that's almost entirely
redundant work)?

May 24 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 25 May 2013 at 01:26:19 UTC, Manu wrote:
 Freeing is a no-realtime-cost operation, since memory 
 management is usually
 scheduled for between-scenes, or passed to other threads.
 And I've never heard of a major title that uses smart pointers, 
 and assigns
 them around the place at runtime.
 I'm accustomed to memory management having a virtually zero 
 cost at runtime.
 So I don't think it's biased at all (in the sense you say), I 
 think I'm
 being quite reasonable.

Same goes for the GC, if you don't allocate, it wont trigger.

 How much floating garbage? This might be acceptable... I don't 
 know enough
 about it.

It about how much garbage you produce while the GC is collecting. 
This won't be collected before the next cycle. You say you don't 
generate a lot of garbage, so the cost should be pretty low.

 That is a easy way to export a part of the load in another 
 thread,
 improving concurrency in the application with little effort.

 Are you saying a concurrent GC would operate exclusively in 
 another thread?
 How does it scan the stack of all other threads?

 With real time constraint, a memory overhead is better than a 
 pause.

Yes, it imply a pause to scan stack/registers, but then the 
thread can live it's life and the heap get scanned/collected. You 
never need to stop the world.

 I wouldn't necessarily agree. Depends on the magnitude of each.
 What sort of magnitude are we talking?
 If you had 64mb of ram, and no virtual memory, would you be 
 happy to
 sacrifice 20% of it? 5% of it?

They are so many different variations here with each pro and 
cons. Hard to give some hard numbers. In non VM code, you have 
basically 2 choices :
  - Tax on every pointer write and check a flag to know if some 
operations are needed. if the flag is true, you mark the old 
value as a root to the GC.
  - Only while collecting using page protection (seems like a 
better option for you as you'll not be collecting that much). The 
cost is way higher when collecting, but it is free when you 
aren't.

 Right. But what's the overhead of a scan process (that's almost 
 entirely
 redundant work)?

Roughly proportional to the live set of object you have. It is 
triggered when your heap grow past a certain limit.

May 24 2013

Manu <turkeyman gmail.com> writes:

On 25 May 2013 05:05, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Fri, May 24, 2013 at 07:55:44PM +0200, deadalnix wrote:
 On Friday, 24 May 2013 at 15:17:00 UTC, Manu wrote:
One important detail to consider for realtime usage, is that it's
very unconventional to allocate at runtime at all...  Perhaps a
couple of short lived temp buffers each frame, and the occasional
change in resources as you progress through a world (which are
probably not allocated in GC memory anyway).  Surely the relatively
high temporal consistency of the heap across cycles can be leveraged
here somehow to help?

 That is good because it means not a lot of floating garbage.

 Isn't the usual solution here to use a memory pool that gets deallocated
 in one shot at the end of the cycle? So during a frame, you'd create a
 pool, allocate all short-lived objects on it, and at the end free the
 entire pool in one shot (which could just be a no-op if you recycle the
 pool memory for the temp objects in the next frame). Long-lived objects,
 of course, will have to live in the heap, and since they usually aren't
 in GC memory anyway, it wouldn't matter.

This totally depends on the task. Almost every task will have its own
solution. I think there are 3 common approaches though:
1. Just don't allocate. Seriously, you don't need dynamic memory anywhere
near as much as you think you do. Get creative!
2. Use a pool like you say.
3. Use a scratch buffer or some sort. Allocate from this buffer linearly,
and wipe it clean each frame. Similar to a pool but supporting irregularly
sized allocations.

A na=C4=ABve, hackish implementation might be a function to reset all GC
 memory to a clean slate. So basically, you treat the entire GC memory as
 your pool, and you allocate at will during a single frame; then at the
 end of the frame, you reset the GC, which is equivalent to collecting
 every object from GC memory except it can probably be done much faster
 than a real collection cycle. Anything that needs to live past a single
 frame will have to be allocated via malloc/free. So this way, you don't
 need any collection cycle at all.

Problem with implementing that pattern in the GC, is it's global now.
You can no longer choose the solution for the problem as such.
How do you allocate something with long life? malloc?
What do non-realtime threads to?

Of course, this may interact badly with certain language constructs: if
 any reference to GC objects lingers past a frame, you may break language
 guarantees (e.g. immutable array gets reused, violating immutability
 when you dereference the stale array pointer in the next frame). But if
 the per-frame code has no escaping GC references, this problem won't
 occur. Maybe if the per-frame code is marked pure? It doesn't work if
 you need to malloc/free, though (as those are inherently impure -- the
 pointers need to survive past the current frame). Can UDAs be used
 somehow to enforce no escaping GC references but allow non-GC references
 to persist past the frame?


 T

 --
 People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG

May 24 2013

Manu <turkeyman gmail.com> writes:

On 25 May 2013 11:26, Manu <turkeyman gmail.com> wrote:

 On 25 May 2013 03:55, deadalnix <deadalnix gmail.com> wrote:

 With real time constraint, a memory overhead is better than a pause.

 I wouldn't necessarily agree. Depends on the magnitude of each.
 What sort of magnitude are we talking?
 If you had 64mb of ram, and no virtual memory, would you be happy to
 sacrifice 20% of it? 5% of it?

Actually, I don't think I've made this point clearly before, but it is of
critical importance.

The single biggest threat when considering unexpected memory-allocation, a
la, that in phobos, is NOT performance, it is non-determinism.
Granted, this is the biggest problem with using a GC on embedded hardware
in general.

So let's say I need to keep some free memory over-head, so that I don't run
out of memory when a collect hasn't happened recently...
How much over-head do I need? I can't afford much/any, so precisely how
much do I need?
Understand, I have no virtual-memory manager, it won't page, it's not a
performance problem, it will just crash if I mis-calculate this value.
And does the amount of overhead required change throughout development? How
often do I need to re-calibrate?
What about memory fragmentation? Functions that perform many small
short-lived allocations have a tendency to fragment the heap.

This is probably the most critical reason why phobos function's can't
allocate internally. General realtime code may have some small flexibility,
but embedded use has hard limits.
So we need to know where allocations are coming from for reasons of
determinism. We need to be able to tightly control these factors to make
confident use of a GC.

The more I think about it, the more I wonder if ref-counting is just better
for strictly embedded use across the board...?
Does D actually have a ref-counted GC? Surely it wouldn't be particularly
hard? Requires compiler support though I suppose.

May 24 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:
 Understand, I have no virtual-memory manager, it won't page, 
 it's not a
 performance problem, it will just crash if I mis-calculate this 
 value.

So the GC is kind of out.

May 24 2013

Manu <turkeyman gmail.com> writes:

On 25 May 2013 15:00, deadalnix <deadalnix gmail.com> wrote:

 On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:

 Understand, I have no virtual-memory manager, it won't page, it's not a
 performance problem, it will just crash if I mis-calculate this value.

 So the GC is kind of out.

Yeah, I'm wondering if that's just a basic truth for embedded.
Can D implement a ref-counting GC? That would probably still be okay, since
collection is immediate.

Modern consoles and portables have plenty of memory; can use a GC, but
simpler/embedded platforms probably just can't. An alternative solution
still needs to be offered for that sort of hardware.

May 24 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 25 May 2013 at 05:18:12 UTC, Manu wrote:
 On 25 May 2013 15:00, deadalnix <deadalnix gmail.com> wrote:

 On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:

 Understand, I have no virtual-memory manager, it won't page, 
 it's not a
 performance problem, it will just crash if I mis-calculate 
 this value.

 So the GC is kind of out.

 Yeah, I'm wondering if that's just a basic truth for embedded.
 Can D implement a ref-counting GC? That would probably still be 
 okay, since
 collection is immediate.

This is technically possible, but you said you make few 
allocations. So with the tax on pointer write or the reference 
counting, you'll pay a lot to collect very few garbages. I'm not 
sure the tradeoff is worthwhile.

Paradoxically, when you create few garbage, GC are really goos as 
they don't need to trigger often. But if you need to add a tax on 
each reference write/copy, you'll probably pay more tax than you 
get out of it.

 Modern consoles and portables have plenty of memory; can use a 
 GC, but
 simpler/embedded platforms probably just can't. An alternative 
 solution
 still needs to be offered for that sort of hardware.

May 24 2013

Manu <turkeyman gmail.com> writes:

On 25 May 2013 15:29, deadalnix <deadalnix gmail.com> wrote:

 On Saturday, 25 May 2013 at 05:18:12 UTC, Manu wrote:

 On 25 May 2013 15:00, deadalnix <deadalnix gmail.com> wrote:

  On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:
  Understand, I have no virtual-memory manager, it won't page, it's not a
 performance problem, it will just crash if I mis-calculate this value.

 So the GC is kind of out.

 Yeah, I'm wondering if that's just a basic truth for embedded.
 Can D implement a ref-counting GC? That would probably still be okay,
 since
 collection is immediate.

 This is technically possible, but you said you make few allocations. So
 with the tax on pointer write or the reference counting, you'll pay a lot
 to collect very few garbages. I'm not sure the tradeoff is worthwhile.

But it would be deterministic, and if the allocations are few, the cost
should be negligible.


Paradoxically, when you create few garbage, GC are really goos as they
 don't need to trigger often. But if you need to add a tax on each reference
 write/copy, you'll probably pay more tax than you get out of it.


They're still non-deterministic though. And unless (even if?) they're
precise, they might leak.

What does ObjC do? It seems to work okay on embedded hardware (although not
particularly memory-constrained hardware).
Didn't ObjC recently reject GC in favour of refcounting?

May 24 2013

Paulo Pinto <pjmlp progtools.org> writes:

Am 25.05.2013 07:52, schrieb Manu:
 On 25 May 2013 15:29, deadalnix <deadalnix gmail.com
 <mailto:deadalnix gmail.com>> wrote:

     On Saturday, 25 May 2013 at 05:18:12 UTC, Manu wrote:

         On 25 May 2013 15:00, deadalnix <deadalnix gmail.com
         <mailto:deadalnix gmail.com>> wrote:

             On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:

                 Understand, I have no virtual-memory manager, it won't
                 page, it's not a
                 performance problem, it will just crash if I
                 mis-calculate this value.


             So the GC is kind of out.


         Yeah, I'm wondering if that's just a basic truth for embedded.
         Can D implement a ref-counting GC? That would probably still be
         okay, since
         collection is immediate.


     This is technically possible, but you said you make few allocations.
     So with the tax on pointer write or the reference counting, you'll
     pay a lot to collect very few garbages. I'm not sure the tradeoff is
     worthwhile.


 But it would be deterministic, and if the allocations are few, the cost
 should be negligible.


     Paradoxically, when you create few garbage, GC are really goos as
     they don't need to trigger often. But if you need to add a tax on
     each reference write/copy, you'll probably pay more tax than you get
     out of it.


 They're still non-deterministic though. And unless (even if?) they're
 precise, they might leak.

 What does ObjC do? It seems to work okay on embedded hardware (although
 not particularly memory-constrained hardware).
 Didn't ObjC recently reject GC in favour of refcounting?

Yes, but is was mainly for not being able to have a stable working GC 
able to cope with the Objective-C code available in the wild. It had 
quite a few issues.

Objective-C reference counting requires compiler and runtime support.

Basically it is based in how Cocoa does reference counting, but instead 
of requiring the developers to manually write the [retain], [release] 
and [autorelease] messages, the compiler is able to infer them based on
Cocoa memory access patterns.

Additionally it makes use of dataflow analysis to remove superfluous use 
of those calls.

There is a WWDC talk on iTunes where they explain that. I can look for 
it if there is interest.

Microsoft did the same thing with their C++/CX language extensions and 
COM for WinRT.

--
Paulo

May 25 2013

"deadalnix" <deadalnix gmail.com> writes:

On Saturday, 25 May 2013 at 05:52:23 UTC, Manu wrote:
 But it would be deterministic, and if the allocations are few, 
 the cost
 should be negligible.

You'll pay a tax on pointer write, not on allocations ! It won't 
be negligible !

 They're still non-deterministic though. And unless (even if?) 
 they're
 precise, they might leak.

Not if they are precise. But this is another topic.

 What does ObjC do? It seems to work okay on embedded hardware 
 (although not
 particularly memory-constrained hardware).
 Didn't ObjC recently reject GC in favour of refcounting?

ObjC is an horrible three headed monster in that regard, and I 
don't think this is the way to go.

May 25 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 25 May 2013 01:52:10 -0400, Manu <turkeyman gmail.com> wrote:

 What does ObjC do? It seems to work okay on embedded hardware (although  
 not
 particularly memory-constrained hardware).
 Didn't ObjC recently reject GC in favour of refcounting?

Having used ObjC for the last year or so working on iOS, it is a very nice  
memory management model.

Essentially, all objects (and only objects) are ref-counted automatically  
by the compiler.  In code, whenever you assign or pass a pointer to an  
object, the compiler automatically inserts retains and releases extremely  
conservatively.

Then, the optimizer comes along and factors out extra retains and  
releases, if it can prove they are necessary.

What I really like about this is, unlike a library-based solution where  
every assignment to a 'smart pointer' incurs a release/retain, the  
compiler knows what this means and will factor them out, removing almost  
all of them.  It's as if you inserted the retains and releases in the most  
optimized way possible, and it's all for free.

Also, I believe the compiler is then free to reorder retains and releases  
since it understands how they work.  Of course, a retain/release is an  
atomic operation, and requires memory barriers, so the CPU/cache cannot  
reorder, but the compiler still can.

I asked David Nadlinger at the conference whether we could leverage this  
power in LDC, since LLVM is the compiler back-end used by Apple, but he  
said all those optimization passes are in the Objective-C front-end.

It would be cool/useful to have compiler-native reference counting.  The  
only issue is, Objective-C is quite object-heavy, and it's statically  
checkable whether a pointer is an Object pointer or not.  In D, you would  
have to conservatively use retains/releases on every pointer, since any  
memory block could be ref-counted.  But just like Objective-C most of them  
could be factored out.  Add in that D has the shared-ness of the pointer  
built into the type system, and you may have something that is extremely  
effective.

-Steve

May 28 2013

"David Nadlinger" <see klickverbot.at> writes:

On Tuesday, 28 May 2013 at 13:33:39 UTC, Steven Schveighoffer 
wrote:
 I asked David Nadlinger at the conference whether we could 
 leverage this power in LDC, since LLVM is the compiler back-end 
 used by Apple, but he said all those optimization passes are in 
 the Objective-C front-end.

Hm, apparently I was imprecise or I slightly misunderstood your 
question:

The actual optimizations _are_ done in LLVM, and are part of its 
source tree (see lib/Transforms/ObjCARC). What I meant to say is 
that they are tied to the ObjC runtime function calls emitted by 
Clang – there is no notion of a "reference counted pointer" on 
the LLVM level.

Thus, we could definitely base a similar implementation D on 
this, which would recognize D runtime calls (potentially 
accompanied by D-specific LLVM metadata) instead of Objective-C 
ones. It's just that there would be quite a bit of adjusting 
involved, as the ObjC ARC implementation isn't designed to be 
language-agnostic.

David

May 28 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 28 May 2013 09:50:42 -0400, David Nadlinger <see klickverbot.at>  
wrote:

 On Tuesday, 28 May 2013 at 13:33:39 UTC, Steven Schveighoffer wrote:
 I asked David Nadlinger at the conference whether we could leverage  
 this power in LDC, since LLVM is the compiler back-end used by Apple,  
 but he said all those optimization passes are in the Objective-C  
 front-end.

 Hm, apparently I was imprecise or I slightly misunderstood your question:

More like I am a compiler ignorant and didn't understand/properly remember  
your answer :)  Thanks for clarifying.

-Steve

May 28 2013

Manu <turkeyman gmail.com> writes:

On 28 May 2013 23:33, Steven Schveighoffer <schveiguy yahoo.com> wrote:

 On Sat, 25 May 2013 01:52:10 -0400, Manu <turkeyman gmail.com> wrote:

  What does ObjC do? It seems to work okay on embedded hardware (although
 not
 particularly memory-constrained hardware).
 Didn't ObjC recently reject GC in favour of refcounting?

 Having used ObjC for the last year or so working on iOS, it is a very nice
 memory management model.

 Essentially, all objects (and only objects) are ref-counted automatically
 by the compiler.  In code, whenever you assign or pass a pointer to an
 object, the compiler automatically inserts retains and releases extremely
 conservatively.

 Then, the optimizer comes along and factors out extra retains and
 releases, if it can prove they are necessary.

 What I really like about this is, unlike a library-based solution where
 every assignment to a 'smart pointer' incurs a release/retain, the compiler
 knows what this means and will factor them out, removing almost all of
 them.  It's as if you inserted the retains and releases in the most
 optimized way possible, and it's all for free.

 Also, I believe the compiler is then free to reorder retains and releases
 since it understands how they work.  Of course, a retain/release is an
 atomic operation, and requires memory barriers, so the CPU/cache cannot
 reorder, but the compiler still can.

Right. This is almost precisely how I imagined it would be working.
I wonder what it would take to have this as a GC strategy in D?
I'm more and more thinking this would be the best approach for realtime
software.
It's deterministic, and while being safe like a GC, the programmer retains
absolute control.
Also, things are destroyed when you expect (again, the deterministic thing).
I think this GC strategy will open D for use on much more embedded hardware.


I asked David Nadlinger at the conference whether we could leverage this
 power in LDC, since LLVM is the compiler back-end used by Apple, but he
 said all those optimization passes are in the Objective-C front-end.

Yeah, this would require D front-end work I'm sure.


It would be cool/useful to have compiler-native reference counting.  The
 only issue is, Objective-C is quite object-heavy, and it's statically
 checkable whether a pointer is an Object pointer or not.  In D, you would
 have to conservatively use retains/releases on every pointer, since any
 memory block could be ref-counted.  But just like Objective-C most of them
 could be factored out.  Add in that D has the shared-ness of the pointer
 built into the type system, and you may have something that is extremely
 effective.

Yep, I can imagine it would work really well, if the front-end implemented
the logic to factor out redundant inc/dec ref's.

May 28 2013

"David Nadlinger" <see klickverbot.at> writes:

On Tuesday, 28 May 2013 at 13:56:03 UTC, Manu wrote:
 Yep, I can imagine it would work really well, if the front-end 
 implemented
 the logic to factor out redundant inc/dec ref's.

It isn't the best idea to do this sort of optimizations 
(entirely) in the front-end, because you really want to be able 
to aggressively optimize away such redundant operations after 
inlining at what previously were function boundaries.

But then again, if you use AST-based inlining like DMD does, it 
might just work… ;)

David

May 28 2013

Manu <turkeyman gmail.com> writes:

On 29 May 2013 00:01, David Nadlinger <see klickverbot.at> wrote:

 On Tuesday, 28 May 2013 at 13:56:03 UTC, Manu wrote:

 Yep, I can imagine it would work really well, if the front-end implement=


ed
 the logic to factor out redundant inc/dec ref's.

 It isn't the best idea to do this sort of optimizations (entirely) in the
 front-end, because you really want to be able to aggressively optimize aw=

ay
 such redundant operations after inlining at what previously were function
 boundaries.

 But then again, if you use AST-based inlining like DMD does, it might jus=

t
 work=E2=80=A6 ;)


Can you comment on the complexity of implementing this sort of garbage
collection? (not really garbage collection, but serves the same purpose)

May 28 2013

Paulo Pinto <pjmlp progtools.org> writes:

Am 28.05.2013 15:33, schrieb Steven Schveighoffer:
 On Sat, 25 May 2013 01:52:10 -0400, Manu <turkeyman gmail.com> wrote:

 What does ObjC do? It seems to work okay on embedded hardware
 (although not
 particularly memory-constrained hardware).
 Didn't ObjC recently reject GC in favour of refcounting?

 Having used ObjC for the last year or so working on iOS, it is a very
 nice memory management model.

 Essentially, all objects (and only objects) are ref-counted
 automatically by the compiler.  In code, whenever you assign or pass a
 pointer to an object, the compiler automatically inserts retains and
 releases extremely conservatively.

 Then, the optimizer comes along and factors out extra retains and
 releases, if it can prove they are necessary.

 What I really like about this is, unlike a library-based solution where
 every assignment to a 'smart pointer' incurs a release/retain, the
 compiler knows what this means and will factor them out, removing almost
 all of them.  It's as if you inserted the retains and releases in the
 most optimized way possible, and it's all for free.

 Also, I believe the compiler is then free to reorder retains and
 releases since it understands how they work.  Of course, a
 retain/release is an atomic operation, and requires memory barriers, so
 the CPU/cache cannot reorder, but the compiler still can.
...

I imagine Microsoft also does a similar thing with their C++/CX language 
extensions (WinRT handles).

May 28 2013

Manu <turkeyman gmail.com> writes:

On 29 May 2013 03:27, Paulo Pinto <pjmlp progtools.org> wrote:

Am 28.05.2013 15:33, schrieb Steven Schveighoffer:

On Sat, 25 May 2013 01:52:10 -0400, Manu <turkeyman gmail.com> wrote:

What does ObjC do? It seems to work okay on embedded hardware
(although not
particularly memory-constrained hardware).
Didn't ObjC recently reject GC in favour of refcounting?

Having used ObjC for the last year or so working on iOS, it is a very
nice memory management model.

Essentially, all objects (and only objects) are ref-counted
automatically by the compiler. In code, whenever you assign or pass a
pointer to an object, the compiler automatically inserts retains and
releases extremely conservatively.

Then, the optimizer comes along and factors out extra retains and
releases, if it can prove they are necessary.

What I really like about this is, unlike a library-based solution where
every assignment to a 'smart pointer' incurs a release/retain, the
compiler knows what this means and will factor them out, removing almost
all of them. It's as if you inserted the retains and releases in the
most optimized way possible, and it's all for free.

Also, I believe the compiler is then free to reorder retains and
releases since it understands how they work. Of course, a
retain/release is an atomic operation, and requires memory barriers, so
the CPU/cache cannot reorder, but the compiler still can.
...

I imagine Microsoft also does a similar thing with their C++/CX language
extensions (WinRT handles).

Yeah certainly. It's ref counted, not garbage collected. And Android's V8
uses a "generational<http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Generational_GC_.28ephemeral_GC.29>
incremental<http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Stop-the-world_vs._incremental_vs._concurrent>
collector"...
That'd be nice!
ObjC and WinRT are both used successfully on embedded hardware, I'm really
wondering if this is the way to go for embedded in D.
V8 uses an incremental collector (somehow?), which I've been saying is
basically mandatory for embedded/realtime use. Apparently Google agree.
Clearly others have already had this quarrel, their resolutions are worth
consideration.

Implementing a ref-counted GC would probably be much simpler than V8's
mythical incremental collector that probably relies on Java restrictions to
operate?

May 28 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 28 May 2013 20:40:03 -0400, Manu <turkeyman gmail.com> wrote:


 ObjC and WinRT are both used successfully on embedded hardware, I'm  
 really
 wondering if this is the way to go for embedded in D.
 V8 uses an incremental collector (somehow?), which I've been saying is
 basically mandatory for embedded/realtime use. Apparently Google agree.
 Clearly others have already had this quarrel, their resolutions are worth
 consideration.

An interesting thing to note, Apple tried garbage collection with Obj-C,  
but only on MacOS, and it's now been deprecated since automatic reference  
counting was introduced [1].  It never was on iOS.

So that is a telling omission I think.

-Steve

[1] https://en.wikipedia.org/wiki/Objective-C#Garbage_collection

May 28 2013

"Paulo Pinto" <pjmlp progtools.org> writes:

On Wednesday, 29 May 2013 at 00:46:18 UTC, Steven Schveighoffer 
wrote:
 On Tue, 28 May 2013 20:40:03 -0400, Manu <turkeyman gmail.com> 
 wrote:


 ObjC and WinRT are both used successfully on embedded 
 hardware, I'm really
 wondering if this is the way to go for embedded in D.
 V8 uses an incremental collector (somehow?), which I've been 
 saying is
 basically mandatory for embedded/realtime use. Apparently 
 Google agree.
 Clearly others have already had this quarrel, their 
 resolutions are worth
 consideration.

 An interesting thing to note, Apple tried garbage collection 
 with Obj-C, but only on MacOS, and it's now been deprecated 
 since automatic reference counting was introduced [1].  It 
 never was on iOS.

 So that is a telling omission I think.

 -Steve

 [1] https://en.wikipedia.org/wiki/Objective-C#Garbage_collection

The main reason was that the GC never worked properly given the C 
underpinnings of Objective-C.

Too many libraries failed to work properly with GC enabled, plus 
you needed to fill your code with GC friendly annotations.

So I imagine Apple tried to find a compromises that would work 
better in a language with C "safety".

Even that is only supported at the Objective-C language level and 
it requires both compiler support and that objects inherit from 
NSObject as top most class, as far as I am aware.

Anyway it is way better than pure manual memory management.

--
Paulo

May 29 2013

Jacob Carlborg <doob me.com> writes:

On 2013-05-29 09:05, Paulo Pinto wrote:

 The main reason was that the GC never worked properly given the C
 underpinnings of Objective-C.

 Too many libraries failed to work properly with GC enabled, plus you
 needed to fill your code with GC friendly annotations.

 So I imagine Apple tried to find a compromises that would work better in
 a language with C "safety".

 Even that is only supported at the Objective-C language level and it
 requires both compiler support and that objects inherit from NSObject as
 top most class, as far as I am aware.

 Anyway it is way better than pure manual memory management.

I'm pretty it works for their CoreFoundation framework which is a C 
library. NSObject, NSString and other classes are built on top of 
CoreFoundation.

-- 
/Jacob Carlborg

May 29 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-05-29 09:46:20 +0000, Jacob Carlborg <doob me.com> said:

On 2013-05-29 09:05, Paulo Pinto wrote:

The main reason was that the GC never worked properly given the C
underpinnings of Objective-C.

Too many libraries failed to work properly with GC enabled, plus you
needed to fill your code with GC friendly annotations.

So I imagine Apple tried to find a compromises that would work better in
a language with C "safety".

Even that is only supported at the Objective-C language level and it
requires both compiler support and that objects inherit from NSObject as
top most class, as far as I am aware.

Anyway it is way better than pure manual memory management.

I'm pretty it works for their CoreFoundation framework which is a C
library. NSObject, NSString and other classes are built on top of
CoreFoundation.

It does for CF types which are toll-free bridged, if you mark them to
be GC managed while casting.
http://developer.apple.com/library/ios/#documentation/CoreFoundation/Conceptual/CFDesignConcepts/Articles/tollFreeBridgedTypes.html

For

instance, CFString and NSString are just different APIs for the same
underlying object, so you can cast between them. But CoreFoundation
itself won't use the GC if you don't involve Objective-C APIs. The
interesting thing is that objects managed by the now deprecated
Objective-C GC also have a reference count, and won't be candidate for
garbage collection until the reference count reaches zero. You can use
CFRetain/CFRelease on GC-managed Objective-C objects if you want, it's
not a noop.

--
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

May 29 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 29.05.2013 02:46, Steven Schveighoffer wrote:
 On Tue, 28 May 2013 20:40:03 -0400, Manu <turkeyman gmail.com> wrote:


 ObjC and WinRT are both used successfully on embedded hardware, I'm
 really
 wondering if this is the way to go for embedded in D.
 V8 uses an incremental collector (somehow?), which I've been saying is
 basically mandatory for embedded/realtime use. Apparently Google agree.
 Clearly others have already had this quarrel, their resolutions are worth
 consideration.

 An interesting thing to note, Apple tried garbage collection with Obj-C,
 but only on MacOS, and it's now been deprecated since automatic
 reference counting was introduced [1].  It never was on iOS.

 So that is a telling omission I think.

 -Steve

 [1] https://en.wikipedia.org/wiki/Objective-C#Garbage_collection

Please note that you have to deal with circular references manually in 
Objective-C, introducing two types of pointers, strong and weak. I don't 
think this is optimal. If you want to deal with circular references 
automatically you again need some other kind of other garbage collection 
running.

A problem with the naive approach of atomic reference counting a counter 
inside the object (as usually done in COM interfaces, I don't know how 
it is done in Objective-C) is that it is not thread-safe to modify a 
pointer without locking (or a CAS2 operation that you don't have on 
popular processors). You can avoid that using deferred reference 
counting (logging pointer changes to some thread local buffer), but that 
introduces back a garbage collection step with possibly massive 
destruction. This step might be done concurrently, but that adds another 
layer of complexity to finding circles.

Another issue might be that incrementing a reference of an object when 
taking an interior pointer (like you do when using slices) can be pretty 
expensive because you usually have to find the base of the object to 
access the counter.

I won't dismiss RC garbage collection as impossible, but doing it 
efficiently and concurrently is not so easy.

May 29 2013

"Paulo Pinto" <pjmlp progtools.org> writes:

On Wednesday, 29 May 2013 at 07:18:49 UTC, Rainer Schuetze wrote:
On 29.05.2013 02:46, Steven Schveighoffer wrote:
On Tue, 28 May 2013 20:40:03 -0400, Manu <turkeyman gmail.com>
wrote:

ObjC and WinRT are both used successfully on embedded
hardware, I'm
really
wondering if this is the way to go for embedded in D.
V8 uses an incremental collector (somehow?), which I've been
saying is
basically mandatory for embedded/realtime use. Apparently
Google agree.
Clearly others have already had this quarrel, their
resolutions are worth
consideration.

An interesting thing to note, Apple tried garbage collection
with Obj-C,
but only on MacOS, and it's now been deprecated since automatic
reference counting was introduced [1]. It never was on iOS.

So that is a telling omission I think.

-Steve

[1]
https://en.wikipedia.org/wiki/Objective-C#Garbage_collection

Please note that you have to deal with circular references
manually in Objective-C, introducing two types of pointers,
strong and weak. I don't think this is optimal. If you want to
deal with circular references automatically you again need some
other kind of other garbage collection running.

A problem with the naive approach of atomic reference counting
a counter inside the object (as usually done in COM interfaces,
I don't know how it is done in Objective-C) is that it is not
thread-safe to modify a pointer without locking (or a CAS2
operation that you don't have on popular processors). You can
avoid that using deferred reference counting (logging pointer
changes to some thread local buffer), but that introduces back
a garbage collection step with possibly massive destruction.
This step might be done concurrently, but that adds another
layer of complexity to finding circles.

Another issue might be that incrementing a reference of an
object when taking an interior pointer (like you do when using
slices) can be pretty expensive because you usually have to
find the base of the object to access the counter.

I won't dismiss RC garbage collection as impossible, but doing
it efficiently and concurrently is not so easy.

There is a nice document where it is described alongside all
restrictions,

https://developer.apple.com/library/mac/#releasenotes/ObjectiveC/RN-TransitioningToARC/Introduction/Introduction.html#//apple_ref/doc/uid/TP40011226

May 29 2013

Manu <turkeyman gmail.com> writes:

On 29 May 2013 17:18, Rainer Schuetze <r.sagitario gmx.de> wrote:

On 29.05.2013 02:46, Steven Schveighoffer wrote:

On Tue, 28 May 2013 20:40:03 -0400, Manu <turkeyman gmail.com> wrote:

ObjC and WinRT are both used successfully on embedded hardware, I'm
really
wondering if this is the way to go for embedded in D.
V8 uses an incremental collector (somehow?), which I've been saying is
basically mandatory for embedded/realtime use. Apparently Google agree.
Clearly others have already had this quarrel, their resolutions are worth
consideration.

An interesting thing to note, Apple tried garbage collection with Obj-C,
but only on MacOS, and it's now been deprecated since automatic
reference counting was introduced [1]. It never was on iOS.

So that is a telling omission I think.

-Steve

[1] https://en.wikipedia.org/wiki/**Objective-C#Garbage_collection<https://en.wikipedia.org/wiki/Objective-C#Garbage_collection>

Please note that you have to deal with circular references manually in
Objective-C, introducing two types of pointers, strong and weak. I don't
think this is optimal. If you want to deal with circular references
automatically you again need some other kind of other garbage collection
running.

A problem with the naive approach of atomic reference counting a counter
inside the object (as usually done in COM interfaces, I don't know how it
is done in Objective-C) is that it is not thread-safe to modify a pointer
without locking (or a CAS2 operation that you don't have on popular
processors). You can avoid that using deferred reference counting (logging
pointer changes to some thread local buffer), but that introduces back a
garbage collection step with possibly massive destruction. This step might
be done concurrently, but that adds another layer of complexity to finding
circles.

Another issue might be that incrementing a reference of an object when
taking an interior pointer (like you do when using slices) can be pretty
expensive because you usually have to find the base of the object to access
the counter.

I won't dismiss RC garbage collection as impossible, but doing it
efficiently and concurrently is not so easy.

What do you think is easier, or perhaps even POSSIBLE in D?
A good RC approach, or a V8 quality concurrent+incremental GC?
I get the feeling either would be acceptable, but I still kinda like idea
of the determinism an RC collector offers.

I reckon this should probably be the next big ticket for D. The
long-standing shared library problems seem to be being addressed.

May 29 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-05-29 08:06:15 +0000, Manu <turkeyman gmail.com> said:

 What do you think is easier, or perhaps even POSSIBLE in D?
 A good RC approach, or a V8 quality concurrent+incremental GC?
 I get the feeling either would be acceptable, but I still kinda like idea
 of the determinism an RC collector offers.

Given that both require calling a function of some sort on pointer 
assignment, I'd say they're pretty much equivalent in implementation 
effort. One thing the compiler should do with RC that might require 
some effort is cancel out redundant increments/decrement pairs inside 
functions, and also offer some kind of weak pointer to deal with 
cycles. On the GC side, well you have to write the new GC.

Also, with RC, you have to be careful not to create cycles with 
closures. Those are often hard to spot absent of an explicit list of 
captured variables.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

May 29 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 29.05.2013 10:06, Manu wrote:
 What do you think is easier, or perhaps even POSSIBLE in D?
 A good RC approach, or a V8 quality concurrent+incremental GC?

I think none of them is feasible without write-barriers on pointer 
modifications in heap memory. That means extra code needs to be 
generated for each pointer modification (if the compiler cannot optimize 
it away as LLVM seems to be doing in case of Objectve-C). As an 
alternative, Leandros concurrent GC implements them with hardware 
support by COW, though at a pretty large granularity (page size). I'm 
not sure if this approach can be sensibly combined with RC or 
incremental collection.


 I get the feeling either would be acceptable, but I still kinda like
 idea of the determinism an RC collector offers.

If you want it to be safe and efficient, it needs to use deferred 
reference counting, and this ain't so deterministic anymore. The good 
thing about it is that you usually don't have to scan the whole heap to 
find candidates for reclamation.

 I reckon this should probably be the next big ticket for D. The
 long-standing shared library problems seem to be being addressed.

The GC proposed by Leandro looks very promising, though it needs support 
by the hardware and the OS. I think we should see how far we can get 
with this approach.

May 30 2013

Manu <turkeyman gmail.com> writes:

On 30 May 2013 19:50, Rainer Schuetze <r.sagitario gmx.de> wrote:

 On 29.05.2013 10:06, Manu wrote:

 What do you think is easier, or perhaps even POSSIBLE in D?
 A good RC approach, or a V8 quality concurrent+incremental GC?

 I think none of them is feasible without write-barriers on pointer
 modifications in heap memory. That means extra code needs to be generated
 for each pointer modification (if the compiler cannot optimize it away as
 LLVM seems to be doing in case of Objectve-C). As an alternative, Leandros
 concurrent GC implements them with hardware support by COW, though at a
 pretty large granularity (page size). I'm not sure if this approach can be
 sensibly combined with RC or incremental collection.


I'm talking about embedded hardware. No virtualisation, tight memory limit,
no significant OS. Is it possible?

 I get the feeling either would be acceptable, but I still kinda like
 idea of the determinism an RC collector offers.

 If you want it to be safe and efficient, it needs to use deferred
 reference counting, and this ain't so deterministic anymore. The good thing
 about it is that you usually don't have to scan the whole heap to find
 candidates for reclamation.


Well, it's a bit more deterministic, at least you could depend on the
deferred free happening within a frame let's say, rather than at some
un-knowable future time when the GC feels like performing a collect...

That said, I'd be interested to try it without a deferred free. Performance
impact depends on the amount of temporaries/frees... I don't imagine it
would impact much/at-all since there is so little memory allocation or
pointer assignments in realtime software.
People use horrific C++ smart pointer templates successfully, without any
compiler support at all. It works because the frequency of pointer
assignments is so low.
RC is key to avoid scanning the whole heap, which completely destroys your
dcache.

I reckon this should probably be the next big ticket for D. The
 long-standing shared library problems seem to be being addressed.

 The GC proposed by Leandro looks very promising, though it needs support
 by the hardware and the OS. I think we should see how far we can get with
 this approach.

His GC looked good, clearly works better for the sociomantic guys, but I
can't imagine it, or anything like it, will ever work on embedded platforms?
No hardware/OS support... is it possible to emulate the requires features?

May 30 2013

"Dicebot" <m.strashun gmail.com> writes:

On Thursday, 30 May 2013 at 11:17:08 UTC, Manu wrote:
 His GC looked good, clearly works better for the sociomantic 
 guys, but I
 can't imagine it, or anything like it, will ever work on 
 embedded platforms?
 No hardware/OS support... is it possible to emulate the 
 requires features?

Well, anything that is done by OS can also be done by program 
itself ;) I am more curious - is it possible to have a sane 
design for both cases within one code base?

May 30 2013

Manu <turkeyman gmail.com> writes:

On 30 May 2013 21:20, Dicebot <m.strashun gmail.com> wrote:

 On Thursday, 30 May 2013 at 11:17:08 UTC, Manu wrote:

 His GC looked good, clearly works better for the sociomantic guys, but I
 can't imagine it, or anything like it, will ever work on embedded
 platforms?
 No hardware/OS support... is it possible to emulate the requires features?

 Well, anything that is done by OS can also be done by program itself ;) I
 am more curious - is it possible to have a sane design for both cases
 within one code base?

Which 'both' cases?

May 30 2013

"Dicebot" <m.strashun gmail.com> writes:

On Thursday, 30 May 2013 at 11:31:53 UTC, Manu wrote:
 Which 'both' cases?

"OS support for fork+CoW" vs "no support, own implementation"

May 30 2013

"Diggory" <diggsey googlemail.com> writes:

On Thursday, 30 May 2013 at 11:34:20 UTC, Dicebot wrote:
 On Thursday, 30 May 2013 at 11:31:53 UTC, Manu wrote:
 Which 'both' cases?

 "OS support for fork+CoW" vs "no support, own implementation"

If you can modify the DMD compiler to output a special sequence 
of instructions whenever you assign to a pointer type then you 
can do a concurrent/incremental GC with minimal OS or hardware 
support.

May 30 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-05-30 12:04:09 +0000, "Diggory" <diggsey googlemail.com> said:

 If you can modify the DMD compiler to output a special sequence of 
 instructions whenever you assign to a pointer type then you can do a 
 concurrent/incremental GC with minimal OS or hardware support.

This also happens to be the same requirement for automatic reference 
counting. I thought about implementing that for my D/Objective-C 
compiler (which is stalled since a while). The job isn't that big: just 
replace any pointer assignments/initialization by a call to a template 
function (that can be inlined) doing the assignment and it becomes very 
easy to implement such things by tweaking a template in druntime.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

May 30 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 30.05.2013 13:16, Manu wrote:
 On 30 May 2013 19:50, Rainer Schuetze <r.sagitario gmx.de
 <mailto:r.sagitario gmx.de>> wrote:



     On 29.05.2013 10:06, Manu wrote:


         What do you think is easier, or perhaps even POSSIBLE in D?
         A good RC approach, or a V8 quality concurrent+incremental GC?


     I think none of them is feasible without write-barriers on pointer
     modifications in heap memory. That means extra code needs to be
     generated for each pointer modification (if the compiler cannot
     optimize it away as LLVM seems to be doing in case of Objectve-C).
     As an alternative, Leandros concurrent GC implements them with
     hardware support by COW, though at a pretty large granularity (page
     size). I'm not sure if this approach can be sensibly combined with
     RC or incremental collection.


 I'm talking about embedded hardware. No virtualisation, tight memory
 limit, no significant OS. Is it possible?

         I get the feeling either would be acceptable, but I still kinda like
         idea of the determinism an RC collector offers.


     If you want it to be safe and efficient, it needs to use deferred
     reference counting, and this ain't so deterministic anymore. The
     good thing about it is that you usually don't have to scan the whole
     heap to find candidates for reclamation.


 Well, it's a bit more deterministic, at least you could depend on the
 deferred free happening within a frame let's say, rather than at some
 un-knowable future time when the GC feels like performing a collect...

 That said, I'd be interested to try it without a deferred free.
 Performance impact depends on the amount of temporaries/frees... I don't
 imagine it would impact much/at-all since there is so little memory
 allocation or pointer assignments in realtime software.
 People use horrific C++ smart pointer templates successfully, without
 any compiler support at all. It works because the frequency of pointer
 assignments is so low.
 RC is key to avoid scanning the whole heap, which completely destroys
 your dcache.

         I reckon this should probably be the next big ticket for D. The
         long-standing shared library problems seem to be being addressed.


     The GC proposed by Leandro looks very promising, though it needs
     support by the hardware and the OS. I think we should see how far we
     can get with this approach.


 His GC looked good, clearly works better for the sociomantic guys, but I
 can't imagine it, or anything like it, will ever work on embedded platforms?
 No hardware/OS support... is it possible to emulate the requires features?

I suspected embedded systems would not have enough support for COW. I 
think the only way to emulate it would be with write barriers, and then 
you can do better than emulating page protection.

The way Michel Fortin proposed to implement it (lowering pointer writes 
to some druntime-defined template) is also how imagine it. A template 
argument that specifies whether the compiler knows that it is a stack 
access would be nice aswell.
One possible complication: memory block operations would have to treat 
pointer fields differently somehow.

May 30 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

 One possible complication: memory block operations would have to treat
 pointer fields differently somehow.

Would they? Shouldn't it be possible to make this part of the post-blit 
constructor?

Kind Regards
Benjamin Thaut

May 30 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 30.05.2013 22:59, Benjamin Thaut wrote:
 One possible complication: memory block operations would have to treat
 pointer fields differently somehow.

 Would they? Shouldn't it be possible to make this part of the post-blit
 constructor?

Not in general, e.g. reference counting needs to know the state before 
and after the copy.

May 30 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-05-31 06:02:20 +0000, Rainer Schuetze <r.sagitario gmx.de> said:

 On 30.05.2013 22:59, Benjamin Thaut wrote:
 One possible complication: memory block operations would have to treat
 pointer fields differently somehow.

 
 Would they? Shouldn't it be possible to make this part of the post-blit
 constructor?

 
 Not in general, e.g. reference counting needs to know the state before 
 and after the copy.

No. Reference counting would work with post-blit: you have the pointer, 
you just need to increment the reference count once. Also, if you're 
moving instead of copying there's no post-blit called but there's also 
no need to change the reference count so it's fine.

What wouldn't work with post-blit (I think) is a concurrent GC, as the 
GC will likely want to be notified when pointers are moved. Post-blit 
doesn't help there, and the compiler currently assumes it can move 
things around without calling any function.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

May 31 2013

Rainer Schuetze <r.sagitario gmx.de> writes:

On 31.05.2013 12:54, Michel Fortin wrote:
 On 2013-05-31 06:02:20 +0000, Rainer Schuetze <r.sagitario gmx.de> said:

 On 30.05.2013 22:59, Benjamin Thaut wrote:
 One possible complication: memory block operations would have to treat
 pointer fields differently somehow.

 Would they? Shouldn't it be possible to make this part of the post-blit
 constructor?

 Not in general, e.g. reference counting needs to know the state before
 and after the copy.

 No. Reference counting would work with post-blit: you have the pointer,
 you just need to increment the reference count once. Also, if you're
 moving instead of copying there's no post-blit called but there's also
 no need to change the reference count so it's fine.

I was thinking about struct assignment through copying and then calling 
the postblit constructor, not copy construction. But I forgot about the 
swap semantics involved. If I interpret the disassembly correctly, the 
assignment in

S s1, s2;
s2 = s1;

translates to

S tmp1, tmp2;
memcpy(&tmp1, &s1);
tmp1.__postblit;   // user defined this(this)
s2.opAssign(tmp1); // makes a copy of tmp1 on the stack
//opAssign does:
     memcpy(&tmp2,&s2);
     memcpy(&s2,&tmp1);
     tmp1.__dtor;

There are a number of additional copies of the original structs, but the 
number of constructor/destructor calls are balanced. That should work 
for reference counting.

 What wouldn't work with post-blit (I think) is a concurrent GC, as the
 GC will likely want to be notified when pointers are moved. Post-blit
 doesn't help there, and the compiler currently assumes it can move
 things around without calling any function.

It would not allow to create a write barrier that needs to atomically 
change the pointer at a given location, or at least to record the old 
value before overwriting it with the new value. But that might not 
exclude concurrency, for example a concurrent GC with deferred reference 
counting.

May 31 2013

"Paulo Pinto" <pjmlp progtools.org> writes:

On Wednesday, 29 May 2013 at 00:40:16 UTC, Manu wrote:
On 29 May 2013 03:27, Paulo Pinto <pjmlp progtools.org> wrote:

Am 28.05.2013 15:33, schrieb Steven Schveighoffer:

On Sat, 25 May 2013 01:52:10 -0400, Manu
<turkeyman gmail.com> wrote:

What does ObjC do? It seems to work okay on embedded hardware
(although not
particularly memory-constrained hardware).
Didn't ObjC recently reject GC in favour of refcounting?

Having used ObjC for the last year or so working on iOS, it
is a very
nice memory management model.

Essentially, all objects (and only objects) are ref-counted
automatically by the compiler. In code, whenever you assign
or pass a
pointer to an object, the compiler automatically inserts
retains and
releases extremely conservatively.

Then, the optimizer comes along and factors out extra retains
and
releases, if it can prove they are necessary.

What I really like about this is, unlike a library-based
solution where
every assignment to a 'smart pointer' incurs a
release/retain, the
compiler knows what this means and will factor them out,
removing almost
all of them. It's as if you inserted the retains and
releases in the
most optimized way possible, and it's all for free.

Also, I believe the compiler is then free to reorder retains
and
releases since it understands how they work. Of course, a
retain/release is an atomic operation, and requires memory
barriers, so
the CPU/cache cannot reorder, but the compiler still can.
...

I imagine Microsoft also does a similar thing with their
C++/CX language
extensions (WinRT handles).

Yeah certainly. It's ref counted, not garbage collected. And
Android's V8
uses a
"generational<http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Generational_GC_.28ephemeral_GC.29>

incremental<http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Stop-the-world_vs._incremental_vs._concurrent>
collector"...
That'd be nice!
ObjC and WinRT are both used successfully on embedded hardware,
I'm really
wondering if this is the way to go for embedded in D.
V8 uses an incremental collector (somehow?), which I've been
saying is
basically mandatory for embedded/realtime use. Apparently
Google agree.
Clearly others have already had this quarrel, their resolutions
are worth
consideration.

Implementing a ref-counted GC would probably be much simpler
than V8's
mythical incremental collector that probably relies on Java
restrictions to
operate?

Actually what I was implying was the cleverness of the compiler
to remove unnecessary increment/decrement operations via
dataflows, similar to what Clang does.

Otherwise you pay too much for performance impact specially if
multiple threads access the same objects. An incremental real
time GC wins hands down in such scenarios.

Google IO is always a nice source of information on how V8 works,

https://developers.google.com/events/io/sessions/324431687
https://developers.google.com/events/io/sessions/324908972

--
Paulo

May 29 2013

"Patrick Down" <patrick.down gmail.com> writes:

On Saturday, 25 May 2013 at 05:29:31 UTC, deadalnix wrote:

 This is technically possible, but you said you make few 
 allocations. So with the tax on pointer write or the reference 
 counting, you'll pay a lot to collect very few garbages. I'm 
 not sure the tradeoff is worthwhile.

Incidentally, I ran across this paper that talks about a 
reference counted garbage collector that claims to address this 
issue.  MIght be of interest to this group.

http://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon03Pure.pdf

 From the paper:

There are two primary problems with reference counting, namely:
(1) run-time overhead of incrementing and decrementing the 
reference count each time a
pointer is copied, particularly on the stack; and
(2) inability to detect cycles and consequent necessity of 
including a second garbage collection technique to deal with 
cyclic garbage.
In this paper we present new algorithms that address these 
problems and describe a
new multiprocessor garbage collector based on these techniques 
that achieves maximum
measured pause times of 2.6 milliseconds over a set of eleven 
benchmark programs that
perform signiﬁcant amounts of memory allocation.

May 25 2013

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On Sat, 25 May 2013 01:16:47 +1000
Manu <turkeyman gmail.com> wrote:
=20
 Errr, well, 1ms is about 7% of the frame, that's quite a long time.
 I'd be feeling pretty uneasy about any library that claimed to want
 7% of the whole game time, and didn't offer any visual/gameplay
 benefits... Maybe if the GC happened to render some sweet water
 effects, or perform some awesome cloth physics or something while it
 was at it ;)

Heh, I think that'd be nobel-prize territory. "Side Effect Oriented
Developement" It'd be like old-school optimization, but maintains
safety and developer sanity. :)

=20
 I think 2% sacrifice for simplifying memory management would probably
 get through without much argument.
 That's ~300=B5s... a few hundred microseconds seems reasonable. Maybe a
 little more if targeting 30fps.
 If it stuck to that strictly, I'd possibly even grant it permission
 to stop the world...
=20

Perhaps a naive idea, but Would running the GC in a fiber be a feasible
approach? Every time the GC fiber is activated, it checks the time, and
then has various points where it yields if the elapsed time passes a
threshold value.

I see two problems though:

1. The state of GC-controlled heaps can change while the GC fiber is
yieled. Don't know how much that could screw things up, or if the issue
is even solvable.

2. Does a fiber context-switch take too long? If so, what about a
stackless fiber? Ex: http://dunkels.com/adam/pt/

May 24 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 05/24/2013 01:51 AM, Joseph Rushton Wakeling wrote:
 Maybe someone else can point to an example, but I can't think of any language
 prior to D that has both the precision and speed to be useful for games and
 embedded programming, and that also has GC built in.
 
 So it seems to me that this might well be an entirely new problem, as no other
 GC language or library has had the motivation to create something that
satisfies
 these use parameters.

Don't have the experience to judge it, but someone made a remark about Nimrod
that might be relevant here:
http://www.reddit.com/r/programming/comments/1fc9jt/dmd_2063_the_d_programming_language_reference/ca968xg

May 31 2013

"Flamaros" <flamaros.xavier gmail.com> writes:

On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
 While there hasn't been anything official, I think it's a safe 
 bet to say that D is being used for a major title, Remedy's 
 Quantum Break, featured prominently during the announcement of 
 Xbox One. Quantum Break doesn't come out until 2014 so the 
 timeline seems about right (Remedy doesn't appear to work on 
 more than one game at a time from what I can tell).


 That's pretty huge news.


 Now I'm wondering what can be done to foster this newly 
 acquired credibility in games.  By far the biggest issue I hear 
 about when it comes to people working on games in D is the 
 garbage collector.  You can work around the GC without too much 
 difficulty as Manu's experience shared in his DConf talk shows 
 but a lot of people new to D don't know how to do that.  We 
 could also use some tools and guides to help people identify 
 and avoid GC use when necessary.

  nogc comes to mind (I believe Andrei mentioned it during one 
 of the talks released). [1][2]

 Johannes Pfau's work in progress -vgc command line option [3] 
 would be another great tool that would help people identify GC 
 allocations.  This or something similar could also be used to 
 document throughout phobos when GC allocations can happen (and 
 help eliminate it where it makes sense to).

 There was a lot of interesting stuff in Benjamin Thaut's 
 article about GC versus manual memory management in a game [4] 
 and the discussion about it on the forums [5].  A lot of this 
 collective knowledge built up on manual memory management 
 techniques specific to D should probably be formalized and 
 added to the official documentation.  There is a Memory 
 Management [6] page in the documentation but it appears to be 
 rather dated at this point and not particularly applicable to 
 modern D2 (no mention of emplace or scoped and it talks about 
 using delete and scope classes).

 Game development is one place D can really get a foothold but 
 all too often the GC is held over D's head because people 
 taking their first look at D don't know how to avoid using it 
 and often don't realize you can avoid using it entirely. This 
 is easily the most common issue raised by newcomers to D with a 
 C or C++ background that I see in the #d IRC channel (many of 
 which are interested in game dev but concerned the GC will kill 
 their game's performance).


 1: http://d.puremagic.com/issues/show_bug.cgi?id=5219
 2: http://wiki.dlang.org/DIP18
 3: https://github.com/D-Programming-Language/dmd/pull/1886
 4: http://3d.benjamin-thaut.de/?p=20#more-20
 5: http://forum.dlang.org/post/k27bh7$t7f$1 digitalmars.com
 6: http://dlang.org/memory.html

As a game developer I will be really enjoyed to be able to 
develop our games in D, and for kind of games we do the major 
issue isn't the GC but the portability and links with 3-party 
libraries (mostly for our internal tools).

We essentially works on Point & Click games :
https://www.facebook.com/pages/Koalabs-Studio/380167978739812?ref=stream

A lot of games companies target many architectures like ARM, X86, 
or PowerPC,...

And for our internal tools we essentially use Qt, but for the 
moment I didn't try QtD. I don't have the chance for the moment 
to work on D during my work time.

May 23 2013

Piotr Szturmaj <bncrbme jadamspam.pl> writes:

W dniu 23.05.2013 20:13, Brad Anderson pisze:
  nogc comes to mind (I believe Andrei mentioned it during one of the
 talks released). [1][2]

 1: http://d.puremagic.com/issues/show_bug.cgi?id=5219
 2: http://wiki.dlang.org/DIP18

When I started learning D 2 years ago, I read on the D webpage that D
allows manual memory management and it's possible to disable the GC. My
first thought was that standard library is written without using GC.
This could be kind of lowest common denominator solution to the
problem. Later, I found it was only my wishful thinking. So, you can
disable GC, but then you can't reliably use the standard library.

Lowest common denominator has its own weaknesses, mainly it sometimes
sacrifices performance, as some algorithms may perform better using
managed slices for example, than using manually managed memory.

nogc attribute could be used to not only block GC, it could be used to
select between GC and non-GC code with the help of the overloading
mechanism. So, the programmer instead of writing one function, would
actually write two functions, one for the GC and one for manual memory
management - only if they need separete code. This will surely double
the effort for some functions, but certainly not for the whole library.
Majority of code doesn't need separete functions, mainly because it's
non allocating code. But some, like containers would surely need these
two "branches".

It would be inconvenient to do twice the work when writing programs,
but I'm not so sure it is when writing a library. And this is just
because it's a _library_, usually written once and then reused. I think
that double effort is not that discouraging in this case.

So, I'd kindly suggest to at least think about this. I'm proposing that
nogc functions could be overloaded similarly to immutable/const
functions.

The other idea is to divide threads into two thread groups: managed and
unmanaged. This is like running two programs together, one written in
D, and one written in C++. If we can run managed and unmanaged
processes separately, why not run two analogous "subprocesses" inside
one process. nogc could help with that.

Obviously, such thread groups must NOT share any mutable data. They
could communicate with some sort of IPC, or perhaps ITC -
std.concurrency comes to mind.

This kind of separation _inside one process_ could help many
applications. Imagine real time sound application, with managed GUI and
unmanaged real time sound loop. I know, everything is possible now, but
I'd rather wait for a safe and clean solution - the one in the D style:)

May 23 2013

"QAston" <qaston gmail.com> writes:

On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
 There was a lot of interesting stuff in Benjamin Thaut's 
 article about GC versus manual memory management in a game [4] 
 and the discussion about it on the forums [5].  A lot of this 
 collective knowledge built up on manual memory management 
 techniques specific to D should probably be formalized and 
 added to the official documentation.  There is a Memory 
 Management [6] page in the documentation but it appears to be 
 rather dated at this point and not particularly applicable to 
 modern D2 (no mention of emplace or scoped and it talks about 
 using delete and scope classes).

 Game development is one place D can really get a foothold but 
 all too often the GC is held over D's head because people 
 taking their first look at D don't know how to avoid using it 
 and often don't realize you can avoid using it entirely. This 
 is easily the most common issue raised by newcomers to D with a 
 C or C++ background that I see in the #d IRC channel (many of 
 which are interested in game dev but concerned the GC will kill 
 their game's performance).

I think that Phobos should have some support for manual memory 
management. I don't mean clearing out the gc usage there, as it's 
fairly obvious. I rather think about something like 
unique_ptr/shared_ptr in the std. I think unique_ptr can't be 
implemented without rval refs, also C++ sollutions may not fit 
here. Anyways, now it's not so straightforward how to live 
without gc so standard sollution would be really helpful.

Also, it should be visible in C++/D that D can really deal with 
manual memory management conveniently - when I checked out Dlang 
first time I felt very disappointed that "delete" operator is 
deprecated. "So - they advertise one can code without GC, yet 
they seem to deprecate the operator" - false claims discourage 
people from using new languages.

May 23 2013

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 23 May 2013 16:02:05 -0400, QAston <qaston gmail.com> wrote:

 Also, it should be visible in C++/D that D can really deal with manual  
 memory management conveniently - when I checked out Dlang first time I  
 felt very disappointed that "delete" operator is deprecated. "So - they  
 advertise one can code without GC, yet they seem to deprecate the  
 operator" - false claims discourage people from using new languages.

While I'm not specifically addressing the ability or not to disable the GC  
(I agree D has problems tehre), deprecating the delete operator does NOT  
preclude manual memory management.

The problem with delete is it conflates destruction with deallocation.   
Yes, when you deallocate, you want to destroy, but manual deallocation is  
a very dangerous operation.  Most of the time, you want to destroy WITHOUT  
deallocating (this is for cases where you are relying on the GC).

Then I think Andrei also had a gripe that D had a whole keyword dedicated  
to an unsafe operation.

You can still destroy and deallocate with destroy() and GC.free().

-Steve

May 23 2013

"QAston" <qaston gmail.com> writes:

On Thursday, 23 May 2013 at 20:07:08 UTC, Steven Schveighoffer 
wrote:
 While I'm not specifically addressing the ability or not to 
 disable the GC (I agree D has problems tehre), deprecating the 
 delete operator does NOT preclude manual memory management.

 The problem with delete is it conflates destruction with 
 deallocation.  Yes, when you deallocate, you want to destroy, 
 but manual deallocation is a very dangerous operation.  Most of 
 the time, you want to destroy WITHOUT deallocating (this is for 
 cases where you are relying on the GC).

 Then I think Andrei also had a gripe that D had a whole keyword 
 dedicated to an unsafe operation.

 You can still destroy and deallocate with destroy() and 
 GC.free().

 -Steve

Yes, I know the rationale behind deprecating delete and i agree 
with it. But from newcomer's point of view this looks misleading 
- not everyone has enough patience (or hatered towards c++) to 
lurk inside mailing lists and official website shows the 
deprecated way of doing things: http://dlang.org/memory.html . 
IMO manual memory management howto should be in a visible place - 
to dispell the myths language suffers from. Maybe even place in 
to the malloc-howto in Efficency paragraph of main website.

May 23 2013

"QAston" <qaston gmail.com> writes:

On Thursday, 23 May 2013 at 20:15:51 UTC, QAston wrote:
 Maybe even place in to the malloc-howto in Efficency paragraph 
 of main website.

Sorry, should be:
Maybe even place the malloc-howto in Efficency paragraph of main 
website.

May 23 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, May 23, 2013 at 10:15:50PM +0200, QAston wrote:
 On Thursday, 23 May 2013 at 20:07:08 UTC, Steven Schveighoffer
 wrote:
While I'm not specifically addressing the ability or not to
disable the GC (I agree D has problems tehre), deprecating the
delete operator does NOT preclude manual memory management.

The problem with delete is it conflates destruction with
deallocation.  Yes, when you deallocate, you want to destroy, but
manual deallocation is a very dangerous operation.  Most of the
time, you want to destroy WITHOUT deallocating (this is for cases
where you are relying on the GC).

Then I think Andrei also had a gripe that D had a whole keyword
dedicated to an unsafe operation.

You can still destroy and deallocate with destroy() and GC.free().

-Steve

 
 Yes, I know the rationale behind deprecating delete and i agree with
 it. But from newcomer's point of view this looks misleading - not
 everyone has enough patience (or hatered towards c++) to lurk inside
 mailing lists and official website shows the deprecated way of doing
 things: http://dlang.org/memory.html . IMO manual memory management
 howto should be in a visible place - to dispell the myths language
 suffers from. Maybe even place in to the malloc-howto in Efficency
 paragraph of main website.

Please file a bug on the bugtracker to update memory.html to reflect
current usage. Misleading (or outdated) documentation is often worse
than no documentation.


T

-- 
Lawyer: (n.) An innocence-vending machine, the effectiveness of which
depends on how much money is inserted.

May 23 2013

1100110 <0b1100110 gmail.com> writes:

On 05/23/2013 03:21 PM, H. S. Teoh wrote:
 On Thu, May 23, 2013 at 10:15:50PM +0200, QAston wrote:
 On Thursday, 23 May 2013 at 20:07:08 UTC, Steven Schveighoffer
 wrote:
 While I'm not specifically addressing the ability or not to
 disable the GC (I agree D has problems tehre), deprecating the
 delete operator does NOT preclude manual memory management.

 The problem with delete is it conflates destruction with
 deallocation.  Yes, when you deallocate, you want to destroy, but
 manual deallocation is a very dangerous operation.  Most of the
 time, you want to destroy WITHOUT deallocating (this is for cases
 where you are relying on the GC).

 Then I think Andrei also had a gripe that D had a whole keyword
 dedicated to an unsafe operation.

 You can still destroy and deallocate with destroy() and GC.free().

 -Steve

 Yes, I know the rationale behind deprecating delete and i agree with
 it. But from newcomer's point of view this looks misleading - not
 everyone has enough patience (or hatered towards c++) to lurk inside
 mailing lists and official website shows the deprecated way of doing
 things: http://dlang.org/memory.html . IMO manual memory management
 howto should be in a visible place - to dispell the myths language
 suffers from. Maybe even place in to the malloc-howto in Efficency
 paragraph of main website.

=20
 Please file a bug on the bugtracker to update memory.html to reflect
 current usage. Misleading (or outdated) documentation is often worse
 than no documentation.
=20
=20
 T
=20

Agreed, even if it's just a Warning Deprecated
it would be much better.

May 24 2013

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 23 May 2013 at 20:02:06 UTC, QAston wrote:
 I think that Phobos should have some support for manual memory 
 management. I don't mean clearing out the gc usage there, as 
 it's fairly obvious. I rather think about something like 
 unique_ptr/shared_ptr in the std. I think unique_ptr can't be 
 implemented without rval refs, also C++ sollutions may not fit 
 here. Anyways, now it's not so straightforward how to live 
 without gc so standard sollution would be really helpful.

There is std.typecons.Unique and std.typecons.RefCounted.  Unique 
is more cumbersome than unique_ptr but it should work though I've 
never tried to use it.  Proper rvalue references would be a nice 
improvement here.

RefCounted doesn't support classes yet simply because nobody has 
taken the time to add support for them.

It'd be nice to just be able to say shared_ptr = RefCounted, 
unique_ptr = Unique when somebody asks about smart pointers in D 
though.

std.typecons.scoped is also useful but a bit buggy/cumbersome.  
jA_cOp (IRC handle) is working on improving it.  Manu tried his 
hand at implementing his own version for fun (which came up 
because we were engaged in yet another GC argument with someone 
coming from C++).

May 23 2013

"QAston" <qaston gmail.com> writes:

On Thursday, 23 May 2013 at 20:51:42 UTC, Brad Anderson wrote:
 On Thursday, 23 May 2013 at 20:02:06 UTC, QAston wrote:
 I think that Phobos should have some support for manual memory 
 management. I don't mean clearing out the gc usage there, as 
 it's fairly obvious. I rather think about something like 
 unique_ptr/shared_ptr in the std. I think unique_ptr can't be 
 implemented without rval refs, also C++ sollutions may not fit 
 here. Anyways, now it's not so straightforward how to live 
 without gc so standard sollution would be really helpful.

 There is std.typecons.Unique and std.typecons.RefCounted.  
 Unique is more cumbersome than unique_ptr but it should work 
 though I've never tried to use it.  Proper rvalue references 
 would be a nice improvement here.

 RefCounted doesn't support classes yet simply because nobody 
 has taken the time to add support for them.

 It'd be nice to just be able to say shared_ptr = RefCounted, 
 unique_ptr = Unique when somebody asks about smart pointers in 
 D though.

 std.typecons.scoped is also useful but a bit buggy/cumbersome.  
 jA_cOp (IRC handle) is working on improving it.  Manu tried his 
 hand at implementing his own version for fun (which came up 
 because we were engaged in yet another GC argument with someone 
 coming from C++).

Thank you very much for the reply - I didn't realize those were 
already there.

May 23 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, May 23, 2013 22:02:05 QAston wrote:
 I think that Phobos should have some support for manual memory
 management. I don't mean clearing out the gc usage there, as it's
 fairly obvious. I rather think about something like
 unique_ptr/shared_ptr in the std. I think unique_ptr can't be
 implemented without rval refs, also C++ sollutions may not fit
 here. Anyways, now it's not so straightforward how to live
 without gc so standard sollution would be really helpful.

We have std.typecons.RefCounted, which is basically a shared pointer.

 Also, it should be visible in C++/D that D can really deal with
 manual memory management conveniently - when I checked out Dlang
 first time I felt very disappointed that "delete" operator is
 deprecated. "So - they advertise one can code without GC, yet
 they seem to deprecate the operator" - false claims discourage
 people from using new languages.

delete is only used for GC memory, and manual memory management should really 
be done with malloc and free rather than explicitly freeing GC memory. But if 
you really want to risk blowing your foot off, you can always use destroy to 
destroy an object in GC memory and core.memory.GC.free to free it.

Also, once we get custom allocators, it should be easier to manually manage 
memory (e.g. I would assume that it would properly abstract doing malloc and 
then emplacing the object in that memory so that you do something like 
allocator!MyObject(args) rather than having to deal with emplace directly).

- Jonathan M Davis

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 09:02, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Thursday, May 23, 2013 22:02:05 QAston wrote:
 I think that Phobos should have some support for manual memory
 management. I don't mean clearing out the gc usage there, as it's
 fairly obvious. I rather think about something like
 unique_ptr/shared_ptr in the std. I think unique_ptr can't be
 implemented without rval refs, also C++ sollutions may not fit
 here. Anyways, now it's not so straightforward how to live
 without gc so standard sollution would be really helpful.

 We have std.typecons.RefCounted, which is basically a shared pointer.

I've always steered away from things like this because it creates a
double-indirection.
I have thought of making a similar RefCounted template, but where the
refCount is stored in a hash table, and the pointer is used to index the
table.
This means the refCount doesn't pollute the class/structure being
ref-counted, or avoids a double-indirection on general access.
It will be slightly slower to inc/decrement, but that's a controlled
operation.
I would use a system like this for probably 80% of resources.

 Also, it should be visible in C++/D that D can really deal with
 manual memory management conveniently - when I checked out Dlang
 first time I felt very disappointed that "delete" operator is
 deprecated. "So - they advertise one can code without GC, yet
 they seem to deprecate the operator" - false claims discourage
 people from using new languages.

 delete is only used for GC memory, and manual memory management should
 really
 be done with malloc and free rather than explicitly freeing GC memory. But
 if
 you really want to risk blowing your foot off, you can always use destroy
 to
 destroy an object in GC memory and core.memory.GC.free to free it.

 Also, once we get custom allocators, it should be easier to manually manage
 memory (e.g. I would assume that it would properly abstract doing malloc
 and
 then emplacing the object in that memory so that you do something like
 allocator!MyObject(args) rather than having to deal with emplace directly).

Custom allocators will probably be very useful, but if there's one thing
STL has taught me, it's hard to use them effectively, and in practise,
nobody ever uses them.
One problem is the implicit allocation functions (array concatenation,
AA's, etc). How to force those to allocate somewhere else for the scope?


 - Jonathan M Davis

May 23 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/23/13 7:42 PM, Manu wrote:
 I've always steered away from things like this because it creates a
 double-indirection.

There's no double indirection for the payload.

 I have thought of making a similar RefCounted template, but where the
 refCount is stored in a hash table, and the pointer is used to index the
 table.
 This means the refCount doesn't pollute the class/structure being
 ref-counted, or avoids a double-indirection on general access.

But that's worse than non-intrusive refcounting, and way worse than 
intrusive refcounting (which should be the elective method for classes).

 Custom allocators will probably be very useful, but if there's one thing
 STL has taught me, it's hard to use them effectively, and in practise,
 nobody ever uses them.

Agreed.

 One problem is the implicit allocation functions (array concatenation,
 AA's, etc). How to force those to allocate somewhere else for the scope?

I have some ideas.


Andrei

May 23 2013

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 24 May 2013 at 00:44:14 UTC, Andrei Alexandrescu wrote:
 Custom allocators will probably be very useful, but if there's 
 one thing
 STL has taught me, it's hard to use them effectively, and in 
 practise,
 nobody ever uses them.

 Agreed.

To benefit from a custom allocator, you need to be under a very 
specific use case. Generic allocator are pretty good in most 
cases.

May 23 2013

Sean Cavanaugh <WorksOnMyMachine gmail.com> writes:

On 5/24/2013 12:25 AM, deadalnix wrote:
 On Friday, 24 May 2013 at 00:44:14 UTC, Andrei Alexandrescu wrote:
 Custom allocators will probably be very useful, but if there's one thing
 STL has taught me, it's hard to use them effectively, and in practise,
 nobody ever uses them.

 Agreed.

 To benefit from a custom allocator, you need to be under a very specific
 use case. Generic allocator are pretty good in most cases.


Most general allocators choke on multi-threaded code, so a large part of 
customizing allocations is to get rid lock contention.

While STL containers can have basic allocator templates assigned to 
them, if you really need performance you typically need to control all 
the different kinds of allocations a container does.

For example, a std::unordered_set allocates a ton of link list nodes to 
keep iterators stable inserts and removes, but the actual data payload 
is another separate allocation, as is some kind of root data structure 
to hold the hash tables.  In STL land this is all allocated through a 
single allocator object, making it very difficult (nearly impossible in 
a clean way) to allocate the payload data with some kind of fixed size 
block allocator, and allocate the metadata and link list nodes with a 
different allocator.  Some people would complain this exposes 
implementation details of a class, but the class is a template, it 
should be able to be configured to work the way you need it to.


class tHashMapNodeDefaultAllocator
{
public:
     static void* allocateMemory(size_t size, size_t alignment)
     {
         return mAlloc(size, alignment);
     }
     static void freeMemory(void* pointer) NOEXCEPT
     {
         mFree(pointer);
     }
};


template <typename DefaultKeyType, typename DefaultValueType>
class tHashMapConfiguration
{
public:
     typedef typename tHashClass<DefaultKeyType> HashClass;
     typedef typename tEqualsClass<DefaultKeyType> EqualClass;
     typedef tHashMapNodeDefaultAllocator NodeAllocator;
     typedef typename tDynamicArrayConfiguration<typename 
tHashMapNode<DefaultKeyType, DefaultValueType>> NodeArrayConfiguration;
};


template <typename KeyType, typename ValueType, typename 
HashMapConfiguration = tHashMapConfiguration<KeyType, ValueType>>
class tHashMap
{
};


// the tHashMap also has an array inside, so there is a way to configure 
that too:


class tDynamicArrayDefaultAllocator
{
public:
     static void* allocateMemory(size_t size, size_t alignment)
     {
         return mAlloc(size, alignment);
     }
     static void freeMemory(void* pointer) NOEXCEPT
     {
         mFree(pointer);
     }
};


class tDynamicArrayDefaultStrategy
{
public:
     static size_t nextAllocationSize(size_t currentSize, size_t 
objectSize, size_t numNewItemsRequested)
     {
         // return some size to grow the array by when the capacity is 
reached
         return currentSize + numNewItemsRequested * 2;
     }
}


template <typename DefaultObjectType>
class tDynamicArrayConfiguration
{
public:
     typedef tDynamicArrayDefaultStrategy DynamicArrayStrategy;
     typedef tDynamicArrayDefaultAllocator DynamicArrayAllocator;
};

May 23 2013

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 24 May 2013 at 05:49:18 UTC, Sean Cavanaugh wrote:
 Most general allocators choke on multi-threaded code, so a 
 large part of customizing allocations is to get rid lock 
 contention.

It is safe to assume that the future is multithreaded and that 
general allocator won't choke on that for long. They already 
exists, you probably don't need (and don't want if your are not 
affected by NIH syndrome) to roll your own here.

May 23 2013

Michel Fortin <michel.fortin michelf.ca> writes:

On 2013-05-23 23:42:10 +0000, Manu <turkeyman gmail.com> said:

 I have thought of making a similar RefCounted template, but where the
 refCount is stored in a hash table, and the pointer is used to index the
 table.
 This means the refCount doesn't pollute the class/structure being
 ref-counted, or avoids a double-indirection on general access.
 It will be slightly slower to inc/decrement, but that's a controlled
 operation.
 I would use a system like this for probably 80% of resources.

I just want to note that this is exactly how reference counts are 
handled in Apple's Objective-C implementation, with a spin-lock 
protecting the table.

Actually, on OS X (but not on iOS) there's 4 tables (if I remember 
well) and which table to use is determined by bits 4 & 5 of the 
pointer. It probably helps when you have more cores.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

May 23 2013

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 23 May 2013 at 23:42:22 UTC, Manu wrote:
 I've always steered away from things like this because it 
 creates a
 double-indirection.
 I have thought of making a similar RefCounted template, but 
 where the
 refCount is stored in a hash table, and the pointer is used to 
 index the
 table.
 This means the refCount doesn't pollute the class/structure 
 being
 ref-counted, or avoids a double-indirection on general access.
 It will be slightly slower to inc/decrement, but that's a 
 controlled
 operation.
 I would use a system like this for probably 80% of resources.

Reference counting also tend to create die in mass effect 
(objects tends to die in cluster) and freeze program for a while. 
I'm not sure it is that better (better than current D's GC for 
sure, but I'm not sure it is better than a good GC). It probably 
depends on the usage pattern.

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 15:21, deadalnix <deadalnix gmail.com> wrote:

 On Thursday, 23 May 2013 at 23:42:22 UTC, Manu wrote:

 I've always steered away from things like this because it creates a
 double-indirection.
 I have thought of making a similar RefCounted template, but where the
 refCount is stored in a hash table, and the pointer is used to index the
 table.
 This means the refCount doesn't pollute the class/structure being
 ref-counted, or avoids a double-indirection on general access.
 It will be slightly slower to inc/decrement, but that's a controlled
 operation.
 I would use a system like this for probably 80% of resources.

 Reference counting also tend to create die in mass effect (objects tends
 to die in cluster) and freeze program for a while. I'm not sure it is that
 better (better than current D's GC for sure, but I'm not sure it is better
 than a good GC). It probably depends on the usage pattern.

In my experience that's fine.
In realtime code, you tend not to allocate/deallocate at runtime. Unless
it's some short lived temp's, which tend not to cluster how you describe...
When you eventually do free some big resources, causing a cluster free, you
will have probably done it at an appropriate time where you intend such a
thing to happen.

May 23 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-May-2013 22:13, Brad Anderson пишет:
 While there hasn't been anything official, I think it's a safe bet to
 say that D is being used for a major title, Remedy's Quantum Break,
 featured prominently during the announcement of Xbox One. Quantum Break
 doesn't come out until 2014 so the timeline seems about right (Remedy
 doesn't appear to work on more than one game at a time from what I can
 tell).

 Now I'm wondering what can be done to foster this newly acquired
 credibility in games.  By far the biggest issue I hear about when it
 comes to people working on games in D is the garbage collector.  You can
 work around the GC without too much difficulty as Manu's experience
 shared in his DConf talk shows but a lot of people new to D don't know
 how to do that.  We could also use some tools and guides to help people
 identify and avoid GC use when necessary.

  nogc comes to mind (I believe Andrei mentioned it during one of the
 talks released). [1][2]

I have simple and future proof proposal:

1. Acknowledge how containers would look like (API level is fine, and 
std.container has it). Postpone allocator or consider them be backed 
into container.

2. Then for any function that has to allocate something (array 
typically) add a compile-time parameter - container to use. Obviously 
there has to be a constraint on what kind of operations it must provide.

3. std.algorithm and std.range become usable. We then can extend this 
policy beyond.

Some examples to boot:

1. std.array.array - incredibly nice tool, turns any range into array.
Let's make a construct function that does the same for any container:
auto arr = array(iota(0, 10).map....)
--->
auto arr = construct!(Array!int)(iota(0, 10).map...)

by repeatedly calling insertAny in general, and doing better things 
depending on the primitives available (like reserving space beforehand 
for array-like types).

BTW users can use alias FTW:
alias toArray = construct!(Array!int); // Yay!

2. schwartzSort - allocates array internally. We just need to pass it 
the right replacement type for array. schwartzSort!(Array!int)(...) - no 
GC required now. Ditto for levenshteinDistance etc.

There could be some limitations on how far such approach can go with 
introducing new overloads. Alternative is new functions with some 
suffix/prefix.
-- 
Dmitry Olshansky

May 23 2013

Paulo Pinto <pjmlp progtools.org> writes:

Am 23.05.2013 20:13, schrieb Brad Anderson:
 While there hasn't been anything official, I think it's a safe bet to
 say that D is being used for a major title, Remedy's Quantum Break,
 featured prominently during the announcement of Xbox One. Quantum Break
 doesn't come out until 2014 so the timeline seems about right (Remedy
 doesn't appear to work on more than one game at a time from what I can
 tell).


 That's pretty huge news.


 Now I'm wondering what can be done to foster this newly acquired
 credibility in games.  By far the biggest issue I hear about when it
 comes to people working on games in D is the garbage collector.  You can
 work around the GC without too much difficulty as Manu's experience
 shared in his DConf talk shows but a lot of people new to D don't know
 how to do that.  We could also use some tools and guides to help people
 identify and avoid GC use when necessary.

  nogc comes to mind (I believe Andrei mentioned it during one of the
 talks released). [1][2]

 Johannes Pfau's work in progress -vgc command line option [3] would be
 another great tool that would help people identify GC allocations.  This
 or something similar could also be used to document throughout phobos
 when GC allocations can happen (and help eliminate it where it makes
 sense to).

 There was a lot of interesting stuff in Benjamin Thaut's article about
 GC versus manual memory management in a game [4] and the discussion
 about it on the forums [5].  A lot of this collective knowledge built up
 on manual memory management techniques specific to D should probably be
 formalized and added to the official documentation.  There is a Memory
 Management [6] page in the documentation but it appears to be rather
 dated at this point and not particularly applicable to modern D2 (no
 mention of emplace or scoped and it talks about using delete and scope
 classes).

 Game development is one place D can really get a foothold but all too
 often the GC is held over D's head because people taking their first
 look at D don't know how to avoid using it and often don't realize you
 can avoid using it entirely. This is easily the most common issue raised
 by newcomers to D with a C or C++ background that I see in the #d IRC
 channel (many of which are interested in game dev but concerned the GC
 will kill their game's performance).


 1: http://d.puremagic.com/issues/show_bug.cgi?id=5219
 2: http://wiki.dlang.org/DIP18
 3: https://github.com/D-Programming-Language/dmd/pull/1886
 4: http://3d.benjamin-thaut.de/?p=20#more-20
 5: http://forum.dlang.org/post/k27bh7$t7f$1 digitalmars.com
 6: http://dlang.org/memory.html

With the increase usage of Windows Phone 8 (I know I know), MonoGame, 

of D losing that train, even with Remedy's good example.



--
Paulo

May 23 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 05/23/2013 08:43 PM, H. S. Teoh wrote:
 I listened to Manu's talk yesterday, and I agree with what he said, that
 Phobos functions that don't *need* to allocate, shouldn't. Andrei was
 also enthusiastic about std.algorithm being almost completely
 allocation-free. Maybe we should file bugs (enhancement requests?) for
 all such Phobos functions?

I'm also in agreement with Manu.  There may well already be bugs for some of
them -- e.g. there is one for toUpperInPlace which he referred to, and the
source of the allocation is clear and is even responsible for other bugs:
http://d.puremagic.com/issues/show_bug.cgi?id=9629

I asked for a list because, even if all the cases are registered as bugs, it's
not necessarily easy to find them.  So, we need to either tag all the bugs so
they can be found easily, or make a list somewhere.

May 23 2013

Jacob Carlborg <doob me.com> writes:

On 2013-05-23 23:42, Joseph Rushton Wakeling wrote:

 I'm also in agreement with Manu.  There may well already be bugs for some of
 them -- e.g. there is one for toUpperInPlace which he referred to, and the
 source of the allocation is clear and is even responsible for other bugs:
 http://d.puremagic.com/issues/show_bug.cgi?id=9629

toUpper/lower cannot be made in place if it should handle all Unicode. 
Some characters will change their length when convert to/from uppercase. 
Examples of these are the German double S and some Turkish I.

-- 
/Jacob Carlborg

May 24 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-May-2013 13:49, Jacob Carlborg пишет:
 On 2013-05-23 23:42, Joseph Rushton Wakeling wrote:

 I'm also in agreement with Manu.  There may well already be bugs for
 some of
 them -- e.g. there is one for toUpperInPlace which he referred to, and
 the
 source of the allocation is clear and is even responsible for other bugs:
 http://d.puremagic.com/issues/show_bug.cgi?id=9629

 toUpper/lower cannot be made in place if it should handle all Unicode.
 Some characters will change their length when convert to/from uppercase.
 Examples of these are the German double S and some Turkish I.

Yes! Now we're getting somewhere. The function was a mistake to begin with.

-- 
Dmitry Olshansky

May 24 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Friday, 24 May 2013 at 09:49:40 UTC, Jacob Carlborg wrote:
 toUpper/lower cannot be made in place if it should handle all 
 Unicode. Some characters will change their length when convert 
 to/from uppercase. Examples of these are the German double S 
 and some Turkish I.

In that case it should only allocate when needed. Most strings 
are ASCII and will not change size.

May 24 2013

Jacob Carlborg <doob me.com> writes:

On 2013-05-24 12:01, Peter Alexander wrote:

 In that case it should only allocate when needed. Most strings are ASCII
 and will not change size.

What I mean is that something called "InPlace" doesn't go hand in hand 
with something that allocates. There's always std.ascii.

-- 
/Jacob Carlborg

May 24 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Friday, 24 May 2013 at 12:29:43 UTC, Jacob Carlborg wrote:
 On 2013-05-24 12:01, Peter Alexander wrote:

 In that case it should only allocate when needed. Most strings 
 are ASCII
 and will not change size.

 What I mean is that something called "InPlace" doesn't go hand 
 in hand with something that allocates. There's always std.ascii.

Ah right, I see your point. My bad.

May 24 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 05/24/2013 11:49 AM, Jacob Carlborg wrote:
 toUpper/lower cannot be made in place if it should handle all Unicode. Some
 characters will change their length when convert to/from uppercase. Examples of
 these are the German double S and some Turkish I.

Surely it's possible to put in-place checks for whether the character length
changes, and ensure in-place replacement without any allocation if it doesn't.
(To be honest, feels a bit of a design flaw in Unicode that character length can
change between lower- and uppercase.)

May 24 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Friday, 24 May 2013 at 13:37:36 UTC, Joseph Rushton Wakeling 
wrote:
 (To be honest, feels a bit of a design flaw in Unicode that 
 character length can
 change between lower- and uppercase.)

Unfortunately it's either that or lose compatibility with ASCII. 
Lower case dotted-i needs to be one byte for ASCII, and upper 
case dotted-i isn't ASCII, so it needs to be more than one byte.

P.S. it's a problem with UTF-8, not Unicode.

May 24 2013

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On 2013-05-24, 16:24, Peter Alexander wrote:

 On Friday, 24 May 2013 at 13:37:36 UTC, Joseph Rushton Wakeling wrote:=

 (To be honest, feels a bit of a design flaw in Unicode that character=


  =

 length can
 change between lower- and uppercase.)

 Unfortunately it's either that or lose compatibility with ASCII. Lower=

  =

 case dotted-i needs to be one byte for ASCII, and upper case dotted-i =

 =

 isn't ASCII, so it needs to be more than one byte.

One could certainly have two different lowercase dotted I's, with one
mapping to I and the other to =C4=B0, and their unicode values close to =
the
upper-case versions.

-- =

Simen

May 24 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 19:49, Jacob Carlborg <doob me.com> wrote:

 On 2013-05-23 23:42, Joseph Rushton Wakeling wrote:

  I'm also in agreement with Manu.  There may well already be bugs for som=

e
 of
 them -- e.g. there is one for toUpperInPlace which he referred to, and t=


he
 source of the allocation is clear and is even responsible for other bugs=


:
 http://d.puremagic.com/issues/**show_bug.cgi?id=3D9629<http://d.puremagi=


c.com/issues/show_bug.cgi?id=3D9629>

 toUpper/lower cannot be made in place if it should handle all Unicode.
 Some characters will change their length when convert to/from uppercase.
 Examples of these are the German double S and some Turkish I.


=C3=9F and SS are both actually 2 bytes, so it works in UTF-8 at least! ;)

May 24 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-May-2013 18:38, Manu пишет:
 On 24 May 2013 19:49, Jacob Carlborg <doob me.com <mailto:doob me.com>>
 wrote:

     On 2013-05-23 23:42, Joseph Rushton Wakeling wrote:

         I'm also in agreement with Manu.  There may well already be bugs
         for some of
         them -- e.g. there is one for toUpperInPlace which he referred
         to, and the
         source of the allocation is clear and is even responsible for
         other bugs:
         http://d.puremagic.com/issues/__show_bug.cgi?id=9629
         <http://d.puremagic.com/issues/show_bug.cgi?id=9629>


     toUpper/lower cannot be made in place if it should handle all
     Unicode. Some characters will change their length when convert
     to/from uppercase. Examples of these are the German double S and
     some Turkish I.


 ß and SS are both actually 2 bytes, so it works in UTF-8 at least! ;)

Okay, here you go - an UTF-8 table of cased sin :)

Codepoint - upper-case - lower-case
0x01e9e : 0x000df - 3 : 2
0x0023a : 0x02c65 - 2 : 3
0x0023e : 0x02c66 - 2 : 3
0x02c7e : 0x0023f - 3 : 2
0x02c7f : 0x00240 - 3 : 2
0x02c6f : 0x00250 - 3 : 2
0x02c6d : 0x00251 - 3 : 2
0x02c70 : 0x00252 - 3 : 2
0x0a78d : 0x00265 - 3 : 2
0x0a7aa : 0x00266 - 3 : 2
0x02c62 : 0x0026b - 3 : 2
0x02c6e : 0x00271 - 3 : 2
0x02c64 : 0x0027d - 3 : 2
0x01e9e : 0x000df - 3 : 2
0x02c62 : 0x0026b - 3 : 2
0x02c64 : 0x0027d - 3 : 2
0x0023a : 0x02c65 - 2 : 3
0x0023e : 0x02c66 - 2 : 3
0x02c6d : 0x00251 - 3 : 2
0x02c6e : 0x00271 - 3 : 2
0x02c6f : 0x00250 - 3 : 2
0x02c70 : 0x00252 - 3 : 2
0x02c7e : 0x0023f - 3 : 2
0x02c7f : 0x00240 - 3 : 2
0x0a78d : 0x00265 - 3 : 2
0x0a7aa : 0x00266 - 3 : 2

And this is only with 1:1 mapping.

Generated by:

void main(){
     import std.uni, std.utf, std.stdio;
     char buf[4];
     foreach(dchar ch; unicode.Cased_Letter.byCodepoint){
         dchar upper = toUpper(ch);
         dchar lower = toLower(ch);

         int uLen = encode(buf, upper);
         int lLen = encode(buf, lower);
         if(uLen != lLen)
             writefln("0x%05x : 0x%05x - %d : %d", upper, lower, uLen, 
lLen);
     }
}



-- 
Dmitry Olshansky

May 24 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, May 24, 2013 09:42:10 Manu wrote:
 On 24 May 2013 09:02, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On Thursday, May 23, 2013 22:02:05 QAston wrote:
 I think that Phobos should have some support for manual memory
 management. I don't mean clearing out the gc usage there, as it's
 fairly obvious. I rather think about something like
 unique_ptr/shared_ptr in the std. I think unique_ptr can't be
 implemented without rval refs, also C++ sollutions may not fit
 here. Anyways, now it's not so straightforward how to live
 without gc so standard sollution would be really helpful.

 
 We have std.typecons.RefCounted, which is basically a shared pointer.

 
 I've always steered away from things like this because it creates a
 double-indirection.
 I have thought of making a similar RefCounted template, but where the
 refCount is stored in a hash table, and the pointer is used to index the
 table.
 This means the refCount doesn't pollute the class/structure being
 ref-counted, or avoids a double-indirection on general access.
 It will be slightly slower to inc/decrement, but that's a controlled
 operation.
 I would use a system like this for probably 80% of resources.

We use smart pointers where I work and it's a godsend for avoiding memory 
problems. We almost never have them whereas the idiots who designed the older 
software used manual refcounting everywhere, and they had tons of memory 
problems. But while we need to be performant, we don't need to be performant 
on quite the level that you do. So, maybe it's more of a problem in your 
environment.

 Also, it should be visible in C++/D that D can really deal with
 
 manual memory management conveniently - when I checked out Dlang
 first time I felt very disappointed that "delete" operator is
 deprecated. "So - they advertise one can code without GC, yet
 they seem to deprecate the operator" - false claims discourage
 people from using new languages.

 
 delete is only used for GC memory, and manual memory management should
 really
 be done with malloc and free rather than explicitly freeing GC memory. But
 if
 you really want to risk blowing your foot off, you can always use destroy
 to
 destroy an object in GC memory and core.memory.GC.free to free it.
 
 Also, once we get custom allocators, it should be easier to manually
 manage
 memory (e.g. I would assume that it would properly abstract doing malloc
 and
 then emplacing the object in that memory so that you do something like
 allocator!MyObject(args) rather than having to deal with emplace
 directly).

 
 Custom allocators will probably be very useful, but if there's one thing
 STL has taught me, it's hard to use them effectively, and in practise,
 nobody ever uses them.

Well, as Andrei said, they're hard, which is why they aren't done yet. Another 
think to think about with regards to C++ though is the fact that its new and 
delete don't having anything to do with a GC, so it has a built-in nice way of 
allocating memory which is managed manually, whereas in D, we're forced to use 
emplace, which is a lot more of a hassle. Even simply having something like 
allocator.make!MyObj(args) and allocator.free(args) would really help out. 
There's no question though that they get hairier when you start having to 
worry about containers and internal allocations and the like. It's a tough 
problem.

 One problem is the implicit allocation functions (array concatenation,
 AA's, etc). How to force those to allocate somewhere else for the scope?

I would fully expect that they use the GC and only the GC as they're language 
constructs, and custom allocators are going to be library constructs. The 
allocators may provide clean ways to do stuff like concatenating arrays using 
their API rather than ~, but if you really want to manipulate arrays with 
slicing and concatenation and whatnot without the GC, I think that you're 
pretty much going to have to create a new type to handle them, which is very 
doable, but it does mean not using the built-in arrays as much, which does 
kind of suck. But for most programs, I would expect that simply managing the 
GC more intelligently for stuff that has to be GC allocated would solve the 
problem nicely. Kiith-Sa and others have managed to quite well at getting the 
GC to work efficiently by managing when it's enabled and gets the chance to run 
and whatnot. You have extremely stringent requirements that may cause problems 
with that (though Kiith-Sa was doing a game of some variety IIRC), but pretty 
much the only way to make it so that built-in stuff that allocates doesn't use 
the GC is to use your own version of druntime.

Kiith-Sa had a good post on how to go about dealing with the GC in performant 
code a while back:

http://forum.dlang.org/post/vbsajlgotanuhmmpnspf forum.dlang.org

Regardless, we're not going to get away from some language features requiring 
the GC, but they're also not features that exist in C++, so if you really 
can't use them, you still haven't lost anything over C++ (as much as it may 
still suck to not be able to them), and there are still plenty of other great 
features that you can take advantage of.

- Jonathan M Davis

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 09:57, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Friday, May 24, 2013 09:42:10 Manu wrote:
 On 24 May 2013 09:02, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On Thursday, May 23, 2013 22:02:05 QAston wrote:
 I think that Phobos should have some support for manual memory
 management. I don't mean clearing out the gc usage there, as it's
 fairly obvious. I rather think about something like
 unique_ptr/shared_ptr in the std. I think unique_ptr can't be
 implemented without rval refs, also C++ sollutions may not fit
 here. Anyways, now it's not so straightforward how to live
 without gc so standard sollution would be really helpful.

 We have std.typecons.RefCounted, which is basically a shared pointer.

 I've always steered away from things like this because it creates a
 double-indirection.
 I have thought of making a similar RefCounted template, but where the
 refCount is stored in a hash table, and the pointer is used to index the
 table.
 This means the refCount doesn't pollute the class/structure being
 ref-counted, or avoids a double-indirection on general access.
 It will be slightly slower to inc/decrement, but that's a controlled
 operation.
 I would use a system like this for probably 80% of resources.

 We use smart pointers where I work and it's a godsend for avoiding memory
 problems. We almost never have them whereas the idiots who designed the
 older
 software used manual refcounting everywhere, and they had tons of memory
 problems. But while we need to be performant, we don't need to be
 performant
 on quite the level that you do. So, maybe it's more of a problem in your
 environment.

 Also, it should be visible in C++/D that D can really deal with

 manual memory management conveniently - when I checked out Dlang
 first time I felt very disappointed that "delete" operator is
 deprecated. "So - they advertise one can code without GC, yet
 they seem to deprecate the operator" - false claims discourage
 people from using new languages.

 delete is only used for GC memory, and manual memory management should
 really
 be done with malloc and free rather than explicitly freeing GC memory.


 But
 if
 you really want to risk blowing your foot off, you can always use


 destroy
 to
 destroy an object in GC memory and core.memory.GC.free to free it.

 Also, once we get custom allocators, it should be easier to manually
 manage
 memory (e.g. I would assume that it would properly abstract doing


 malloc
 and
 then emplacing the object in that memory so that you do something like
 allocator!MyObject(args) rather than having to deal with emplace
 directly).

 Custom allocators will probably be very useful, but if there's one thing
 STL has taught me, it's hard to use them effectively, and in practise,
 nobody ever uses them.

 Well, as Andrei said, they're hard, which is why they aren't done yet.
 Another
 think to think about with regards to C++ though is the fact that its new
 and
 delete don't having anything to do with a GC, so it has a built-in nice
 way of
 allocating memory which is managed manually, whereas in D, we're forced to
 use
 emplace, which is a lot more of a hassle. Even simply having something like
 allocator.make!MyObj(args) and allocator.free(args) would really help out.
 There's no question though that they get hairier when you start having to
 worry about containers and internal allocations and the like. It's a tough
 problem.

 One problem is the implicit allocation functions (array concatenation,
 AA's, etc). How to force those to allocate somewhere else for the scope?

 I would fully expect that they use the GC and only the GC as they're
 language
 constructs, and custom allocators are going to be library constructs. The
 allocators may provide clean ways to do stuff like concatenating arrays
 using
 their API rather than ~, but if you really want to manipulate arrays with
 slicing and concatenation and whatnot without the GC, I think that you're
 pretty much going to have to create a new type to handle them, which is
 very
 doable, but it does mean not using the built-in arrays as much, which does
 kind of suck. But for most programs, I would expect that simply managing
 the
 GC more intelligently for stuff that has to be GC allocated would solve the
 problem nicely. Kiith-Sa and others have managed to quite well at getting
 the
 GC to work efficiently by managing when it's enabled and gets the chance
 to run
 and whatnot. You have extremely stringent requirements that may cause
 problems
 with that (though Kiith-Sa was doing a game of some variety IIRC), but
 pretty
 much the only way to make it so that built-in stuff that allocates doesn't
 use
 the GC is to use your own version of druntime.

 Kiith-Sa had a good post on how to go about dealing with the GC in
 performant
 code a while back:

 http://forum.dlang.org/post/vbsajlgotanuhmmpnspf forum.dlang.org

 Regardless, we're not going to get away from some language features
 requiring
 the GC, but they're also not features that exist in C++, so if you really
 can't use them, you still haven't lost anything over C++ (as much as it may
 still suck to not be able to them), and there are still plenty of other
 great
 features that you can take advantage of.

/agree, except the issue I raised, when ~ is used in phobos.
That means that function is now off-limits. And there's no way to know
which functions they are...

- Jonathan M Davis

May 23 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 24 May 2013 01:11:17 +0100, Manu <turkeyman gmail.com> wrote:
 /agree, except the issue I raised, when ~ is used in phobos.
 That means that function is now off-limits. And there's no way to know
 which functions they are...

It's not the allocation caused by ~ which is the issue though is it, it's  
the collection it might trigger, right?

So what you really need are 3 main things:

1. A way to prevent the GC collecting until a given point(*).
2. A way to limit the GC collection time.
3. For phobos functions to be optimised to not allocate or to use alloca  
where possible.





(*) Until the collection point the GC would ask the OS for more memory (a  
new pool or page) or fail and throw an Error.  Much like in Leandro's  
concurrent GC talk/example where he talks about eager allocation.


this is..

Lets imagine you can mark a thread as not stopped by the pause-the-world.   
Lets imagine it still does allocations which we want to collect at some  
stage.  How would this work..

1. The GC would remove the thread stack and global space from it's list of  
roots scanned by normal collections.  It would not pause it on normal  
collections.

2. (*) above would be in effect, the first allocation in the thread would  
cause the GC to create a thread local pool, this pool would not be shared  
by other threads (no locking required, not scanned by normal GC  
collections).  This pool could be pre-allocated by a new GC primitive  
"GC.allocLocalPool();" for efficiency.  Allocation would come from this  
thread-local pool, or trigger a new pool allocation - so minimal locking  
should be required.

3. The thread would call a new GC primitive at the point where collection  
was desired i.e. "GC.localCollect(size_t maxMicroSecs);".  This collection  
would be special, it would not stop the thread, but would occur inline.   
It would only scan the thread local pool and would do so with an enforced  
upper bound collection time.

4. There are going to be issues around 'shared' /mutable/ data, e.g.

  - The non-paused thread accessing it (esp during collection)
  - If the thread allocated 'shared' data

I am hoping that if the thread main function is marked as  notpaused (or  
similar) then the compiler can statically verify neither of these occur  
and produce a compile time error.

So, that's the idea.  I don't know the current GC all that well so I've  
probably missed something crucial.  I doubt this idea is revolutionary and  
it is perhaps debatable whether the complexity is worth the effort, also  
whether it actually makes placing an upper bound on the collection any  
easier.

Thoughts?

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

May 24 2013

"Dicebot" <m.strashun gmail.com> writes:

On Friday, 24 May 2013 at 10:24:13 UTC, Regan Heath wrote:
 It's not the allocation caused by ~ which is the issue though 
 is it, it's the collection it might trigger, right?

Depends. When it comes to real-time software you can't say 
without studying specific task requirements. Stop-the-world 
collection is a complete disaster but, for example, if you 
consider concurrent one like Leandro has shown - it can satisfy 
soft real-time requirements. But only if heap size managed by GC 
stays reasonably low - thus the need to control that you don't 
allocate in an unexpected ways.

 So what you really need are 3 main things:

 1. A way to prevent the GC collecting until a given point(*).

You can do it now. Does not help if world is stopped and/or you 
can't limit collection time.

 2. A way to limit the GC collection time.

Or run it concurrently with low priority. Will do for lot of 
_soft_ real-time.

 3. For phobos functions to be optimised to not allocate or to 
 use alloca where possible.

Really important one as helps not only game dev / soft real-time 
servers, but also hardcore embedded.

May 24 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 24 May 2013 11:38:40 +0100, Dicebot <m.strashun gmail.com> wrote:

 On Friday, 24 May 2013 at 10:24:13 UTC, Regan Heath wrote:
 It's not the allocation caused by ~ which is the issue though is it,  
 it's the collection it might trigger, right?

 Depends. When it comes to real-time software you can't say without  
 studying specific task requirements. Stop-the-world collection is a  
 complete disaster but, for example, if you consider concurrent one like  
 Leandro has shown - it can satisfy soft real-time requirements. But only  
 if heap size managed by GC stays reasonably low - thus the need to  
 control that you don't allocate in an unexpected ways.

 So what you really need are 3 main things:

 1. A way to prevent the GC collecting until a given point(*).

 You can do it now. Does not help if world is stopped and/or you can't  
 limit collection time.

If you disable collection, then the GC runs out of memory what happens?   
Does it simply ask the OS for more memory?  I assumed, from Leandro's  
talk, that it would block on the GC lock until collection completed, or  
simply fail if collection was disabled.

Also, the key to the idea I gave was to control collection only in the  
real-time thread/part of the application.

 2. A way to limit the GC collection time.

 Or run it concurrently with low priority. Will do for lot of _soft_  
 real-time.

I don't think Manu is doing _soft_ real-time, he wants a hard guarantee it  
will not exceed 100us (or similar).

Concurrent may be a possible solution as well, but if you think about it  
my idea is basically a second isolated collector running in a real-time  
context concurrently.

 3. For phobos functions to be optimised to not allocate or to use  
 alloca where possible.

 Really important one as helps not only game dev / soft real-time  
 servers, but also hardcore embedded.

Sure, it's desirable to be more efficient but it's no longer essential if  
the allocations no longer cost you anything in the real-time thread -  
that's the point.

What do you think of the idea of making marked threads except from normal  
GC processing and isolating their allocations to a single page/pool in  
order to control and reduce collection times?

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

May 24 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 20:24, Regan Heath <regan netmail.co.nz> wrote:

 On Fri, 24 May 2013 01:11:17 +0100, Manu <turkeyman gmail.com> wrote:

 /agree, except the issue I raised, when ~ is used in phobos.
 That means that function is now off-limits. And there's no way to know
 which functions they are...

 It's not the allocation caused by ~ which is the issue though is it, it's
 the collection it might trigger, right?

Yes, but the unpredictability is the real concern. It's hard to control
something that you don't know about.
If the phobos function can avoid the allocation, then why not avoid it?


So what you really need are 3 main things:
 1. A way to prevent the GC collecting until a given point(*).
 2. A way to limit the GC collection time.
 3. For phobos functions to be optimised to not allocate or to use alloca
 where possible.



I think we can already do this.



The incremental(+precise) GC idea, I think this would be the silver bullet
for games!



Yes, I think effort to improve this would be universally appreciated.


(*) Until the collection point the GC would ask the OS for more memory (a
 new pool or page) or fail and throw an Error.  Much like in Leandro's
 concurrent GC talk/example where he talks about eager allocation.

Bear in mind, most embedded hardware does now have virtual memory, and
often a fairly small hard limit.
If we are trying to manually sequence out allocations and collects, like
schedule collects when you change scenes on a black screen or something for
instance, then you can't have random phobos functions littering small
allocations all over the place.




 this is..

 Lets imagine you can mark a thread as not stopped by the pause-the-world.
  Lets imagine it still does allocations which we want to collect at some
 stage.  How would this work..

 1. The GC would remove the thread stack and global space from it's list of
 roots scanned by normal collections.  It would not pause it on normal
 collections.

 2. (*) above would be in effect, the first allocation in the thread would
 cause the GC to create a thread local pool, this pool would not be shared
 by other threads (no locking required, not scanned by normal GC
 collections).  This pool could be pre-allocated by a new GC primitive
 "GC.allocLocalPool();" for efficiency.  Allocation would come from this
 thread-local pool, or trigger a new pool allocation - so minimal locking
 should be required.

 3. The thread would call a new GC primitive at the point where collection
 was desired i.e. "GC.localCollect(size_t maxMicroSecs);".  This collection
 would be special, it would not stop the thread, but would occur inline.  It
 would only scan the thread local pool and would do so with an enforced
 upper bound collection time.

 4. There are going to be issues around 'shared' /mutable/ data, e.g.

  - The non-paused thread accessing it (esp during collection)
  - If the thread allocated 'shared' data

 I am hoping that if the thread main function is marked as  notpaused (or
 similar) then the compiler can statically verify neither of these occur and
 produce a compile time error.

 So, that's the idea.  I don't know the current GC all that well so I've
 probably missed something crucial.  I doubt this idea is revolutionary and
 it is perhaps debatable whether the complexity is worth the effort, also
 whether it actually makes placing an upper bound on the collection any
 easier.

 Thoughts?


It sounds kinda complex... but I'm not qualified to comment.

May 24 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 24 May 2013 15:50:43 +0100, Manu <turkeyman gmail.com> wrote:
 On 24 May 2013 20:24, Regan Heath <regan netmail.co.nz> wrote:

 It sounds kinda complex... but I'm not qualified to comment.

Yeah, there is complexity.  It all boils down to whether it is possible  
using modern GC techniques (precise, incremental, etc) to perform a  
collection in 300us as you require.  If a full collection cannot be done  
in that time, perhaps a smaller subset can - that is where I was heading  
with this idea.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

May 24 2013

Manu <turkeyman gmail.com> writes:

I might just add that there are some other important targets as well in the
vein of this discussion.

DLL's *still* don't work properly. druntime/phobos still don't really work
as dll's.
They are getting some attention, but it's been a really long standing and
seriously major issue. Shared libraries are like, important!


On 25 May 2013 00:50, Manu <turkeyman gmail.com> wrote:

 On 24 May 2013 20:24, Regan Heath <regan netmail.co.nz> wrote:

 On Fri, 24 May 2013 01:11:17 +0100, Manu <turkeyman gmail.com> wrote:

 /agree, except the issue I raised, when ~ is used in phobos.
 That means that function is now off-limits. And there's no way to know
 which functions they are...

 It's not the allocation caused by ~ which is the issue though is it, it's
 the collection it might trigger, right?

 Yes, but the unpredictability is the real concern. It's hard to control
 something that you don't know about.
 If the phobos function can avoid the allocation, then why not avoid it?


 So what you really need are 3 main things:
 1. A way to prevent the GC collecting until a given point(*).
 2. A way to limit the GC collection time.
 3. For phobos functions to be optimised to not allocate or to use alloca
 where possible.



 I think we can already do this.



 The incremental(+precise) GC idea, I think this would be the silver bullet
 for games!



 Yes, I think effort to improve this would be universally appreciated.


  (*) Until the collection point the GC would ask the OS for more memory (a
 new pool or page) or fail and throw an Error.  Much like in Leandro's
 concurrent GC talk/example where he talks about eager allocation.

 Bear in mind, most embedded hardware does now have virtual memory, and
 often a fairly small hard limit.
 If we are trying to manually sequence out allocations and collects, like
 schedule collects when you change scenes on a black screen or something for
 instance, then you can't have random phobos functions littering small
 allocations all over the place.




 this is..

 Lets imagine you can mark a thread as not stopped by the pause-the-world.
  Lets imagine it still does allocations which we want to collect at some
 stage.  How would this work..

 1. The GC would remove the thread stack and global space from it's list
 of roots scanned by normal collections.  It would not pause it on normal
 collections.

 2. (*) above would be in effect, the first allocation in the thread would
 cause the GC to create a thread local pool, this pool would not be shared
 by other threads (no locking required, not scanned by normal GC
 collections).  This pool could be pre-allocated by a new GC primitive
 "GC.allocLocalPool();" for efficiency.  Allocation would come from this
 thread-local pool, or trigger a new pool allocation - so minimal locking
 should be required.

 3. The thread would call a new GC primitive at the point where collection
 was desired i.e. "GC.localCollect(size_t maxMicroSecs);".  This collection
 would be special, it would not stop the thread, but would occur inline.  It
 would only scan the thread local pool and would do so with an enforced
 upper bound collection time.

 4. There are going to be issues around 'shared' /mutable/ data, e.g.

  - The non-paused thread accessing it (esp during collection)
  - If the thread allocated 'shared' data

 I am hoping that if the thread main function is marked as  notpaused (or
 similar) then the compiler can statically verify neither of these occur and
 produce a compile time error.

 So, that's the idea.  I don't know the current GC all that well so I've
 probably missed something crucial.  I doubt this idea is revolutionary and
 it is perhaps debatable whether the complexity is worth the effort, also
 whether it actually makes placing an upper bound on the collection any
 easier.

 Thoughts?


 It sounds kinda complex... but I'm not qualified to comment.

May 24 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 24.05.2013 17:02, schrieb Manu:
 I might just add that there are some other important targets as well in
 the vein of this discussion.

 DLL's *still* don't work properly. druntime/phobos still don't really
 work as dll's.
 They are getting some attention, but it's been a really long standing
 and seriously major issue. Shared libraries are like, important!

Fully agree there. See
http://d.puremagic.com/issues/show_bug.cgi?id=9816

Kind Regards
Benjamin Thaut

May 24 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, May 24, 2013 10:11:17 Manu wrote:
 /agree, except the issue I raised, when ~ is used in phobos.
 That means that function is now off-limits. And there's no way to know
 which functions they are...

Yes, we need to look at that. I actually don't think that ~ gets used much 
(primarily because so much of it uses ranges which don't have ~), but it's 
something that we need to look out for and address. The suggestion of an  nogc 
attribute which at least guarantees that new isn't used would be nice for 
that, since it could potentially both guarantee it and document it. But that 
woludn't work with templated functions for the most part, since types being 
used with them might allocate, though we could presumably add attribute 
inferrence for that so that the functions which call them can be marked with 
 nogc and be able to know that the functions that they're calling obey that.

My guess is that the functions which are most likely to allocate are those 
which specifically take strings or arrays, as they _can_ use ~, so they 
probably need to be examined first, but in some cases, they're also the type of 
function which may _have_ to allocate, depending on what they're doing. 
Probably the right approach for that is to track down all of those that are 
allocting, make it so that any of those that can avoid the allocation do, and 
then create overloads which take an output range or somesuch for those that 
have to allocate so that preallocated memory and the like can be used for 
them. And if we actually have any which can't possibly do anything but 
allocate, they should be clearly documented as such.

All around though, figuring out how to minimize GC usage in Phobos and enforce 
that is an open problem which is still very much up for discussion on how best 
to address (particularly when it's quite easy to introduce inadventant 
allocations with some stuff). But with everything else that we've had to worry 
about, optimizations like that haven't been as high a priority as they're 
going to need to be long term. At some point, we're probably going to need to 
benchmark stuff more agressively and optimize Phobos in general more, because 
it's the standard library. And eliminating unnecessary memory allocations 
definitely goes along with that.

- Jonathan M Davis

May 23 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 23 May 2013 20:21:47 -0400
schrieb "Jonathan M Davis" <jmdavisProg gmx.com>:

 At some point, we're probably going to need to 
 benchmark stuff more agressively and optimize Phobos in general more, because 
 it's the standard library. And eliminating unnecessary memory allocations 
 definitely goes along with that.
 
 - Jonathan M Davis

On a related note, a while back I benchmarked the naive Phobos
approach to create a Windows API (wchar) string from a D
string with using alloca to convert the string on a piece of
stack memory like this: http://dpaste.1azy.net/b60d37d4
IIRC it was 13(!) times faster for ~100 chars of English text
and 5 times for some multi-byte characters.
I think this approach is too hackish for Phobos, but it
demonstrates that there is much room.

-- 
Marco

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 14:11, Marco Leise <Marco.Leise gmx.de> wrote:

 Am Thu, 23 May 2013 20:21:47 -0400
 schrieb "Jonathan M Davis" <jmdavisProg gmx.com>:

 At some point, we're probably going to need to
 benchmark stuff more agressively and optimize Phobos in general more,

 because
 it's the standard library. And eliminating unnecessary memory allocations
 definitely goes along with that.

 - Jonathan M Davis

 On a related note, a while back I benchmarked the naive Phobos
 approach to create a Windows API (wchar) string from a D
 string with using alloca to convert the string on a piece of
 stack memory like this: http://dpaste.1azy.net/b60d37d4
 IIRC it was 13(!) times faster for ~100 chars of English text
 and 5 times for some multi-byte characters.
 I think this approach is too hackish for Phobos, but it
 demonstrates that there is much room.

I don't think it's hack-ish at all, that's precisely what the stack is
there for. It would be awesome for people to use alloca in places that it
makes sense.
Especially in cases where the function is a leaf or leaf-stem (ie, if there
is no possibility of recursion), then using the stack should be encouraged.
For safety, obviously phobos should do something like:
  void[] buffer = bytes < reasonable_anticipated_buffer_size ?
alloca(bytes) : new void[bytes];

toStringz is a very common source of allocations. This alloca approach
would be great in those cases, filenames in particular.

May 23 2013

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 24 May 2013 at 05:02:33 UTC, Manu wrote:
 On 24 May 2013 14:11, Marco Leise <Marco.Leise gmx.de> wrote:
 I don't think it's hack-ish at all, that's precisely what the 
 stack is
 there for. It would be awesome for people to use alloca in 
 places that it
 makes sense.
 Especially in cases where the function is a leaf or leaf-stem 
 (ie, if there
 is no possibility of recursion), then using the stack should be 
 encouraged.
 For safety, obviously phobos should do something like:
   void[] buffer = bytes < reasonable_anticipated_buffer_size ?
 alloca(bytes) : new void[bytes];

That is probably something that could be handled in the optimizer 
in many cases.

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 15:29, deadalnix <deadalnix gmail.com> wrote:

 On Friday, 24 May 2013 at 05:02:33 UTC, Manu wrote:

 On 24 May 2013 14:11, Marco Leise <Marco.Leise gmx.de> wrote:
 I don't think it's hack-ish at all, that's precisely what the stack is
 there for. It would be awesome for people to use alloca in places that it
 makes sense.
 Especially in cases where the function is a leaf or leaf-stem (ie, if
 there
 is no possibility of recursion), then using the stack should be
 encouraged.
 For safety, obviously phobos should do something like:
   void[] buffer = bytes < reasonable_anticipated_buffer_**size ?
 alloca(bytes) : new void[bytes];

 That is probably something that could be handled in the optimizer in many
 cases.

The optimiser probably can't predict if the function may recurse, and as
such, the amount of memory you feel is reasonable to take from the stack is
hard to predict...
It could possibly do so for leaf functions only, but then most of the
opportunities aren't in leaf functions. I'd say a majority of phobos
allocations are created when passing strings through to library/system
calls.

May 23 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, May 24, 2013 15:37:39 Manu wrote:
 I'd say a majority of phobos
 allocations are created when passing strings through to library/system
 calls.

That does sound probable, as toStringz will often (and unpredictably) result 
in allocations, and it does seem like a prime location for at least attempting 
to use a static array instead as you suggested. But if toStringz _wouldn't_ 
result in an allocation, then copying to a static array would be inadvisable, 
so we're probably going to need a function which does toStringz's test so that 
it can be used outside of toStringz.

- Jonathan M Davis

May 23 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 15:44, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Friday, May 24, 2013 15:37:39 Manu wrote:
 I'd say a majority of phobos
 allocations are created when passing strings through to library/system
 calls.

 That does sound probable, as toStringz will often (and unpredictably)
 result
 in allocations, and it does seem like a prime location for at least
 attempting
 to use a static array instead as you suggested. But if toStringz _wouldn't_
 result in an allocation, then copying to a static array would be
 inadvisable,
 so we're probably going to need a function which does toStringz's test so
 that
 it can be used outside of toStringz.

Yeah, an alloca based cstring helper which performs the zero-terminate
check, then if it's not terminated, and short enough, alloca and copy, else
if too long, new.
I'm sure that would be a handy little template, and improve phobos a lot.

May 23 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-May-2013 09:02, Manu пишет:
 On 24 May 2013 14:11, Marco Leise <Marco.Leise gmx.de
 <mailto:Marco.Leise gmx.de>> wrote:

     Am Thu, 23 May 2013 20:21:47 -0400
     schrieb "Jonathan M Davis" <jmdavisProg gmx.com
     <mailto:jmdavisProg gmx.com>>:

      > At some point, we're probably going to need to
      > benchmark stuff more agressively and optimize Phobos in general
     more, because
      > it's the standard library. And eliminating unnecessary memory
     allocations
      > definitely goes along with that.
      >
      > - Jonathan M Davis

     On a related note, a while back I benchmarked the naive Phobos
     approach to create a Windows API (wchar) string from a D
     string with using alloca to convert the string on a piece of
     stack memory like this: http://dpaste.1azy.net/b60d37d4
     IIRC it was 13(!) times faster for ~100 chars of English text
     and 5 times for some multi-byte characters.
     I think this approach is too hackish for Phobos, but it
     demonstrates that there is much room.


 I don't think it's hack-ish at all, that's precisely what the stack is
 there for. It would be awesome for people to use alloca in places that
 it makes sense.
 Especially in cases where the function is a leaf or leaf-stem (ie, if
 there is no possibility of recursion), then using the stack should be
 encouraged.
 For safety, obviously phobos should do something like:
    void[] buffer = bytes < reasonable_anticipated_buffer_size ?
 alloca(bytes) : new void[bytes];

 toStringz is a very common source of allocations. This alloca approach
 would be great in those cases, filenames in particular.

Alternatively just make a TLS buffer as scratchpad and use that everywhere.

-- 
Dmitry Olshansky

May 24 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Friday, 24 May 2013 at 09:40:03 UTC, Dmitry Olshansky wrote:
 Alternatively just make a TLS buffer as scratchpad and use that 
 everywhere.

I believe that's what TempAlloc is for.

May 24 2013

Manu <turkeyman gmail.com> writes:

On 24 May 2013 19:40, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:

 24-May-2013 09:02, Manu =D0=BF=D0=B8=D1=88=D0=B5=D1=82:

 On 24 May 2013 14:11, Marco Leise <Marco.Leise gmx.de
 <mailto:Marco.Leise gmx.de>> wrote:

     Am Thu, 23 May 2013 20:21:47 -0400
     schrieb "Jonathan M Davis" <jmdavisProg gmx.com
     <mailto:jmdavisProg gmx.com>>:


      > At some point, we're probably going to need to
      > benchmark stuff more agressively and optimize Phobos in general
     more, because
      > it's the standard library. And eliminating unnecessary memory
     allocations
      > definitely goes along with that.
      >
      > - Jonathan M Davis

     On a related note, a while back I benchmarked the naive Phobos
     approach to create a Windows API (wchar) string from a D
     string with using alloca to convert the string on a piece of
     stack memory like this: http://dpaste.1azy.net/**b60d37d4<http://dpa=


ste.1azy.net/b60d37d4>
     IIRC it was 13(!) times faster for ~100 chars of English text
     and 5 times for some multi-byte characters.
     I think this approach is too hackish for Phobos, but it
     demonstrates that there is much room.


 I don't think it's hack-ish at all, that's precisely what the stack is
 there for. It would be awesome for people to use alloca in places that
 it makes sense.
 Especially in cases where the function is a leaf or leaf-stem (ie, if
 there is no possibility of recursion), then using the stack should be
 encouraged.
 For safety, obviously phobos should do something like:
    void[] buffer =3D bytes < reasonable_anticipated_buffer_**size ?
 alloca(bytes) : new void[bytes];

 toStringz is a very common source of allocations. This alloca approach
 would be great in those cases, filenames in particular.

 Alternatively just make a TLS buffer as scratchpad and use that everywher=

e.


How is that any different than just using the stack in practise?

May 24 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-May-2013 18:35, Manu пишет:
 On 24 May 2013 19:40, Dmitry Olshansky <dmitry.olsh gmail.com
 <mailto:dmitry.olsh gmail.com>> wrote:

     24-May-2013 09:02, Manu пишет:

         On 24 May 2013 14:11, Marco Leise <Marco.Leise gmx.de
         <mailto:Marco.Leise gmx.de>
         <mailto:Marco.Leise gmx.de <mailto:Marco.Leise gmx.de>>> wrote:

              Am Thu, 23 May 2013 20:21:47 -0400
              schrieb "Jonathan M Davis" <jmdavisProg gmx.com
         <mailto:jmdavisProg gmx.com>
              <mailto:jmdavisProg gmx.com <mailto:jmdavisProg gmx.com>>>:


               > At some point, we're probably going to need to
               > benchmark stuff more agressively and optimize Phobos in
         general
              more, because
               > it's the standard library. And eliminating unnecessary
         memory
              allocations
               > definitely goes along with that.
               >
               > - Jonathan M Davis

              On a related note, a while back I benchmarked the naive Phobos
              approach to create a Windows API (wchar) string from a D
              string with using alloca to convert the string on a piece of
              stack memory like this: http://dpaste.1azy.net/__b60d37d4
         <http://dpaste.1azy.net/b60d37d4>
              IIRC it was 13(!) times faster for ~100 chars of English text
              and 5 times for some multi-byte characters.
              I think this approach is too hackish for Phobos, but it
              demonstrates that there is much room.


         I don't think it's hack-ish at all, that's precisely what the
         stack is
         there for. It would be awesome for people to use alloca in
         places that
         it makes sense.
         Especially in cases where the function is a leaf or leaf-stem
         (ie, if
         there is no possibility of recursion), then using the stack
         should be
         encouraged.
         For safety, obviously phobos should do something like:
             void[] buffer = bytes < reasonable_anticipated_buffer___size ?
         alloca(bytes) : new void[bytes];

         toStringz is a very common source of allocations. This alloca
         approach
         would be great in those cases, filenames in particular.


     Alternatively just make a TLS buffer as scratchpad and use that
     everywhere.


 How is that any different than just using the stack in practise?

Can pass across function boundaries up/down. Can grow arbitrary large 
without blowing up.

-- 
Dmitry Olshansky

May 24 2013

Jacob Carlborg <doob me.com> writes:

On 2013-05-24 07:02, Manu wrote:

 I don't think it's hack-ish at all, that's precisely what the stack is
 there for. It would be awesome for people to use alloca in places that
 it makes sense.
 Especially in cases where the function is a leaf or leaf-stem (ie, if
 there is no possibility of recursion), then using the stack should be
 encouraged.
 For safety, obviously phobos should do something like:
    void[] buffer = bytes < reasonable_anticipated_buffer_size ?
 alloca(bytes) : new void[bytes];

 toStringz is a very common source of allocations. This alloca approach
 would be great in those cases, filenames in particular.

Basically every function in Tango that operates on some kind of array 
takes an array and an optional buffer (also an array). If the buffer is 
too small it will allocate using the GC. If not, it won't allocate and 
builds the array in place. That worked great with D1 where strings 
weren't immutable.

-- 
/Jacob Carlborg

May 24 2013

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
 Johannes Pfau's work in progress -vgc command line option [3] 
 would be another great tool that would help people identify GC 
 allocations.  This or something similar could also be used to 
 document throughout phobos when GC allocations can happen (and 
 help eliminate it where it makes sense to).

I have yet to look at any of these entries but I went ahead and 
built phobos with Johannes' -vgc and put the output into a 
spreadsheet.

http://goo.gl/HP78r (google spreadsheet)

I'm not exactly sure if this catches templates or not.  This 
wasn't a unittest build, just building phobos.  I did try to 
build the unittests with -vgc but it runs out of memory trying to 
build std/algorithm.d.  There is substantially more -vgc output 
when building the unit tests though.

Obviously a lot of these aren't going anywhere but there's 
probably some interesting things to be found wading through this.

May 23 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 23.05.2013 20:13, schrieb Brad Anderson:
 While there hasn't been anything official, I think it's a safe bet to
 say that D is being used for a major title, Remedy's Quantum Break,
 featured prominently during the announcement of Xbox One. Quantum Break
 doesn't come out until 2014 so the timeline seems about right (Remedy
 doesn't appear to work on more than one game at a time from what I can
 tell).


 That's pretty huge news.


 Now I'm wondering what can be done to foster this newly acquired
 credibility in games.  By far the biggest issue I hear about when it
 comes to people working on games in D is the garbage collector.  You can
 work around the GC without too much difficulty as Manu's experience
 shared in his DConf talk shows but a lot of people new to D don't know
 how to do that.  We could also use some tools and guides to help people
 identify and avoid GC use when necessary.

  nogc comes to mind (I believe Andrei mentioned it during one of the
 talks released). [1][2]

 Johannes Pfau's work in progress -vgc command line option [3] would be
 another great tool that would help people identify GC allocations.  This
 or something similar could also be used to document throughout phobos
 when GC allocations can happen (and help eliminate it where it makes
 sense to).

 There was a lot of interesting stuff in Benjamin Thaut's article about
 GC versus manual memory management in a game [4] and the discussion
 about it on the forums [5].  A lot of this collective knowledge built up
 on manual memory management techniques specific to D should probably be
 formalized and added to the official documentation.  There is a Memory
 Management [6] page in the documentation but it appears to be rather
 dated at this point and not particularly applicable to modern D2 (no
 mention of emplace or scoped and it talks about using delete and scope
 classes).

 Game development is one place D can really get a foothold but all too
 often the GC is held over D's head because people taking their first
 look at D don't know how to avoid using it and often don't realize you
 can avoid using it entirely. This is easily the most common issue raised
 by newcomers to D with a C or C++ background that I see in the #d IRC
 channel (many of which are interested in game dev but concerned the GC
 will kill their game's performance).


 1: http://d.puremagic.com/issues/show_bug.cgi?id=5219
 2: http://wiki.dlang.org/DIP18
 3: https://github.com/D-Programming-Language/dmd/pull/1886
 4: http://3d.benjamin-thaut.de/?p=20#more-20
 5: http://forum.dlang.org/post/k27bh7$t7f$1 digitalmars.com
 6: http://dlang.org/memory.html

Besides my studies I'm working at havok and the biggest problems most 
likely would be (in order of importance)

- Compiler / druntime for all 9 plattforms we have to support simply do 
not exist
- Full Visual Studio integration needed. Inclusive a really good code 
completion and a very nice debugging experience for all plattforms. 
VisualD is quite nice and debugging using the visual studio debugger 
works quite well but its a real pita that you have to patch dmd and 
compile it from source so you can debug in x64 on windows.
- SIMD: core.simd is just not there yet. The last time I looked really 
basic stuff like unaligned loads where missing.
- The GC. A no go, a GC free version of the runtime (non leaking) should 
be provided.
- Better windows support. All of the developement we do happens on 
windows and most of D's community does not care about windows support. 
I'm curious how long it will take until D will get propper DLL support.

Kind Regards
Benjamin Thaut

May 24 2013

Manu <turkeyman gmail.com> writes:

On 25 May 2013 04:20, Benjamin Thaut <code benjamin-thaut.de> wrote:

Am 23.05.2013 20:13, schrieb Brad Anderson:

While there hasn't been anything official, I think it's a safe bet to

say that D is being used for a major title, Remedy's Quantum Break,
featured prominently during the announcement of Xbox One. Quantum Break
doesn't come out until 2014 so the timeline seems about right (Remedy
doesn't appear to work on more than one game at a time from what I can
tell).

That's pretty huge news.

Now I'm wondering what can be done to foster this newly acquired
credibility in games. By far the biggest issue I hear about when it
comes to people working on games in D is the garbage collector. You can
work around the GC without too much difficulty as Manu's experience
shared in his DConf talk shows but a lot of people new to D don't know
how to do that. We could also use some tools and guides to help people
identify and avoid GC use when necessary.

nogc comes to mind (I believe Andrei mentioned it during one of the
talks released). [1][2]

Johannes Pfau's work in progress -vgc command line option [3] would be
another great tool that would help people identify GC allocations. This
or something similar could also be used to document throughout phobos
when GC allocations can happen (and help eliminate it where it makes
sense to).

There was a lot of interesting stuff in Benjamin Thaut's article about
GC versus manual memory management in a game [4] and the discussion
about it on the forums [5]. A lot of this collective knowledge built up
on manual memory management techniques specific to D should probably be
formalized and added to the official documentation. There is a Memory
Management [6] page in the documentation but it appears to be rather
dated at this point and not particularly applicable to modern D2 (no
mention of emplace or scoped and it talks about using delete and scope
classes).

Game development is one place D can really get a foothold but all too
often the GC is held over D's head because people taking their first
look at D don't know how to avoid using it and often don't realize you
can avoid using it entirely. This is easily the most common issue raised
by newcomers to D with a C or C++ background that I see in the #d IRC
channel (many of which are interested in game dev but concerned the GC
will kill their game's performance).

1: http://d.puremagic.com/issues/**show_bug.cgi?id=5219<http://d.puremagic.com/issues/show_bug.cgi?id=5219>
2: http://wiki.dlang.org/DIP18
3: https://github.com/D-**Programming-Language/dmd/pull/**1886<https://github.com/D-Programming-Language/dmd/pull/1886>
4: http://3d.benjamin-thaut.de/?**p=20#more-20<http://3d.benjamin-thaut.de/?p=20#more-20>
5: http://forum.dlang.org/post/**k27bh7$t7f$1 digitalmars.com<http://forum.dlang.org/post/k27bh7$t7f$1 digitalmars.com>
6: http://dlang.org/memory.html

Besides my studies I'm working at havok and the biggest problems most
likely would be (in order of importance)

- Compiler / druntime for all 9 plattforms we have to support simply do
not exist

Yup.

- Full Visual Studio integration needed. Inclusive a really good code
completion and a very nice debugging experience for all plattforms. VisualD
is quite nice and debugging using the visual studio debugger works quite
well but its a real pita that you have to patch dmd and compile it from
source so you can debug in x64 on windows.

Win64 works for me out of the box... ?

- SIMD: core.simd is just not there yet. The last time I looked really
basic stuff like unaligned loads where missing.

I'm working on std.simd (slowly >_<) .. It'll get there.

- The GC. A no go, a GC free version of the runtime (non leaking) should
be provided.

See, I have spend a decade on core tech/engine code meticulously worrying
about memory allocation. I don't think a GC is an outright no-go.
But we certainly don't have a GC that fits the bill.

- Better windows support. All of the developement we do happens on windows
and most of D's community does not care about windows support. I'm curious
how long it will take until D will get propper DLL support.

As with everyone in games!
We need DLL's urgently.

May 24 2013

Paulo Pinto <pjmlp progtools.org> writes:

Am 25.05.2013 03:29, schrieb Manu:
 On 25 May 2013 04:20, Benjamin Thaut <code benjamin-thaut.de
 <mailto:code benjamin-thaut.de>> wrote:
 [...]
 See, I have spend a decade on core tech/engine code meticulously
 worrying about memory allocation. I don't think a GC is an outright no-go.
 But we certainly don't have a GC that fits the bill.

Given that Android, Windows Phone 7/8 and PS Vita have system languages 
with GC, it does not seem to bother those developers.

Yes I know that most AAA studios are actually bypassing them and using C 
and C++ directly, but already having indie developers using D would be a 
great win.

One needs to start somehere.

     - Better windows support. All of the developement we do happens on
     windows and most of D's community does not care about windows
     support. I'm curious how long it will take until D will get propper
     DLL support.

Yeah, this is partially why I lost the train for game development. I was 
too much focused in FOSS issues, instead of focusing in doing a game.


--
Paulo

May 25 2013

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 25.05.2013 03:29, schrieb Manu:
 Win64 works for me out of the box... ?

For me dmd produces type names like modulename.typename.subtypename 
which will causes internal errors within the visual studio debugger in 
some cases. Also debugging of static / global variabels is not possible 
(even when gshared) because they are also formatted like 
modulename.variablename;

Kind Regards
Benjamin Thaut

May 25 2013

Manu <turkeyman gmail.com> writes:

On 25 May 2013 21:03, Benjamin Thaut <code benjamin-thaut.de> wrote:

 Am 25.05.2013 03:29, schrieb Manu:

 Win64 works for me out of the box... ?

 For me dmd produces type names like modulename.typename.**subtypename
 which will causes internal errors within the visual studio debugger in some
 cases. Also debugging of static / global variabels is not possible (even
 when gshared) because they are also formatted like modulename.variablename;

True, sadly there are holes in the debug experience, which are pretty
important to have fixed at some point.

May 25 2013

Brad Roberts <braddr puremagic.com> writes:

On 5/25/13 6:28 PM, Manu wrote:
 On 25 May 2013 21:03, Benjamin Thaut <code benjamin-thaut.de
 <mailto:code benjamin-thaut.de>> wrote:

     Am 25.05.2013 03:29, schrieb Manu:



         Win64 works for me out of the box... ?


     For me dmd produces type names like
     modulename.typename.__subtypename which will causes internal errors
     within the visual studio debugger in some cases. Also debugging of
     static / global variabels is not possible (even when gshared)
     because they are also formatted like modulename.variablename;


 True, sadly there are holes in the debug experience, which are pretty
 important to have fixed at some point.

Bugzilla links?

May 25 2013

"Rob T" <alanb ucora.com> writes:

On Thursday, 23 May 2013 at 18:13:17 UTC, Brad Anderson wrote:
  nogc comes to mind (I believe Andrei mentioned it during one 
 of the talks released). [1][2]

I would love to have something like  nogc to guarantee there's no 
hidden or misplaced allocations in a section of code or 
optionally throughout the entire application.

Not only is the GC a cause of concern for game devs, it is also a 
concern for general systems development. For example I have a 
simple virtual network device driver that I'd like to re-write 
from C++ to D. It does not need a GC at all, all memory is 
preallocated in advance of use during initialization and it does 
not need anything from Phobos. If I could easily cut out the GC 
even from the executable binary that would be great, provided 
that I was certain that no allocations were going on by mistake. 
Yes I know I can get rid of the GC, but there should be an 
elegant solution for doing it that guarantees I am not using 
features of the language that require the GC.

Keep in mind that even if the GC was improved, there will still 
be plenty of systems applications that will not require the GC, 
so while improving the GC is a huge deal in itself, it is still 
not a general solution for those who do not need a GC at all and 
want to be certain they are not allocating by mistake.

--rt

May 24 2013

D Programming

C/C++ Programming

Other

digitalmars.D - D on next-gen consoles and for game development