digitalmars.D - core.traits?
- Manu (16/16) Jan 05 2019 So, druntime has core.internal.traits where a bunch of std.traits have
- kinke (2/5) Jan 05 2019 I fully agree.
- Brian (2/7) Jan 08 2019 YES, YES!! Agree!!!
- Seb (3/13) Jan 05 2019 I would go even one step further and move everything and just
- H. S. Teoh (11/23) Jan 05 2019 I concur! We've been adding ugly nasty hacks to druntime, or writing
- Manu (4/27) Jan 05 2019 Great! So, who's gonna do it?
- Nicholas Wilson (2/12) Jan 05 2019 Makes sense to me.
- Walter Bright (2/5) Jan 05 2019 Sounds good.
- Manu (16/21) Jan 07 2019 Okay, so this is a challenging effort, since phobos is such a tangled
- Steven Schveighoffer (8/34) Jan 07 2019 I was going to say core.meta should just have the basic definitions, but...
- H. S. Teoh (25/39) Jan 07 2019 It's already gotten better over the years in some ways (though not
- Steven Schveighoffer (13/24) Jan 07 2019 std.internal.traits contains pieces of std.meta -- a quick look shows it...
- H. S. Teoh (18/44) Jan 07 2019 What, wut...? `TypeTuple` still exists?! I thought we had gone through
- Steven Schveighoffer (12/59) Jan 07 2019 It's an internal definition, so it can be anything it wants it to be. It...
- Manu (20/61) Jan 07 2019 Ummm, well, yes and no. staticMap is DEFINITELY like you say, one of
- Jonathan M Davis (51/79) Jan 07 2019 Digitalmars-d wrote:
- Nick Treleaven (15/18) Jan 08 2019 That's fine internally, but std.traits still uses *Tuple for ten
- Steven Schveighoffer (10/28) Jan 08 2019 What I mean is that you reach for things like what is available in
- Manu (11/47) Jan 07 2019 Well, I think you're kinda catastrophising here... I said very clearly
- Mike Franklin (19/22) Jan 07 2019 I'm not really doing much with D anymore, so I apologize for
- Mike Franklin (75/85) Jan 07 2019 I spent some time trying to think through some of the issues with
- Jacob Carlborg (4/88) Jan 09 2019 I like this approach.
- kinke (17/27) Jan 08 2019 I also feel the need for at least 1 another base library. My
- Steven Schveighoffer (5/23) Jan 08 2019 This is self-contradictory, as AA's require TypeInfo.
- Mike Franklin (37/47) Jan 08 2019 Steven is right (as usual) here. There has to be a serious
- Neia Neutuladh (4/11) Jan 08 2019 The specific thing that he replied to was having a public symbol for
- Mike Franklin (10/13) Jan 08 2019 Perhaps; I'm not sure. The `pureMalloc` implementation is a lot
- Jacob Carlborg (6/16) Jan 09 2019 What do you mean "couldn't continue"? It's possible to implement
- Mike Franklin (16/31) Jan 09 2019 In DMD you can't use it without linking in the runtime, but in
- Patrick Schluter (17/50) Jan 09 2019 AVX512 concerns only a very small part of processors on the
- bioinfornatics (9/29) Jan 09 2019 By reading (quiclkly) these articles:
- H. S. Teoh (30/37) Jan 09 2019 EXACTLY!!!
- jmh530 (8/37) Jan 09 2019 One thing I like about libmir's sum function
- H. S. Teoh (13/19) Jan 09 2019 That's an excellent idea. Have a generic default algorithm that
- Mike Franklin (26/35) Jan 09 2019 Yes, this is one of the benefits of making `memcpy(T)(T* dest, T*
- Mike Franklin (33/50) Jan 09 2019 Yes, I agree, and even the newer chips have "Enhanced REP MOVSB
- Ethan (20/25) Jan 10 2019 AVX512 is a superset of AVX2, is a superset of AVX, is a superset
- Ethan (4/9) Jan 10 2019 Where's the edit button. The last writing stream function was
- luckoverthere (6/32) Jan 10 2019 That's disappointing to learn. Ryzen has four 128-bit AVX units,
- Ethan (8/13) Jan 11 2019 The good news though is that Ryzen's 128-bit pipeline outperforms
- bioinfornatics (3/17) Jan 11 2019 Hi ethan, could you share a piece of code to do that ?
- Ethan (5/7) Jan 11 2019 Not really.
- bioinfornatics (6/15) Jan 11 2019 OK I understand, no problem 😉
- Jacob Carlborg (6/17) Jan 09 2019 Perhaps it could be considered as a fallback when a "memcpy" isn't
- Mike Franklin (17/25) Jan 09 2019 I'm not sure what you mean. DMD currently links in libc, so
So, druntime has core.internal.traits where a bunch of std.traits have been mirrored to support internal machinery within druntime. This is clear evidence that a lot of these traits are really super-critical to doing basically anything interesting with D. I have experience with no-phobos projects in the past where I've been frustrated that I had to mirror all the traits I needed manually. I suggest, a fair set of std.traits (no-brainer traits that you basically can't live without) should be officially moved to core.traits, so that they are always available to all D users. Traits are pure-templates, they don't emit code, and have no impact on the size of the druntime binary. They significantly shouldn't affect build times unless they are instantiated. ...and they're already there in core.internal.traits. We should move them to core.traits, and that should be their official home. It really just makes sense. Uncontroversial low-level traits don't belong in phobos.
Jan 05 2019
On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:We should move them to core.traits, and that should be their official home. It really just makes sense. Uncontroversial low-level traits don't belong in phobos.I fully agree.
Jan 05 2019
On Saturday, 5 January 2019 at 21:31:38 UTC, kinke wrote:On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:YES, YES!! Agree!!!We should move them to core.traits, and that should be their official home. It really just makes sense. Uncontroversial low-level traits don't belong in phobos.I fully agree.
Jan 08 2019
On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:So, druntime has core.internal.traits where a bunch of std.traits have been mirrored to support internal machinery within druntime. This is clear evidence that a lot of these traits are really super-critical to doing basically anything interesting with D. I have experience with no-phobos projects in the past where I've been frustrated that I had to mirror all the traits I needed manually. [...]I would go even one step further and move everything and just alias things in std.traits, s.t. no breakage happens.
Jan 05 2019
On Sat, Jan 05, 2019 at 09:49:58PM +0000, Seb via Digitalmars-d wrote:On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:I concur! We've been adding ugly nasty hacks to druntime, or writing code in circumlocutous ways, for far too long now, all because certain basic traits happen to be in std.traits and it's verboten to import Phobos from druntime. It's time to revisit that decision. The more complex traits should remain in Phobos in order not to complicate druntime too much, but the basic ones needed also in druntime should moved into druntime, instead of copy-pasta or roll-your-own in druntime. T -- It's amazing how careful choice of punctuation can leave you hanging:So, druntime has core.internal.traits where a bunch of std.traits have been mirrored to support internal machinery within druntime. This is clear evidence that a lot of these traits are really super-critical to doing basically anything interesting with D. I have experience with no-phobos projects in the past where I've been frustrated that I had to mirror all the traits I needed manually. [...]I would go even one step further and move everything and just alias things in std.traits, s.t. no breakage happens.
Jan 05 2019
On Sat, Jan 5, 2019 at 2:04 PM H. S. Teoh via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Sat, Jan 05, 2019 at 09:49:58PM +0000, Seb via Digitalmars-d wrote:Great! So, who's gonna do it? I'm already overloaded with these sorts of refactors >_<On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:I concur! We've been adding ugly nasty hacks to druntime, or writing code in circumlocutous ways, for far too long now, all because certain basic traits happen to be in std.traits and it's verboten to import Phobos from druntime. It's time to revisit that decision. The more complex traits should remain in Phobos in order not to complicate druntime too much, but the basic ones needed also in druntime should moved into druntime, instead of copy-pasta or roll-your-own in druntime. T -- It's amazing how careful choice of punctuation can leave you hanging:So, druntime has core.internal.traits where a bunch of std.traits have been mirrored to support internal machinery within druntime. This is clear evidence that a lot of these traits are really super-critical to doing basically anything interesting with D. I have experience with no-phobos projects in the past where I've been frustrated that I had to mirror all the traits I needed manually. [...]I would go even one step further and move everything and just alias things in std.traits, s.t. no breakage happens.
Jan 05 2019
On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:So, druntime has core.internal.traits where a bunch of std.traits have been mirrored to support internal machinery within druntime. This is clear evidence that a lot of these traits are really super-critical to doing basically anything interesting with D. I have experience with no-phobos projects in the past where I've been frustrated that I had to mirror all the traits I needed manually. [...]Makes sense to me.
Jan 05 2019
On 1/5/2019 1:12 PM, Manu wrote:We should move them to core.traits, and that should be their official home. It really just makes sense. Uncontroversial low-level traits don't belong in phobos.Sounds good.
Jan 05 2019
On Sat, Jan 5, 2019 at 11:00 PM Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 1/5/2019 1:12 PM, Manu wrote:Okay, so this is a challenging effort, since phobos is such a tangled rats nets of chaos... But attempting to move some traits immediately calls into question std.meta. I think we can all agree that Alias and AliasSeq should be in druntime along with core traits... but where should it live? Should there be core.meta as well? It's kinda like core.traits, in that it doesn't include runtime code, it doesn't increase the payload of druntime.lib for end-users.. Perhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable. ...yes, this process will go on and on. The only way forward is to take each hurdle one at a time... and ideally, in attempting this effort, we can de-tangle a lot of cruft during the process.We should move them to core.traits, and that should be their official home. It really just makes sense. Uncontroversial low-level traits don't belong in phobos.Sounds good.
Jan 07 2019
On 1/7/19 4:25 PM, Manu wrote:On Sat, Jan 5, 2019 at 11:00 PM Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:I was going to say core.meta should just have the basic definitions, but why not just dump all of it in there. As you said, they are templates (and so you only pay if you use them), and really part of the core language definition. There are already parts of std.meta inside core.internal as well, so if we want to be consistent, we should just move it all ;) -SteveOn 1/5/2019 1:12 PM, Manu wrote:Okay, so this is a challenging effort, since phobos is such a tangled rats nets of chaos... But attempting to move some traits immediately calls into question std.meta. I think we can all agree that Alias and AliasSeq should be in druntime along with core traits... but where should it live? Should there be core.meta as well? It's kinda like core.traits, in that it doesn't include runtime code, it doesn't increase the payload of druntime.lib for end-users.. Perhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable. ....yes, this process will go on and on. The only way forward is to take each hurdle one at a time... and ideally, in attempting this effort, we can de-tangle a lot of cruft during the process.We should move them to core.traits, and that should be their official home. It really just makes sense. Uncontroversial low-level traits don't belong in phobos.Sounds good.
Jan 07 2019
On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote: [...]Okay, so this is a challenging effort, since phobos is such a tangled rats nets of chaos...It's already gotten better over the years in some ways (though not others -- unfortunately I'm afraid std.traits might be one of the places where things have probably gotten more tangled). It certainly hasn't lived up to the promise of that old Phobos Philosophy page that once existed but has since been removed, of Phobos being a collection of lightweight, self-contained, mostly-orthogonal, reusable components. It has become quite the opposite, where the dependency graph of Phobos modules is approaching pretty close to being a complete graph. (And yes, there are some pretty deep-seated cyclic dependencies that thus far nobody has been able to truly unravel in any satisfactory way.)But attempting to move some traits immediately calls into question std.meta. I think we can all agree that Alias and AliasSeq should be in druntime along with core traits... but where should it live? Should there be core.meta as well? It's kinda like core.traits, in that it doesn't include runtime code, it doesn't increase the payload of druntime.lib for end-users..Shouldn't all of core.traits be like that? I'd hardly expect any runtime component to be associated with something called 'traits'.Perhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable.I'm afraid this would set the wrong precedent -- since there's core.traits for std.traits and core.meta for std.meta, why not also have core.typecons, core.range, and then it's all gonna go downhill from there, and before you know it there's gonna be core.stdio and core.format... *shudder*...yes, this process will go on and on. The only way forward is to take each hurdle one at a time... and ideally, in attempting this effort, we can de-tangle a lot of cruft during the process.I'm tempted to say we should put everything in core.traits for now. And just the absolute bare minimum it takes to meet whatever druntime needs, and nothing more. T -- I think Debian's doing something wrong, `apt-get install pesticide', doesn't seem to remove the bugs on my system! -- Mike Dresser
Jan 07 2019
On 1/7/19 4:41 PM, H. S. Teoh wrote:On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote: [...]std.internal.traits contains pieces of std.meta -- a quick look shows it has AliasSeq (but under the name TypeTuple), allSatisfy, anySatisfy, Filter, staticMap. We might as well just move the whole thing there. The fear of moving other pieces is real, but only if you look superficially. traits and meta are really part of the language, I can't imagine using D without it (and neither can any of the people who put pieces of those modules into druntime). I don't want to see anything more complex and interdependent get sucked in. I know the goal for Manu right now is emplace, which does feel like a language feature. But as has been said many times here, std.traits is a no-brainer, and std.meta is really a building block that std.traits uses. -StevePerhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable.I'm afraid this would set the wrong precedent -- since there's core.traits for std.traits and core.meta for std.meta, why not also have core.typecons, core.range, and then it's all gonna go downhill from there, and before you know it there's gonna be core.stdio and core.format... *shudder*
Jan 07 2019
On Mon, Jan 07, 2019 at 04:54:15PM -0500, Steven Schveighoffer via Digitalmars-d wrote:On 1/7/19 4:41 PM, H. S. Teoh wrote:What, wut...? `TypeTuple` still exists?! I thought we had gone through a somewhat painful deprecation cycle just to kill it off. Or is that cycle not done yet...?On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote: [...]std.internal.traits contains pieces of std.meta -- a quick look shows it has AliasSeq (but under the name TypeTuple),Perhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable.I'm afraid this would set the wrong precedent -- since there's core.traits for std.traits and core.meta for std.meta, why not also have core.typecons, core.range, and then it's all gonna go downhill from there, and before you know it there's gonna be core.stdio and core.format... *shudder*allSatisfy, anySatisfy, Filter, staticMap. We might as well just move the whole thing there.I dunno, IMO allSatisfy, anySatisfy, and esp. Filter and staticMap are all heavy-weight templates (not in terms of code complexity, but in the sheer amount of templates that will get instantiated when you use them, AKA compiler slowdown fuel). I'm not so sure they should go into druntime.The fear of moving other pieces is real, but only if you look superficially. traits and meta are really part of the language, I can't imagine using D without it (and neither can any of the people who put pieces of those modules into druntime).Actually, I find myself redefining AliasSeq locally with a shorter name all the time. Scarily enough, doing that is shorter than typing `import std.traits : AliasSeq;`.I don't want to see anything more complex and interdependent get sucked in. I know the goal for Manu right now is emplace, which does feel like a language feature. But as has been said many times here, std.traits is a no-brainer, and std.meta is really a building block that std.traits uses.[...] Wait, so you're saying we should move the *entire* std.meta and std.traits to druntime...? T -- ASCII stupid question, getty stupid ANSI.
Jan 07 2019
On 1/7/19 5:10 PM, H. S. Teoh wrote:On Mon, Jan 07, 2019 at 04:54:15PM -0500, Steven Schveighoffer via Digitalmars-d wrote:It's an internal definition, so it can be anything it wants it to be. It could be Tuple or Foobar.On 1/7/19 4:41 PM, H. S. Teoh wrote:What, wut...? `TypeTuple` still exists?! I thought we had gone through a somewhat painful deprecation cycle just to kill it off. Or is that cycle not done yet...?On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote: [...]std.internal.traits contains pieces of std.meta -- a quick look shows it has AliasSeq (but under the name TypeTuple),Perhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable.I'm afraid this would set the wrong precedent -- since there's core.traits for std.traits and core.meta for std.meta, why not also have core.typecons, core.range, and then it's all gonna go downhill from there, and before you know it there's gonna be core.stdio and core.format... *shudder*Again, they are already there, and for a reason.allSatisfy, anySatisfy, Filter, staticMap. We might as well just move the whole thing there.I dunno, IMO allSatisfy, anySatisfy, and esp. Filter and staticMap are all heavy-weight templates (not in terms of code complexity, but in the sheer amount of templates that will get instantiated when you use them, AKA compiler slowdown fuel). I'm not so sure they should go into druntime.The point is to have it universally recognized construct. When your code's documentation says `TList` or whatever, then someone has to go figure that out. If it says `AliasSeq`, it's known what it is.The fear of moving other pieces is real, but only if you look superficially. traits and meta are really part of the language, I can't imagine using D without it (and neither can any of the people who put pieces of those modules into druntime).Actually, I find myself redefining AliasSeq locally with a shorter name all the time. Scarily enough, doing that is shorter than typing `import std.traits : AliasSeq;`.At least std.meta. std.traits can possibly be split into critical language-enabling traits and the less important ones, but I don't know off the top of my head which ones are those. But I would say std.meta is composed only of the former variety. -SteveI don't want to see anything more complex and interdependent get sucked in. I know the goal for Manu right now is emplace, which does feel like a language feature. But as has been said many times here, std.traits is a no-brainer, and std.meta is really a building block that std.traits uses.[...] Wait, so you're saying we should move the *entire* std.meta and std.traits to druntime...?
Jan 07 2019
On Mon, Jan 7, 2019 at 2:10 PM H. S. Teoh via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Mon, Jan 07, 2019 at 04:54:15PM -0500, Steven Schveighoffer via Digitalmars-d wrote:Ummm, well, yes and no. staticMap is DEFINITELY like you say, one of the worst, and I really think it needs a language solution. C++'s `...` expression is exactly staticMap, and I wonder if we need an expression like that in-language to take load off staticMap, because it's perhaps one of the slowest parts f the language. We can translate almost anything to CTFE, *except* code that invokes staticMap, so I really reckon it needs a language tool. This is an aside, if we wanna talk about staticMap, we should start a new thread.On 1/7/19 4:41 PM, H. S. Teoh wrote:What, wut...? `TypeTuple` still exists?! I thought we had gone through a somewhat painful deprecation cycle just to kill it off. Or is that cycle not done yet...?On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote: [...]std.internal.traits contains pieces of std.meta -- a quick look shows it has AliasSeq (but under the name TypeTuple),Perhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable.I'm afraid this would set the wrong precedent -- since there's core.traits for std.traits and core.meta for std.meta, why not also have core.typecons, core.range, and then it's all gonna go downhill from there, and before you know it there's gonna be core.stdio and core.format... *shudder*allSatisfy, anySatisfy, Filter, staticMap. We might as well just move the whole thing there.I dunno, IMO allSatisfy, anySatisfy, and esp. Filter and staticMap are all heavy-weight templates (not in terms of code complexity, but in the sheer amount of templates that will get instantiated when you use them, AKA compiler slowdown fuel).I'm not so sure they should go into druntime.Then you're kinda making a case that NONE of it should go in druntime, because they're such common building blocks. Also, you've already lost the game, because they're already in druntime (in core.internal).I've been known to do that too... I could do that for core.traits aswell, but that seems pretty lame.The fear of moving other pieces is real, but only if you look superficially. traits and meta are really part of the language, I can't imagine using D without it (and neither can any of the people who put pieces of those modules into druntime).Actually, I find myself redefining AliasSeq locally with a shorter name all the time. Scarily enough, doing that is shorter than typing `import std.traits : AliasSeq;`.I'd like to be more selective than that; at very least, audit every single symbol coming across. But some part of me thinks this may actually be a very reasonable proposal...I don't want to see anything more complex and interdependent get sucked in. I know the goal for Manu right now is emplace, which does feel like a language feature. But as has been said many times here, std.traits is a no-brainer, and std.meta is really a building block that std.traits uses.[...] Wait, so you're saying we should move the *entire* std.meta and std.traits to druntime...?
Jan 07 2019
On Monday, January 7, 2019 3:10:02 PM MST H. S. Teoh via Digitalmars-d wrote:On Mon, Jan 07, 2019 at 04:54:15PM -0500, Steven Schveighoffer viaDigitalmars-d wrote:We're talking about druntime internals here. TypeTuple was one of the things that got copied into druntime as internal. It was later renamed to AliasSeq in Phobos, but the copied stuff in druntime didn't necessarily get changed, since it was all internal and had nothing to do with the Phobos stuff it originated from.On 1/7/19 4:41 PM, H. S. Teoh wrote:What, wut...? `TypeTuple` still exists?! I thought we had gone through a somewhat painful deprecation cycle just to kill it off. Or is that cycle not done yet...?On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote: [...]std.internal.traits contains pieces of std.meta -- a quick look shows it has AliasSeq (but under the name TypeTuple),Perhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable.I'm afraid this would set the wrong precedent -- since there's core.traits for std.traits and core.meta for std.meta, why not also have core.typecons, core.range, and then it's all gonna go downhill from there, and before you know it there's gonna be core.stdio and core.format... *shudder*Historically, we've tried to only put stuff in druntime that needs to be in druntime for druntime to do what it does. That does sometimes get stretched (e.g. we put all of the OS bindings in druntime rather than just the ones that druntime needs), but Phobos is the standard library, not druntime. druntime is the runtime. If something needs to be there so that the runtime can do its thing, then we put it in druntime, but not much else should be there. Certainly, anything that is there needs to be stuff that's actually core. So, in that respect, it's pretty questionable to move all of std.traits and std.meta into druntime wholesale. What we had for a while was traits just being copied into druntime where they were needed, resulting in duplicate implementations in various places (especially for TypeTuple). Later, they were largely consolidated into core.internal.traits, but they were still completely separate from Phobos. More recently, the traits in core.internal.traits have had their std.traits implementations turned into simple wrappers that use the core.traits symbols (they're not aliased, because that doesn't work well with the documentation, so you get an extra layer of template with every trait whose implementation is in druntime). So, if Manu needs a trait for something in druntime, the normal thing to do at this point would be to just add it to core.internal.traits and then potentially make Phobos wrap the druntime symbol. We don't actually need to move anything wholesale, nor do we need something like core.traits which is intended to make the traits publicly available from druntime. Another issue here if you actually look at core.traits and std.meta is that there are actually several templates which use other pieces of Phobos. e.g. isInputRange gets used in std.meta.aliasSeqOf as does std.array.array, and std.traits.packageName uses startsWith. So, while many of the templates from std.traits and std.meta could be moved, actually trying to move them all wouldn't actually work without reimplementing yet other pieces of Phobos. Personally, I think that the only real benefit of having something like core.traits over having core.internal.traits is that you can move the documentation for those symbols to their druntime implementations and make the Phobos implementations actual aliases with just a link to the druntime documentation instead of needing a thin wrapper template. Anyone wanting to avoid Phobos can still use those traits from std.traits and std.meta just fine, since it wouldn't involve linking against Phobos, just importing the module (even some of the traits that we can't move would work just fine, because they'd just be pulling in other Phobos symbols that don't involve linking). But it does get pretty weird to have some of the traits in core and some in std with no real obvious distinction, since it's based on what druntime needs. So, in that respect, keeping them internal is cleaner. Either way, I think that it's quite clear that we can't move everything. - Jonathan M DavisallSatisfy, anySatisfy, Filter, staticMap. We might as well just move the whole thing there.I dunno, IMO allSatisfy, anySatisfy, and esp. Filter and staticMap are all heavy-weight templates (not in terms of code complexity, but in the sheer amount of templates that will get instantiated when you use them, AKA compiler slowdown fuel). I'm not so sure they should go into druntime.
Jan 07 2019
On Monday, 7 January 2019 at 21:54:15 UTC, Steven Schveighoffer wrote:std.internal.traits contains pieces of std.meta -- a quick look shows it has AliasSeq (but under the name TypeTuple),That's fine internally, but std.traits still uses *Tuple for ten public templates: https://github.com/dlang/phobos/pull/6227 If we move any of those to core.traits, please can we finally fix the names.traits and meta are really part of the language,Some in std.traits are tightly coupled to the language, e.g. isInteger. Some are utility templates, e.g. ConstOf, CopyConstness. I think only the former should be public in druntime. Select and select aren't even traits, they should have been in std.meta. Except perhaps AliasSeq, (Alias, Instantiate) all of std.meta seems to be utility templates rather than language feature wrappers.
Jan 08 2019
On 1/8/19 1:59 PM, Nick Treleaven wrote:On Monday, 7 January 2019 at 21:54:15 UTC, Steven Schveighoffer wrote:You'd have to convince Andrei, as he seemingly nixed the PR.std.internal.traits contains pieces of std.meta -- a quick look shows it has AliasSeq (but under the name TypeTuple),That's fine internally, but std.traits still uses *Tuple for ten public templates: https://github.com/dlang/phobos/pull/6227 If we move any of those to core.traits, please can we finally fix the names.What I mean is that you reach for things like what is available in std.traits quite often when doing template constraints or type manipulation. I'd consider ConstOf and CopyConstness to be in that group.traits and meta are really part of the language,Some in std.traits are tightly coupled to the language, e.g. isInteger. Some are utility templates, e.g. ConstOf, CopyConstness. I think only the former should be public in druntime. Select and select aren't even traits, they should have been in std.meta.Except perhaps AliasSeq, (Alias, Instantiate) all of std.meta seems to be utility templates rather than language feature wrappers.What I mean is that I consider them part of D language, not like a library feature that is optional. But in any case, much of std.traits relies on std.meta. We would at least need the parts in std.meta that are used to build std.traits. -Steve
Jan 08 2019
On Mon, Jan 7, 2019 at 1:42 PM H. S. Teoh via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote: [...]Well, I think you're kinda catastrophising here... I said very clearly "The only way forward is to take each hurdle one at a time", and I mean that as literally as possible. One at a time... The question is: core.meta... yeah? No precedents are being set.. we're not opening flood gates, it's a singular question.Okay, so this is a challenging effort, since phobos is such a tangled rats nets of chaos...It's already gotten better over the years in some ways (though not others -- unfortunately I'm afraid std.traits might be one of the places where things have probably gotten more tangled). It certainly hasn't lived up to the promise of that old Phobos Philosophy page that once existed but has since been removed, of Phobos being a collection of lightweight, self-contained, mostly-orthogonal, reusable components. It has become quite the opposite, where the dependency graph of Phobos modules is approaching pretty close to being a complete graph. (And yes, there are some pretty deep-seated cyclic dependencies that thus far nobody has been able to truly unravel in any satisfactory way.)But attempting to move some traits immediately calls into question std.meta. I think we can all agree that Alias and AliasSeq should be in druntime along with core traits... but where should it live? Should there be core.meta as well? It's kinda like core.traits, in that it doesn't include runtime code, it doesn't increase the payload of druntime.lib for end-users..Shouldn't all of core.traits be like that? I'd hardly expect any runtime component to be associated with something called 'traits'.Perhaps AliasSeq should live somewhere different? I'm feeling like a lean/trimmed-down core.meta might want to exist next to core.traits though; it seems reasonable.I'm afraid this would set the wrong precedent -- since there's core.traits for std.traits and core.meta for std.meta, why not also have core.typecons, core.range, and then it's all gonna go downhill from there, and before you know it there's gonna be core.stdio and core.format... *shudder*Are you saying we should put *everything* in core.traits? That is, put AliasSeq in core.traits? It's objectively NOT a 'traits'......yes, this process will go on and on. The only way forward is to take each hurdle one at a time... and ideally, in attempting this effort, we can de-tangle a lot of cruft during the process.I'm tempted to say we should put everything in core.traits for now. And just the absolute bare minimum it takes to meet whatever druntime needs, and nothing more.
Jan 07 2019
On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:We should move them to core.traits, and that should be their official home. It really just makes sense. Uncontroversial low-level traits don't belong in phobos.I'm not really doing much with D anymore, so I apologize for interrupting the conversation, but I had an idea I wanted to offer just in case it might help. This was something I tried to tackle about 6 months ago, called it "UtiliD" (https://forum.dlang.org/post/wgkbamnlraustaycbbya forum.dlang.org). I ultimately failed due to the tangle of Phobos, and because my life priorities changed I gave up and deleted my repository (oops!). Anyway, my suggestion is to create a new library separate from druntime and phobos that has no dependencies whatsoever (no libc, no libstdc++, no OS dependencies, no druntime dependency, etc.). I mean it; **no dependencies**. Not even object.d. The only thing it should require is a D compiler. That library can then be imported by druntime, phobos, betterC builds, or even the compiler itself. It will take strict enforcement of the "no dependency" rule and good judgment to keep the scope from ballooning, but it may be a good place for things like `traits`, `meta` and others. Hope I'm not just making noise. Mike
Jan 07 2019
On Tuesday, 8 January 2019 at 01:44:08 UTC, Mike Franklin wrote:Anyway, my suggestion is to create a new library separate from druntime and phobos that has no dependencies whatsoever (no libc, no libstdc++, no OS dependencies, no druntime dependency, etc.). I mean it; **no dependencies**. Not even object.d. The only thing it should require is a D compiler. That library can then be imported by druntime, phobos, betterC builds, or even the compiler itself. It will take strict enforcement of the "no dependency" rule and good judgment to keep the scope from ballooning, but it may be a good place for things like `traits`, `meta` and others.I spent some time trying to think through some of the issues with druntime, and came up with this: Right now, druntime is somewhat of a monolith trying to be too many things. * utilities (traits, string utilities, type conversion utilities, etc...) * compiler lowerings * C standard library bindings * C++ standard library bindings * C standard library bindings * Operating system bindings * OS abstractions (thread, fibers, context switching, etc...) * Compiler lowerings * DWARF implementation * TLS implementation * GC * (probably more) So, I suggest something like this: ---------------------------------- * core.util - a.k.a utiliD - Just utility implementations written in D (e.g `std.traits`, `std.meta`, etc. No dependencies whatsoever. No operating system or platform abstractions. No high-level language features(e.g. exceptions) * public imports: (none) * private imports: (none) * core.stdc - C standard library bindings - libc functions verbatim; no convenience or utility implementations * public imports: (none) * private imports: core.util * core.stdcpp - C++ standard library bindings - libstdc++ data structures verbatim; no convenience or utility implementations * public imports: (none) * private imports: core.util * sys - OS/Platform bindings - operating system implementations verbatim; no convenience or utility implementations * public imports: (none) * private imports: core.util * core.pal - Platform/OS abstractions - threads, fibers, context switching, etc. * public imports: (none) * private imports: core.util, sys, core.libc * core.d - compiler support (compiler lowerings, runtime initialization, TLS implementation, DWARF implementation, GC, etc...) * public imports : core.util * private imports : core.pal * druntime - Just a top-level package containing public imports, aliases, and compiler support. No other implementations * public imports: core.pal, core.d * private imports: core.util * std - phobos * public imports: (none) * private imports: druntime There are likely other suitable ways to organize it, but that's just what I could come up with after thinking through it a little. I would prefer if each of those were in their own repository and even move some of them to Deimos or dub, but that would probably irritate a lot of people. I'd also prefer to have each of those in their own packages, but D is probably too deep in technical debt for that. (See also https://issues.dlang.org/show_bug.cgi?id=11666) So, to make it more palatable, I suggest: ----------------------------------------- * `core.util` gets own repository so it can be independently added to other repositories as a self-contained/freestanding dependency * `core.stdc`, `core.stdcpp`, `sys`, `core.pal`, and `core.d` all go into the druntime monolith like it is today. * phobos remains much like it is today. In the context of the discussion at hand, `std.traits`, `std.meta`, and other utilities can be moved to `core.util`. `core.util` can then be added as a dependency to dmd, druntime, and phobos. The rest will probably have to wait for D3 :/ Mike
Jan 07 2019
On 2019-01-08 06:37, Mike Franklin wrote:I spent some time trying to think through some of the issues with druntime, and came up with this: Right now, druntime is somewhat of a monolith trying to be too many things.  * utilities (traits, string utilities, type conversion utilities, etc...)  * compiler lowerings  * C standard library bindings  * C++ standard library bindings  * C standard library bindings  * Operating system bindings  * OS abstractions (thread, fibers, context switching, etc...)  * Compiler lowerings  * DWARF implementation  * TLS implementation  * GC  * (probably more) So, I suggest something like this: ---------------------------------- * core.util - a.k.a utiliD - Just utility implementations written in D (e.g `std.traits`, `std.meta`, etc. No dependencies whatsoever. No operating system or platform abstractions. No high-level language features(e.g. exceptions)    * public imports: (none)    * private imports: (none) * core.stdc - C standard library bindings - libc functions verbatim; no convenience or utility implementations    * public imports: (none)    * private imports: core.util * core.stdcpp - C++ standard library bindings - libstdc++ data structures verbatim; no convenience or utility implementations    * public imports: (none)    * private imports: core.util * sys - OS/Platform bindings - operating system implementations verbatim; no convenience or utility implementations    * public imports: (none)    * private imports: core.util * core.pal - Platform/OS abstractions - threads, fibers, context switching, etc.    * public imports: (none)    * private imports: core.util, sys, core.libc * core.d - compiler support (compiler lowerings, runtime initialization, TLS implementation, DWARF implementation, GC, etc...)    * public imports : core.util    * private imports : core.pal * druntime - Just a top-level package containing public imports, aliases, and compiler support. No other implementations    * public imports: core.pal, core.d    * private imports: core.util * std - phobos    * public imports: (none)    * private imports: druntime There are likely other suitable ways to organize it, but that's just what I could come up with after thinking through it a little. I would prefer if each of those were in their own repository and even move some of them to Deimos or dub, but that would probably irritate a lot of people. I'd also prefer to have each of those in their own packages, but D is probably too deep in technical debt for that. (See also https://issues.dlang.org/show_bug.cgi?id=11666) So, to make it more palatable, I suggest: -----------------------------------------  * `core.util` gets own repository so it can be independently added to other repositories as a self-contained/freestanding dependency  * `core.stdc`, `core.stdcpp`, `sys`, `core.pal`, and `core.d` all go into the druntime monolith like it is today.  * phobos remains much like it is today. In the context of the discussion at hand, `std.traits`, `std.meta`, and other utilities can be moved to `core.util`. `core.util` can then be added as a dependency to dmd, druntime, and phobos. The rest will probably have to wait for D3 :/I like this approach. -- /Jacob Carlborg
Jan 09 2019
On Tuesday, 8 January 2019 at 01:44:08 UTC, Mike Franklin wrote:Anyway, my suggestion is to create a new library separate from druntime and phobos that has no dependencies whatsoever (no libc, no libstdc++, no OS dependencies, no druntime dependency, etc.). I mean it; **no dependencies**. Not even object.d. The only thing it should require is a D compiler. That library can then be imported by druntime, phobos, betterC builds, or even the compiler itself. It will take strict enforcement of the "no dependency" rule and good judgment to keep the scope from ballooning, but it may be a good place for things like `traits`, `meta` and others.I also feel the need for at least 1 another base library. My focus is on the fundamental compiler support functions, like initializing/comparing/copying arrays and general associative arrays support, as they are fundamental to the language and their compilers (not talking about TypeInfos, ModuleInfos, Object etc.). I think we need such a base library in order to improve -betterC and its available language features. The important thing would be to try to reduce the external dependencies of that lib to an absolute minimum, similar to rust's core library (just 5 symbols: mem{cpy,cmp,set} + rust_begin_panic + rust_eh_personality), although we'll probably need some some primitives, e.g., malloc/realloc/free. If that's possible, using D for bare-metal targets without a C library (e.g., a future WebAssembly version with direct access to GC, or your own OS kernel/firmware) would probably become awesome, as you'd only need to implement maybe a dozen of symbols.
Jan 08 2019
On 1/8/19 4:23 PM, kinke wrote:On Tuesday, 8 January 2019 at 01:44:08 UTC, Mike Franklin wrote:This is self-contradictory, as AA's require TypeInfo. Though I agree with the goal. It's just not a "now" goal, we first need to fix these components so they DON'T depend on such things as TypeInfo. -SteveAnyway, my suggestion is to create a new library separate from druntime and phobos that has no dependencies whatsoever (no libc, no libstdc++, no OS dependencies, no druntime dependency, etc.). I mean it; **no dependencies**. Not even object.d. The only thing it should require is a D compiler. That library can then be imported by druntime, phobos, betterC builds, or even the compiler itself. It will take strict enforcement of the "no dependency" rule and good judgment to keep the scope from ballooning, but it may be a good place for things like `traits`, `meta` and others.I also feel the need for at least 1 another base library. My focus is on the fundamental compiler support functions, like initializing/comparing/copying arrays and general associative arrays support, as they are fundamental to the language and their compilers (not talking about TypeInfos, ModuleInfos, Object etc.).
Jan 08 2019
On Tuesday, 8 January 2019 at 21:26:51 UTC, Steven Schveighoffer wrote:Steven is right (as usual) here. There has to be a serious effort to remove the dependency on runtime information that is available at compile-time. I tried quite hard on that in 2017~2018, but I ran into all sorts of problems. Exhibit A: We can set an array's length in ` safe`, `nothrow`, `pure` code. But, it gets lowered to a runtime hook that is neither ` safe`, `nothrow`, nor `pure` (https://github.com/dlang/druntime/blob/e47a00bff935c3f079bb567a6ec97663ba384487/src/r /lifetime.d#L1265). In other words, the compiler-runtime interface is a lie. So, if you try to rewrite that as a template to remove the dependency on `TypeInfo`, then the template will run through the semantic phase of the compiler and now you have to be honest, and it doesn't compiler. So, to make that work you have to make all of the code that `_d_arraysetlengthT` calls ` safe`, `nothrow`, nor `pure` to prevent breakage, you'll find that none of it compiles because the "turtles at the bottom" (i.e. `memcpy`, `malloc`, etc...) aren't `pure` or whatever attribute constraint you're trying to apply. Exhibit B: I tried to convert `_d_arraycast` to a template in https://github.com/dlang/druntime/pull/2268 and ran into similar problems. Some tried to help with a `pureMalloc` implementation in https://github.com/dlang/druntime/pull/2276, but that didn't go well either. Walter responded with "Since realloc() free's memory, it cannot ever be considered pure." Well, what the hell are we supposed to do then? IMO, this having dynamic stack allocation for arrays and strings will help (https://issues.dlang.org/show_bug.cgi?id=18788). GDC and LDC already provide this, but DMD's implementation is in druntime (https://github.com/dlang/druntime/blob/9a8edfb48e4842180c706ee26ebd8edb10be53 4/src/rt/alloca.d), so it requires linking in druntime, and now we're at a catch 22. I asked Walter for help with this, as it is beyond my current skills, but he said he didn't have time. Here's what I think will help: 1. Get `alloca` or dynamic stack array allocation working. This will help a lot because we won't have to reach for `malloc` and friends for simple allocations like generating dynamic assert messages 2. Convert `memcpy`, `memset`, and `memcmp` to strongly-typed D templates so they can be used in the implementations when converting runtime hooks to templates. I did some exploration on that and published my results at https://github.com/JinShil/memcpyD. Unfortunately, DMD is missing an AVX512 implementation so I couldn't continue. Lots of obstacles here and I don't see it happening without Walter and Andrei making it a priority. MikeI also feel the need for at least 1 another base library. My focus is on the fundamental compiler support functions, like initializing/comparing/copying arrays and general associative arrays support, as they are fundamental to the language and their compilers (not talking about TypeInfos, ModuleInfos, Object etc.).This is self-contradictory, as AA's require TypeInfo. Though I agree with the goal. It's just not a "now" goal, we first need to fix these components so they DON'T depend on such things as TypeInfo.
Jan 08 2019
On Wed, 09 Jan 2019 02:32:50 +0000, Mike Franklin wrote:I tried to convert `_d_arraycast` to a template in https://github.com/dlang/druntime/pull/2268 and ran into similar problems. Some tried to help with a `pureMalloc` implementation in https://github.com/dlang/druntime/pull/2276, but that didn't go well either. Walter responded with "Since realloc() free's memory, it cannot ever be considered pure." Well, what the hell are we supposed to do then?The specific thing that he replied to was having a public symbol for realloc that was considered pure. Perhaps a private fakePureRealloc() would be more palatable?
Jan 08 2019
On Wednesday, 9 January 2019 at 03:32:17 UTC, Neia Neutuladh wrote:The specific thing that he replied to was having a public symbol for realloc that was considered pure. Perhaps a private fakePureRealloc() would be more palatable?Perhaps; I'm not sure. The `pureMalloc` implementation is a lot of clever hackery anyway, so I think it would be best to just implement stack-allocated dynamic arrays (i.e. https://issues.dlang.org/show_bug.cgi?id=18788) and avoid the games. That would have solved the immediate need I had for converting runtime hooks to templates, and would help some of that work move forward. Mike
Jan 08 2019
On 2019-01-09 03:32, Mike Franklin wrote:Here's what I think will help: 1. Get `alloca` or dynamic stack array allocation working. This will help a lot because we won't have to reach for `malloc` and friends for simple allocations like generating dynamic assert messagesWhat's the problem with "alloca"?2. Convert `memcpy`, `memset`, and `memcmp` to strongly-typed D templates so they can be used in the implementations when converting runtime hooks to templates. I did some exploration on that and published my results at https://github.com/JinShil/memcpyD. Unfortunately, DMD is missing an AVX512 implementation so I couldn't continue.What do you mean "couldn't continue"? It's possible to implement "memcpy" without AVX512. Am I missing something? -- /Jacob Carlborg
Jan 09 2019
On Wednesday, 9 January 2019 at 11:01:46 UTC, Jacob Carlborg wrote:On 2019-01-09 03:32, Mike Franklin wrote:In DMD you can't use it without linking in the runtime, but in LDC and GDC, you can. One of the goals of implementing these runtime hooks as templates is to make more features available in -betterC builds, or for pay-as-you-go runtime implementations. If you need to link in druntime to get `alloca`, you can't implement the runtime hooks as templates and have them work in -betterC.Here's what I think will help: 1. Get `alloca` or dynamic stack array allocation working. This will help a lot because we won't have to reach for `malloc` and friends for simple allocations like generating dynamic assert messagesWhat's the problem with "alloca"?Yes, it's possible, but I don't think it will ever be accepted if it doesn't perform at least as well as the optimized versions in C or assembly that use AVX512 or other SIMD features. It needs to be at least as good as what libc provides, so we need to be able to leverage these unique hardware features to get the best performance. Mike2. Convert `memcpy`, `memset`, and `memcmp` to strongly-typed D templates so they can be used in the implementations when converting runtime hooks to templates. I did some exploration on that and published my results at https://github.com/JinShil/memcpyD. Unfortunately, DMD is missing an AVX512 implementation so I couldn't continue.What do you mean "couldn't continue"? It's possible to implement "memcpy" without AVX512. Am I missing something?
Jan 09 2019
On Wednesday, 9 January 2019 at 11:49:40 UTC, Mike Franklin wrote:On Wednesday, 9 January 2019 at 11:01:46 UTC, Jacob Carlborg wrote:AVX512 concerns only a very small part of processors on the market (Skylake, Canon Lake and Cascade Lake). AMD will never implement it and the number of people upgrading to one of the lake cpus from some recent chip is also not that great. I don't see why not having it implemented yet is blocking anything. People who really need AVX512 performance will have implemented memcpy themselves already and for the others, they will have to wait a little bit. It's not as if it couldn't be added later. I really don't understand the problem. This said, another issue with memcpy that very often gets lost is that, because of the fancy benchmarking, its system performance cost is often wrongly assessed, and a lot of heroic efforts are put in optimizing big block transfers, while in reality it's mostly called on small (postblit) to medium blocks. Linus Torvalds had once a rant on that subject on realworldtech. https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589On 2019-01-09 03:32, Mike Franklin wrote:In DMD you can't use it without linking in the runtime, but in LDC and GDC, you can. One of the goals of implementing these runtime hooks as templates is to make more features available in -betterC builds, or for pay-as-you-go runtime implementations. If you need to link in druntime to get `alloca`, you can't implement the runtime hooks as templates and have them work in -betterC.Here's what I think will help: 1. Get `alloca` or dynamic stack array allocation working. This will help a lot because we won't have to reach for `malloc` and friends for simple allocations like generating dynamic assert messagesWhat's the problem with "alloca"?Yes, it's possible, but I don't think it will ever be accepted if it doesn't perform at least as well as the optimized versions in C or assembly that use AVX512 or other SIMD features. It needs to be at least as good as what libc provides, so we need to be able to leverage these unique hardware features to get the best performance.2. Convert `memcpy`, `memset`, and `memcmp` to strongly-typed D templates so they can be used in the implementations when converting runtime hooks to templates. I did some exploration on that and published my results at https://github.com/JinShil/memcpyD. Unfortunately, DMD is missing an AVX512 implementation so I couldn't continue.What do you mean "couldn't continue"? It's possible to implement "memcpy" without AVX512. Am I missing something?
Jan 09 2019
On Wednesday, 9 January 2019 at 12:31:13 UTC, Patrick Schluter wrote:On Wednesday, 9 January 2019 at 11:49:40 UTC, Mike Franklin wrote:By reading (quiclkly) these articles: - https://lemire.me/blog/2018/04/19/by-how-much-does-avx-512-slow-down-your-cpu-a-first-experiment/ - https://lemire.me/blog/2018/09/07/avx-512-when-and-how-to-use-these-new-instructions/ it seem that using avx512 can be good if you pin a thread to a core in order to process only avx512 statement.[...]AVX512 concerns only a very small part of processors on the market (Skylake, Canon Lake and Cascade Lake). AMD will never implement it and the number of people upgrading to one of the lake cpus from some recent chip is also not that great. I don't see why not having it implemented yet is blocking anything. People who really need AVX512 performance will have implemented memcpy themselves already and for the others, they will have to wait a little bit. It's not as if it couldn't be added later. I really don't understand the problem. This said, another issue with memcpy that very often gets lost is that, because of the fancy benchmarking, its system performance cost is often wrongly assessed, and a lot of heroic efforts are put in optimizing big block transfers, while in reality it's mostly called on small (postblit) to medium blocks. Linus Torvalds had once a rant on that subject on realworldtech. https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589
Jan 09 2019
On Wed, Jan 09, 2019 at 12:31:13PM +0000, Patrick Schluter via Digitalmars-d wrote: [...]This said, another issue with memcpy that very often gets lost is that, because of the fancy benchmarking, its system performance cost is often wrongly assessed, and a lot of heroic efforts are put in optimizing big block transfers, while in reality it's mostly called on small (postblit) to medium blocks.EXACTLY!!! Some time ago I took an interest in implementing the equivalent of strchr in the most optimized way possible. For that, I wrote several of my own algorithms and also perused the glibc implementation. Eventually, I realized that the glibc implementation, which uses fancy 64-bit-word scanning with a lot of setup overhead and messy starting/trailing cases, is optimizing for very large scans, i.e., when the byte being sought occurs only rarely in a very large haystack. In those cases it's at the top of benchmarks. However, in the arguably more common case where the byte being sought occurs relatively frequently in small- to medium-sized haystacks, repeatedly searching the haystack incurs a ton of overhead setting up all that fancy machinery, branch hazards, and what-not, where a plain ole `while (*ptr++ != needle) {}` works much better. I suspect many of the C library functions of this sort (incl. memcpy + friends) have a tendency to suffer from this sort of premature optimization. Not to mention that often overly-specialized benchmarks of this sort fail to account for bias caused by the CPU's branch predictor learning the benchmark and the cache hierarchy amortizing the cost of repeatedly searching the same haystack -- things you rarely do in real-life applications. There's a big risk of your "super-optimized" algorithm ending up optimizing for an unrealistic use-case, but having only mediocre or sometimes even poor performance in real-world computations.Linus Torvalds had once a rant on that subject on realworldtech. https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589Nice. T -- If the comments and the code disagree, it's likely that *both* are wrong. -- Christopher
Jan 09 2019
On Wednesday, 9 January 2019 at 17:40:38 UTC, H. S. Teoh wrote:[snip] EXACTLY!!! Some time ago I took an interest in implementing the equivalent of strchr in the most optimized way possible. For that, I wrote several of my own algorithms and also perused the glibc implementation. Eventually, I realized that the glibc implementation, which uses fancy 64-bit-word scanning with a lot of setup overhead and messy starting/trailing cases, is optimizing for very large scans, i.e., when the byte being sought occurs only rarely in a very large haystack. In those cases it's at the top of benchmarks. However, in the arguably more common case where the byte being sought occurs relatively frequently in small- to medium-sized haystacks, repeatedly searching the haystack incurs a ton of overhead setting up all that fancy machinery, branch hazards, and what-not, where a plain ole `while (*ptr++ != needle) {}` works much better. I suspect many of the C library functions of this sort (incl. memcpy + friends) have a tendency to suffer from this sort of premature optimization. Not to mention that often overly-specialized benchmarks of this sort fail to account for bias caused by the CPU's branch predictor learning the benchmark and the cache hierarchy amortizing the cost of repeatedly searching the same haystack -- things you rarely do in real-life applications. There's a big risk of your "super-optimized" algorithm ending up optimizing for an unrealistic use-case, but having only mediocre or sometimes even poor performance in real-world computations.One thing I like about libmir's sum function http://docs.algorithm.dlang.io/latest/mir_math_sum.html was that the algorithm you use to return the sum can be chosen with an enum on the template. So it's really a collection of different sum algorithms all in one. Set the default as something reasonable and then let the user decide if they want something else.
Jan 09 2019
On Wed, Jan 09, 2019 at 06:55:30PM +0000, jmh530 via Digitalmars-d wrote: [...]One thing I like about libmir's sum function http://docs.algorithm.dlang.io/latest/mir_math_sum.html was that the algorithm you use to return the sum can be chosen with an enum on the template. So it's really a collection of different sum algorithms all in one. Set the default as something reasonable and then let the user decide if they want something else.That's an excellent idea. Have a generic default algorithm that performs reasonably well in typical use cases, but also give the user the power to choose a different algorithm if he knows that it would work better with his particular use case. Empowering the user -- over time I've come to learn that this is always the best approach to API design. It's one that has the best chance of standing the test of time. Fancy APIs that don't pay enough attention to this principle tend to eventually fade into obscurity. T -- I am Ohm of Borg. Resistance is voltage over current.
Jan 09 2019
On Wednesday, 9 January 2019 at 19:25:35 UTC, H. S. Teoh wrote:That's an excellent idea. Have a generic default algorithm that performs reasonably well in typical use cases, but also give the user the power to choose a different algorithm if he knows that it would work better with his particular use case. Empowering the user -- over time I've come to learn that this is always the best approach to API design. It's one that has the best chance of standing the test of time. Fancy APIs that don't pay enough attention to this principle tend to eventually fade into obscurity.Yes, this is one of the benefits of making `memcpy(T)(T* dest, T* src)` instead of `memcpy(void* dest, void* src, size_t num)`. One can generate a `memcpy` at compile-time that is optimized to the machine that the program is being compiled on (or for). druntime could expose "memcpy configuration settings" for users to tune at compile-time. But, then you have to deal with distribution of binaries. If you are compiling a binary that you want to be able to run on all Intel 64-bit PCs, for example, you can't do that tuning at compile-time; it has to be done at runtime. Assuming my understanding is correct, Agner Fog's implementation sets a function pointer to the most optimized implementation for the machine the program is running on based on an inspection fo the CPU's capabilities at the first invocation of `memcpy`. There's a lot of things like this to consider in order to create a professional `memcpy` implementation. Personally, I'd just like to put the infrastructure in place so those more talented than I can tune it. But as I said before, that first PR that puts said infrastructure in place needs to be justified, and I predict it will be difficult to overcome bias and perception. Reading the comments in this thread fills me with a little more optimism that I'm not the only one who thinks it's a good idea. But, we still need dynamic stack allocation first before any of this can happen. Mike
Jan 09 2019
On Wednesday, 9 January 2019 at 12:31:13 UTC, Patrick Schluter wrote:AVX512 concerns only a very small part of processors on the market (Skylake, Canon Lake and Cascade Lake). AMD will never implement it and the number of people upgrading to one of the lake cpus from some recent chip is also not that great.Yes, I agree, and even the newer chips have "Enhanced REP MOVSB and STOSB operation (ERMSB)" which can compensate. See https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-opt mization-manual.pdf 3.7.6.I don't see why not having it implemented yet is blocking anything. People who really need AVX512 performance will have implemented memcpy themselves already and for the others, they will have to wait a little bit. It's not as if it couldn't be added later. I really don't understand the problem.I remember analyzing other implementations of `memcpy` and they were all using AVX512. I had faith in the authors of those implementations (e.g. Agner Fog) that they knew more than me, so that was what I should be using. Perhaps I should revisit it and just do the best that DMD can do. But also keep in mind that there's a strategy to getting things accepted in DMD and elsewhere. You are often battling perception. The single most challenging aspect of implementing `memcpy` in D is overcoming bias and justifying it to the obstructionists that see it as a complete waste of time. If I can't implement it in AVX512 simply for the purpose of measurement and comparison, it will be more difficult to justify.This said, another issue with memcpy that very often gets lost is that, because of the fancy benchmarking, its system performance cost is often wrongly assessed, and a lot of heroic efforts are put in optimizing big block transfers, while in reality it's mostly called on small (postblit) to medium blocks. Linus Torvalds had once a rant on that subject on realworldtech. https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589I understand. I also encountered a lot of difficulting getting consistent measurements in my exploration. Doing proper measurement and analysis for this kind of thing is a skill in and of itself. You're right about the small copies being the norm. As part of my exploration, I write a logging `memcpy` wrapper to see what kind of copies DMD was doing when it compiled itself, and it was as you describe. Perhaps I'll give it another go at a later time, but we need to get dynamic stack allocation working first because many of the runtime hook implementations that will utilize `memcpy` do some error checking and assertions, and we need to be able to generate dynamic error messages for those assertions when the caller is `pure`. We need a solution to this (https://issues.dlang.org/show_bug.cgi?id=18788) first. Mike
Jan 09 2019
On Thursday, 10 January 2019 at 00:10:18 UTC, Mike Franklin wrote:I remember analyzing other implementations of `memcpy` and they were all using AVX512. I had faith in the authors of those implementations (e.g. Agner Fog) that they knew more than me, so that was what I should be using. Perhaps I should revisit it and just do the best that DMD can do.AVX512 is a superset of AVX2, is a superset of AVX, is a superset of SSE. I expect the implementations you were looking at are actually implemented in SSE, where SSE2 is a baseline expectation for x64 processors. I've done some AVX2 code recently with 256-bit values. The performance is significantly slower on AMD processors. I assume their pipeline internally is still 128 bit as a result, and while my 256-bit code can run faster on Intel it needs to run on AMD so I've dropped to 128-bit instructions at most - effectively keeping my code SSE4.1 compatible. I've done a memset_pattern4[1] implementation in SSE previously. The important instruction group is _mm_stream. Which, you will note, was an instruction group first introduced in SSE1 and hasn't had additional writing stream functions added since since SSE 4.1[2]. [1] https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/memset_pattern4.3.html [2] https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=5119,5452,5443,5910,5288,5119,5249,5231&text=_mm_stream
Jan 10 2019
On Thursday, 10 January 2019 at 10:13:57 UTC, Ethan wrote:I've done a memset_pattern4[1] implementation in SSE previously. The important instruction group is _mm_stream. Which, you will note, was an instruction group first introduced in SSE1 and hasn't had additional writing stream functions added since since SSE 4.1[2].Where's the edit button. The last writing stream function was added in SSE2. A streaming load was added in SSE 4.1 I believe I used that load when optimising string compares.
Jan 10 2019
On Thursday, 10 January 2019 at 10:13:57 UTC, Ethan wrote:On Thursday, 10 January 2019 at 00:10:18 UTC, Mike Franklin wrote:That's disappointing to learn. Ryzen has four 128-bit AVX units, 2 of them can only do addition and the other 2 can only do multiplication. Not sure how the memory is shared between units but if it isn't then it'd need to copy to be able to do an addition then a multiplication.I remember analyzing other implementations of `memcpy` and they were all using AVX512. I had faith in the authors of those implementations (e.g. Agner Fog) that they knew more than me, so that was what I should be using. Perhaps I should revisit it and just do the best that DMD can do.AVX512 is a superset of AVX2, is a superset of AVX, is a superset of SSE. I expect the implementations you were looking at are actually implemented in SSE, where SSE2 is a baseline expectation for x64 processors. I've done some AVX2 code recently with 256-bit values. The performance is significantly slower on AMD processors. I assume their pipeline internally is still 128 bit as a result, and while my 256-bit code can run faster on Intel it needs to run on AMD so I've dropped to 128-bit instructions at most - effectively keeping my code SSE4.1 compatible. I've done a memset_pattern4[1] implementation in SSE previously. The important instruction group is _mm_stream. Which, you will note, was an instruction group first introduced in SSE1 and hasn't had additional writing stream functions added since since SSE 4.1[2]. [1] https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/memset_pattern4.3.html [2] https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=5119,5452,5443,5910,5288,5119,5249,5231&text=_mm_stream
Jan 10 2019
On Thursday, 10 January 2019 at 21:01:09 UTC, luckoverthere wrote:That's disappointing to learn. Ryzen has four 128-bit AVX units, 2 of them can only do addition and the other 2 can only do multiplication. Not sure how the memory is shared between units but if it isn't then it'd need to copy to be able to do an addition then a multiplication.The good news though is that Ryzen's 128-bit pipeline outperforms my Skylake i7 with this code. So you could say they've optimised for the majority usecase. It's reaaaaaally beneficial to do 256-bit logic for my particular use case here since I'm sampling and operating on 8 32-bit values at a time to produce a 32-bit output. But eh, I've gotta write for the build farm hardware.
Jan 11 2019
On Friday, 11 January 2019 at 09:36:09 UTC, Ethan wrote:On Thursday, 10 January 2019 at 21:01:09 UTC, luckoverthere wrote:Hi ethan, could you share a piece of code to do that ? thanks youThat's disappointing to learn. Ryzen has four 128-bit AVX units, 2 of them can only do addition and the other 2 can only do multiplication. Not sure how the memory is shared between units but if it isn't then it'd need to copy to be able to do an addition then a multiplication.The good news though is that Ryzen's 128-bit pipeline outperforms my Skylake i7 with this code. So you could say they've optimised for the majority usecase. It's reaaaaaally beneficial to do 256-bit logic for my particular use case here since I'm sampling and operating on 8 32-bit values at a time to produce a 32-bit output. But eh, I've gotta write for the build farm hardware.
Jan 11 2019
On Friday, 11 January 2019 at 11:10:10 UTC, bioinfornatics wrote:Hi ethan, could you share a piece of code to do that ? thanks youNot really. 1) It's very context specific 2) It's for my current employer and is subject to the usual code disclosure NDAs
Jan 11 2019
On Friday, 11 January 2019 at 11:47:20 UTC, Ethan wrote:On Friday, 11 January 2019 at 11:10:10 UTC, bioinfornatics wrote:OK I understand, no problem 😉 So I could try to use this idea for training. As example take 8 value of 32 bit and return the sum or others... But I though AMD had 2 units for sum and units for multiply. I need to get a better understanding on this topics 🤔Hi ethan, could you share a piece of code to do that ? thanks youNot really. 1) It's very context specific 2) It's for my current employer and is subject to the usual code disclosure NDAs
Jan 11 2019
On 2019-01-09 12:49, Mike Franklin wrote:In DMD you can't use it without linking in the runtime, but in LDC and GDC, you can. One of the goals of implementing these runtime hooks as templates is to make more features available in -betterC builds, or for pay-as-you-go runtime implementations. If you need to link in druntime to get `alloca`, you can't implement the runtime hooks as templates and have them work in -betterC.Ah, I see.Yes, it's possible, but I don't think it will ever be accepted if it doesn't perform at least as well as the optimized versions in C or assembly that use AVX512 or other SIMD features. It needs to be at least as good as what libc provides, so we need to be able to leverage these unique hardware features to get the best performance.Perhaps it could be considered as a fallback when a "memcpy" isn't available. -- /Jacob Carlborg
Jan 09 2019
On Wednesday, 9 January 2019 at 19:24:28 UTC, Jacob Carlborg wrote:I'm not sure what you mean. DMD currently links in libc, so `memcpy` is always available. Also, it's difficult for me to articulate, but we don't want `void* memcpy(void* destination, const void* source, size_t num)` rewritten in D. We need `void memcpy(T)(T* destination, const T* source)` or some other strongly typed template like that. And as an aside, thanks to https://github.com/dlang/dmd/pull/8504 we now have to be careful about the order of arguments. Anyway, I'm not sure there's much point in hashing this out right now. We need dynamic stack allocation first before any of this can happen because the runtime hooks need to be able to generate dynamic assertion messages in -betterC, and there's only one person I know of that can do that (Walter), and I don't think it's a priority for him right now. MikeYes, it's possible, but I don't think it will ever be accepted if it doesn't perform at least as well as the optimized versions in C or assembly that use AVX512 or other SIMD features. It needs to be at least as good as what libc provides, so we need to be able to leverage these unique hardware features to get the best performance.Perhaps it could be considered as a fallback when a "memcpy" isn't available.
Jan 09 2019