digitalmars.D - core.traits?

Manu (16/16) Jan 05 2019 So, druntime has core.internal.traits where a bunch of std.traits have

kinke (2/5) Jan 05 2019 I fully agree.

Brian (2/7) Jan 08 2019 YES, YES!! Agree!!!

Seb (3/13) Jan 05 2019 I would go even one step further and move everything and just

H. S. Teoh (11/23) Jan 05 2019 I concur! We've been adding ugly nasty hacks to druntime, or writing
Manu (4/27) Jan 05 2019 Great! So, who's gonna do it?

Nicholas Wilson (2/12) Jan 05 2019 Makes sense to me.
Walter Bright (2/5) Jan 05 2019 Sounds good.

Manu (16/21) Jan 07 2019 Okay, so this is a challenging effort, since phobos is such a tangled

Steven Schveighoffer (8/34) Jan 07 2019 I was going to say core.meta should just have the basic definitions, but...

H. S. Teoh (25/39) Jan 07 2019 It's already gotten better over the years in some ways (though not

Steven Schveighoffer (13/24) Jan 07 2019 std.internal.traits contains pieces of std.meta -- a quick look shows it...

H. S. Teoh (18/44) Jan 07 2019 What, wut...? `TypeTuple` still exists?! I thought we had gone through

Steven Schveighoffer (12/59) Jan 07 2019 It's an internal definition, so it can be anything it wants it to be. It...

Manu (20/61) Jan 07 2019 Ummm, well, yes and no. staticMap is DEFINITELY like you say, one of
Jonathan M Davis (51/79) Jan 07 2019 Digitalmars-d wrote:
Nick Treleaven (15/18) Jan 08 2019 That's fine internally, but std.traits still uses *Tuple for ten

Steven Schveighoffer (10/28) Jan 08 2019 What I mean is that you reach for things like what is available in

Manu (11/47) Jan 07 2019 Well, I think you're kinda catastrophising here... I said very clearly

Mike Franklin (19/22) Jan 07 2019 I'm not really doing much with D anymore, so I apologize for

Mike Franklin (75/85) Jan 07 2019 I spent some time trying to think through some of the issues with

Jacob Carlborg (4/88) Jan 09 2019 I like this approach.

kinke (17/27) Jan 08 2019 I also feel the need for at least 1 another base library. My

Steven Schveighoffer (5/23) Jan 08 2019 This is self-contradictory, as AA's require TypeInfo.

Mike Franklin (37/47) Jan 08 2019 Steven is right (as usual) here. There has to be a serious

Neia Neutuladh (4/11) Jan 08 2019 The specific thing that he replied to was having a public symbol for

Mike Franklin (10/13) Jan 08 2019 Perhaps; I'm not sure. The `pureMalloc` implementation is a lot

Jacob Carlborg (6/16) Jan 09 2019 What do you mean "couldn't continue"? It's possible to implement

Mike Franklin (16/31) Jan 09 2019 In DMD you can't use it without linking in the runtime, but in

Patrick Schluter (17/50) Jan 09 2019 AVX512 concerns only a very small part of processors on the

bioinfornatics (9/29) Jan 09 2019 By reading (quiclkly) these articles:
H. S. Teoh (30/37) Jan 09 2019 EXACTLY!!!

jmh530 (8/37) Jan 09 2019 One thing I like about libmir's sum function

H. S. Teoh (13/19) Jan 09 2019 That's an excellent idea. Have a generic default algorithm that

Mike Franklin (26/35) Jan 09 2019 Yes, this is one of the benefits of making `memcpy(T)(T* dest, T*

Mike Franklin (33/50) Jan 09 2019 Yes, I agree, and even the newer chips have "Enhanced REP MOVSB

Ethan (20/25) Jan 10 2019 AVX512 is a superset of AVX2, is a superset of AVX, is a superset

Ethan (4/9) Jan 10 2019 Where's the edit button. The last writing stream function was
luckoverthere (6/32) Jan 10 2019 That's disappointing to learn. Ryzen has four 128-bit AVX units,

Ethan (8/13) Jan 11 2019 The good news though is that Ryzen's 128-bit pipeline outperforms

bioinfornatics (3/17) Jan 11 2019 Hi ethan, could you share a piece of code to do that ?

Ethan (5/7) Jan 11 2019 Not really.

bioinfornatics (6/15) Jan 11 2019 OK I understand, no problem 😉

Jacob Carlborg (6/17) Jan 09 2019 Perhaps it could be considered as a fallback when a "memcpy" isn't

Mike Franklin (17/25) Jan 09 2019 I'm not sure what you mean. DMD currently links in libc, so

Manu <turkeyman gmail.com> writes:

So, druntime has core.internal.traits where a bunch of std.traits have
been mirrored to support internal machinery within druntime.
This is clear evidence that a lot of these traits are really
super-critical to doing basically anything interesting with D.
I have experience with no-phobos projects in the past where I've been
frustrated that I had to mirror all the traits I needed manually.

I suggest, a fair set of std.traits (no-brainer traits that you
basically can't live without) should be officially moved to
core.traits, so that they are always available to all D users.
Traits are pure-templates, they don't emit code, and have no impact on
the size of the druntime binary. They significantly shouldn't affect
build times unless they are instantiated.
...and they're already there in core.internal.traits.

We should move them to core.traits, and that should be their official
home. It really just makes sense. Uncontroversial low-level traits
don't belong in phobos.

Jan 05 2019

kinke <noone nowhere.com> writes:

On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:
 We should move them to core.traits, and that should be their 
 official home. It really just makes sense. Uncontroversial 
 low-level traits don't belong in phobos.

I fully agree.

Jan 05 2019

Brian <zoujiaqing gmail.com> writes:

On Saturday, 5 January 2019 at 21:31:38 UTC, kinke wrote:
 On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:
 We should move them to core.traits, and that should be their 
 official home. It really just makes sense. Uncontroversial 
 low-level traits don't belong in phobos.

 I fully agree.

YES, YES!! Agree!!!

Jan 08 2019

Seb <seb wilzba.ch> writes:

On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:
 So, druntime has core.internal.traits where a bunch of 
 std.traits have
 been mirrored to support internal machinery within druntime.
 This is clear evidence that a lot of these traits are really
 super-critical to doing basically anything interesting with D.
 I have experience with no-phobos projects in the past where 
 I've been
 frustrated that I had to mirror all the traits I needed 
 manually.

 [...]

I would go even one step further and move everything and just 
alias things in std.traits, s.t. no breakage happens.

Jan 05 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Jan 05, 2019 at 09:49:58PM +0000, Seb via Digitalmars-d wrote:
 On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:
 So, druntime has core.internal.traits where a bunch of std.traits have
 been mirrored to support internal machinery within druntime.
 This is clear evidence that a lot of these traits are really
 super-critical to doing basically anything interesting with D.
 I have experience with no-phobos projects in the past where I've been
 frustrated that I had to mirror all the traits I needed manually.
 
 [...]

 
 I would go even one step further and move everything and just alias things
 in std.traits, s.t. no breakage happens.

I concur!  We've been adding ugly nasty hacks to druntime, or writing
code in circumlocutous ways, for far too long now, all because certain
basic traits happen to be in std.traits and it's verboten to import
Phobos from druntime.  It's time to revisit that decision.  The more
complex traits should remain in Phobos in order not to complicate
druntime too much, but the basic ones needed also in druntime should
moved into druntime, instead of copy-pasta or roll-your-own in druntime.


T

-- 
It's amazing how careful choice of punctuation can leave you hanging:

Jan 05 2019

Manu <turkeyman gmail.com> writes:

On Sat, Jan 5, 2019 at 2:04 PM H. S. Teoh via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Sat, Jan 05, 2019 at 09:49:58PM +0000, Seb via Digitalmars-d wrote:
 On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:
 So, druntime has core.internal.traits where a bunch of std.traits have
 been mirrored to support internal machinery within druntime.
 This is clear evidence that a lot of these traits are really
 super-critical to doing basically anything interesting with D.
 I have experience with no-phobos projects in the past where I've been
 frustrated that I had to mirror all the traits I needed manually.

 [...]

 I would go even one step further and move everything and just alias things
 in std.traits, s.t. no breakage happens.

 I concur!  We've been adding ugly nasty hacks to druntime, or writing
 code in circumlocutous ways, for far too long now, all because certain
 basic traits happen to be in std.traits and it's verboten to import
 Phobos from druntime.  It's time to revisit that decision.  The more
 complex traits should remain in Phobos in order not to complicate
 druntime too much, but the basic ones needed also in druntime should
 moved into druntime, instead of copy-pasta or roll-your-own in druntime.


 T

 --
 It's amazing how careful choice of punctuation can leave you hanging:

Great! So, who's gonna do it?
I'm already overloaded with these sorts of refactors >_<

Jan 05 2019

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:
 So, druntime has core.internal.traits where a bunch of 
 std.traits have
 been mirrored to support internal machinery within druntime.
 This is clear evidence that a lot of these traits are really
 super-critical to doing basically anything interesting with D.
 I have experience with no-phobos projects in the past where 
 I've been
 frustrated that I had to mirror all the traits I needed 
 manually.

 [...]

Makes sense to me.

Jan 05 2019

Walter Bright <newshound2 digitalmars.com> writes:

On 1/5/2019 1:12 PM, Manu wrote:
 We should move them to core.traits, and that should be their official
 home. It really just makes sense. Uncontroversial low-level traits
 don't belong in phobos.

Sounds good.

Jan 05 2019

Manu <turkeyman gmail.com> writes:

On Sat, Jan 5, 2019 at 11:00 PM Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 1/5/2019 1:12 PM, Manu wrote:
 We should move them to core.traits, and that should be their official
 home. It really just makes sense. Uncontroversial low-level traits
 don't belong in phobos.

 Sounds good.

Okay, so this is a challenging effort, since phobos is such a tangled
rats nets of chaos...
But attempting to move some traits immediately calls into question std.meta.
I think we can all agree that Alias and AliasSeq should be in druntime
along with core traits... but where should it live?
Should there be core.meta as well? It's kinda like core.traits, in
that it doesn't include runtime code, it doesn't increase the payload
of druntime.lib for end-users..
Perhaps AliasSeq should live somewhere different?
I'm feeling like a lean/trimmed-down core.meta might want to exist
next to core.traits though; it seems reasonable.

...yes, this process will go on and on. The only way forward is to
take each hurdle one at a time... and ideally, in attempting this
effort, we can de-tangle a lot of cruft during the process.

Jan 07 2019

Steven Schveighoffer <schveiguy gmail.com> writes:

On 1/7/19 4:25 PM, Manu wrote:
 On Sat, Jan 5, 2019 at 11:00 PM Walter Bright via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 1/5/2019 1:12 PM, Manu wrote:
 We should move them to core.traits, and that should be their official
 home. It really just makes sense. Uncontroversial low-level traits
 don't belong in phobos.

 Sounds good.

 
 Okay, so this is a challenging effort, since phobos is such a tangled
 rats nets of chaos...
 But attempting to move some traits immediately calls into question std.meta.
 I think we can all agree that Alias and AliasSeq should be in druntime
 along with core traits... but where should it live?
 Should there be core.meta as well? It's kinda like core.traits, in
 that it doesn't include runtime code, it doesn't increase the payload
 of druntime.lib for end-users..
 Perhaps AliasSeq should live somewhere different?
 I'm feeling like a lean/trimmed-down core.meta might want to exist
 next to core.traits though; it seems reasonable.
 
 ....yes, this process will go on and on. The only way forward is to
 take each hurdle one at a time... and ideally, in attempting this
 effort, we can de-tangle a lot of cruft during the process.
 

I was going to say core.meta should just have the basic definitions, but 
why not just dump all of it in there. As you said, they are templates 
(and so you only pay if you use them), and really part of the core 
language definition.

There are already parts of std.meta inside core.internal as well, so if 
we want to be consistent, we should just move it all ;)

-Steve

Jan 07 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote:
[...]
 Okay, so this is a challenging effort, since phobos is such a tangled
 rats nets of chaos...

It's already gotten better over the years in some ways (though not
others -- unfortunately I'm afraid std.traits might be one of the places
where things have probably gotten more tangled).  It certainly hasn't
lived up to the promise of that old Phobos Philosophy page that once
existed but has since been removed, of Phobos being a collection of
lightweight, self-contained, mostly-orthogonal, reusable components.  It
has become quite the opposite, where the dependency graph of Phobos
modules is approaching pretty close to being a complete graph.  (And
yes, there are some pretty deep-seated cyclic dependencies that thus far
nobody has been able to truly unravel in any satisfactory way.)


 But attempting to move some traits immediately calls into question
 std.meta.  I think we can all agree that Alias and AliasSeq should be
 in druntime along with core traits... but where should it live?
 Should there be core.meta as well? It's kinda like core.traits, in
 that it doesn't include runtime code, it doesn't increase the payload
 of druntime.lib for end-users..

Shouldn't all of core.traits be like that?  I'd hardly expect any
runtime component to be associated with something called 'traits'.


 Perhaps AliasSeq should live somewhere different?
 I'm feeling like a lean/trimmed-down core.meta might want to exist
 next to core.traits though; it seems reasonable.

I'm afraid this would set the wrong precedent -- since there's
core.traits for std.traits and core.meta for std.meta, why not also have
core.typecons, core.range, and then it's all gonna go downhill from
there, and before you know it there's gonna be core.stdio and
core.format... *shudder*


 ...yes, this process will go on and on. The only way forward is to
 take each hurdle one at a time... and ideally, in attempting this
 effort, we can de-tangle a lot of cruft during the process.

I'm tempted to say we should put everything in core.traits for now. And
just the absolute bare minimum it takes to meet whatever druntime needs,
and nothing more.


T

-- 
I think Debian's doing something wrong, `apt-get install pesticide', doesn't
seem to remove the bugs on my system! -- Mike Dresser

Jan 07 2019

Steven Schveighoffer <schveiguy gmail.com> writes:

On 1/7/19 4:41 PM, H. S. Teoh wrote:
 On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote:
 [...]
 Perhaps AliasSeq should live somewhere different?
 I'm feeling like a lean/trimmed-down core.meta might want to exist
 next to core.traits though; it seems reasonable.

 
 I'm afraid this would set the wrong precedent -- since there's
 core.traits for std.traits and core.meta for std.meta, why not also have
 core.typecons, core.range, and then it's all gonna go downhill from
 there, and before you know it there's gonna be core.stdio and
 core.format... *shudder*

std.internal.traits contains pieces of std.meta -- a quick look shows it 
has AliasSeq (but under the name TypeTuple), allSatisfy, anySatisfy, 
Filter, staticMap. We might as well just move the whole thing there.

The fear of moving other pieces is real, but only if you look 
superficially. traits and meta are really part of the language, I can't 
imagine using D without it (and neither can any of the people who put 
pieces of those modules into druntime).

I don't want to see anything more complex and interdependent get sucked 
in. I know the goal for Manu right now is emplace, which does feel like 
a language feature. But as has been said many times here, std.traits is 
a no-brainer, and std.meta is really a building block that std.traits uses.

-Steve

Jan 07 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Jan 07, 2019 at 04:54:15PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 1/7/19 4:41 PM, H. S. Teoh wrote:
 On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote:
 [...]
 Perhaps AliasSeq should live somewhere different?  I'm feeling
 like a lean/trimmed-down core.meta might want to exist next to
 core.traits though; it seems reasonable.

 
 I'm afraid this would set the wrong precedent -- since there's
 core.traits for std.traits and core.meta for std.meta, why not also
 have core.typecons, core.range, and then it's all gonna go downhill
 from there, and before you know it there's gonna be core.stdio and
 core.format... *shudder*

 
 std.internal.traits contains pieces of std.meta -- a quick look shows
 it has AliasSeq (but under the name TypeTuple),

What, wut...?  `TypeTuple` still exists?! I thought we had gone through
a somewhat painful deprecation cycle just to kill it off. Or is that
cycle not done yet...?


 allSatisfy, anySatisfy, Filter, staticMap. We might as well just move
 the whole thing there.

I dunno, IMO allSatisfy, anySatisfy, and esp. Filter and staticMap are
all heavy-weight templates (not in terms of code complexity, but in the
sheer amount of templates that will get instantiated when you use them,
AKA compiler slowdown fuel).  I'm not so sure they should go into
druntime.


 The fear of moving other pieces is real, but only if you look
 superficially.  traits and meta are really part of the language, I
 can't imagine using D without it (and neither can any of the people
 who put pieces of those modules into druntime).

Actually, I find myself redefining AliasSeq locally with a shorter name
all the time.  Scarily enough, doing that is shorter than typing `import
std.traits : AliasSeq;`.


 I don't want to see anything more complex and interdependent get
 sucked in.  I know the goal for Manu right now is emplace, which does
 feel like a language feature. But as has been said many times here,
 std.traits is a no-brainer, and std.meta is really a building block
 that std.traits uses.

[...]

Wait, so you're saying we should move the *entire* std.meta and
std.traits to druntime...?


T

-- 
ASCII stupid question, getty stupid ANSI.

Jan 07 2019

Steven Schveighoffer <schveiguy gmail.com> writes:

On 1/7/19 5:10 PM, H. S. Teoh wrote:
 On Mon, Jan 07, 2019 at 04:54:15PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 1/7/19 4:41 PM, H. S. Teoh wrote:
 On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote:
 [...]
 Perhaps AliasSeq should live somewhere different?  I'm feeling
 like a lean/trimmed-down core.meta might want to exist next to
 core.traits though; it seems reasonable.

 I'm afraid this would set the wrong precedent -- since there's
 core.traits for std.traits and core.meta for std.meta, why not also
 have core.typecons, core.range, and then it's all gonna go downhill
 from there, and before you know it there's gonna be core.stdio and
 core.format... *shudder*

 std.internal.traits contains pieces of std.meta -- a quick look shows
 it has AliasSeq (but under the name TypeTuple),

 
 What, wut...?  `TypeTuple` still exists?! I thought we had gone through
 a somewhat painful deprecation cycle just to kill it off. Or is that
 cycle not done yet...?

It's an internal definition, so it can be anything it wants it to be. It 
could be Tuple or Foobar.

 
 
 allSatisfy, anySatisfy, Filter, staticMap. We might as well just move
 the whole thing there.

 
 I dunno, IMO allSatisfy, anySatisfy, and esp. Filter and staticMap are
 all heavy-weight templates (not in terms of code complexity, but in the
 sheer amount of templates that will get instantiated when you use them,
 AKA compiler slowdown fuel).  I'm not so sure they should go into
 druntime.

Again, they are already there, and for a reason.

 The fear of moving other pieces is real, but only if you look
 superficially.  traits and meta are really part of the language, I
 can't imagine using D without it (and neither can any of the people
 who put pieces of those modules into druntime).

 
 Actually, I find myself redefining AliasSeq locally with a shorter name
 all the time.  Scarily enough, doing that is shorter than typing `import
 std.traits : AliasSeq;`.

The point is to have it universally recognized construct. When your 
code's documentation says `TList` or whatever, then someone has to go 
figure that out. If it says `AliasSeq`, it's known what it is.

 I don't want to see anything more complex and interdependent get
 sucked in.  I know the goal for Manu right now is emplace, which does
 feel like a language feature. But as has been said many times here,
 std.traits is a no-brainer, and std.meta is really a building block
 that std.traits uses.

 [...]
 
 Wait, so you're saying we should move the *entire* std.meta and
 std.traits to druntime...?

At least std.meta. std.traits can possibly be split into critical 
language-enabling traits and the less important ones, but I don't know 
off the top of my head which ones are those. But I would say std.meta is 
composed only of the former variety.

-Steve

Jan 07 2019

Manu <turkeyman gmail.com> writes:

On Mon, Jan 7, 2019 at 2:10 PM H. S. Teoh via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Mon, Jan 07, 2019 at 04:54:15PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 1/7/19 4:41 PM, H. S. Teoh wrote:
 On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote:
 [...]
 Perhaps AliasSeq should live somewhere different?  I'm feeling
 like a lean/trimmed-down core.meta might want to exist next to
 core.traits though; it seems reasonable.

 I'm afraid this would set the wrong precedent -- since there's
 core.traits for std.traits and core.meta for std.meta, why not also
 have core.typecons, core.range, and then it's all gonna go downhill
 from there, and before you know it there's gonna be core.stdio and
 core.format... *shudder*

 std.internal.traits contains pieces of std.meta -- a quick look shows
 it has AliasSeq (but under the name TypeTuple),

 What, wut...?  `TypeTuple` still exists?! I thought we had gone through
 a somewhat painful deprecation cycle just to kill it off. Or is that
 cycle not done yet...?


 allSatisfy, anySatisfy, Filter, staticMap. We might as well just move
 the whole thing there.

 I dunno, IMO allSatisfy, anySatisfy, and esp. Filter and staticMap are
 all heavy-weight templates (not in terms of code complexity, but in the
 sheer amount of templates that will get instantiated when you use them,
 AKA compiler slowdown fuel).

Ummm, well, yes and no. staticMap is DEFINITELY like you say, one of
the worst, and I really think it needs a language solution.
C++'s `...` expression is exactly staticMap, and I wonder if we need
an expression like that in-language to take load off staticMap,
because it's perhaps one of the slowest parts f the language.
We can translate almost anything to CTFE, *except* code that invokes
staticMap, so I really reckon it needs a language tool.

This is an aside, if we wanna talk about staticMap, we should start a
new thread.

 I'm not so sure they should go into druntime.

Then you're kinda making a case that NONE of it should go in druntime,
because they're such common building blocks.
Also, you've already lost the game, because they're already in
druntime (in core.internal).

 The fear of moving other pieces is real, but only if you look
 superficially.  traits and meta are really part of the language, I
 can't imagine using D without it (and neither can any of the people
 who put pieces of those modules into druntime).

 Actually, I find myself redefining AliasSeq locally with a shorter name
 all the time.  Scarily enough, doing that is shorter than typing `import
 std.traits : AliasSeq;`.

I've been known to do that too... I could do that for core.traits
aswell, but that seems pretty lame.

 I don't want to see anything more complex and interdependent get
 sucked in.  I know the goal for Manu right now is emplace, which does
 feel like a language feature. But as has been said many times here,
 std.traits is a no-brainer, and std.meta is really a building block
 that std.traits uses.

 [...]

 Wait, so you're saying we should move the *entire* std.meta and
 std.traits to druntime...?

I'd like to be more selective than that; at very least, audit every
single symbol coming across. But some part of me thinks this may
actually be a very reasonable proposal...

Jan 07 2019

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, January 7, 2019 3:10:02 PM MST H. S. Teoh via Digitalmars-d 
wrote:
 On Mon, Jan 07, 2019 at 04:54:15PM -0500, Steven Schveighoffer via 

Digitalmars-d wrote:
 On 1/7/19 4:41 PM, H. S. Teoh wrote:
 On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d
 wrote:
 [...]

 Perhaps AliasSeq should live somewhere different?  I'm feeling
 like a lean/trimmed-down core.meta might want to exist next to
 core.traits though; it seems reasonable.

 I'm afraid this would set the wrong precedent -- since there's
 core.traits for std.traits and core.meta for std.meta, why not also
 have core.typecons, core.range, and then it's all gonna go downhill
 from there, and before you know it there's gonna be core.stdio and
 core.format... *shudder*

 std.internal.traits contains pieces of std.meta -- a quick look shows
 it has AliasSeq (but under the name TypeTuple),

 What, wut...?  `TypeTuple` still exists?! I thought we had gone through
 a somewhat painful deprecation cycle just to kill it off. Or is that
 cycle not done yet...?

We're talking about druntime internals here. TypeTuple was one of the things
that got copied into druntime as internal. It was later renamed to AliasSeq
in Phobos, but the copied stuff in druntime didn't necessarily get changed,
since it was all internal and had nothing to do with the Phobos stuff it
originated from.

 allSatisfy, anySatisfy, Filter, staticMap. We might as well just move
 the whole thing there.

 I dunno, IMO allSatisfy, anySatisfy, and esp. Filter and staticMap are
 all heavy-weight templates (not in terms of code complexity, but in the
 sheer amount of templates that will get instantiated when you use them,
 AKA compiler slowdown fuel).  I'm not so sure they should go into
 druntime.

Historically, we've tried to only put stuff in druntime that needs to be in
druntime for druntime to do what it does. That does sometimes get stretched
(e.g. we put all of the OS bindings in druntime rather than just the ones
that druntime needs), but Phobos is the standard library, not druntime.
druntime is the runtime. If something needs to be there so that the runtime
can do its thing, then we put it in druntime, but not much else should be
there. Certainly, anything that is there needs to be stuff that's actually
core. So, in that respect, it's pretty questionable to move all of
std.traits and std.meta into druntime wholesale.

What we had for a while was traits just being copied into druntime where
they were needed, resulting in duplicate implementations in various places
(especially for TypeTuple). Later, they were largely consolidated into
core.internal.traits, but they were still completely separate from Phobos.
More recently, the traits in core.internal.traits have had their std.traits
implementations turned into simple wrappers that use the core.traits symbols
(they're not aliased, because that doesn't work well with the documentation,
so you get an extra layer of template with every trait whose implementation
is in druntime).

So, if Manu needs a trait for something in druntime, the normal thing to do
at this point would be to just add it to core.internal.traits and then
potentially make Phobos wrap the druntime symbol. We don't actually need to
move anything wholesale, nor do we need something like core.traits which is
intended to make the traits publicly available from druntime.

Another issue here if you actually look at core.traits and std.meta is that
there are actually several templates which use other pieces of Phobos. e.g.
isInputRange gets used in std.meta.aliasSeqOf as does std.array.array, and
std.traits.packageName uses startsWith. So, while many of the templates from
std.traits and std.meta could be moved, actually trying to move them all
wouldn't actually work without reimplementing yet other pieces of Phobos.

Personally, I think that the only real benefit of having something like
core.traits over having core.internal.traits is that you can move the
documentation for those symbols to their druntime implementations and make
the Phobos implementations actual aliases with just a link to the druntime
documentation instead of needing a thin wrapper template. Anyone wanting to
avoid Phobos can still use those traits from std.traits and std.meta just
fine, since it wouldn't involve linking against Phobos, just importing the
module (even some of the traits that we can't move would work just fine,
because they'd just be pulling in other Phobos symbols that don't involve
linking). But it does get pretty weird to have some of the traits in core
and some in std with no real obvious distinction, since it's based on what
druntime needs. So, in that respect, keeping them internal is cleaner.

Either way, I think that it's quite clear that we can't move everything.

- Jonathan M Davis

Jan 07 2019

Nick Treleaven <nick geany.org> writes:

On Monday, 7 January 2019 at 21:54:15 UTC, Steven Schveighoffer 
wrote:
 std.internal.traits contains pieces of std.meta -- a quick look 
 shows it has AliasSeq (but under the name TypeTuple),

That's fine internally, but std.traits still uses *Tuple for ten 
public templates:
https://github.com/dlang/phobos/pull/6227

If we move any of those to core.traits, please can we finally fix 
the names.

 traits and meta are really part of the language,

Some in std.traits are tightly coupled to the language, e.g. 
isInteger. Some are utility templates, e.g. ConstOf, 
CopyConstness. I think only the former should be public in 
druntime. Select and select aren't even traits, they should have 
been in std.meta.

Except perhaps AliasSeq, (Alias, Instantiate) all of std.meta 
seems to be utility templates rather than language feature 
wrappers.

Jan 08 2019

Steven Schveighoffer <schveiguy gmail.com> writes:

On 1/8/19 1:59 PM, Nick Treleaven wrote:
 On Monday, 7 January 2019 at 21:54:15 UTC, Steven Schveighoffer wrote:
 std.internal.traits contains pieces of std.meta -- a quick look shows 
 it has AliasSeq (but under the name TypeTuple),

 
 That's fine internally, but std.traits still uses *Tuple for ten public 
 templates:
 https://github.com/dlang/phobos/pull/6227
 
 If we move any of those to core.traits, please can we finally fix the 
 names.

You'd have to convince Andrei, as he seemingly nixed the PR.

 traits and meta are really part of the language,

 
 Some in std.traits are tightly coupled to the language, e.g. isInteger. 
 Some are utility templates, e.g. ConstOf, CopyConstness. I think only 
 the former should be public in druntime. Select and select aren't even 
 traits, they should have been in std.meta.

What I mean is that you reach for things like what is available in 
std.traits quite often when doing template constraints or type 
manipulation. I'd consider ConstOf and CopyConstness to be in that group.

 Except perhaps AliasSeq, (Alias, Instantiate) all of std.meta seems to 
 be utility templates rather than language feature wrappers.

What I mean is that I consider them part of D language, not like a 
library feature that is optional.

But in any case, much of std.traits relies on std.meta. We would at 
least need the parts in std.meta that are used to build std.traits.

-Steve

Jan 08 2019

Manu <turkeyman gmail.com> writes:

On Mon, Jan 7, 2019 at 1:42 PM H. S. Teoh via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Mon, Jan 07, 2019 at 01:25:17PM -0800, Manu via Digitalmars-d wrote:
 [...]
 Okay, so this is a challenging effort, since phobos is such a tangled
 rats nets of chaos...

 It's already gotten better over the years in some ways (though not
 others -- unfortunately I'm afraid std.traits might be one of the places
 where things have probably gotten more tangled).  It certainly hasn't
 lived up to the promise of that old Phobos Philosophy page that once
 existed but has since been removed, of Phobos being a collection of
 lightweight, self-contained, mostly-orthogonal, reusable components.  It
 has become quite the opposite, where the dependency graph of Phobos
 modules is approaching pretty close to being a complete graph.  (And
 yes, there are some pretty deep-seated cyclic dependencies that thus far
 nobody has been able to truly unravel in any satisfactory way.)


 But attempting to move some traits immediately calls into question
 std.meta.  I think we can all agree that Alias and AliasSeq should be
 in druntime along with core traits... but where should it live?
 Should there be core.meta as well? It's kinda like core.traits, in
 that it doesn't include runtime code, it doesn't increase the payload
 of druntime.lib for end-users..

 Shouldn't all of core.traits be like that?  I'd hardly expect any
 runtime component to be associated with something called 'traits'.


 Perhaps AliasSeq should live somewhere different?
 I'm feeling like a lean/trimmed-down core.meta might want to exist
 next to core.traits though; it seems reasonable.

 I'm afraid this would set the wrong precedent -- since there's
 core.traits for std.traits and core.meta for std.meta, why not also have
 core.typecons, core.range, and then it's all gonna go downhill from
 there, and before you know it there's gonna be core.stdio and
 core.format... *shudder*

Well, I think you're kinda catastrophising here... I said very clearly
"The only way forward is to take each hurdle one at a time", and I
mean that as literally as possible.
One at a time...

The question is: core.meta... yeah?
No precedents are being set.. we're not opening flood gates, it's a
singular question.

 ...yes, this process will go on and on. The only way forward is to
 take each hurdle one at a time... and ideally, in attempting this
 effort, we can de-tangle a lot of cruft during the process.

 I'm tempted to say we should put everything in core.traits for now. And
 just the absolute bare minimum it takes to meet whatever druntime needs,
 and nothing more.

Are you saying we should put *everything* in core.traits? That is, put
AliasSeq in core.traits? It's objectively NOT a 'traits'...

Jan 07 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Saturday, 5 January 2019 at 21:12:54 UTC, Manu wrote:

 We should move them to core.traits, and that should be their 
 official home. It really just makes sense. Uncontroversial 
 low-level traits don't belong in phobos.

I'm not really doing much with D anymore, so I apologize for 
interrupting the conversation, but I had an idea I wanted to 
offer just in case it might help.

This was something I tried to tackle about 6 months ago, called 
it "UtiliD" 
(https://forum.dlang.org/post/wgkbamnlraustaycbbya forum.dlang.org).  I
ultimately failed due to the tangle of Phobos, and because my life priorities
changed I gave up and deleted my repository (oops!).

Anyway, my suggestion is to create a new library separate from 
druntime and phobos that has no dependencies whatsoever (no libc, 
no libstdc++, no OS dependencies, no druntime dependency, etc.).  
I mean it; **no dependencies**.  Not even object.d.  The only 
thing it should require is a D compiler.

That library can then be imported by druntime, phobos, betterC 
builds, or even the compiler itself. It will take strict 
enforcement of the "no dependency" rule and good judgment to keep 
the scope from ballooning, but it may be a good place for things 
like `traits`, `meta` and others.

Hope I'm not just making noise.

Mike

Jan 07 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Tuesday, 8 January 2019 at 01:44:08 UTC, Mike Franklin wrote:

 Anyway, my suggestion is to create a new library separate from 
 druntime and phobos that has no dependencies whatsoever (no 
 libc, no libstdc++, no OS dependencies, no druntime dependency, 
 etc.).  I mean it; **no dependencies**.  Not even object.d.  
 The only thing it should require is a D compiler.

 That library can then be imported by druntime, phobos, betterC 
 builds, or even the compiler itself. It will take strict 
 enforcement of the "no dependency" rule and good judgment to 
 keep the scope from ballooning, but it may be a good place for 
 things like `traits`, `meta` and others.

I spent some time trying to think through some of the issues with 
druntime, and came up with this:

Right now, druntime is somewhat of a monolith trying to be too 
many things.
   * utilities (traits, string utilities, type conversion 
utilities, etc...)
   * compiler lowerings
   * C standard library bindings
   * C++ standard library bindings
   * C standard library bindings
   * Operating system bindings
   * OS abstractions (thread, fibers, context switching, etc...)
   * Compiler lowerings
   * DWARF implementation
   * TLS implementation
   * GC
   * (probably more)

So, I suggest something like this:
----------------------------------
* core.util - a.k.a utiliD - Just utility implementations written 
in D (e.g `std.traits`, `std.meta`, etc. No dependencies 
whatsoever. No operating system or platform abstractions. No 
high-level language features(e.g. exceptions)
     * public imports: (none)
     * private imports: (none)

* core.stdc - C standard library bindings - libc functions 
verbatim; no convenience or utility implementations
     * public imports: (none)
     * private imports: core.util

* core.stdcpp - C++ standard library bindings - libstdc++ data 
structures verbatim; no convenience or utility implementations
     * public imports: (none)
     * private imports: core.util

* sys - OS/Platform bindings - operating system implementations 
verbatim; no convenience or utility implementations
     * public imports: (none)
     * private imports: core.util

* core.pal - Platform/OS abstractions - threads, fibers, context 
switching, etc.
     * public imports: (none)
     * private imports: core.util, sys, core.libc

* core.d - compiler support (compiler lowerings, runtime 
initialization, TLS implementation, DWARF implementation, GC, 
etc...)
     * public imports : core.util
     * private imports : core.pal

* druntime - Just a top-level package containing public imports, 
aliases, and compiler support. No other implementations
     * public imports: core.pal, core.d
     * private imports: core.util

* std - phobos
     * public imports: (none)
     * private imports: druntime

There are likely other suitable ways to organize it, but that's 
just what I could come up with after thinking through it a little.

I would prefer if each of those were in their own repository and 
even move some of them to Deimos or dub, but that would probably 
irritate a lot of people.  I'd also prefer to have each of those 
in their own packages, but D is probably too deep in technical 
debt for that.  (See also 
https://issues.dlang.org/show_bug.cgi?id=11666)

So, to make it more palatable, I suggest:
-----------------------------------------
   * `core.util` gets own repository so it can be independently 
added to other repositories as a self-contained/freestanding 
dependency

   * `core.stdc`, `core.stdcpp`, `sys`, `core.pal`, and `core.d` 
all go into the druntime monolith like it is today.

   * phobos remains much like it is today.

In the context of the discussion at hand, `std.traits`, 
`std.meta`, and other utilities can be moved to `core.util`.  
`core.util` can then be added as a dependency to dmd, druntime, 
and phobos.  The rest will probably have to wait for D3 :/

Mike

Jan 07 2019

Jacob Carlborg <doob me.com> writes:

On 2019-01-08 06:37, Mike Franklin wrote:

 I spent some time trying to think through some of the issues with 
 druntime, and came up with this:
 
 Right now, druntime is somewhat of a monolith trying to be too many things.
    * utilities (traits, string utilities, type conversion utilities, 
 etc...)
    * compiler lowerings
    * C standard library bindings
    * C++ standard library bindings
    * C standard library bindings
    * Operating system bindings
    * OS abstractions (thread, fibers, context switching, etc...)
    * Compiler lowerings
    * DWARF implementation
    * TLS implementation
    * GC
    * (probably more)
 
 So, I suggest something like this:
 ----------------------------------
 * core.util - a.k.a utiliD - Just utility implementations written in D 
 (e.g `std.traits`, `std.meta`, etc. No dependencies whatsoever. No 
 operating system or platform abstractions. No high-level language 
 features(e.g. exceptions)
      * public imports: (none)
      * private imports: (none)
 
 * core.stdc - C standard library bindings - libc functions verbatim; no 
 convenience or utility implementations
      * public imports: (none)
      * private imports: core.util
 
 * core.stdcpp - C++ standard library bindings - libstdc++ data 
 structures verbatim; no convenience or utility implementations
      * public imports: (none)
      * private imports: core.util
 
 * sys - OS/Platform bindings - operating system implementations 
 verbatim; no convenience or utility implementations
      * public imports: (none)
      * private imports: core.util
 
 * core.pal - Platform/OS abstractions - threads, fibers, context 
 switching, etc.
      * public imports: (none)
      * private imports: core.util, sys, core.libc
 
 * core.d - compiler support (compiler lowerings, runtime initialization, 
 TLS implementation, DWARF implementation, GC, etc...)
      * public imports : core.util
      * private imports : core.pal
 
 * druntime - Just a top-level package containing public imports, 
 aliases, and compiler support. No other implementations
      * public imports: core.pal, core.d
      * private imports: core.util
 
 * std - phobos
      * public imports: (none)
      * private imports: druntime
 
 There are likely other suitable ways to organize it, but that's just 
 what I could come up with after thinking through it a little.
 
 I would prefer if each of those were in their own repository and even 
 move some of them to Deimos or dub, but that would probably irritate a 
 lot of people.  I'd also prefer to have each of those in their own 
 packages, but D is probably too deep in technical debt for that.  (See 
 also https://issues.dlang.org/show_bug.cgi?id=11666)
 
 So, to make it more palatable, I suggest:
 -----------------------------------------
    * `core.util` gets own repository so it can be independently added to 
 other repositories as a self-contained/freestanding dependency
 
    * `core.stdc`, `core.stdcpp`, `sys`, `core.pal`, and `core.d` all go 
 into the druntime monolith like it is today.
 
    * phobos remains much like it is today.
 
 In the context of the discussion at hand, `std.traits`, `std.meta`, and 
 other utilities can be moved to `core.util`. `core.util` can then be 
 added as a dependency to dmd, druntime, and phobos.  The rest will 
 probably have to wait for D3 :/

I like this approach.

-- 
/Jacob Carlborg

Jan 09 2019

kinke <noone nowhere.com> writes:

On Tuesday, 8 January 2019 at 01:44:08 UTC, Mike Franklin wrote:
 Anyway, my suggestion is to create a new library separate from 
 druntime and phobos that has no dependencies whatsoever (no 
 libc, no libstdc++, no OS dependencies, no druntime dependency, 
 etc.).  I mean it; **no dependencies**.  Not even object.d.  
 The only thing it should require is a D compiler.

 That library can then be imported by druntime, phobos, betterC 
 builds, or even the compiler itself. It will take strict 
 enforcement of the "no dependency" rule and good judgment to 
 keep the scope from ballooning, but it may be a good place for 
 things like `traits`, `meta` and others.

I also feel the need for at least 1 another base library. My 
focus is on the fundamental compiler support functions, like 
initializing/comparing/copying arrays and general associative 
arrays support, as they are fundamental to the language and their 
compilers (not talking about TypeInfos, ModuleInfos, Object etc.).
I think we need such a base library in order to improve -betterC 
and its available language features. The important thing would be 
to try to reduce the external dependencies of that lib to an 
absolute minimum, similar to rust's core library (just 5 symbols: 
mem{cpy,cmp,set} + rust_begin_panic + rust_eh_personality), 
although we'll probably need some some primitives, e.g., 
malloc/realloc/free. If that's possible, using D for bare-metal 
targets without a C library (e.g., a future WebAssembly version 
with direct access to GC, or your own OS kernel/firmware) would 
probably become awesome, as you'd only need to implement maybe a 
dozen of symbols.

Jan 08 2019

Steven Schveighoffer <schveiguy gmail.com> writes:

On 1/8/19 4:23 PM, kinke wrote:
 On Tuesday, 8 January 2019 at 01:44:08 UTC, Mike Franklin wrote:
 Anyway, my suggestion is to create a new library separate from 
 druntime and phobos that has no dependencies whatsoever (no libc, no 
 libstdc++, no OS dependencies, no druntime dependency, etc.).  I mean 
 it; **no dependencies**.  Not even object.d. The only thing it should 
 require is a D compiler.

 That library can then be imported by druntime, phobos, betterC builds, 
 or even the compiler itself. It will take strict enforcement of the 
 "no dependency" rule and good judgment to keep the scope from 
 ballooning, but it may be a good place for things like `traits`, 
 `meta` and others.

 
 I also feel the need for at least 1 another base library. My focus is on 
 the fundamental compiler support functions, like 
 initializing/comparing/copying arrays and general associative arrays 
 support, as they are fundamental to the language and their compilers 
 (not talking about TypeInfos, ModuleInfos, Object etc.).

This is self-contradictory, as AA's require TypeInfo.

Though I agree with the goal. It's just not a "now" goal, we first need 
to fix these components so they DON'T depend on such things as TypeInfo.

-Steve

Jan 08 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Tuesday, 8 January 2019 at 21:26:51 UTC, Steven Schveighoffer 
wrote:

 I also feel the need for at least 1 another base library. My 
 focus is on the fundamental compiler support functions, like 
 initializing/comparing/copying arrays and general associative 
 arrays support, as they are fundamental to the language and 
 their compilers (not talking about TypeInfos, ModuleInfos, 
 Object etc.).

 This is self-contradictory, as AA's require TypeInfo.

 Though I agree with the goal. It's just not a "now" goal, we 
 first need to fix these components so they DON'T depend on such 
 things as TypeInfo.

Steven is right (as usual) here.  There has to be a serious 
effort to remove the dependency on runtime information that is 
available at compile-time.  I tried quite hard on that in 
2017~2018, but I ran into all sorts of problems.

Exhibit A:
We can set an array's length in ` safe`, `nothrow`, `pure` code. 
But, it gets lowered to a runtime hook that is neither ` safe`, 
`nothrow`, nor `pure` 
(https://github.com/dlang/druntime/blob/e47a00bff935c3f079bb567a6ec97663ba384487/src/r
/lifetime.d#L1265).  In other words, the compiler-runtime interface is a lie. 
So, if you try to rewrite that as a template to remove the dependency on
`TypeInfo`, then the template will run through the semantic phase of the
compiler and now you have to be honest, and it doesn't compiler. So, to make
that work you have to make all of the code that `_d_arraysetlengthT` calls
` safe`, `nothrow`, nor `pure` to prevent breakage, you'll find that none of it
compiles because the "turtles at the bottom" (i.e. `memcpy`, `malloc`, etc...)
aren't `pure` or whatever attribute constraint you're trying to apply.

Exhibit B:
I tried to convert `_d_arraycast` to a template in 
https://github.com/dlang/druntime/pull/2268 and ran into similar 
problems.  Some tried to help with a `pureMalloc` implementation 
in https://github.com/dlang/druntime/pull/2276, but that didn't 
go well either.  Walter responded with "Since realloc() free's 
memory, it cannot ever be considered pure."  Well, what the hell 
are we supposed to do then? IMO, this having dynamic stack 
allocation for arrays and strings will help 
(https://issues.dlang.org/show_bug.cgi?id=18788).  GDC and LDC 
already provide this, but DMD's implementation is in druntime 
(https://github.com/dlang/druntime/blob/9a8edfb48e4842180c706ee26ebd8edb10be53
4/src/rt/alloca.d), so it requires linking in druntime, and now we're at a
catch 22.  I asked Walter for help with this, as it is beyond my current
skills, but he said he didn't have time.

Here's what I think will help:
1.  Get `alloca` or dynamic stack array allocation working.  This 
will help a lot because we won't have to reach for `malloc` and 
friends for simple allocations like generating dynamic assert 
messages
2.  Convert `memcpy`, `memset`, and `memcmp` to strongly-typed D 
templates so they can be used in the implementations when 
converting runtime hooks to templates.  I did some exploration on 
that and published my results at 
https://github.com/JinShil/memcpyD.  Unfortunately, DMD is 
missing an AVX512 implementation so I couldn't continue.

Lots of obstacles here and I don't see it happening without 
Walter and Andrei making it a priority.

Mike

Jan 08 2019

Neia Neutuladh <neia ikeran.org> writes:

On Wed, 09 Jan 2019 02:32:50 +0000, Mike Franklin wrote:
 I tried to convert `_d_arraycast` to a template in
 https://github.com/dlang/druntime/pull/2268 and ran into similar
 problems.  Some tried to help with a `pureMalloc` implementation in
 https://github.com/dlang/druntime/pull/2276, but that didn't go well
 either.  Walter responded with "Since realloc() free's memory, it cannot
 ever be considered pure."  Well, what the hell are we supposed to do
 then?

The specific thing that he replied to was having a public symbol for 
realloc that was considered pure. Perhaps a private fakePureRealloc() would 
be more palatable?

Jan 08 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Wednesday, 9 January 2019 at 03:32:17 UTC, Neia Neutuladh 
wrote:

 The specific thing that he replied to was having a public 
 symbol for realloc that was considered pure. Perhaps a private 
 fakePureRealloc() would be more palatable?

Perhaps; I'm not sure.  The `pureMalloc` implementation is a lot 
of clever hackery anyway, so I think it would be best to just 
implement stack-allocated dynamic arrays (i.e. 
https://issues.dlang.org/show_bug.cgi?id=18788) and avoid the 
games.  That would have solved the immediate need I had for 
converting runtime hooks to templates, and would help some of 
that work move forward.

Mike

Jan 08 2019

Jacob Carlborg <doob me.com> writes:

On 2019-01-09 03:32, Mike Franklin wrote:

 Here's what I think will help:
 1.  Get `alloca` or dynamic stack array allocation working.  This will 
 help a lot because we won't have to reach for `malloc` and friends for 
 simple allocations like generating dynamic assert messages

What's the problem with "alloca"?

 2.  Convert `memcpy`, `memset`, and `memcmp` to strongly-typed D 
 templates so they can be used in the implementations when converting 
 runtime hooks to templates.  I did some exploration on that and 
 published my results at https://github.com/JinShil/memcpyD.  
 Unfortunately, DMD is missing an AVX512 implementation so I couldn't 
 continue.

What do you mean "couldn't continue"? It's possible to implement 
"memcpy" without AVX512. Am I missing something?

-- 
/Jacob Carlborg

Jan 09 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Wednesday, 9 January 2019 at 11:01:46 UTC, Jacob Carlborg 
wrote:
 On 2019-01-09 03:32, Mike Franklin wrote:

 Here's what I think will help:
 1.  Get `alloca` or dynamic stack array allocation working.  
 This will help a lot because we won't have to reach for 
 `malloc` and friends for simple allocations like generating 
 dynamic assert messages

 What's the problem with "alloca"?

In DMD you can't use it without linking in the runtime, but in 
LDC and GDC, you can.  One of the goals of implementing these 
runtime hooks as templates is to make more features available in 
-betterC builds, or for pay-as-you-go runtime implementations.  
If you need to link in druntime to get `alloca`, you can't 
implement the runtime hooks as templates and have them work in 
-betterC.

 2.  Convert `memcpy`, `memset`, and `memcmp` to strongly-typed 
 D templates so they can be used in the implementations when 
 converting runtime hooks to templates.  I did some exploration 
 on that and published my results at 
 https://github.com/JinShil/memcpyD.  Unfortunately, DMD is 
 missing an AVX512 implementation so I couldn't continue.

 What do you mean "couldn't continue"? It's possible to 
 implement "memcpy" without AVX512. Am I missing something?

Yes, it's possible, but I don't think it will ever be accepted if 
it doesn't perform at least as well as the optimized versions in 
C or assembly that use AVX512 or other SIMD features.  It needs 
to be at least as good as what libc provides, so we need to be 
able to leverage these unique hardware features to get the best 
performance.

Mike

Jan 09 2019

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Wednesday, 9 January 2019 at 11:49:40 UTC, Mike Franklin wrote:
 On Wednesday, 9 January 2019 at 11:01:46 UTC, Jacob Carlborg 
 wrote:
 On 2019-01-09 03:32, Mike Franklin wrote:

 Here's what I think will help:
 1.  Get `alloca` or dynamic stack array allocation working.  
 This will help a lot because we won't have to reach for 
 `malloc` and friends for simple allocations like generating 
 dynamic assert messages

 What's the problem with "alloca"?

 In DMD you can't use it without linking in the runtime, but in 
 LDC and GDC, you can.  One of the goals of implementing these 
 runtime hooks as templates is to make more features available 
 in -betterC builds, or for pay-as-you-go runtime 
 implementations.  If you need to link in druntime to get 
 `alloca`, you can't implement the runtime hooks as templates 
 and have them work in -betterC.

 2.  Convert `memcpy`, `memset`, and `memcmp` to 
 strongly-typed D templates so they can be used in the 
 implementations when converting runtime hooks to templates.  
 I did some exploration on that and published my results at 
 https://github.com/JinShil/memcpyD.  Unfortunately, DMD is 
 missing an AVX512 implementation so I couldn't continue.

 What do you mean "couldn't continue"? It's possible to 
 implement "memcpy" without AVX512. Am I missing something?

 Yes, it's possible, but I don't think it will ever be accepted 
 if it doesn't perform at least as well as the optimized 
 versions in C or assembly that use AVX512 or other SIMD 
 features.  It needs to be at least as good as what libc 
 provides, so we need to be able to leverage these unique 
 hardware features to get the best performance.

AVX512 concerns only a very small part of processors on the 
market (Skylake, Canon Lake and Cascade Lake). AMD will never 
implement it and the number of people upgrading to one of the 
lake cpus from some recent chip is also not that great.
I don't see why not having it implemented yet is blocking 
anything. People who really need AVX512 performance will have 
implemented memcpy themselves already and for the others, they 
will have to wait a little bit. It's not as if it couldn't be 
added later. I really don't understand the problem.
This said, another issue with memcpy that very often gets lost is 
that, because of the fancy benchmarking, its system performance 
cost is often wrongly assessed, and a lot of heroic efforts are 
put in optimizing big block transfers, while in reality it's 
mostly called on small (postblit) to medium blocks. Linus 
Torvalds had once a rant on that subject on realworldtech.
https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589

Jan 09 2019

bioinfornatics <bioinfornatics fedoraproject.org> writes:

On Wednesday, 9 January 2019 at 12:31:13 UTC, Patrick Schluter
wrote:
On Wednesday, 9 January 2019 at 11:49:40 UTC, Mike Franklin
wrote:
[...]

AVX512 concerns only a very small part of processors on the
market (Skylake, Canon Lake and Cascade Lake). AMD will never
implement it and the number of people upgrading to one of the
lake cpus from some recent chip is also not that great.
I don't see why not having it implemented yet is blocking
anything. People who really need AVX512 performance will have
implemented memcpy themselves already and for the others, they
will have to wait a little bit. It's not as if it couldn't be
added later. I really don't understand the problem.
This said, another issue with memcpy that very often gets lost
is that, because of the fancy benchmarking, its system
performance cost is often wrongly assessed, and a lot of heroic
efforts are put in optimizing big block transfers, while in
reality it's mostly called on small (postblit) to medium
blocks. Linus Torvalds had once a rant on that subject on
realworldtech.
https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589

By reading (quiclkly) these articles:
-
https://lemire.me/blog/2018/04/19/by-how-much-does-avx-512-slow-down-your-cpu-a-first-experiment/
-
https://lemire.me/blog/2018/09/07/avx-512-when-and-how-to-use-these-new-instructions/

it seem that using avx512 can be good if you pin a thread to a
core in order to process only avx512 statement.

Jan 09 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jan 09, 2019 at 12:31:13PM +0000, Patrick Schluter via Digitalmars-d
wrote:
[...]
 This said, another issue with memcpy that very often gets lost is
 that, because of the fancy benchmarking, its system performance cost
 is often wrongly assessed, and a lot of heroic efforts are put in
 optimizing big block transfers, while in reality it's mostly called on
 small (postblit) to medium blocks.

EXACTLY!!!

Some time ago I took an interest in implementing the equivalent of
strchr in the most optimized way possible. For that, I wrote several of
my own algorithms and also perused the glibc implementation.

Eventually, I realized that the glibc implementation, which uses fancy
64-bit-word scanning with a lot of setup overhead and messy
starting/trailing cases, is optimizing for very large scans, i.e., when
the byte being sought occurs only rarely in a very large haystack.  In
those cases it's at the top of benchmarks.  However, in the arguably
more common case where the byte being sought occurs relatively
frequently in small- to medium-sized haystacks, repeatedly searching the
haystack incurs a ton of overhead setting up all that fancy machinery,
branch hazards, and what-not, where a plain ole `while (*ptr++ !=
needle) {}` works much better.

I suspect many of the C library functions of this sort (incl. memcpy +
friends) have a tendency to suffer from this sort of premature
optimization.

Not to mention that often overly-specialized benchmarks of this sort
fail to account for bias caused by the CPU's branch predictor learning
the benchmark and the cache hierarchy amortizing the cost of repeatedly
searching the same haystack -- things you rarely do in real-life
applications.  There's a big risk of your "super-optimized" algorithm
ending up optimizing for an unrealistic use-case, but having only
mediocre or sometimes even poor performance in real-world computations.


 Linus Torvalds had once a rant on that subject on realworldtech.
 https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589

Nice.


T

-- 
If the comments and the code disagree, it's likely that *both* are wrong. --
Christopher

Jan 09 2019

jmh530 <john.michael.hall gmail.com> writes:

On Wednesday, 9 January 2019 at 17:40:38 UTC, H. S. Teoh wrote:
 [snip]

 EXACTLY!!!

 Some time ago I took an interest in implementing the equivalent 
 of strchr in the most optimized way possible. For that, I wrote 
 several of my own algorithms and also perused the glibc 
 implementation.

 Eventually, I realized that the glibc implementation, which 
 uses fancy 64-bit-word scanning with a lot of setup overhead 
 and messy starting/trailing cases, is optimizing for very large 
 scans, i.e., when the byte being sought occurs only rarely in a 
 very large haystack.  In those cases it's at the top of 
 benchmarks.  However, in the arguably more common case where 
 the byte being sought occurs relatively frequently in small- to 
 medium-sized haystacks, repeatedly searching the haystack 
 incurs a ton of overhead setting up all that fancy machinery, 
 branch hazards, and what-not, where a plain ole `while (*ptr++ 
 != needle) {}` works much better.

 I suspect many of the C library functions of this sort (incl. 
 memcpy + friends) have a tendency to suffer from this sort of 
 premature optimization.

 Not to mention that often overly-specialized benchmarks of this 
 sort fail to account for bias caused by the CPU's branch 
 predictor learning the benchmark and the cache hierarchy 
 amortizing the cost of repeatedly searching the same haystack 
 -- things you rarely do in real-life applications.  There's a 
 big risk of your "super-optimized" algorithm ending up 
 optimizing for an unrealistic use-case, but having only 
 mediocre or sometimes even poor performance in real-world 
 computations.


One thing I like about libmir's sum function
http://docs.algorithm.dlang.io/latest/mir_math_sum.html
was that the algorithm you use to return the sum can be chosen 
with an enum on the template. So it's really a collection of 
different sum algorithms all in one. Set the default as something 
reasonable and then let the user decide if they want something 
else.

Jan 09 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jan 09, 2019 at 06:55:30PM +0000, jmh530 via Digitalmars-d wrote:
[...]
 One thing I like about libmir's sum function
 http://docs.algorithm.dlang.io/latest/mir_math_sum.html
 was that the algorithm you use to return the sum can be chosen with an
 enum on the template. So it's really a collection of different sum
 algorithms all in one. Set the default as something reasonable and
 then let the user decide if they want something else.

That's an excellent idea.  Have a generic default algorithm that
performs reasonably well in typical use cases, but also give the user
the power to choose a different algorithm if he knows that it would work
better with his particular use case.

Empowering the user -- over time I've come to learn that this is always
the best approach to API design.  It's one that has the best chance of
standing the test of time.  Fancy APIs that don't pay enough attention
to this principle tend to eventually fade into obscurity.


T

-- 
I am Ohm of Borg. Resistance is voltage over current.

Jan 09 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Wednesday, 9 January 2019 at 19:25:35 UTC, H. S. Teoh wrote:

 That's an excellent idea.  Have a generic default algorithm 
 that performs reasonably well in typical use cases, but also 
 give the user the power to choose a different algorithm if he 
 knows that it would work better with his particular use case.

 Empowering the user -- over time I've come to learn that this 
 is always the best approach to API design.  It's one that has 
 the best chance of standing the test of time.  Fancy APIs that 
 don't pay enough attention to this principle tend to eventually 
 fade into obscurity.

Yes, this is one of the benefits of making `memcpy(T)(T* dest, T* 
src)` instead of `memcpy(void* dest, void* src, size_t num)`.  
One can generate a `memcpy` at compile-time that is optimized to 
the machine that the program is being compiled on (or for).  
druntime could expose "memcpy configuration settings" for users 
to tune at compile-time.

But, then you have to deal with distribution of binaries.  If you 
are compiling a binary that you want to be able to run on all 
Intel 64-bit PCs, for example, you can't do that tuning at 
compile-time; it has to be done at runtime.  Assuming my 
understanding is correct, Agner Fog's implementation sets a 
function pointer to the most optimized implementation for the 
machine the program is running on based on an inspection fo the 
CPU's capabilities at the first invocation of `memcpy`.

There's a lot of things like this to consider in order to create 
a professional `memcpy` implementation.

Personally, I'd just like to put the infrastructure in place so 
those more talented than I can tune it. But as I said before, 
that first PR that puts said infrastructure in place needs to be 
justified, and I predict it will be difficult to overcome bias 
and perception.  Reading the comments in this thread fills me 
with a little more optimism that I'm not the only one who thinks 
it's a good idea. But, we still need dynamic stack allocation 
first before any of this can happen.

Mike

Jan 09 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Wednesday, 9 January 2019 at 12:31:13 UTC, Patrick Schluter
wrote:

Yes, I agree, and even the newer chips have "Enhanced REP MOVSB
and STOSB operation (ERMSB)" which can compensate. See
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-opt
mization-manual.pdf 3.7.6.

I don't see why not having it implemented yet is blocking
anything. People who really need AVX512 performance will have
implemented memcpy themselves already and for the others, they
will have to wait a little bit. It's not as if it couldn't be
added later. I really don't understand the problem.

I remember analyzing other implementations of `memcpy` and they
were all using AVX512. I had faith in the authors of those
implementations (e.g. Agner Fog) that they knew more than me, so
that was what I should be using. Perhaps I should revisit it and
just do the best that DMD can do.

But also keep in mind that there's a strategy to getting things
accepted in DMD and elsewhere. You are often battling
perception. The single most challenging aspect of implementing
`memcpy` in D is overcoming bias and justifying it to the
obstructionists that see it as a complete waste of time. If I
can't implement it in AVX512 simply for the purpose of
measurement and comparison, it will be more difficult to justify.

This said, another issue with memcpy that very often gets lost
is that, because of the fancy benchmarking, its system
performance cost is often wrongly assessed, and a lot of heroic
efforts are put in optimizing big block transfers, while in
reality it's mostly called on small (postblit) to medium
blocks. Linus Torvalds had once a rant on that subject on
realworldtech.
https://www.realworldtech.com/forum/?threadid=168200&curpostid=168589

I understand. I also encountered a lot of difficulting getting
consistent measurements in my exploration. Doing proper
measurement and analysis for this kind of thing is a skill in and
of itself.

You're right about the small copies being the norm. As part of
my exploration, I write a logging `memcpy` wrapper to see what
kind of copies DMD was doing when it compiled itself, and it was
as you describe.

Perhaps I'll give it another go at a later time, but we need to
get dynamic stack allocation working first because many of the
runtime hook implementations that will utilize `memcpy` do some
error checking and assertions, and we need to be able to generate
dynamic error messages for those assertions when the caller is
`pure`. We need a solution to this
(https://issues.dlang.org/show_bug.cgi?id=18788) first.

Mike

Jan 09 2019

Ethan <gooberman gmail.com> writes:

On Thursday, 10 January 2019 at 00:10:18 UTC, Mike Franklin wrote:
 I remember analyzing other implementations of `memcpy` and they 
 were all using AVX512.  I had faith in the authors of those 
 implementations (e.g. Agner Fog) that they knew more than me, 
 so that was what I should be using. Perhaps I should revisit it 
 and just do the best that DMD can do.

AVX512 is a superset of AVX2, is a superset of AVX, is a superset 
of SSE. I expect the implementations you were looking at are 
actually implemented in SSE, where SSE2 is a baseline expectation 
for x64 processors.

I've done some AVX2 code recently with 256-bit values. The 
performance is significantly slower on AMD processors. I assume 
their pipeline internally is still 128 bit as a result,
and while my 256-bit code can run faster on Intel it needs to run 
on AMD so I've dropped to 128-bit instructions at most - 
effectively keeping my code SSE4.1 compatible.

I've done a memset_pattern4[1] implementation in SSE previously. 
The important instruction group is _mm_stream. Which, you will 
note, was an instruction group first introduced in SSE1 and 
hasn't had additional writing stream functions added since since 
SSE 4.1[2].

[1] 
https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/memset_pattern4.3.html
[2] 
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=5119,5452,5443,5910,5288,5119,5249,5231&text=_mm_stream

Jan 10 2019

Ethan <gooberman gmail.com> writes:

On Thursday, 10 January 2019 at 10:13:57 UTC, Ethan wrote:
 I've done a memset_pattern4[1] implementation in SSE 
 previously. The important instruction group is _mm_stream. 
 Which, you will note, was an instruction group first introduced 
 in SSE1 and hasn't had additional writing stream functions 
 added since since SSE 4.1[2].

Where's the edit button. The last writing stream function was 
added in SSE2. A streaming load was added in SSE 4.1 I believe I 
used that load when optimising string compares.

Jan 10 2019

luckoverthere <luckoverthere gmail.cm> writes:

On Thursday, 10 January 2019 at 10:13:57 UTC, Ethan wrote:
On Thursday, 10 January 2019 at 00:10:18 UTC, Mike Franklin
wrote:
I remember analyzing other implementations of `memcpy` and
they were all using AVX512. I had faith in the authors of
those implementations (e.g. Agner Fog) that they knew more
than me, so that was what I should be using. Perhaps I should
revisit it and just do the best that DMD can do.

AVX512 is a superset of AVX2, is a superset of AVX, is a
superset of SSE. I expect the implementations you were looking
at are actually implemented in SSE, where SSE2 is a baseline
expectation for x64 processors.

I've done some AVX2 code recently with 256-bit values. The
performance is significantly slower on AMD processors. I assume
their pipeline internally is still 128 bit as a result,
and while my 256-bit code can run faster on Intel it needs to
run on AMD so I've dropped to 128-bit instructions at most -
effectively keeping my code SSE4.1 compatible.

I've done a memset_pattern4[1] implementation in SSE
previously. The important instruction group is _mm_stream.
Which, you will note, was an instruction group first introduced
in SSE1 and hasn't had additional writing stream functions
added since since SSE 4.1[2].

[1]
https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/memset_pattern4.3.html
[2]
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=5119,5452,5443,5910,5288,5119,5249,5231&text=_mm_stream

That's disappointing to learn. Ryzen has four 128-bit AVX units,
2 of them can only do addition and the other 2 can only do
multiplication. Not sure how the memory is shared between units
but if it isn't then it'd need to copy to be able to do an
addition then a multiplication.

Jan 10 2019

Ethan <gooberman gmail.com> writes:

On Thursday, 10 January 2019 at 21:01:09 UTC, luckoverthere wrote:
 That's disappointing to learn. Ryzen has four 128-bit AVX 
 units, 2 of them can only do addition and the other 2 can only 
 do multiplication. Not sure how the memory is shared between 
 units but if it isn't then it'd need to copy to be able to do 
 an addition then a multiplication.

The good news though is that Ryzen's 128-bit pipeline outperforms 
my Skylake i7 with this code. So you could say they've optimised 
for the majority usecase.

It's reaaaaaally beneficial to do 256-bit logic for my particular 
use case here since I'm sampling and operating on 8 32-bit values 
at a time to produce a 32-bit output. But eh, I've gotta write 
for the build farm hardware.

Jan 11 2019

bioinfornatics <bioinfornatics fedoraproject.org> writes:

On Friday, 11 January 2019 at 09:36:09 UTC, Ethan wrote:
 On Thursday, 10 January 2019 at 21:01:09 UTC, luckoverthere 
 wrote:
 That's disappointing to learn. Ryzen has four 128-bit AVX 
 units, 2 of them can only do addition and the other 2 can only 
 do multiplication. Not sure how the memory is shared between 
 units but if it isn't then it'd need to copy to be able to do 
 an addition then a multiplication.

 The good news though is that Ryzen's 128-bit pipeline 
 outperforms my Skylake i7 with this code. So you could say 
 they've optimised for the majority usecase.

 It's reaaaaaally beneficial to do 256-bit logic for my 
 particular use case here since I'm sampling and operating on 8 
 32-bit values at a time to produce a 32-bit output. But eh, 
 I've gotta write for the build farm hardware.

Hi ethan, could you share a piece of code to do that ?

thanks you

Jan 11 2019

Ethan <gooberman gmail.com> writes:

On Friday, 11 January 2019 at 11:10:10 UTC, bioinfornatics wrote:
 Hi ethan, could you share a piece of code to do that ?

 thanks you

Not really.

1) It's very context specific
2) It's for my current employer and is subject to the usual code 
disclosure NDAs

Jan 11 2019

bioinfornatics <bioinfornatics fedoraproject.org> writes:

On Friday, 11 January 2019 at 11:47:20 UTC, Ethan wrote:
 On Friday, 11 January 2019 at 11:10:10 UTC, bioinfornatics 
 wrote:
 Hi ethan, could you share a piece of code to do that ?

 thanks you

 Not really.

 1) It's very context specific
 2) It's for my current employer and is subject to the usual 
 code disclosure NDAs

OK I understand, no problem 😉
So I could try to use this idea for training. As example take 8 
value of 32 bit and return the sum or others...

But I though AMD had 2 units for sum and units for multiply. I 
need to get a better understanding on this topics 🤔

Jan 11 2019

Jacob Carlborg <doob me.com> writes:

On 2019-01-09 12:49, Mike Franklin wrote:

 In DMD you can't use it without linking in the runtime, but in LDC and 
 GDC, you can.  One of the goals of implementing these runtime hooks as 
 templates is to make more features available in -betterC builds, or for 
 pay-as-you-go runtime implementations. If you need to link in druntime 
 to get `alloca`, you can't implement the runtime hooks as templates and 
 have them work in -betterC.

Ah, I see.
 Yes, it's possible, but I don't think it will ever be accepted if it 
 doesn't perform at least as well as the optimized versions in C or 
 assembly that use AVX512 or other SIMD features.  It needs to be at 
 least as good as what libc provides, so we need to be able to leverage 
 these unique hardware features to get the best performance.

Perhaps it could be considered as a fallback when a "memcpy" isn't 
available.

-- 
/Jacob Carlborg

Jan 09 2019

Mike Franklin <slavo5150 yahoo.com> writes:

On Wednesday, 9 January 2019 at 19:24:28 UTC, Jacob Carlborg 
wrote:

 Yes, it's possible, but I don't think it will ever be accepted 
 if it doesn't perform at least as well as the optimized 
 versions in C or assembly that use AVX512 or other SIMD 
 features.  It needs to be at least as good as what libc 
 provides, so we need to be able to leverage these unique 
 hardware features to get the best performance.

 Perhaps it could be considered as a fallback when a "memcpy" 
 isn't available.

I'm not sure what you mean. DMD currently links in libc, so 
`memcpy` is always available.

Also, it's difficult for me to articulate, but we don't want 
`void* memcpy(void* destination, const void* source, size_t num)` 
rewritten in D.  We need `void memcpy(T)(T* destination, const T* 
source)` or some other strongly typed template like that. And as 
an aside, thanks to https://github.com/dlang/dmd/pull/8504 we now 
have to be careful about the order of arguments.

Anyway, I'm not sure there's much point in hashing this out right 
now.  We need dynamic stack allocation first before any of this 
can happen because the runtime hooks need to be able to generate 
dynamic assertion messages in -betterC, and there's only one 
person I know of that can do that (Walter), and I don't think 
it's a priority for him right now.

Mike

Jan 09 2019

D Programming

C/C++ Programming

Other

digitalmars.D - core.traits?