digitalmars.D - Use SIMD to accelerate comment lexing
- Walter Bright (2/2) Jun 01 2015 https://issues.dlang.org/show_bug.cgi?id=14641
- Brian Schott (3/6) Jun 01 2015 libdparse does this already. I added some information to that bug
- Walter Bright (2/4) Jun 01 2015 Thank you!
- Jonathan M Davis (6/13) Jun 01 2015 Honestly, I'm quite impressed with what I've heard you say about
- luminousone (5/8) Jun 01 2015 Looking at that code, I would think that some well placed
- Manu via Digitalmars-d (7/9) Jun 02 2015 How do you measure this? Is there a convenient setup that will produce
- Walter Bright (5/16) Jun 02 2015 Compile Phobos.
- Jonathan M Davis (4/6) Jun 02 2015 Plus, within a few months, we may have switched over to ddmd
- Iain Buclaw via Digitalmars-d (3/9) Jun 02 2015 And what would the D way be?
- Jonathan M Davis (4/19) Jun 02 2015 We have simd in the language, so we can use that, whereas Manu
- Iain Buclaw via Digitalmars-d (5/24) Jun 02 2015 I was being deliberately quizzical because there are different takes on
- Jonathan M Davis (7/13) Jun 02 2015 Well, Manu would know that a lot better than I would. I hadn't
- weaselcat (4/17) Jun 02 2015 D's simd library is difficult to use in comparison with gcc or
- Manu via Digitalmars-d (26/44) Jun 02 2015 I'll wear responsibility for this, but std.simd is proving really hard
- Johannes Pfau (8/14) Jun 02 2015 There are qemu images for most architectures supported by debian:
- Jacob Carlborg (10/21) Jun 03 2015 Isn't it possible to proceed without forceinline, to be able to finish
- Manu via Digitalmars-d (9/26) Jun 03 2015 The codegen is everything. Functionality is the easy part here ;)
- Iain Buclaw via Digitalmars-d (3/25) Jun 03 2015 Well, the compiler knows whether the types are supported natively at lea...
- Walter Bright (5/29) Jun 03 2015 I suggest not worrying about forceinline for the moment, and just write ...
- Andrei Alexandrescu (4/11) Jun 02 2015 Probably building Phobos is a good baseline. If simple experiments show
- deadalnix (13/26) Jun 02 2015 Well, I discussed that with clang people a while ago and here are
- Walter Bright (2/11) Jun 03 2015 Line numbers have to be kept track of as well.
- deadalnix (3/24) Jun 03 2015 They retrieve line number lazily when needed, with various
- Walter Bright (3/12) Jun 04 2015 Hmm. There's no way to get the line number without counting LFs, and tha...
- Brian Schott (9/11) Jun 04 2015 It would be nice if it was that simple.
- Walter Bright (2/10) Jun 04 2015 Yeah, you're right
- deadalnix (22/39) Jun 04 2015 Yes, the first time you query file number, clang build metadata
- Walter Bright (4/37) Jun 04 2015 It's an interesting approach. I generally shoot for making the debug bui...
- deadalnix (4/8) Jun 04 2015 In C++, you would not need line number for header, even in debug
https://issues.dlang.org/show_bug.cgi?id=14641 Manu, our resident god of vector instructions, do you want to take this on?
Jun 01 2015
On Monday, 1 June 2015 at 19:38:59 UTC, Walter Bright wrote:https://issues.dlang.org/show_bug.cgi?id=14641 Manu, our resident god of vector instructions, do you want to take this on?libdparse does this already. I added some information to that bug report that may be useful.
Jun 01 2015
On 6/1/2015 1:18 PM, Brian Schott wrote:libdparse does this already. I added some information to that bug report that may be useful.Thank you!
Jun 01 2015
On Monday, 1 June 2015 at 20:18:19 UTC, Brian Schott wrote:On Monday, 1 June 2015 at 19:38:59 UTC, Walter Bright wrote:Honestly, I'm quite impressed with what I've heard you say about the lengths that you've gone to in order to optimize your lexer and parser. _Very_ cool. And all the better if dmd gets some of the same optimizations. - Jonathan M Davishttps://issues.dlang.org/show_bug.cgi?id=14641 Manu, our resident god of vector instructions, do you want to take this on?libdparse does this already. I added some information to that bug report that may be useful.
Jun 01 2015
On Monday, 1 June 2015 at 19:38:59 UTC, Walter Bright wrote:https://issues.dlang.org/show_bug.cgi?id=14641 Manu, our resident god of vector instructions, do you want to take this on?Looking at that code, I would think that some well placed prefetch and Non Temporal move intrinsic's, would do more good then anything else. CPU speculative read ahead is not so smart as one would think it otta be.
Jun 01 2015
On 2 June 2015 at 05:39, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:https://issues.dlang.org/show_bug.cgi?id=14641 Manu, our resident god of vector instructions, do you want to take this on?How do you measure this? Is there a convenient setup that will produce a realistic test environment? This is more awkward in C than in D, it needs a different implementation for each compiler... will DMD's CI build with all common C compilers to prove that the implementations are correct?
Jun 02 2015
On 6/2/2015 5:27 AM, Manu via Digitalmars-d wrote:On 2 June 2015 at 05:39, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:Time how long the compiler takes when running a non-optimized build.https://issues.dlang.org/show_bug.cgi?id=14641 Manu, our resident god of vector instructions, do you want to take this on?How do you measure this?Is there a convenient setup that will produce a realistic test environment?Compile Phobos.This is more awkward in C than in D, it needs a different implementation for each compiler... will DMD's CI build with all common C compilers to prove that the implementations are correct?Just make it work with one compiler on one platform, the most convenient one. We can extend it to others later.
Jun 02 2015
On Tuesday, 2 June 2015 at 15:08:07 UTC, Walter Bright wrote:Just make it work with one compiler on one platform, the most convenient one. We can extend it to others later.Plus, within a few months, we may have switched over to ddmd (hopefully), in which case, you can just do it the D way. - Jonathan M Davis
Jun 02 2015
On 2 June 2015 at 19:11, Jonathan M Davis via Digitalmars-d < digitalmars-d puremagic.com> wrote:On Tuesday, 2 June 2015 at 15:08:07 UTC, Walter Bright wrote:And what would the D way be?Just make it work with one compiler on one platform, the most convenient one. We can extend it to others later.Plus, within a few months, we may have switched over to ddmd (hopefully), in which case, you can just do it the D way.
Jun 02 2015
On Tuesday, 2 June 2015 at 17:24:09 UTC, Iain Buclaw wrote:On 2 June 2015 at 19:11, Jonathan M Davis via Digitalmars-d < digitalmars-d puremagic.com> wrote:We have simd in the language, so we can use that, whereas Manu was talking about how he would have to do it differently in C/C++. - Jonathan M DavisOn Tuesday, 2 June 2015 at 15:08:07 UTC, Walter Bright wrote:And what would the D way be?Just make it work with one compiler on one platform, the most convenient one. We can extend it to others later.Plus, within a few months, we may have switched over to ddmd (hopefully), in which case, you can just do it the D way.
Jun 02 2015
On 2 June 2015 at 19:25, Jonathan M Davis via Digitalmars-d < digitalmars-d puremagic.com> wrote:On Tuesday, 2 June 2015 at 17:24:09 UTC, Iain Buclaw wrote:I was being deliberately quizzical because there are different takes on what you would call simd in the language, what set of types are available to you, what intrinsics are exposed (and how they are exposed), etc.On 2 June 2015 at 19:11, Jonathan M Davis via Digitalmars-d < digitalmars-d puremagic.com> wrote: On Tuesday, 2 June 2015 at 15:08:07 UTC, Walter Bright wrote:We have simd in the language, so we can use that, whereas Manu was talking about how he would have to do it differently in C/C++. - Jonathan M DavisJust make it work with one compiler on one platform, the most convenientone. We can extend it to others later.Plus, within a few months, we may have switched over to ddmd (hopefully), in which case, you can just do it the D way. And what would the D way be?
Jun 02 2015
On Tuesday, 2 June 2015 at 17:54:38 UTC, Iain Buclaw wrote:I was being deliberately quizzical because there are different takes on what you would call simd in the language, what set of types are available to you, what intrinsics are exposed (and how they are exposed), etc.Well, Manu would know that a lot better than I would. I hadn't even heard of SIMD before he got Walter to put it into the language, and I have yet to use it, though I do think that I need to look into it at some point to see where I could take advantage of it. - Jonathan M Davis
Jun 02 2015
On Tuesday, 2 June 2015 at 18:20:51 UTC, Jonathan M Davis wrote:On Tuesday, 2 June 2015 at 17:54:38 UTC, Iain Buclaw wrote:D's simd library is difficult to use in comparison with gcc or clang's extensions to C/C++. bye,I was being deliberately quizzical because there are different takes on what you would call simd in the language, what set of types are available to you, what intrinsics are exposed (and how they are exposed), etc.Well, Manu would know that a lot better than I would. I hadn't even heard of SIMD before he got Walter to put it into the language, and I have yet to use it, though I do think that I need to look into it at some point to see where I could take advantage of it. - Jonathan M Davis
Jun 02 2015
On 3 June 2015 at 07:18, weaselcat via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Tuesday, 2 June 2015 at 18:20:51 UTC, Jonathan M Davis wrote:I'll wear responsibility for this, but std.simd is proving really hard for me to finish. I think in order to get something in there to start with, I need to reduce the scope to the simplest bits, get them in, then build outwards. It's fairly large to cover everything I think is important, and there's a few tools missing still; I can't finish without some way to know the SIMD flags fed to the compiler from the command line (some standard versions?), and it's also difficult to resolve without forceinline of some sort. I've reached situations where the compiler(/s) just don't do what I want it to. Also, I think the stack of simd function influence the compilers inline heuristics, and even thought the compiler decides to inline simd functions, presence of many of them in an outer function seems to reduce the probability that the outer function will be inlined as it should. forceinline needs to be a hard statement to the compiler, and ideally, a forceinline call tree shouldn't improperly influence the compilers inline decisions for outer functions. As an aside, I need a test environment for each compiler, targetting x86, x64 and arm at least, where I can submit some code, and have it run the unittests on a matrix of appropriate targets. (ideally PPC and MIPS would also be included, so they can influence design decisions.) Does any such test system exist? A web service to provide this would be invaluable... I don't have all those systems available to me.On Tuesday, 2 June 2015 at 17:54:38 UTC, Iain Buclaw wrote:D's simd library is difficult to use in comparison with gcc or clang's extensions to C/C++. bye,I was being deliberately quizzical because there are different takes on what you would call simd in the language, what set of types are available to you, what intrinsics are exposed (and how they are exposed), etc.Well, Manu would know that a lot better than I would. I hadn't even heard of SIMD before he got Walter to put it into the language, and I have yet to use it, though I do think that I need to look into it at some point to see where I could take advantage of it. - Jonathan M Davis
Jun 02 2015
Am Wed, 3 Jun 2015 09:08:52 +1000 schrieb Manu via Digitalmars-d <digitalmars-d puremagic.com>:As an aside, I need a test environment for each compiler, targetting x86, x64 and arm at least, where I can submit some code, and have it run the unittests on a matrix of appropriate targets. (ideally PPC and MIPS would also be included, so they can influence design decisions.) Does any such test system exist? A web service to provide this would be invaluable... I don't have all those systems available to me.There are qemu images for most architectures supported by debian: https://people.debian.org/~aurel32/qemu/ You'll need a very recent qemu but it works better than expected. (I hope to utilize these for cross compiler testing in the future. But someone first needs to implement cross compiler testing in the dmd test suite...)
Jun 02 2015
On 2015-06-03 01:08, Manu via Digitalmars-d wrote:It's fairly large to cover everything I think is important, and there's a few tools missing still; I can't finish without some way to know the SIMD flags fed to the compiler from the command line (some standard versions?), and it's also difficult to resolve without forceinline of some sort.Isn't it possible to proceed without forceinline, to be able to finish the functionality. I understand that you think it's useless for performance reasons, but is it enough to get the functionality correct?As an aside, I need a test environment for each compiler, targetting x86, x64 and arm at least, where I can submit some code, and have it run the unittests on a matrix of appropriate targets. (ideally PPC and MIPS would also be included, so they can influence design decisions.) Does any such test system exist? A web service to provide this would be invaluable... I don't have all those systems available to me.Travis CI [1] can be used for x86-64, Linux and OS X. There's also a service that uses Windows for their hosts, but I can't remember the name right now. [1] https://travis-ci.org -- /Jacob Carlborg
Jun 03 2015
On 3 June 2015 at 17:50, Jacob Carlborg via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 2015-06-03 01:08, Manu via Digitalmars-d wrote:The codegen is everything. Functionality is the easy part here ;) Most things are already correct, but I can't confidently proof out the codegen. The main blocker though is that I don't know what simd level the user requested on the command line. The library has no idea what hardware features to target without explicit statement by the user.It's fairly large to cover everything I think is important, and there's a few tools missing still; I can't finish without some way to know the SIMD flags fed to the compiler from the command line (some standard versions?), and it's also difficult to resolve without forceinline of some sort.Isn't it possible to proceed without forceinline, to be able to finish the functionality. I understand that you think it's useless for performance reasons, but is it enough to get the functionality correct?I use travis. I was thinking smething more d-specific, along the lines of DPaste...As an aside, I need a test environment for each compiler, targetting x86, x64 and arm at least, where I can submit some code, and have it run the unittests on a matrix of appropriate targets. (ideally PPC and MIPS would also be included, so they can influence design decisions.) Does any such test system exist? A web service to provide this would be invaluable... I don't have all those systems available to me.Travis CI [1] can be used for x86-64, Linux and OS X. There's also a service that uses Windows for their hosts, but I can't remember the name right now.
Jun 03 2015
On 3 June 2015 at 11:28, Manu via Digitalmars-d <digitalmars-d puremagic.comwrote:On 3 June 2015 at 17:50, Jacob Carlborg via Digitalmars-d <digitalmars-d puremagic.com> wrote:Well, the compiler knows whether the types are supported natively at least, and you can probe this information using CTFE.On 2015-06-03 01:08, Manu via Digitalmars-d wrote:theIt's fairly large to cover everything I think is important, and there's a few tools missing still; I can't finish without some way to know the SIMD flags fed to the compiler from the command line (some standard versions?), and it's also difficult to resolve without forceinline of some sort.Isn't it possible to proceed without forceinline, to be able to finishfunctionality. I understand that you think it's useless for performance reasons, but is it enough to get the functionality correct?The codegen is everything. Functionality is the easy part here ;) Most things are already correct, but I can't confidently proof out the codegen. The main blocker though is that I don't know what simd level the user requested on the command line. The library has no idea what hardware features to target without explicit statement by the user.
Jun 03 2015
On 6/2/2015 4:08 PM, Manu via Digitalmars-d wrote:I'll wear responsibility for this, but std.simd is proving really hard for me to finish. I think in order to get something in there to start with, I need to reduce the scope to the simplest bits, get them in, then build outwards. It's fairly large to cover everything I think is important, and there's a few tools missing still; I can't finish without some way to know the SIMD flags fed to the compiler from the command line (some standard versions?), and it's also difficult to resolve without forceinline of some sort. I've reached situations where the compiler(/s) just don't do what I want it to. Also, I think the stack of simd function influence the compilers inline heuristics, and even thought the compiler decides to inline simd functions, presence of many of them in an outer function seems to reduce the probability that the outer function will be inlined as it should. forceinline needs to be a hard statement to the compiler, and ideally, a forceinline call tree shouldn't improperly influence the compilers inline decisions for outer functions.I suggest not worrying about forceinline for the moment, and just write the code as if it existed.As an aside, I need a test environment for each compiler, targetting x86, x64 and arm at least, where I can submit some code, and have it run the unittests on a matrix of appropriate targets. (ideally PPC and MIPS would also be included, so they can influence design decisions.) Does any such test system exist? A web service to provide this would be invaluable... I don't have all those systems available to me.Just make it work on the machine you have, and prove it out on that machine. Then worry about porting it.
Jun 03 2015
On 6/2/15 5:27 AM, Manu via Digitalmars-d wrote:On 2 June 2015 at 05:39, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:Probably building Phobos is a good baseline. If simple experiments show measurable speedup with Phobos, it's likely to be worth the effort. -- Andreihttps://issues.dlang.org/show_bug.cgi?id=14641 Manu, our resident god of vector instructions, do you want to take this on?How do you measure this? Is there a convenient setup that will produce a realistic test environment?
Jun 02 2015
On Tuesday, 2 June 2015 at 12:27:38 UTC, Manu wrote:On 2 June 2015 at 05:39, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:Well, I discussed that with clang people a while ago and here are how they do it and their measurement : You go though character and look for a '/'. When you hit one, you check if the character before it is a *, and if so, you have the end of the comment. There is obviously various edges cases to take into account, but that is the general idea. You can find the code in Lexer::SkipBlockComment in clang/lib/Lex/Lexer.cpp Various benchmark on their side have shown that alignment is desirable before having vector operations to kick in. They used to have an AVX implementation, but it seems to be gone now, I'm not sure why at this stage.https://issues.dlang.org/show_bug.cgi?id=14641 Manu, our resident god of vector instructions, do you want to take this on?How do you measure this? Is there a convenient setup that will produce a realistic test environment? This is more awkward in C than in D, it needs a different implementation for each compiler... will DMD's CI build with all common C compilers to prove that the implementations are correct?
Jun 02 2015
On 6/2/2015 5:45 PM, deadalnix wrote:Well, I discussed that with clang people a while ago and here are how they do it and their measurement : You go though character and look for a '/'. When you hit one, you check if the character before it is a *, and if so, you have the end of the comment. There is obviously various edges cases to take into account, but that is the general idea. You can find the code in Lexer::SkipBlockComment in clang/lib/Lex/Lexer.cpp Various benchmark on their side have shown that alignment is desirable before having vector operations to kick in. They used to have an AVX implementation, but it seems to be gone now, I'm not sure why at this stage.Line numbers have to be kept track of as well.
Jun 03 2015
On Wednesday, 3 June 2015 at 22:50:52 UTC, Walter Bright wrote:On 6/2/2015 5:45 PM, deadalnix wrote:They retrieve line number lazily when needed, with various mechanism to speedup the lookup.Well, I discussed that with clang people a while ago and here are how they do it and their measurement : You go though character and look for a '/'. When you hit one, you check if the character before it is a *, and if so, you have the end of the comment. There is obviously various edges cases to take into account, but that is the general idea. You can find the code in Lexer::SkipBlockComment in clang/lib/Lex/Lexer.cpp Various benchmark on their side have shown that alignment is desirable before having vector operations to kick in. They used to have an AVX implementation, but it seems to be gone now, I'm not sure why at this stage.Line numbers have to be kept track of as well.
Jun 03 2015
On 6/3/2015 7:05 PM, deadalnix wrote:On Wednesday, 3 June 2015 at 22:50:52 UTC, Walter Bright wrote:Hmm. There's no way to get the line number without counting LFs, and that means searching for them.On 6/2/2015 5:45 PM, deadalnix wrote:They retrieve line number lazily when needed, with various mechanism to speedup the lookup.You go though character and look for a '/'. When you hit one, you check if the character before it is a *, and if so, you have the end of the comment. There is obviously various edges cases to take into account, but that is the general idea.Line numbers have to be kept track of as well.
Jun 04 2015
On Thursday, 4 June 2015 at 18:39:02 UTC, Walter Bright wrote:Hmm. There's no way to get the line number without counting LFs, and that means searching for them.It would be nice if it was that simple. EndOfLine: \u000D \u000A \u000D \u000A \u2028 \u2029 EndOfFile
Jun 04 2015
On 6/4/2015 1:44 PM, Brian Schott wrote:It would be nice if it was that simple. EndOfLine: \u000D \u000A \u000D \u000A \u2028 \u2029 EndOfFileYeah, you're right
Jun 04 2015
On Thursday, 4 June 2015 at 18:39:02 UTC, Walter Bright wrote:On 6/3/2015 7:05 PM, deadalnix wrote:Yes, the first time you query file number, clang build metadata about new line by going through the file's content and finding position of new lines. The process uses vector operation as well. Apparently, they think it is better to do that way for various reasons: - Position tracking is more compact (and position is embedded in all expression, declaration, and more) which reduce memory footprint bu quite a lot. - It makes the lexer simpler and faster. - You don't need to track new lines if you don't use them. If you don't emit debug infos in C++, and have no error, most line number are not used (not sure in D, because various language facilities like bound checking uses line number, but that is a win in C++). - Debug emission have some predictable access pattern, and algorithm to find line number from an offset in the file are special cased to handle it. - Finding new line can be vectorized on the whole file. t cannot be vectorized when done in // with lexing. Once again, I'm not sure this is a win in D, because we need line number more than in C++, but it seems to be a win in C++.On Wednesday, 3 June 2015 at 22:50:52 UTC, Walter Bright wrote:Hmm. There's no way to get the line number without counting LFs, and that means searching for them.On 6/2/2015 5:45 PM, deadalnix wrote:They retrieve line number lazily when needed, with various mechanism to speedup the lookup.You go though character and look for a '/'. When you hit one, you check if the character before it is a *, and if so, you have the end of the comment. There is obviously various edges cases to take into account, but that is the general idea.Line numbers have to be kept track of as well.
Jun 04 2015
On 6/4/2015 2:44 PM, deadalnix wrote:On Thursday, 4 June 2015 at 18:39:02 UTC, Walter Bright wrote:It's an interesting approach. I generally shoot for making the debug builds the fastest, because that's when people are in the edit-compile-debug loop. And the debug output needs line numbers :-)On 6/3/2015 7:05 PM, deadalnix wrote:Yes, the first time you query file number, clang build metadata about new line by going through the file's content and finding position of new lines. The process uses vector operation as well. Apparently, they think it is better to do that way for various reasons: - Position tracking is more compact (and position is embedded in all expression, declaration, and more) which reduce memory footprint bu quite a lot. - It makes the lexer simpler and faster. - You don't need to track new lines if you don't use them. If you don't emit debug infos in C++, and have no error, most line number are not used (not sure in D, because various language facilities like bound checking uses line number, but that is a win in C++). - Debug emission have some predictable access pattern, and algorithm to find line number from an offset in the file are special cased to handle it. - Finding new line can be vectorized on the whole file. t cannot be vectorized when done in // with lexing. Once again, I'm not sure this is a win in D, because we need line number more than in C++, but it seems to be a win in C++.On Wednesday, 3 June 2015 at 22:50:52 UTC, Walter Bright wrote:Hmm. There's no way to get the line number without counting LFs, and that means searching for them.On 6/2/2015 5:45 PM, deadalnix wrote:They retrieve line number lazily when needed, with various mechanism to speedup the lookup.You go though character and look for a '/'. When you hit one, you check if the character before it is a *, and if so, you have the end of the comment. There is obviously various edges cases to take into account, but that is the general idea.Line numbers have to be kept track of as well.
Jun 04 2015
On Friday, 5 June 2015 at 00:30:44 UTC, Walter Bright wrote:It's an interesting approach. I generally shoot for making the debug builds the fastest, because that's when people are in the edit-compile-debug loop. And the debug output needs line numbers :-)In C++, you would not need line number for header, even in debug mode, unless they contains various template and/or implementation, which I assume would be false in many cases.
Jun 04 2015