digitalmars.D.ldc - Win64, merge-2.067, LLVM master, VS 2015 - current status
- kinke (41/41) May 10 2015 Hey guys,
- Dan Olson (2/4) May 10 2015 Where does core.thread fail its test?
- kinke (14/15) May 12 2015 Apparently an access violation in fiber_switchContext().
- Dan Olson (9/24) May 12 2015 I wonder if it issue #666 [1]? It looks like the same failure I
- Kai Nacke (6/11) May 10 2015 The failure may be caused by the assembler code because of
- kinke (37/37) May 14 2015 Most interesting: many of the failing unittests are caused by a
- Dan Olson (5/17) May 14 2015 Very cool. The math unittest failures could just be the ones that
- kinke (19/19) May 16 2015 After intensive debugging, the strange issue responsible for most
- Dan Olson (9/13) May 17 2015 Could sections_ldc.initSections() be neglecting the BSS section? I
- Dan Olson (5/16) May 17 2015 I don't have a Windows host available right now, but am assuming that
- kinke (12/12) May 17 2015 Thx for the hint, Dan!!!
- Temtaime (5/5) May 20 2015 Hi all !
- Temtaime (3/3) May 20 2015 For example that code is ok: http://goo.gl/QMfpzg
- John Colvin (4/9) May 21 2015 Rule of thumb: don't use array ops for short arrays. They are
- David Nadlinger (4/7) May 21 2015 We should be able to do better, though, especially for cases like
- kinke (9/11) May 21 2015 Definitely, if GDC manages to do so, we should too.
- Kai Nacke (5/14) May 23 2015 Now ldc should inline the arrayops. If you have some benchmarks
- Kai Nacke (10/15) May 23 2015 Should be fixed. Now ldc generates:
- kinke (17/17) May 23 2015 Update:
- kinke (3/3) May 24 2015 With VS 2013, 2 std.conv unittests fail, due to
- Elie Morisse (2/20) May 24 2015 Wonderful!
Hey guys, as a teaser for the curious, here's the current status with a bleeding edge Win64 environment: * Visual Studio 2015 CTP * LLVM master (5de9960) * LDC: merge-2.067 ** *.conf.in files manually modified to include "-L/LARGEADDRESSAWARE:NO" as default option ** druntime: + https://github.com/kinke/druntime/commit/1add4f0d401717acc42d 2600f9e85ca7d0efe11 + https://github.com/ldc-developers/druntime/pull/17 ** phobos: + https://github.com/ldc-developers/phobos/pull/17 druntime + phobos unittests, debug and release: 92% tests passed, 46 tests failed out of 555 failures: core.thread (segfaults in release only) std.csv std.datetime std.encoding std.math std.parallelism std.path std.process std.socket std.stream std.string (fails to compile in debug, fails in release) std.traits std.uni std.uri std.zip std.zlib std.algorithm.sorting (fails in debug only) std.digest.crc std.digest.md (fails in debug only) std.digest.ripemd (fails in debug only) std.digest.sha std.net.isemail std.regex.internal.parser std.regex.internal.tests std.regex std.internal.math.gammafunction
May 10 2015
"kinke" <noone nowhere.com> writes:failures: core.thread (segfaults in release only)Where does core.thread fail its test?
May 10 2015
On Sunday, 10 May 2015 at 22:25:14 UTC, Dan Olson wrote:Where does core.thread fail its test?Apparently an access violation in fiber_switchContext(). asm: push rbx xor rax,rax push qword ptr gs:[rax] push qword ptr gs:[rax+8] push qword ptr gs:[rax+10h] mov qword ptr [rcx],rsp mov rsp,rdx pop qword ptr gs:[rax+10h] --> access violation with rax=0 pop qword ptr gs:[rax+8] pop qword ptr gs:[rax] pop rbx
May 12 2015
On Tuesday, 12 May 2015 at 22:46:26 UTC, kinke wrote:On Sunday, 10 May 2015 at 22:25:14 UTC, Dan Olson wrote:see on OS X/iOS and only in release builds. Can you see if it happens in the runShared test? If so, it may pass if you enable the version(Posix) code for sm_this that uses pthread_get/setspecific [2]. [1] https://github.com/ldc-developers/ldc/issues/666 [2] https://github.com/ldc-developers/druntime/blob/ldc/src/core/thread.d#L1135Where does core.thread fail its test?Apparently an access violation in fiber_switchContext(). asm: push rbx xor rax,rax push qword ptr gs:[rax] push qword ptr gs:[rax+8] push qword ptr gs:[rax+10h] mov qword ptr [rcx],rsp mov rsp,rdx pop qword ptr gs:[rax+10h] --> access violation with rax=0 pop qword ptr gs:[rax+8] pop qword ptr gs:[rax] pop rbx
May 12 2015
On Wednesday, 13 May 2015 at 05:14:57 UTC, Dan Olson wrote:Thx - it crashes during the runShared test. pthread isn't supported on Windows. The access violation occurs in https://github.com/ldc-developers/druntime/blob/ldc/src/core/thread.d#L3597.Apparently an access violation in fiber_switchContext(). asm: push rbx xor rax,rax push qword ptr gs:[rax] push qword ptr gs:[rax+8] push qword ptr gs:[rax+10h] mov qword ptr [rcx],rsp mov rsp,rdx pop qword ptr gs:[rax+10h] --> access violation with rax=0 pop qword ptr gs:[rax+8] pop qword ptr gs:[rax] pop rbxI see on OS X/iOS and only in release builds. Can you see if it happens in the runShared test? If so, it may pass if you enable the version(Posix) code for sm_this that uses pthread_get/setspecific [2]. [1] https://github.com/ldc-developers/ldc/issues/666 [2] https://github.com/ldc-developers/druntime/blob/ldc/src/core/thread.d#L1135
May 13 2015
"kinke" <noone nowhere.com> writes:On Wednesday, 13 May 2015 at 05:14:57 UTC, Dan Olson wrote:It does seem to be the same problem because the stack to resume on is wrong. That is what I see on OS X and iOS. You could try Windows TlsGetValue API for sm_this and mimic the pthread_getspecific code. If it works, I don't think it is a real fix but does allow rest of thread unittest to run. Maybe just disabling the runShared test and documenting is best. I found this older page where boost decided coroutine migration between threads was unsafe because of TLS: http://www.crystalclearsoftware.com/soc/coroutine/coroutine/coroutine_thread.html Thinking out loud: If LDC could provide a switch to disable TLS address caching, would folks think to use it? Should it be on by default to support the ability to migrate Fibers across threads? Is it worth the performance loss in other TLS cases? Maybe it is better to have some targets w/ expensive TLS lookup just disable Fiber migration. Is FIber migration that common?Thx - it crashes during the runShared test. pthread isn't supported on Windows. The access violation occurs in https://github.com/ldc-developers/druntime/blob/ldc/src/core/thread.d#L3597.Apparently an access violation in fiber_switchContext(). asm: push rbx xor rax,rax push qword ptr gs:[rax] push qword ptr gs:[rax+8] push qword ptr gs:[rax+10h] mov qword ptr [rcx],rsp mov rsp,rdx pop qword ptr gs:[rax+10h] --> access violation with rax=0 pop qword ptr gs:[rax+8] pop qword ptr gs:[rax] pop rbxon OS X/iOS and only in release builds. Can you see if it happens in the runShared test? If so, it may pass if you enable the version(Posix) code for sm_this that uses pthread_get/setspecific [2]. [1] https://github.com/ldc-developers/ldc/issues/666 [2] https://github.com/ldc-developers/druntime/blob/ldc/src/core/thread.d#L1135
May 14 2015
On Thursday, 14 May 2015 at 07:51:16 UTC, Dan Olson wrote:You could try Windows TlsGetValue API for sm_this and mimic the pthread_getspecific code.Yep, that makes the runShared test pass. core.thread still doesn't pass all tests in release though (e.g., non-volatile GP registers are apparently not restored correctly).
May 14 2015
On Sunday, 10 May 2015 at 14:13:35 UTC, kinke wrote:Hey guys, as a teaser for the curious, here's the current status with a bleeding edge Win64 environment:Really cool!failures: std.digest.shaThe failure may be caused by the assembler code because of non-matching ABI conventions. Regards, Kai
May 10 2015
Most interesting: many of the failing unittests are caused by a single issue, namely, std.concurrency.unregisterMe() invoked by the static destructor of the std.concurrency module. The following unittests all pass when isolated, i.e., by compiling via 'ldc2 -g -main -unittest <foo>.d' (debug, '-release' added for release) and then running the resulting <foo>.exe: std.datetime std.parallelism std.path std.process std.string std.uni std.uri std.zip std.zlib std.algorithm.sorting std.digest.crc std.digest.md std.digest.ripemd std.digest.sha std.net.isemail std.regex.internal.parser std.regex.internal.tests std.regex The following failures are NOT caused by std.concurrency.~this(): core.thread std.csv * std.encoding std.math std.socket std.stream std.traits * std.internal.math.gammafunction * [*] fixed or worked around on my system, patches being prepared So we're not far from LDC on Win64 passing all druntime + phobos unit tests! :)
May 14 2015
"kinke" <noone nowhere.com> writes:The following failures are NOT caused by std.concurrency.~this(): core.thread std.csv * std.encoding std.math std.socket std.stream std.traits * std.internal.math.gammafunction * [*] fixed or worked around on my system, patches being prepared So we're not far from LDC on Win64 passing all druntime + phobos unit tests! :)Very cool. The math unittest failures could just be the ones that aren't written for 64-bit real. There was some work on it by Kevin and Johan in this pull [1], but I have not been following lately. [1] https://github.com/ldc-developers/phobos/pull/7
May 14 2015
After intensive debugging, the strange issue responsible for most failures seems to be reducible to some unit tests allocating GC memory, this in turn leading to a GC collection pass which then destroys the `__gshared core.thread.Thread core.thread.Thread.sm_tbeg` object representing the main thread (it's actually the start of a linked list of threads). The object should actually be kept alive by the sm_tbeg reference (and the main Thread additionally by sm_main). After initializing the reference with the main Thread at program startup, it isn't touched (i.e., not reset to null - verified via data breakpoint), but the object is finalized anyway. So it looks as if the GC doesn't know about these __gshared references. When accessing the linked list later, when terminating all threads right before exiting the program, funny things happen due to garbage in the .next and .prev references. I've just tried storing another reference to the main Thread as `static Thread core.thread.Thread.sm_mainDummy`, and the object isn't destroyed anymore. So `gshared` seems to be the problem. And most likely all other targets are affected too by this bug.
May 16 2015
"kinke" <noone nowhere.com> writes:I've just tried storing another reference to the main Thread as `static Thread core.thread.Thread.sm_mainDummy`, and the object isn't destroyed anymore. So `gshared` seems to be the problem. And most likely all other targets are affected too by this bug.Could sections_ldc.initSections() be neglecting the BSS section? I noticed the version(Win64) code has: if (_bss_start__ != null) { pushRange(&_bss_start__, &_bss_end__); } -- Dan
May 17 2015
Dan Olson <zans.is.for.cans yahoo.com> writes:"kinke" <noone nowhere.com> writes:I don't have a Windows host available right now, but am assuming that _bss_start__ is a symbol created by linker that overlays first variable in BSS, which very likely is 0 because it is in BSS. But again, just guessing.I've just tried storing another reference to the main Thread as `static Thread core.thread.Thread.sm_mainDummy`, and the object isn't destroyed anymore. So `gshared` seems to be the problem. And most likely all other targets are affected too by this bug.Could sections_ldc.initSections() be neglecting the BSS section? I noticed the version(Win64) code has: if (_bss_start__ != null) { pushRange(&_bss_start__, &_bss_end__); }
May 17 2015
Thx for the hint, Dan!!! 99% tests passed, 4 tests failed out of 556 The following tests FAILED: 175 - std.math (Failed) 191 - std.stream (Failed) 460 - std.socket-debug (Failed) 465 - std.stream-debug (Failed) !!! :)) The problem was that instead of using the range [_data_start__, _data_end__) (computed in rt/msvc.c), we've used [&_data_start__, &_data_end__). We also did so for the BSS section, which isn't present in my druntime-test-runner-debug.exe though.
May 17 2015
Hi all ! http://goo.gl/0JI4qJ Why ldc cannot optimize it ? Should i create an issue ? For example gdc uses only one mulfps.
May 20 2015
For example that code is ok: http://goo.gl/QMfpzg Can we rewrite vector operations with foreach rather than call vector functions ?
May 20 2015
On Wednesday, 20 May 2015 at 13:34:50 UTC, Temtaime wrote:Hi all ! http://goo.gl/0JI4qJ Why ldc cannot optimize it ? Should i create an issue ? For example gdc uses only one mulfps.Rule of thumb: don't use array ops for short arrays. They are quite well optimised for large arrays, but aren't great in cases like your example.
May 21 2015
On Thursday, 21 May 2015 at 19:05:33 UTC, John Colvin wrote:Rule of thumb: don't use array ops for short arrays. They are quite well optimised for large arrays, but aren't great in cases like your example.We should be able to do better, though, especially for cases like this where the length is known statically. — David
May 21 2015
On Thursday, 21 May 2015 at 19:07:20 UTC, David Nadlinger wrote:We should be able to do better, though, especially for cases like this where the length is known statically.Definitely, if GDC manages to do so, we should too. Some years ago, I ported some math-intensive C++ code to D and rewrote all simple loops by array ops, for better readability but primarily because I assumed that'd help with SSE vectorization. No wonder the runtime was about an order of magnitude higher compared to the corresponding C++ version. The disappointment was rather high and lowered my interest in D, so yes, please create a Github issue about this.
May 21 2015
Hi guys ! I recently found that ldc doesn't compile with llvm master anymore. They removed CreateCall[2-3] functions ans some of overloads of CreateCall too. Now all the parameters should be passed to that function by vector<llvm::Value *>. Anyone to fix ?
May 22 2015
Kai has just fixed it. Please don't expect the few of us to always react so quickly to every LLVM API change (and there's a whole lot of them) on its master branch though. ;)
May 22 2015
On Thursday, 21 May 2015 at 19:05:33 UTC, John Colvin wrote:On Wednesday, 20 May 2015 at 13:34:50 UTC, Temtaime wrote:Now ldc should inline the arrayops. If you have some benchmarks you could re-run them. Regards, KaiHi all ! http://goo.gl/0JI4qJ Why ldc cannot optimize it ? Should i create an issue ? For example gdc uses only one mulfps.Rule of thumb: don't use array ops for short arrays. They are quite well optimised for large arrays, but aren't great in cases like your example.
May 23 2015
On Wednesday, 20 May 2015 at 13:34:50 UTC, Temtaime wrote:Hi all ! http://goo.gl/0JI4qJ Why ldc cannot optimize it ? Should i create an issue ? For example gdc uses only one mulfps.Should be fixed. Now ldc generates: movups (%r8), %xmm0 shufps $0, %xmm1, %xmm1 mulps %xmm0, %xmm1 movups %xmm1, (%rcx) movq %rcx, %rax retq Regards, Kai
May 23 2015
Update: * LLVM master (1fd101c) * LDC: branch merge-2.067 ** *.conf.in files hacked to include "-L/LARGEADDRESSAWARE:NO" as default option ** druntime: branch ldc-merge-2.067 + https://github.com/ldc-developers/druntime/pull/29 (VS 2015 only) ** phobos: branch ldc-merge-2.067 + https://github.com/JohanEngelen/phobos/commit/2ac2581fe49da475bf6f687cfb7bcb9c9ddf8b71 https://github.com/kinke/phobos/commit/86511b3ca9f4a6b5358b7983ee64d0a688c63216 Due to https://github.com/ldc-developers/ldc/issues/930, you'll need to hack your <buildDir>\build.ninja file and exclude the -g switch when building runtime\std\string-unittest-debug.obj. Except for a Win64-specific core.thread unittest (testNonvolatileRegister), which fails in the release build, all druntime & phobos unittests pass, at least with VS 2015. :)
May 23 2015
With VS 2013, 2 std.conv unittests fail, due to strtod()/strtold() not being able to parse hex strings, otherwise same as for VS 2015.
May 24 2015
On Sunday, 24 May 2015 at 01:49:14 UTC, kinke wrote:Update: * LLVM master (1fd101c) * LDC: branch merge-2.067 ** *.conf.in files hacked to include "-L/LARGEADDRESSAWARE:NO" as default option ** druntime: branch ldc-merge-2.067 + https://github.com/ldc-developers/druntime/pull/29 (VS 2015 only) ** phobos: branch ldc-merge-2.067 + https://github.com/JohanEngelen/phobos/commit/2ac2581fe49da475bf6f687cfb7bcb9c9ddf8b71 https://github.com/kinke/phobos/commit/86511b3ca9f4a6b5358b7983ee64d0a688c63216 Due to https://github.com/ldc-developers/ldc/issues/930, you'll need to hack your <buildDir>\build.ninja file and exclude the -g switch when building runtime\std\string-unittest-debug.obj. Except for a Win64-specific core.thread unittest (testNonvolatileRegister), which fails in the release build, all druntime & phobos unittests pass, at least with VS 2015. :)Wonderful!
May 24 2015