digitalmars.D.learn - Question on SSE intrinsics
- piotrekg2 (9/9) Jul 29 2017 Hi,
- Eugene Wissner (3/12) Jul 29 2017 https://stackoverflow.com/questions/14002946/explicit-simd-code-in-d
- Johan Engelen (9/15) Jul 29 2017 Yes, with LDC (probably GDC too).
- piotrekg2 (2/19) Jul 29 2017 What about __builtin_ctz?
- Nicholas Wilson (2/25) Jul 29 2017 https://github.com/ldc-developers/druntime/blob/ldc/src/ldc/intrinsics.d...
- Nicholas Wilson (3/6) Jul 29 2017 you can also make it out of bsf or bsr (i can't remember which)
Hi, I'm trying to port some of my c++ code which uses sse2 instructions into D. The code calls the following intrinsics: - _mm256_loadu_si256 - _mm256_movemask_epi8 Do they have any equivalent intrinsics in D? I'm compiling my c++ code using gcc. Thanks, Piotr
Jul 29 2017
On Saturday, 29 July 2017 at 16:01:07 UTC, piotrekg2 wrote:Hi, I'm trying to port some of my c++ code which uses sse2 instructions into D. The code calls the following intrinsics: - _mm256_loadu_si256 - _mm256_movemask_epi8 Do they have any equivalent intrinsics in D? I'm compiling my c++ code using gcc. Thanks, Piotrhttps://stackoverflow.com/questions/14002946/explicit-simd-code-in-d I don't think something has changed since then.
Jul 29 2017
On Saturday, 29 July 2017 at 16:01:07 UTC, piotrekg2 wrote:Hi, I'm trying to port some of my c++ code which uses sse2 instructions into D. The code calls the following intrinsics: - _mm256_loadu_si256 - _mm256_movemask_epi8 Do they have any equivalent intrinsics in D?Yes, with LDC (probably GDC too). But unfortunately we don't have the "_mm256" functions (yet?), instead we have GCC's "__builtin_ia32..." functions. The first one you mention I think is just an unaligned load? That can be done with the template `loadUnaligned` from module ldc.simd. The second one has a synonym, "__builtin_ia32_pmovmskb256". -Johan
Jul 29 2017
On Saturday, 29 July 2017 at 18:19:47 UTC, Johan Engelen wrote:On Saturday, 29 July 2017 at 16:01:07 UTC, piotrekg2 wrote:What about __builtin_ctz?Hi, I'm trying to port some of my c++ code which uses sse2 instructions into D. The code calls the following intrinsics: - _mm256_loadu_si256 - _mm256_movemask_epi8 Do they have any equivalent intrinsics in D?Yes, with LDC (probably GDC too). But unfortunately we don't have the "_mm256" functions (yet?), instead we have GCC's "__builtin_ia32..." functions. The first one you mention I think is just an unaligned load? That can be done with the template `loadUnaligned` from module ldc.simd. The second one has a synonym, "__builtin_ia32_pmovmskb256". -Johan
Jul 29 2017
On Saturday, 29 July 2017 at 22:45:12 UTC, piotrekg2 wrote:On Saturday, 29 July 2017 at 18:19:47 UTC, Johan Engelen wrote:https://github.com/ldc-developers/druntime/blob/ldc/src/ldc/intrinsics.di#L325On Saturday, 29 July 2017 at 16:01:07 UTC, piotrekg2 wrote:What about __builtin_ctz?Hi, I'm trying to port some of my c++ code which uses sse2 instructions into D. The code calls the following intrinsics: - _mm256_loadu_si256 - _mm256_movemask_epi8 Do they have any equivalent intrinsics in D?Yes, with LDC (probably GDC too). But unfortunately we don't have the "_mm256" functions (yet?), instead we have GCC's "__builtin_ia32..." functions. The first one you mention I think is just an unaligned load? That can be done with the template `loadUnaligned` from module ldc.simd. The second one has a synonym, "__builtin_ia32_pmovmskb256". -Johan
Jul 29 2017
On Sunday, 30 July 2017 at 02:05:32 UTC, Nicholas Wilson wrote:On Saturday, 29 July 2017 at 22:45:12 UTC, piotrekg2 wrote:you can also make it out of bsf or bsr (i can't remember which) from `core.bitop`What about __builtin_ctz?https://github.com/ldc-developers/druntime/blob/ldc/src/ldc/intrinsics.di#L325
Jul 29 2017