digitalmars.D - Some nice new DMD slicing optimizations
- Walter Bright (29/29) Oct 06 2016 https://github.com/dlang/dmd/pull/6176
- Ilya Yaroshenko (2/8) Oct 06 2016 Awesome!
- rikki cattermole (4/33) Oct 06 2016 If there is bound checking shouldn't there be a check to guarantee b and...
- Ilya Yaroshenko (2/7) Oct 07 2016 The function is not safe. So there are no checks in release mode.
- Walter Bright (2/4) Oct 07 2016 I set -noboundscheck
https://github.com/dlang/dmd/pull/6176 I'm happy to report that DMD has (finally!) gotten some significant new optimizations! Specifically, 'slicing' a two register wide aggregate into two register-sized variables, enabling much better enregistering. Given the code: void foo(int[] a, int[] b, int[] c) { foreach (i; 0 .. a.length) a[i] = b[i] + c[i]; } the inner loop formerly compiled to: LA: mov EAX,018h[ESP] mov EDX,010h[ESP] mov ECX,[EBX*4][EAX] add ECX,[EBX*4][EDX] mov ESI,020h[ESP] mov [EBX*4][ESI],ECX inc EBX cmp EBX,01Ch[ESP] jb LA and now: L1A: mov ECX,[EBX*4][EDI] add ECX,[EBX*4][ESI] mov 0[EBX*4][EBP],ECX inc EBX cmp EBX,EDX jb L1A I've been wanting to do this for years, and finally got around to it. (I also thought of a simpler way to implement it, which helped a lot.) Further work will be in widening what this applies to.
Oct 06 2016
On Friday, 7 October 2016 at 06:07:47 UTC, Walter Bright wrote:https://github.com/dlang/dmd/pull/6176 I'm happy to report that DMD has (finally!) gotten some significant new optimizations! Specifically, 'slicing' a two register wide aggregate into two register-sized variables, enabling much better enregistering. [...]Awesome!
Oct 06 2016
On 07/10/2016 7:07 PM, Walter Bright wrote:https://github.com/dlang/dmd/pull/6176 I'm happy to report that DMD has (finally!) gotten some significant new optimizations! Specifically, 'slicing' a two register wide aggregate into two register-sized variables, enabling much better enregistering. Given the code: void foo(int[] a, int[] b, int[] c) { foreach (i; 0 .. a.length) a[i] = b[i] + c[i]; } the inner loop formerly compiled to: LA: mov EAX,018h[ESP] mov EDX,010h[ESP] mov ECX,[EBX*4][EAX] add ECX,[EBX*4][EDX] mov ESI,020h[ESP] mov [EBX*4][ESI],ECX inc EBX cmp EBX,01Ch[ESP] jb LA and now: L1A: mov ECX,[EBX*4][EDI] add ECX,[EBX*4][ESI] mov 0[EBX*4][EBP],ECX inc EBX cmp EBX,EDX jb L1A I've been wanting to do this for years, and finally got around to it. (I also thought of a simpler way to implement it, which helped a lot.) Further work will be in widening what this applies to.If there is bound checking shouldn't there be a check to guarantee b and c and >= a.length? Otherwise, awesome!
Oct 06 2016
On Friday, 7 October 2016 at 06:30:32 UTC, rikki cattermole wrote:On 07/10/2016 7:07 PM, Walter Bright wrote:The function is not safe. So there are no checks in release mode.[...]If there is bound checking shouldn't there be a check to guarantee b and c and >= a.length? Otherwise, awesome!
Oct 07 2016
On 10/6/2016 11:30 PM, rikki cattermole wrote:If there is bound checking shouldn't there be a check to guarantee b and c andI set -noboundscheck= a.length?
Oct 07 2016