www.digitalmars.com         C & C++   DMDScript  

D.gnu - GDC on 4.2.3 with autovectorization

reply downs <default_357-line yahoo.de> writes:
Here's a patch to get GDC SVN from 04.05.08 (roughly) to support 4.2.3, as well
as automatic translation of loop statements into SSE optimized assembly.

Basic procedure goes as follows: download GDC from SVN, copy it into the GCC
folder as per installation procedure, edit the setup-gcc.sh to replace the
following line:

 elif grep -q '^4\.1\.' gcc/BASE-VER; then
with
 elif grep -q '^4\.[12]\.' gcc/BASE-VER; then
then try to run it. It will, predictably, fail. _Now_ apply the attached patch to the so-prepared GDC directory. Configure and build as normal. (--disable-bootstrap if you don't like waiting for hours) BEWARE! Not really being a GCC dev, I've had to make some _very_ weird _pure guesses_ during the change to 4.2.3, so the resulting code, while it appears to work correctly (http://demented.no-ip.org/~root/results.html ; dstress backs me up .. the numbers are in line with official GDC results), might in fact, break C compatibility, break D compatibility, or eat and/or abuse (and/or sexually), without discrimination, small objects, household pets and family members, INCLUDING YOU. Don't say I didn't warn you. That being said, have fun with it! --downs PS: here's a demo of the autovectorizer at work: gentoo-pc ~ $ cat test.d && gdc test.d -o test -O2 -msse -ftree-vectorize -ftree-vectorizer-verbose=5 -g && ./test && objdump -d test |grep addps -C10 module test; import std.stdio; void main() { float[4] a = [1f, 2, 3, 4]; float[4] b = [4f, 3, 2, 1]; float[4] c; for (int i = 0; i < 4; ++i) c[i] = a[i] + b[i]; writefln(c); } test.d:5: note: not vectorized: too many BBs in loop. test.d:6: note: LOOP VECTORIZED. test.d:2: note: vectorized 1 loops in function. [5,5,5,5] 804a228: 89 74 24 14 mov %esi,0x14(%esp) 804a22c: 89 44 24 08 mov %eax,0x8(%esp) 804a230: 89 54 24 0c mov %edx,0xc(%esp) 804a234: 89 3c 24 mov %edi,(%esp) 804a237: c7 44 24 04 04 00 00 movl $0x4,0x4(%esp) 804a23e: 00 804a23f: e8 fc 26 00 00 call 804c940 <_d_arraycopy> 804a244: b8 00 00 c0 7f mov $0x7fc00000,%eax 804a249: 0f 28 45 c8 movaps -0x38(%ebp),%xmm0 804a24d: 89 45 bc mov %eax,-0x44(%ebp) 804a250: 0f 58 45 d8 addps -0x28(%ebp),%xmm0 804a254: 89 45 c0 mov %eax,-0x40(%ebp) 804a257: 89 45 c4 mov %eax,-0x3c(%ebp) 804a25a: 8d 45 b8 lea -0x48(%ebp),%eax 804a25d: 83 ec 04 sub $0x4,%esp 804a260: 0f 29 45 b8 movaps %xmm0,-0x48(%ebp) 804a264: 89 44 24 08 mov %eax,0x8(%esp) 804a268: c7 44 24 04 04 00 00 movl $0x4,0x4(%esp) 804a26f: 00 804a270: c7 04 24 34 a2 06 08 movl $0x806a234,(%esp) 804a277: e8 84 9b 00 00 call 8053e00 <_D3std5stdio8writeflnFYv>
Apr 06 2008
next sibling parent downs <default_357-line yahoo.de> writes:
Here it is. Sorry.
Apr 06 2008
prev sibling parent =?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

downs wrote:
 Here's a patch to get GDC SVN from 04.05.08 (roughly) to support 4.2.3, as
well as automatic translation of loop statements into SSE optimized assembly.
 
Thank you. I don't know on what platform you developed it, but it appears to work fine here on a 64 bits linux. I only encountered one minor issue: once your patch is applied to a gcc source tree, it becomes impossible to build compilers for languages other than C and D from that tree (not important since I use the system compiler for everything else, but I thought I'd point it out since it took me a bit of time to figure what was wrong). Jerome - -- +------------------------- Jerome M. BERGER ---------------------+ | mailto:jeberger free.fr | ICQ: 238062172 | | http://jeberger.free.fr/ | Jabber: jeberger jabber.fr | +---------------------------------+------------------------------+ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFH+oHFd0kWM4JG3k8RAriQAKC++AsVgM7IzPJkfNrtM4IUImuYIQCfX1KM jyjNY6tFnuArc6+7k/QHXk0= =utAM -----END PGP SIGNATURE-----
Apr 07 2008