digitalmars.D - Dual CPU code
- bearophile (8/8) Feb 02 2009 This comes after a small discussion I've had in the #D IRC channel.
- Don (2/15) Feb 02 2009 Is this mostly integer, or floating point code?
- bearophile (4/5) Feb 02 2009 In that specific cases, it's mostly FP. If I compile it with LDC with -s...
- Walter Bright (18/26) Feb 02 2009 This is a very old problem, it even cropped up in the bad old DOS days
- bearophile (7/15) Feb 02 2009 I think that solves my problem, thank you. It's a simple solution (maybe...
- Walter Bright (2/15) Feb 02 2009 That's one way to do it.
- grauzone (2/18) Feb 02 2009 The glorious return of include files!
- Andrei Alexandrescu (5/22) Feb 02 2009 I must be missing something - why isn't
- Walter Bright (4/28) Feb 02 2009 Because importing something does not change how it was compiled. If you
- BCS (2/15) Feb 02 2009 the code generator needs to be run on the code more than once.
- Christopher Wright (4/31) Feb 02 2009 The shared code has to be compiled with two sets of compiler switches,
- BCS (10/32) Feb 02 2009 my first thought would be to play games with the linker:
- Tim M (4/27) Feb 02 2009 Is this the sort thing you are looking for:
This comes after a small discussion I've had in the #D IRC channel. I have seen that the LDC compiler is much more efficient if you use SSE(2) extensions, while it's not much efficient if you don't use them (GCC/GDC don't seem so much sensitive to the presence of the SSE extensions). I often have to switch from an old and a new CPU, so if I compile with SSE2 extensions the program doesn't run on the old CPU, while if I don't use them, I sometimes have a program that goes much slower on the newer CPU. So, it may be useful to have a way to build executables able to run well on both CPUs (Apple has done something like this two or more times in the past). There are several ways to do this, a solution is to compile just critical functions for different CPUs, but that may require compiler support. My executables are generally small, so doubling their size isn't a problem. So a simple solution is to bundle two whole executables into an executable and add a small header that looks for the current CPU, and runs the right executable. Notice that the problem I have shown isn't limited to SSE2, it's more common, for example in the close future you may want code compiled for the GPU and/or CPU, etc. Bye, bearophile
Feb 02 2009
bearophile wrote:This comes after a small discussion I've had in the #D IRC channel. I have seen that the LDC compiler is much more efficient if you use SSE(2) extensions, while it's not much efficient if you don't use them (GCC/GDC don't seem so much sensitive to the presence of the SSE extensions). I often have to switch from an old and a new CPU, so if I compile with SSE2 extensions the program doesn't run on the old CPU, while if I don't use them, I sometimes have a program that goes much slower on the newer CPU. So, it may be useful to have a way to build executables able to run well on both CPUs (Apple has done something like this two or more times in the past). There are several ways to do this, a solution is to compile just critical functions for different CPUs, but that may require compiler support. My executables are generally small, so doubling their size isn't a problem. So a simple solution is to bundle two whole executables into an executable and add a small header that looks for the current CPU, and runs the right executable. Notice that the problem I have shown isn't limited to SSE2, it's more common, for example in the close future you may want code compiled for the GPU and/or CPU, etc. Bye, bearophileIs this mostly integer, or floating point code?
Feb 02 2009
Don:Is this mostly integer, or floating point code?In that specific cases, it's mostly FP. If I compile it with LDC with -sse3 flags the resulting asm is a jungle of the new registers :-) Bye, bearophile
Feb 02 2009
bearophile wrote:So, it may be useful to have a way to build executables able to run well on both CPUs (Apple has done something like this two or more times in the past). There are several ways to do this, a solution is to compile just critical functions for different CPUs, but that may require compiler support. My executables are generally small, so doubling their size isn't a problem. So a simple solution is to bundle two whole executables into an executable and add a small header that looks for the current CPU, and runs the right executable.This is a very old problem, it even cropped up in the bad old DOS days where you had the choice of emulator or FPU. The solution is fairly simple - you don't need to bind together two executables. Simply put a runtime switch in: import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo(); and then compile sse.d and nosse.d with different compiler switches. The std.cpuid module will tell you what you've got at runtime. To see a real example of this, look at the array op implementation code in the standard library, such as internal/arrayfloat.d, it does a runtime switch for several different FPU flavors.
Feb 02 2009
Walter Bright:import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd")); Bye, bearophile
Feb 02 2009
bearophile wrote:Walter Bright:That's one way to do it.import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
Feb 02 2009
Walter Bright wrote:bearophile wrote:The glorious return of include files!Walter Bright:That's one way to do it.import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
Feb 02 2009
Walter Bright wrote:bearophile wrote:I must be missing something - why isn't import shared_module_code; good? AndreiWalter Bright:That's one way to do it.import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
Feb 02 2009
Andrei Alexandrescu wrote:Walter Bright wrote:Because importing something does not change how it was compiled. If you have one module that you want two separate instances of, compiled with different switches, they have to be somehow given different names.bearophile wrote:I must be missing something - why isn't import shared_module_code; good?Walter Bright:That's one way to do it.import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
Feb 02 2009
Hello Andrei,the code generator needs to be run on the code more than once.bearophile wrote:I must be missing something - why isn't import shared_module_code; good? Andreimodule sse; mixin(import("shared_module_code.dd"));
Feb 02 2009
Andrei Alexandrescu wrote:Walter Bright wrote:The shared code has to be compiled with two sets of compiler switches, resulting in two distinct modules with different ModuleInfo, TypeInfo, and so forth. You can't do that with import.bearophile wrote:I must be missing something - why isn't import shared_module_code; good? AndreiWalter Bright:That's one way to do it.import std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd"));
Feb 02 2009
Reply to bearophile,Walter Bright:my first thought would be to play games with the linker: define a function EnterA() that calls code define a function EnterB() that calls code compile needed code for CPU A to A.obj Compile needed code for CPU B to B.obj make a lib with EnterA and A.obj forcing internal linking make a lib with EnterB and B.obj forcing internal linking link common code and both libs making the libs becomes the fun partimport std.cpuid; import sse; import nosse; ... if (std.cpuid.sse2()) sse2.foo(); else nosse2.foo();I think that solves my problem, thank you. It's a simple solution (maybe I didn't think of it because I use bud that compiles all the program in one go). I presume that usually the D code in the sse and nosse modules is the same, it's just compiled in two different ways, so the two modules may just contain two lines of code as: module sse; mixin(import("shared_module_code.dd")); Bye, bearophile
Feb 02 2009
On Tue, 03 Feb 2009 00:31:17 +1300, bearophile <bearophileHUGS lycos.com> wrote:This comes after a small discussion I've had in the #D IRC channel. I have seen that the LDC compiler is much more efficient if you use SSE(2) extensions, while it's not much efficient if you don't use them (GCC/GDC don't seem so much sensitive to the presence of the SSE extensions). I often have to switch from an old and a new CPU, so if I compile with SSE2 extensions the program doesn't run on the old CPU, while if I don't use them, I sometimes have a program that goes much slower on the newer CPU. So, it may be useful to have a way to build executables able to run well on both CPUs (Apple has done something like this two or more times in the past). There are several ways to do this, a solution is to compile just critical functions for different CPUs, but that may require compiler support. My executables are generally small, so doubling their size isn't a problem. So a simple solution is to bundle two whole executables into an executable and add a small header that looks for the current CPU, and runs the right executable. Notice that the problem I have shown isn't limited to SSE2, it's more common, for example in the close future you may want code compiled for the GPU and/or CPU, etc. Bye, bearophileIs this the sort thing you are looking for: http://www.songho.ca/misc/sse/sse.html
Feb 02 2009