D.gnu - abi specs, multiple linkages, binary symbol information
- Jakob Praher (58/58) Oct 18 2004 hi David,
- David Friedman (36/114) Oct 19 2004 There is no formal spec, so your best bet is to check out mangle.c and
- Jakob Praher (77/155) Oct 19 2004 ok. will look into that.
- David Friedman (5/179) Oct 20 2004 Have to use '~' for concat on a D forum ;)
hi David, hi all, I like the D language. Since I also play with gcj (the static gcc java compiler), which has a new ABI (additional to the c++ linkage), I was wondering about the default D ABI: * how classes/modules/functions/methods are mangled * which type codes exist * is there a way to describe any type using a type code (which is probably needed for method overloading ) * is there a way to specifiy versioning in the ABI * since D has its own linkage (opposed to C++ linkage) I would appreciate a less is more approach and a more stable ABI like that of C++ Yes I looked at DMD but I thought, it would be pleased to know there is a written spec (the language reference is quite quiet about that). What would be interesting is to support many different types of ABIs/Linkages. This could be done by "helping" the compiler to understand the ABI that one is using. And: As the language is specified today, is there a way to do a load time linking? I would be interested to link GCJ shared objects against D in a very native form, so that one could use for instance the many java libs already developed. for instance gcj import org.apache.xalan...TransformerImpl; gcj import java.lang.String; int main( char[][] args ) { TransformerImpl impl = new TransformerImpl( ); .... } .... Plus: In order for instance to export a D class to be linked with GCJ, one clearly needs more meta information exposed in the object file, or distilled from the D sources. For me I'd favor the first approach, which would be interesting, since one could link against a D object file without the need of the corresponding D source code. The metadata approach used by GCJ is very straigt forward: * There is an UTF8 table * There is a Method table for each method of a class * There are some other tables for Class Descriptors ... * The method table contains also all the referenced methods (not only the ones defined) * There is a Class table for each class (which contains links to the other tables) - vtable (the class's methods) these tables are used for the java binary compatiblity stuff: - otable (offset table for referenced objects by an offset) - atable (address table for referenced objects via address) - itable (interface table) Surely the simplicity of the java type system allows for a simple implementation of that. D would need some more meta information (modules, functions, custom types. .... ) But it would be an interesting task, since then binary compatiblity in D would be more stable. And the interoperabilty between the gcj project and D could also be interesting. looking forward to some discussions -- Jakob
Oct 18 2004
Jakob Praher wrote:hi David, hi all, I like the D language. Since I also play with gcj (the static gcc java compiler), which has a new ABI (additional to the c++ linkage), I was wondering about the default D ABI: * how classes/modules/functions/methods are mangled * which type codes exist * is there a way to describe any type using a type code (which is probably needed for method overloading )There is no formal spec, so your best bet is to check out mangle.c and mtype.c in the front-end source. Basically, functions and variables have the form: "_D" ~ <namespace mangle> ~ <type mangle> <namepsace> is formed from the package, module, aggregates, etc. down to the declaration's identifier. The following would have the same namespace mangle: module a; class b { class c { int i; } } module a.b; class c { idouble i; } <type> encodes the type of the declaration and may contain more namespace mangling if it involves classes, etc.* is there a way to specifiy versioning in the ABII don't think there is a way to do this now.* since D has its own linkage (opposed to C++ linkage) I would appreciate a less is more approach and a more stable ABI like that of C++ Yes I looked at DMD but I thought, it would be pleased to know there is a written spec (the language reference is quite quiet about that). What would be interesting is to support many different types of ABIs/Linkages. This could be done by "helping" the compiler to understand the ABI that one is using. And: As the language is specified today, is there a way to do a load time linking?Unix-style shared libraries are somewhat working now. There are still some initialization and linking issues. I can't really speak to the Windows platform.I would be interested to link GCJ shared objects against D in a very native form, so that one could use for instance the many java libs already developed. for instance gcj import org.apache.xalan...TransformerImpl; gcj import java.lang.String; int main( char[][] args ) { TransformerImpl impl = new TransformerImpl( ); .... } .... Plus: In order for instance to export a D class to be linked with GCJ, one clearly needs more meta information exposed in the object file, or distilled from the D sources. For me I'd favor the first approach, which would be interesting, since one could link against a D object file without the need of the corresponding D source code.I have been thinking of doing something along these lines (with Objective C!) In order to directly use another object ABI, it would be necessary to introduce a new basic type into D. The capabilities of D and Java objects are similar, but they are not binary compatible. Consider: Object o = someJavaObject; o.toString(); // Java String (another Object) or D char[] (a two-element struct) ? If it was important to have Java object pose as D objects (and vice-versa), it would be necessary to use wrappers and/or glue code. I think this could be mostly transparent. There are still more issues like synchronization, and garbage collection that would need to be worked out. An alternative method would be into implement D completely with the GCJ ABI. In this case, however, the ABI would have to be extended to D types like dynamic arrays and structs.The metadata approach used by GCJ is very straigt forward: * There is an UTF8 table * There is a Method table for each method of a class * There are some other tables for Class Descriptors ... * The method table contains also all the referenced methods (not only the ones defined) * There is a Class table for each class (which contains links to the other tables) - vtable (the class's methods) these tables are used for the java binary compatiblity stuff: - otable (offset table for referenced objects by an offset) - atable (address table for referenced objects via address) - itable (interface table) Surely the simplicity of the java type system allows for a simple implementation of that. D would need some more meta information (modules, functions, custom types. .... ) But it would be an interesting task, since then binary compatiblity in D would be more stable. And the interoperabilty between the gcj project and D could also be interesting.It really would be nice to have this kind of binary compatibility as D DLLs/shared libraries become more widespread. The nice thing about it is that using the tables could be optional if you wanted to maximize performance. Davidlooking forward to some discussions -- Jakob
Oct 19 2004
David Friedman wrote:Jakob Praher wrote:~ is a concatenation right?hi David, hi all, I like the D language. Since I also play with gcj (the static gcc java compiler), which has a new ABI (additional to the c++ linkage), I was wondering about the default D ABI: * how classes/modules/functions/methods are mangled * which type codes exist * is there a way to describe any type using a type code (which is probably needed for method overloading )There is no formal spec, so your best bet is to check out mangle.c and mtype.c in the front-end source. Basically, functions and variables have the form: "_D" ~ <namespace mangle> ~ <type mangle><namepsace> is formed from the package, module, aggregates, etc. down to the declaration's identifier. The following would have the same namespace mangle: module a; class b { class c { int i; } } module a.b; class c { idouble i; } <type> encodes the type of the declaration and may contain more namespace mangling if it involves classes, etc.ok. will look into that. so you have * packages * modules * classes what is the difference between a package and a module? I have heard that modules can have initializers? Are they somewhat like static classes? are packages every used now?hmm. this is probably no that easy. but on the other hand one could do* is there a way to specifiy versioning in the ABII don't think there is a way to do this now.will look at the compiler for that. but great work done sofar!* since D has its own linkage (opposed to C++ linkage) I would appreciate a less is more approach and a more stable ABI like that of C++ Yes I looked at DMD but I thought, it would be pleased to know there is a written spec (the language reference is quite quiet about that). What would be interesting is to support many different types of ABIs/Linkages. This could be done by "helping" the compiler to understand the ABI that one is using. And: As the language is specified today, is there a way to do a load time linking?Unix-style shared libraries are somewhat working now. There are still some initialization and linking issues. I can't really speak to the Windows platform.I have been thinking of doing something along these lines (with Objective C!) In order to directly use another object ABI, it would be necessary to introduce a new basic type into D. The capabilities of D and Java objects are similar, but they are not binary compatible. Consider: Object o = someJavaObject; o.toString(); // Java String (another Object) or D char[] (a two-element struct) ?If it was important to have Java object pose as D objects (and vice-versa), it would be necessary to use wrappers and/or glue code. I think this could be mostly transparent. There are still more issues like synchronization, and garbage collection that would need to be worked out. An alternative method would be into implement D completely with the GCJ ABI. In this case, however, the ABI would have to be extended to D types like dynamic arrays and structs.What I'll probably do over the next (spare-)time is to define a concrete spec about the requirements of these table linkages and then perhaps we can settle on a mangling structure that is java-compatible, but also satisfies the D requirements. Objective C has also a type of mangling and structures are mangled using {<members>}, which could be a way to define them.... V // void I // int ... L...; // java class [<type>;// java array +----------------+ {<type><type>;// structure *<type>; // pointer or something like that ... value types have to be mangled too... /perhaps using the structure information, this would make {I} a value type who is just a int32, and {*I;} a struct of a pointer to int32 etc. we would of course need a away to introduce new built in types, which could be done using the _; stuff, for instance _uint16; would mean unsigned int 16 or something like that I also have looked into the .net stuff and found out that they use non-symbolic type information, but whether thats better is a question of taste ... What would be also great is a pointer free representation of all the exported meta data of a d compilation unit. for instance like a constant pool of items: Item = { byte type, int size } DCompilationUnit = { byte type; int size; int majorVersion int minorVersion int PackageRef } PackageDesc = { byte type; int size; int moduleCount int ModuleRefId .. } ModuelDesc = { byte type; int size; int classCount int varCount } ClassDesc = { byte type; int size; } ... ... this coud be placed in a special section of the relocatable Object file, for instance the .metadata section or something like that, which can be loaded read only (in the .text section and can be mmaped directly) with that information we could use an extraction tool that uses these information and could pass this stuff to the compiler ... Allthough I have a bit of ELF and linking knowledge I am by no means an expert. So if anyone has better ideas for laying out this stuff in Object files, please let me know. Jakob
Oct 19 2004
Jakob Praher wrote:David Friedman wrote:[snip]Jakob Praher wrote:Have to use '~' for concat on a D forum ;) [snip]There is no formal spec, so your best bet is to check out mangle.c and mtype.c in the front-end source. Basically, functions and variables have the form: "_D" ~ <namespace mangle> ~ <type mangle>~ is a concatenation right?ok. will look into that. so you have * packages * modules * classes what is the difference between a package and a module? I have heard that modules can have initializers? Are they somewhat like static classes? are packages every used now?Packages are just the names/directories containing modules (e.g. std.c)hmm. this is probably no that easy. but on the other hand one could do* is there a way to specifiy versioning in the ABII don't think there is a way to do this now.will look at the compiler for that. but great work done sofar!* since D has its own linkage (opposed to C++ linkage) I would appreciate a less is more approach and a more stable ABI like that of C++ Yes I looked at DMD but I thought, it would be pleased to know there is a written spec (the language reference is quite quiet about that). What would be interesting is to support many different types of ABIs/Linkages. This could be done by "helping" the compiler to understand the ABI that one is using. And: As the language is specified today, is there a way to do a load time linking?Unix-style shared libraries are somewhat working now. There are still some initialization and linking issues. I can't really speak to the Windows platform.I have been thinking of doing something along these lines (with Objective C!) In order to directly use another object ABI, it would be necessary to introduce a new basic type into D. The capabilities of D and Java objects are similar, but they are not binary compatible. Consider: Object o = someJavaObject; o.toString(); // Java String (another Object) or D char[] (a two-element struct) ?If it was important to have Java object pose as D objects (and vice-versa), it would be necessary to use wrappers and/or glue code. I think this could be mostly transparent. There are still more issues like synchronization, and garbage collection that would need to be worked out. An alternative method would be into implement D completely with the GCJ ABI. In this case, however, the ABI would have to be extended to D types like dynamic arrays and structs.What I'll probably do over the next (spare-)time is to define a concrete spec about the requirements of these table linkages and then perhaps we can settle on a mangling structure that is java-compatible, but also satisfies the D requirements. Objective C has also a type of mangling and structures are mangled using {<members>}, which could be a way to define them.... V // void I // int ... L...; // java class [<type>;// java array +----------------+ {<type><type>;// structure *<type>; // pointer or something like that ... value types have to be mangled too... /perhaps using the structure information, this would make {I} a value type who is just a int32, and {*I;} a struct of a pointer to int32 etc. we would of course need a away to introduce new built in types, which could be done using the _; stuff, for instance _uint16; would mean unsigned int 16 or something like that I also have looked into the .net stuff and found out that they use non-symbolic type information, but whether thats better is a question of taste ... What would be also great is a pointer free representation of all the exported meta data of a d compilation unit. for instance like a constant pool of items: Item = { byte type, int size } DCompilationUnit = { byte type; int size; int majorVersion int minorVersion int PackageRef } PackageDesc = { byte type; int size; int moduleCount int ModuleRefId .. } ModuelDesc = { byte type; int size; int classCount int varCount } ClassDesc = { byte type; int size; } ... ... this coud be placed in a special section of the relocatable Object file, for instance the .metadata section or something like that, which can be loaded read only (in the .text section and can be mmaped directly) with that information we could use an extraction tool that uses these information and could pass this stuff to the compiler ... Allthough I have a bit of ELF and linking knowledge I am by no means an expert. So if anyone has better ideas for laying out this stuff in Object files, please let me know. Jakob
Oct 20 2004