D.gnu - Object file questions
- Timo Sintonen (12/12) Aug 14 2014 I have been looking at object files to see if I can reduce the
- Johannes Pfau (7/23) Aug 14 2014 Strange, could you post a testcase?
- Timo Sintonen (8/34) Aug 14 2014 It seems this comes from libdruntime and it exists in object.o
- Artur Skawina via D.gnu (15/27) Aug 14 2014 diff --git a/libphobos/libdruntime/gcc/atomics.d b/libphobos/libdruntime...
- Johannes Pfau (22/64) Aug 14 2014 If you're referring to this:
- Timo Sintonen (11/80) Aug 16 2014 Looks good. Template code is gone and init blocks have moved to
- ketmar via D.gnu (8/9) Aug 16 2014 maybe this will work:
- Johannes Pfau (11/102) Aug 16 2014 Iain recently pushed a commit to put zero initializers into bss, so
- Timo Sintonen (35/53) Aug 16 2014 It is true that bss does not take place in the executable. But in
- Johannes Pfau (4/29) Aug 16 2014 I just had a look at this and ClassInfo has a mutable 'monitor' field,
- Mike (7/10) Aug 16 2014 This was discussed at DConf 2014.
- Johannes Pfau (9/24) Aug 17 2014 Great! But I think this pull request addresses a different monitor
- Mike (5/17) Aug 17 2014 I looked through the source code, and couldn't find any such
- Johannes Pfau (14/35) Aug 17 2014 In gcc/d/d-objfile.cc: Search for
- Artur Skawina via D.gnu (29/32) Aug 16 2014 [Only noticed this accidentally; using a mailing list
- Mike (10/11) Aug 16 2014 It may be allowed, but it probably shouldn't be. Always-inlining
- Artur Skawina via D.gnu (13/20) Aug 16 2014 Address-of should work -- disallowing it wouldn't help much, but would
- Johannes Pfau (5/34) Aug 17 2014 We can make this explicit. I don't care enough to argue about that.
- Artur Skawina via D.gnu (4/12) Aug 17 2014 *I* haven't encountered any problems and have been using functions+
- Johannes Pfau (5/19) Aug 17 2014 Then I don't understand your statement at all. You said 'instead of
- Artur Skawina via D.gnu (17/38) Aug 17 2014 I don't know - it wasn't me who proposed:
- Mike (4/16) Aug 17 2014 Do you mean the problems with --gc-sections breaking code?
- Timo Sintonen (7/43) Aug 16 2014 This seems to work.
- Artur Skawina via D.gnu (41/45) Aug 16 2014 version (GNU) {
- Artur Skawina via D.gnu (50/53) Aug 16 2014 I did not like that required dereference in the previous version,
- Timo Sintonen (18/73) Aug 17 2014 This seems to work. With inlining the code is quite compact.
- Johannes Pfau (12/100) Aug 17 2014 You mean __builtin_volatile_load/store? I'm not sure if compiler
- Timo Sintonen (26/27) Aug 17 2014 This does not work with member functions
- Artur Skawina via D.gnu (47/83) Aug 17 2014 It works for me:
- Timo Sintonen (11/62) Aug 17 2014 I am compiling for arm and I am sorry I misinterpreted the
- Artur Skawina via D.gnu (6/8) Aug 17 2014 Does declaring it as:
- Timo Sintonen (11/26) Aug 17 2014 Yes, now it works.
- Johannes Pfau (5/38) Aug 17 2014 r3 is an argument/scratch register, the callee can't rely on its
- Johannes Pfau (3/4) Aug 17 2014 caller of course ;-)
- Timo Sintonen (2/42) Aug 17 2014 So is this a bug or just undefined behavior?
- Timo Sintonen (28/45) Aug 18 2014 I have had some weird bugs lately and then I looked my other
- Artur Skawina via D.gnu (11/17) Aug 17 2014 Ensuring ordering, w/o it the compiler could reorder operations
- Johannes Pfau (8/44) Aug 17 2014 That's a good start. Can you also get unary operators working?
- Artur Skawina via D.gnu (93/99) Aug 17 2014 Unary ops are easy. If you mean post-inc and post-dec -- that's a
- Johannes Pfau (10/12) Aug 17 2014 It's perfect for structs, but when simply declaring a Volatile!uint the
- Artur Skawina via D.gnu (29/46) Aug 17 2014 Another D-problem - the language doesn't have /real/ refs. But...
I have been looking at object files to see if I can reduce the memory usage for minimum systems. There are two things I have noticed: 1. In the data segment there is some source code as ascii text from a template in gcc/atomics.d . This is in the actual data segment and not in debug info segments and goes into the data segment of the executable. I do not see any code using this data. Why is this in the executable and is it possible to remove it? 2. In the data segment there is also __init for all types. I assume that they contain the initial values that are copied when a new object of this type is created. Is this data mutable and should it really be in data segment and not in rodata?
Aug 14 2014
Am Thu, 14 Aug 2014 10:07:04 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:I have been looking at object files to see if I can reduce the memory usage for minimum systems. There are two things I have noticed: 1. In the data segment there is some source code as ascii text from a template in gcc/atomics.d . This is in the actual data segment and not in debug info segments and goes into the data segment of the executable. I do not see any code using this data. Why is this in the executable and is it possible to remove it?Strange, could you post a testcase?2. In the data segment there is also __init for all types. I assume that they contain the initial values that are copied when a new object of this type is created.Correct, it's for '.init' (there's especially __..._TypeInfo_init which is the initializer for typeinfo. I've implemented -fno-rtti in a private git branch to get rid of typeinfo)Is this data mutable and should it really be in data segment and not in rodata?I think it should be in rodata.
Aug 14 2014
On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau wrote:Am Thu, 14 Aug 2014 10:07:04 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:It seems this comes from libdruntime and it exists in object.o and core/atomic.o, Testcase is to compile minlibd library as it is currently in the repo using the makefile as such. But I think it will be in any object file that imports gcc.atomics and uses the template in there.I have been looking at object files to see if I can reduce the memory usage for minimum systems. There are two things I have noticed: 1. In the data segment there is some source code as ascii text from a template in gcc/atomics.d . This is in the actual data segment and not in debug info segments and goes into the data segment of the executable. I do not see any code using this data. Why is this in the executable and is it possible to remove it?Strange, could you post a testcase?So it is not a bug and not a feature. It is just because it does not matter? Maybe a feature request?2. In the data segment there is also __init for all types. I assume that they contain the initial values that are copied when a new object of this type is created.Correct, it's for '.init' (there's especially __..._TypeInfo_init which is the initializer for typeinfo. I've implemented -fno-rtti in a private git branch to get rid of typeinfo)Is this data mutable and should it really be in data segment and not in rodata?I think it should be in rodata.
Aug 14 2014
On 08/14/14 19:53, Timo Sintonen via D.gnu wrote:On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau wrote:diff --git a/libphobos/libdruntime/gcc/atomics.d b/libphobos/libdruntime/gcc/atomics.d index 78e644191e8f..ee1a146b680e 100644 --- a/libphobos/libdruntime/gcc/atomics.d +++ b/libphobos/libdruntime/gcc/atomics.d -28,7 +28,7 import gcc.builtins; */ private template __sync_op_and(string op1, string op2) { - const __sync_op_and = ` + enum __sync_op_and = ` T __sync_` ~ op1 ~ `_and_` ~ op2 ~ `(T)(const ref shared T ptr, T value) { static if (T.sizeof == byte.sizeof) arturAm Thu, 14 Aug 2014 10:07:04 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:It seems this comes from libdruntime and it exists in object.o and core/atomic.o, Testcase is to compile minlibd library as it is currently in the repo using the makefile as such. But I think it will be in any object file that imports gcc.atomics and uses the template in there.I have been looking at object files to see if I can reduce the memory usage for minimum systems. There are two things I have noticed: 1. In the data segment there is some source code as ascii text from a template in gcc/atomics.d . This is in the actual data segment and not in debug info segments and goes into the data segment of the executable. I do not see any code using this data. Why is this in the executable and is it possible to remove it?Strange, could you post a testcase?
Aug 14 2014
Am Thu, 14 Aug 2014 17:53:32 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau wrote:If you're referring to this: http://dpaste.dzfl.pl/fe75e8c7dfca This seems to be the const variable in __sync_op_and. Try to change the code to "immutable __sync_op_and = " or "enum __sync_op_and = " and file a bug report.Am Thu, 14 Aug 2014 10:07:04 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:It seems this comes from libdruntime and it exists in object.o and core/atomic.o, Testcase is to compile minlibd library as it is currently in the repo using the makefile as such. But I think it will be in any object file that imports gcc.atomics and uses the template in there.I have been looking at object files to see if I can reduce the memory usage for minimum systems. There are two things I have noticed: 1. In the data segment there is some source code as ascii text from a template in gcc/atomics.d . This is in the actual data segment and not in debug info segments and goes into the data segment of the executable. I do not see any code using this data. Why is this in the executable and is it possible to remove it?Strange, could you post a testcase?Seems to happen only for the TypeInfo init symbols. I can't run the testsuite right now, but try this: diff --git a/gcc/d/d-decls.cc b/gcc/d/d-decls.cc index bd6f5f9..45d433a 100644 --- a/gcc/d/d-decls.cc +++ b/gcc/d/d-decls.cc -274,6 +274,8 TypeInfoDeclaration::toSymbol (void) // given TypeInfo. It is the actual data, not a reference gcc_assert (TREE_CODE (TREE_TYPE (csym->Stree)) == REFERENCE_TYPE); TREE_TYPE (csym->Stree) = TREE_TYPE (TREE_TYPE (csym->Stree)); + TREE_CONSTANT (csym->Stree) = true; + TREE_READONLY (csym->Stree) = true; relayout_decl (csym->Stree); TREE_USED (csym->Stree) = 1;So it is not a bug and not a feature. It is just because it does not matter? Maybe a feature request?2. In the data segment there is also __init for all types. I assume that they contain the initial values that are copied when a new object of this type is created.Correct, it's for '.init' (there's especially __..._TypeInfo_init which is the initializer for typeinfo. I've implemented -fno-rtti in a private git branch to get rid of typeinfo)Is this data mutable and should it really be in data segment and not in rodata?I think it should be in rodata.
Aug 14 2014
On Thursday, 14 August 2014 at 19:05:46 UTC, Johannes Pfau wrote:Am Thu, 14 Aug 2014 17:53:32 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:Looks good. Template code is gone and init blocks have moved to rodata. My simple test program compiles and runs. There is still some __Class in data segment and init values for structs and arrays in bss segment. Is it possible to move these to rodata too? In my application there will be several large structs. I never create anything of these types. Instead I use them to point to hardware registers and maybe on top of existing byte arrays like message buffers. There will still be initial values for these structs wasting memory. Is there any way to omit them?On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau wrote:If you're referring to this: http://dpaste.dzfl.pl/fe75e8c7dfca This seems to be the const variable in __sync_op_and. Try to change the code to "immutable __sync_op_and = " or "enum __sync_op_and = " and file a bug report.Am Thu, 14 Aug 2014 10:07:04 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:It seems this comes from libdruntime and it exists in object.o and core/atomic.o, Testcase is to compile minlibd library as it is currently in the repo using the makefile as such. But I think it will be in any object file that imports gcc.atomics and uses the template in there.I have been looking at object files to see if I can reduce the memory usage for minimum systems. There are two things I have noticed: 1. In the data segment there is some source code as ascii text from a template in gcc/atomics.d . This is in the actual data segment and not in debug info segments and goes into the data segment of the executable. I do not see any code using this data. Why is this in the executable and is it possible to remove it?Strange, could you post a testcase?Seems to happen only for the TypeInfo init symbols. I can't run the testsuite right now, but try this: diff --git a/gcc/d/d-decls.cc b/gcc/d/d-decls.cc index bd6f5f9..45d433a 100644 --- a/gcc/d/d-decls.cc +++ b/gcc/d/d-decls.cc -274,6 +274,8 TypeInfoDeclaration::toSymbol (void) // given TypeInfo. It is the actual data, not a reference gcc_assert (TREE_CODE (TREE_TYPE (csym->Stree)) == REFERENCE_TYPE); TREE_TYPE (csym->Stree) = TREE_TYPE (TREE_TYPE (csym->Stree)); + TREE_CONSTANT (csym->Stree) = true; + TREE_READONLY (csym->Stree) = true; relayout_decl (csym->Stree); TREE_USED (csym->Stree) = 1;So it is not a bug and not a feature. It is just because it does not matter? Maybe a feature request?2. In the data segment there is also __init for all types. I assume that they contain the initial values that are copied when a new object of this type is created.Correct, it's for '.init' (there's especially __..._TypeInfo_init which is the initializer for typeinfo. I've implemented -fno-rtti in a private git branch to get rid of typeinfo)Is this data mutable and should it really be in data segment and not in rodata?I think it should be in rodata.
Aug 16 2014
On Sat, 16 Aug 2014 07:06:34 +0000 "Timo Sintonen via D.gnu" <d.gnu puremagic.com> wrote:structs wasting memory. Is there any way to omit them?maybe this will work: struct A { int n =3D void; uint[2] a =3D void; ...and so on for all fields }
Aug 16 2014
Am Sat, 16 Aug 2014 07:06:34 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:On Thursday, 14 August 2014 at 19:05:46 UTC, Johannes Pfau wrote:Iain recently pushed a commit to put zero initializers into bss, so that's intentional: http://bugzilla.gdcproject.org/show_bug.cgi?id=139 But I understand your point that it should be in rodata instead, you'll have to discuss this with Iain. Regarding __Class: Can you post a short example?Am Thu, 14 Aug 2014 17:53:32 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:Looks good. Template code is gone and init blocks have moved to rodata. My simple test program compiles and runs. There is still some __Class in data segment and init values for structs and arrays in bss segment. Is it possible to move these to rodata too?On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau wrote:If you're referring to this: http://dpaste.dzfl.pl/fe75e8c7dfca This seems to be the const variable in __sync_op_and. Try to change the code to "immutable __sync_op_and = " or "enum __sync_op_and = " and file a bug report.Am Thu, 14 Aug 2014 10:07:04 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:It seems this comes from libdruntime and it exists in object.o and core/atomic.o, Testcase is to compile minlibd library as it is currently in the repo using the makefile as such. But I think it will be in any object file that imports gcc.atomics and uses the template in there.I have been looking at object files to see if I can reduce the memory usage for minimum systems. There are two things I have noticed: 1. In the data segment there is some source code as ascii text from a template in gcc/atomics.d . This is in the actual data segment and not in debug info segments and goes into the data segment of the executable. I do not see any code using this data. Why is this in the executable and is it possible to remove it?Strange, could you post a testcase?Seems to happen only for the TypeInfo init symbols. I can't run the testsuite right now, but try this: diff --git a/gcc/d/d-decls.cc b/gcc/d/d-decls.cc index bd6f5f9..45d433a 100644 --- a/gcc/d/d-decls.cc +++ b/gcc/d/d-decls.cc -274,6 +274,8 TypeInfoDeclaration::toSymbol (void) // given TypeInfo. It is the actual data, not a reference gcc_assert (TREE_CODE (TREE_TYPE (csym->Stree)) == REFERENCE_TYPE); TREE_TYPE (csym->Stree) = TREE_TYPE (TREE_TYPE (csym->Stree)); + TREE_CONSTANT (csym->Stree) = true; + TREE_READONLY (csym->Stree) = true; relayout_decl (csym->Stree); TREE_USED (csym->Stree) = 1;So it is not a bug and not a feature. It is just because it does not matter? Maybe a feature request?2. In the data segment there is also __init for all types. I assume that they contain the initial values that are copied when a new object of this type is created.Correct, it's for '.init' (there's especially __..._TypeInfo_init which is the initializer for typeinfo. I've implemented -fno-rtti in a private git branch to get rid of typeinfo)Is this data mutable and should it really be in data segment and not in rodata?I think it should be in rodata.In my application there will be several large structs. I never create anything of these types. Instead I use them to point to hardware registers and maybe on top of existing byte arrays like message buffers. There will still be initial values for these structs wasting memory. Is there any way to omit them?See https://github.com/D-Programming-GDC/GDC/pull/82 attribute("noinit")
Aug 16 2014
On Saturday, 16 August 2014 at 07:36:07 UTC, Johannes Pfau wrote:Iain recently pushed a commit to put zero initializers into bss, so that's intentional: http://bugzilla.gdcproject.org/show_bug.cgi?id=139 But I understand your point that it should be in rodata instead, you'll have to discuss this with Iain.It is true that bss does not take place in the executable. But in small processors, even there is nowadays plenty of rom there is not enough ram. It is also a question of safety: in the long run, data area may be corrupted by buggy program or electrical distort while rodata in rom cannot be changed. At least in my setup, gold maps bss to executable anyway while ld does not. I noticed your comment in the bug report. I was just thinking the same: one big block of zeros that is used by all. Another that I was thinking is that memset might be used for these types. Then there would be no block of zeros at all. But that would require an extra flag in typeinfo to separate these types from others...Regarding __Class: Can you post a short example?Some lines from mapfile. Seems to be one for every type in the program: .data 0x0000000020001074 0x720 minlibd/libdruntime/libdruntime.a(object_.o) 0x0000000020001074 _D9Exception7__ClassZ 0x00000000200010c0 _D8TypeInfo7__ClassZ 0x000000002000110c _D17TypeInfo_Function7__ClassZ 0x0000000020001158 _D17TypeInfo_Delegate7__ClassZ 0x00000000200011a4 _D14TypeInfo_Class7__ClassZ 0x00000000200011f0 _D18TypeInfo_Interface7__ClassZ 0x000000002000123c _D15TypeInfo_Struct7__ClassZ 0x0000000020001288 _D16TypeInfo_Typedef7__ClassZYes this will solve the problem.In my application there will be several large structs. I never create anything of these types. Instead I use them to point to hardware registers and maybe on top of existing byte arrays like message buffers. There will still be initial values for these structs wasting memory. Is there any way to omit them?See https://github.com/D-Programming-GDC/GDC/pull/82 attribute("noinit")
Aug 16 2014
Am Sat, 16 Aug 2014 08:39:04 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:I just had a look at this and ClassInfo has a mutable 'monitor' field, so it can't be placed into read-only data.Regarding __Class: Can you post a short example?Some lines from mapfile. Seems to be one for every type in the program: .data 0x0000000020001074 0x720 minlibd/libdruntime/libdruntime.a(object_.o) 0x0000000020001074 _D9Exception7__ClassZ 0x00000000200010c0 _D8TypeInfo7__ClassZ 0x000000002000110c _D17TypeInfo_Function7__ClassZ 0x0000000020001158 _D17TypeInfo_Delegate7__ClassZ 0x00000000200011a4 _D14TypeInfo_Class7__ClassZ 0x00000000200011f0 _D18TypeInfo_Interface7__ClassZ 0x000000002000123c _D15TypeInfo_Struct7__ClassZ 0x0000000020001288 _D16TypeInfo_Typedef7__ClassZ
Aug 16 2014
On Saturday, 16 August 2014 at 09:29:14 UTC, Johannes Pfau wrote:I just had a look at this and ClassInfo has a mutable 'monitor' field, so it can't be placed into read-only data.This was discussed at DConf 2014. https://www.youtube.com/watch?v=TNvUIWFy02I#t=1008 There is currently a pull request to remove the monitor from object field from object and therefore all classes: https://github.com/D-Programming-Language/druntime/pull/789. Mike
Aug 16 2014
Am Sat, 16 Aug 2014 10:36:19 +0000 schrieb "Mike" <none none.com>:On Saturday, 16 August 2014 at 09:29:14 UTC, Johannes Pfau wrote:Great! But I think this pull request addresses a different monitor problem: There's an implicit __monitor field in every class right now, which makes every class _instance_ bigger. But the monitor in TypeInfo/ClassInfo is different: ClassInfo exists only once per class, it doesn't matter how many class instances you've got. AFAIR this monitor is to support synchronize(ClassType) which synchronizes on the class type, not on an instance.I just had a look at this and ClassInfo has a mutable 'monitor' field, so it can't be placed into read-only data.This was discussed at DConf 2014. https://www.youtube.com/watch?v=TNvUIWFy02I#t=1008 There is currently a pull request to remove the monitor from object field from object and therefore all classes: https://github.com/D-Programming-Language/druntime/pull/789. Mike
Aug 17 2014
On Sunday, 17 August 2014 at 08:26:40 UTC, Johannes Pfau wrote:Great! But I think this pull request addresses a different monitor problem: There's an implicit __monitor field in every class right now, which makes every class _instance_ bigger. But the monitor in TypeInfo/ClassInfo is different: ClassInfo exists only once per class, it doesn't matter how many class instances you've got. AFAIR this monitor is to support synchronize(ClassType) which synchronizes on the class type, not on an instance.I looked through the source code, and couldn't find any such monitor. Can you please point it out for me? Thanks, Mike
Aug 17 2014
Am Sun, 17 Aug 2014 10:44:34 +0000 schrieb "Mike" <none none.com>:On Sunday, 17 August 2014 at 08:26:40 UTC, Johannes Pfau wrote:In gcc/d/d-objfile.cc: Search for /* Put out the ClassInfo. * The layout is: * void **vptr; * monitor_t monitor; * byte[] initializer; // static initialisation data Actually I just realized that this is also true for all TypeInfo, so I'll have to revert the commit which placed TypeInfo into .rodata (Thinking more about it, it's more or less the same monitor as the one referred in the pull request: TypeInfo are classes and for every type there is one instance, then these instances have __monitor fields. But the implementation in the compiler is slightly different)Great! But I think this pull request addresses a different monitor problem: There's an implicit __monitor field in every class right now, which makes every class _instance_ bigger. But the monitor in TypeInfo/ClassInfo is different: ClassInfo exists only once per class, it doesn't matter how many class instances you've got. AFAIR this monitor is to support synchronize(ClassType) which synchronizes on the class type, not on an instance.I looked through the source code, and couldn't find any such monitor. Can you please point it out for me? Thanks, Mike
Aug 17 2014
On 08/16/14 09:33, Johannes Pfau via D.gnu wrote:https://github.com/D-Programming-GDC/GDC/pull/82[Only noticed this accidentally; using a mailing list instead of some web forum would increase visibility...]enum var = Volatile!(T,addr)(): doesn't allow |= on enum literals, even if the type implements opAssign as there's no this pointerT volatile_load(T)(ref T v) nothrow { asm { "" : "+m" v; } T res = v; asm { "" : "+g" res; } return res; } void volatile_store(T)(ref T v, const T a) nothrow { asm { "" : : "m" v; } v = a; asm { "" : "+m" v; } } struct Volatile(T, alias /* T* */ A) { void opOpAssign(string OP)(const T rhs) nothrow { auto v = volatile_load(*A); mixin("v " ~ OP ~ "= rhs;"); volatile_store(*A, v); } } enum TimerB = Volatile!(uint, cast(uint*)0xDEADBEEF)(); void main() { TimerB |= 0b1; TimerB += 1; }not emitting force-inlined functions is a logical optimization for forceinline (if a function is always inlined, there's no way to call it, so there's no need to output it).Taking the address of an always_inline function is allowed. artur
Aug 16 2014
On Saturday, 16 August 2014 at 09:59:03 UTC, Artur Skawina via D.gnu wrote:Taking the address of an always_inline function is allowed.It may be allowed, but it probably shouldn't be. Always-inlining a function and taking the address of that function is contradictory. But this situation demonstrates why having an intelligent linker is a better solution than decorating with attributes. The linker should know if you took an address of an always-inlined function or not and decide whether or not to remove it from the binary. Mike
Aug 16 2014
On 08/16/14 12:41, Mike via D.gnu wrote:On Saturday, 16 August 2014 at 09:59:03 UTC, Artur Skawina via D.gnu wrote:Address-of should work -- disallowing it wouldn't help much, but would create problems for code that needs to call the function both directly and indirectly. This is actually a larger problem for D than for C (where it's allowed) because of generic code, templates and delegates. The alternative would be requiring trivial not- inline wrappers and compile failures if one is accidentally forgotten. A ` nocode` attribute would be a good idea, yes, but there's no need to make it implicit for ` inline`.Taking the address of an always_inline function is allowed.It may be allowed, but it probably shouldn't be. Always-inlining a function and taking the address of that function is contradictory.But this situation demonstrates why having an intelligent linker is a better solution than decorating with attributes. The linker should know if you took an address of an always-inlined function or not and decide whether or not to remove it from the binary.It already does. Apparently there are some kind of problems with certain setups, but, instead of addressing those problems, more and more /language/ hacks are proposed... artur
Aug 16 2014
Am Sat, 16 Aug 2014 13:15:57 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:On 08/16/14 12:41, Mike via D.gnu wrote:We can make this explicit. I don't care enough to argue about that.On Saturday, 16 August 2014 at 09:59:03 UTC, Artur Skawina via D.gnu wrote:Address-of should work -- disallowing it wouldn't help much, but would create problems for code that needs to call the function both directly and indirectly. This is actually a larger problem for D than for C (where it's allowed) because of generic code, templates and delegates. The alternative would be requiring trivial not- inline wrappers and compile failures if one is accidentally forgotten. A ` nocode` attribute would be a good idea, yes, but there's no need to make it implicit for ` inline`.Taking the address of an always_inline function is allowed.It may be allowed, but it probably shouldn't be. Always-inlining a function and taking the address of that function is contradictory.So as you know all these problems and you know exactly how to fix them, where's your contribution?But this situation demonstrates why having an intelligent linker is a better solution than decorating with attributes. The linker should know if you took an address of an always-inlined function or not and decide whether or not to remove it from the binary.It already does. Apparently there are some kind of problems with certain setups, but, instead of addressing those problems, more and more /language/ hacks are proposed... artur
Aug 17 2014
On 08/17/14 10:31, Johannes Pfau via D.gnu wrote:Am Sat, 16 Aug 2014 13:15:57 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:It already does. Apparently there are some kind of problems with certain setups, but, instead of addressing those problems, more and more /language/ hacks are proposed...So as you know all these problems and you know exactly how to fix them, where's your contribution?*I* haven't encountered any problems and have been using functions+ data+gc-sections for years... artur
Aug 17 2014
Am Sun, 17 Aug 2014 13:38:36 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:On 08/17/14 10:31, Johannes Pfau via D.gnu wrote:Then I don't understand your statement at all. You said 'instead of addressing those problems' but there are no problems? Also what exactly are 'more /language/ hacks'?Am Sat, 16 Aug 2014 13:15:57 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:It already does. Apparently there are some kind of problems with certain setups, but, instead of addressing those problems, more and more /language/ hacks are proposed...So as you know all these problems and you know exactly how to fix them, where's your contribution?*I* haven't encountered any problems and have been using functions+ data+gc-sections for years...
Aug 17 2014
On 08/17/14 13:57, Johannes Pfau via D.gnu wrote:Am Sun, 17 Aug 2014 13:38:36 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:I don't know - it wasn't me who proposed: - attribute("noinit") - attribute("notypeinfo") - attribute("nocode") - pragma(GNU_nomoduleinfo) etcOn 08/17/14 10:31, Johannes Pfau via D.gnu wrote:Then I don't understand your statement at all. You said 'instead of addressing those problems' but there are no problems?Am Sat, 16 Aug 2014 13:15:57 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:It already does. Apparently there are some kind of problems with certain setups, but, instead of addressing those problems, more and more /language/ hacks are proposed...So as you know all these problems and you know exactly how to fix them, where's your contribution?*I* haven't encountered any problems and have been using functions+ data+gc-sections for years...Also what exactly are 'more /language/ hacks'?The above, volatile attribute etc. Note that I agree (some of) those are necessary -- it's just that they are all useful for certain very specific cases -- they are not a general solution to the codegen bloat problem. A situation where practically every declaration and almost every scope in a D program needs to be annotated with compiler- -specific non-portable annotations is not a good one. And not even a practical one -- it not reasonable to expect everyone to modify the source of every used library (!) to match the requirements of every project (some may need RTTI, other may not want it at all, etc). artur
Aug 17 2014
On Saturday, 16 August 2014 at 11:16:09 UTC, Artur Skawina via D.gnu wrote:A ` nocode` attribute would be a good idea, yes, but there's no need to make it implicit for ` inline`.Do you mean the problems with --gc-sections breaking code? MikeBut this situation demonstrates why having an intelligent linker is a better solution than decorating with attributes. The linker should know if you took an address of an always-inlined function or not and decide whether or not to remove it from the binary.It already does. Apparently there are some kind of problems with certain setups, but, instead of addressing those problems, more and more /language/ hacks are proposed...
Aug 17 2014
On Saturday, 16 August 2014 at 09:59:03 UTC, Artur Skawina via D.gnu wrote:On 08/16/14 09:33, Johannes Pfau via D.gnu wrote:This seems to work. I am not so familiar with these opAssign things, so how I can do basic assignment: TimerB = 0x1234 ? How can I use this with struct members ? Is it possible to inline volatile_load and volatile_store ?https://github.com/D-Programming-GDC/GDC/pull/82[Only noticed this accidentally; using a mailing list instead of some web forum would increase visibility...]enum var = Volatile!(T,addr)(): doesn't allow |= on enum literals, even if the type implements opAssign as there's no this pointerT volatile_load(T)(ref T v) nothrow { asm { "" : "+m" v; } T res = v; asm { "" : "+g" res; } return res; } void volatile_store(T)(ref T v, const T a) nothrow { asm { "" : : "m" v; } v = a; asm { "" : "+m" v; } } struct Volatile(T, alias /* T* */ A) { void opOpAssign(string OP)(const T rhs) nothrow { auto v = volatile_load(*A); mixin("v " ~ OP ~ "= rhs;"); volatile_store(*A, v); } } enum TimerB = Volatile!(uint, cast(uint*)0xDEADBEEF)(); void main() { TimerB |= 0b1; TimerB += 1; }not emitting force-inlined functions is a logical optimization for forceinline (if a function is always inlined, there's no way to call it, so there's no need to output it).Taking the address of an always_inline function is allowed. artur
Aug 16 2014
On 08/16/14 18:46, Timo Sintonen via D.gnu wrote:I am not so familiar with these opAssign things, so how I can do basic assignment: TimerB = 0x1234 ?Is it possible to inline volatile_load and volatile_store ?version (GNU) { static import gcc.attribute; enum inline = gcc.attribute.attribute("forceinline"); } extern int volatile_dummy; inline T volatile_load(T)(ref T v) nothrow { asm { "" : "+m" v, "+m" volatile_dummy; } T res = v; asm { "" : "+g" res, "+m" volatile_dummy; } return res; } inline void volatile_store(T, A)(ref T v, A a) nothrow { asm { "" : "+m" volatile_dummy : "m" v; } v = a; asm { "" : "+m" v, "+m" volatile_dummy; } } static struct Volatile(T, alias PTR) { static: nothrow: inline: void opOpAssign(string OP)(const T rhs) { auto v = volatile_load(*PTR); mixin("v " ~ OP ~ "= rhs;"); volatile_store(*PTR, v); } void opAssign()(const T rhs) { volatile_store(*PTR, rhs); } T opUnary(string OP:"*")() { return volatile_load(*PTR); } } enum TimerB = Volatile!(uint, cast(uint*)0xDEADBEEF)(); int main() { TimerB |= 0b1; TimerB += 1; TimerB = 42; return *TimerB; }How can I use this with struct members ?One possibility would be to declare all members as `Volatile!...`, or even create such a struct at CT. Another solution would be something like http://forum.dlang.org/post/mailman.4237.1405540813.2907.digital ars-d puremagic.com . artur
Aug 16 2014
On 08/16/14 20:40, Artur Skawina wrote:I did not like that required dereference in the previous version, and tried a different approach: struct Timer { Volatile!uint control; Volatile!uint data; } enum timerA = cast(Timer*)0xDEADBEAF; int main() { timerA.control |= 0b1; timerA.control += 1; timerA.control = 42; int a = timerA.data - timerA.data; int b = timerA.control; return timerA.control; } version (GNU) { static import gcc.attribute; enum inline = gcc.attribute.attribute("forceinline"); } extern int volatile_dummy; inline T volatile_load(T)(ref T v) nothrow { asm { "" : "+m" v, "+m" volatile_dummy; } T res = v; asm { "" : "+g" res, "+m" v, "+m" volatile_dummy; } return res; } inline void volatile_store(T, A)(ref T v, A a) nothrow { asm { "" : "+m" volatile_dummy : "m" v; } v = a; asm { "" : "+m" v, "+m" volatile_dummy; } } struct Volatile(T) { T raw; nothrow: inline: disable this(this); void opAssign(A)(A a) { volatile_store(raw, a); } T load() property { return volatile_load(raw); } alias load this; void opOpAssign(string OP)(const T rhs) { auto v = volatile_load(raw); mixin("v " ~ OP ~ "= rhs;"); volatile_store(raw, v); } } arturHow can I use this with struct members ?One possibility would be to declare all members as `Volatile!...`, or
Aug 16 2014
On Saturday, 16 August 2014 at 20:01:06 UTC, Artur Skawina via D.gnu wrote:On 08/16/14 20:40, Artur Skawina wrote:This seems to work. With inlining the code is quite compact. Not tested yet but the code for these constructs looks correct: for (f=0;f<50;f++) { regs.txreg = śomebuf[f] } while (regs.status == 0) {} What is the purpose of volatile_dummy? Even if it is not used, the address for it is calculated in several places. The struct members are defined saparately. This means the address of every member is stored and fetched separately. The compiler seems to remove some of these and use the pointer, but I am not sure what happens when the structs are bigger. It seems all loads and stores access the real memory, like volatile should do. It is hard to follow the optimized code so I am not yet sure that they have not been reordered in any way. Anyway, this seems acceptable solution to me. Johannes, is this good starting point to you or is your work with compiler builtins giving us some more?I did not like that required dereference in the previous version, and tried a different approach: struct Timer { Volatile!uint control; Volatile!uint data; } enum timerA = cast(Timer*)0xDEADBEAF; int main() { timerA.control |= 0b1; timerA.control += 1; timerA.control = 42; int a = timerA.data - timerA.data; int b = timerA.control; return timerA.control; } version (GNU) { static import gcc.attribute; enum inline = gcc.attribute.attribute("forceinline"); } extern int volatile_dummy; inline T volatile_load(T)(ref T v) nothrow { asm { "" : "+m" v, "+m" volatile_dummy; } T res = v; asm { "" : "+g" res, "+m" v, "+m" volatile_dummy; } return res; } inline void volatile_store(T, A)(ref T v, A a) nothrow { asm { "" : "+m" volatile_dummy : "m" v; } v = a; asm { "" : "+m" v, "+m" volatile_dummy; } } struct Volatile(T) { T raw; nothrow: inline: disable this(this); void opAssign(A)(A a) { volatile_store(raw, a); } T load() property { return volatile_load(raw); } alias load this; void opOpAssign(string OP)(const T rhs) { auto v = volatile_load(raw); mixin("v " ~ OP ~ "= rhs;"); volatile_store(raw, v); } } arturHow can I use this with struct members ?One possibility would be to declare all members as `Volatile!...`, or
Aug 17 2014
Am Sun, 17 Aug 2014 07:57:15 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:On Saturday, 16 August 2014 at 20:01:06 UTC, Artur Skawina via D.gnu wrote:You mean __builtin_volatile_load/store? I'm not sure if compiler barriers and these builtins are 100% equal, I think I managed to produce example code where the barriers didn't work 100% as expected. But these builtins will need to be introduced anyway as core.bitop.volatileLoad or whatever final name the DMD devs decide on. Regarding nocode/typeinfo/noinit/GNU_nomoduleinfo: I think these are useful anyway. The linker can strip these out, but I don't want to rely on the linker and on the user to know all special linker flags only to avoid some binary bloat which can be avoided in the compiler. But overall this approach looks fine.On 08/16/14 20:40, Artur Skawina wrote:=20 This seems to work. With inlining the code is quite compact. =20 Not tested yet but the code for these constructs looks correct: for (f=3D0;f<50;f++) { regs.txreg =3D =C5=9Bomebuf[f] } while (regs.status =3D=3D 0) {} =20 What is the purpose of volatile_dummy? Even if it is not used, the address for it is calculated in several places. =20 The struct members are defined saparately. This means the address of every member is stored and fetched separately. The compiler seems to remove some of these and use the pointer, but I am not sure what happens when the structs are bigger. =20 It seems all loads and stores access the real memory, like volatile should do. It is hard to follow the optimized code so I am not yet sure that they have not been reordered in any way. =20 Anyway, this seems acceptable solution to me. =20 Johannes, is this good starting point to you or is your work with compiler builtins giving us some more?I did not like that required dereference in the previous=20 version, and tried a different approach: struct Timer { Volatile!uint control; Volatile!uint data; } enum timerA =3D cast(Timer*)0xDEADBEAF; int main() { timerA.control |=3D 0b1; timerA.control +=3D 1; timerA.control =3D 42; int a =3D timerA.data - timerA.data; int b =3D timerA.control; return timerA.control; } version (GNU) { static import gcc.attribute; enum inline =3D gcc.attribute.attribute("forceinline"); } =20 extern int volatile_dummy; =20 inline T volatile_load(T)(ref T v) nothrow { asm { "" : "+m" v, "+m" volatile_dummy; } T res =3D v; asm { "" : "+g" res, "+m" v, "+m" volatile_dummy; } return res; } inline void volatile_store(T, A)(ref T v, A a) nothrow { asm { "" : "+m" volatile_dummy : "m" v; } v =3D a; asm { "" : "+m" v, "+m" volatile_dummy; } } =20 struct Volatile(T) { T raw; nothrow: inline: disable this(this); void opAssign(A)(A a) { volatile_store(raw, a); } T load() property { return volatile_load(raw); } alias load this; void opOpAssign(string OP)(const T rhs) { auto v =3D volatile_load(raw); mixin("v " ~ OP ~ "=3D rhs;"); volatile_store(raw, v); } } arturHow can I use this with struct members ?=20 One possibility would be to declare all members as=20 `Volatile!...`, or
Aug 17 2014
On Sunday, 17 August 2014 at 07:57:15 UTC, Timo Sintonen wrote:This seems to work.This does not work with member functions struct uartreg { Volatile!int sr; Volatile!int dr; Volatile!int brr; Volatile!int cr1; Volatile!int cr2; Volatile!int cr3; Volatile!int gtpr; // send a byte to the uart void send(int t) { while ((sr&0x80)==0) { } dr=t; } } In this function the fetch of sr is omitted but compare is still made against an invalid register value. Then address of dr is omitted and store is made from wrong register to invalid address. So the generated code is totally invalid. If I move this function out of the struct then it is ok. I use -O2, not tested what it woud do without optimization. Also if I have: cr1=cr2=0; I get: expression this.cr2.opAssign(0) is void and has no value
Aug 17 2014
On 08/17/14 11:24, Timo Sintonen via D.gnu wrote:On Sunday, 17 August 2014 at 07:57:15 UTC, Timo Sintonen wrote:It works for me: import volat; // module w/ the last Volatile(T) implementation. struct uartreg { Volatile!int sr; Volatile!int dr; Volatile!int brr; Volatile!int cr1; Volatile!int cr2; Volatile!int cr3; Volatile!int gtpr; // send a byte to the uart void send(int t) { while ((sr&0x80)==0) { } dr=t; } } enum uart = cast(uartreg*)0xDEADBEAF; void main() { uart.send(42); } => 0000000000403620 <_Dmain>: 403620: b8 af be ad de mov $0xdeadbeaf,%eax 403625: 0f 1f 00 nopl (%rax) 403628: b9 af be ad de mov $0xdeadbeaf,%ecx 40362d: 8b 11 mov (%rcx),%edx 40362f: 81 e2 80 00 00 00 and $0x80,%edx 403635: 74 f1 je 403628 <_Dmain+0x8> 403637: bf b3 be ad de mov $0xdeadbeb3,%edi 40363c: 31 c0 xor %eax,%eax 40363e: c7 07 2a 00 00 00 movl $0x2a,(%rdi) 403644: c3 retq Except for some obviously missed optimizations (dead eax load, unnecessary ecx reload), the code seems fine. What platform are you using and what does the emitted code look like?This seems to work.This does not work with member functions struct uartreg { Volatile!int sr; Volatile!int dr; Volatile!int brr; Volatile!int cr1; Volatile!int cr2; Volatile!int cr3; Volatile!int gtpr; // send a byte to the uart void send(int t) { while ((sr&0x80)==0) { } dr=t; } } In this function the fetch of sr is omitted but compare is still made against an invalid register value. Then address of dr is omitted and store is made from wrong register to invalid address. So the generated code is totally invalid. If I move this function out of the struct then it is ok. I use -O2, not tested what it woud do without optimization.Also if I have: cr1=cr2=0; I get: expression this.cr2.opAssign(0) is void and has no valueThat's because the opAssign returns void, which prevents this kind of chaining. This was a deliberate choice, as I /wanted/ to disallow that; it's already a bad idea for normal assignments; for volatile ones, which can require a specific order, it's an even worse one. But it's trivial to "fix", just change void opAssign(A)(A a) { volatile_store(raw, a); } to T opAssign(A)(A a) { volatile_store(raw, a); return a; } artur
Aug 17 2014
On Sunday, 17 August 2014 at 11:35:33 UTC, Artur Skawina via D.gnu wrote:It works for me: import volat; // module w/ the last Volatile(T) implementation. struct uartreg { Volatile!int sr; Volatile!int dr; Volatile!int brr; Volatile!int cr1; Volatile!int cr2; Volatile!int cr3; Volatile!int gtpr; // send a byte to the uart void send(int t) { while ((sr&0x80)==0) { } dr=t; } } enum uart = cast(uartreg*)0xDEADBEAF; void main() { uart.send(42); } => 0000000000403620 <_Dmain>: 403620: b8 af be ad de mov $0xdeadbeaf,%eax 403625: 0f 1f 00 nopl (%rax) 403628: b9 af be ad de mov $0xdeadbeaf,%ecx 40362d: 8b 11 mov (%rcx),%edx 40362f: 81 e2 80 00 00 00 and $0x80,%edx 403635: 74 f1 je 403628 <_Dmain+0x8> 403637: bf b3 be ad de mov $0xdeadbeb3,%edi 40363c: 31 c0 xor %eax,%eax 40363e: c7 07 2a 00 00 00 movl $0x2a,(%rdi) 403644: c3 retq Except for some obviously missed optimizations (dead eax load, unnecessary ecx reload), the code seems fine. What platform are you using and what does the emitted code look like?I am compiling for arm and I am sorry I misinterpreted the optimized code. Actually the code is correct but it still does not work. The problem is that the call to get the tls pointer for volatile_dummy seems to corrupt the register (r3) where the this pointer is. The call is inside the while loop. After removing tha call by hand in the assembly everything works. R3 is usually pushed into stack when it is used in a function. I have to check what is wrong in this case.Also if I have: cr1=cr2=0; I get: expression this.cr2.opAssign(0) is void and has no valueThat's because the opAssign returns void, which prevents this kind of chaining. This was a deliberate choice, as I /wanted/ to disallow that; it's already a bad idea for normal assignments; for volatile ones, which can require a specific order, it's an even worse one. But it's trivial to "fix", just change void opAssign(A)(A a) { volatile_store(raw, a); } to T opAssign(A)(A a) { volatile_store(raw, a); return a; } artur
Aug 17 2014
On 08/17/14 15:44, Timo Sintonen via D.gnu wrote:I am compiling for arm and I am sorry I misinterpreted the optimized code. Actually the code is correct but it still does not work. The problem is that the call to get the tls pointer for volatile_dummy seems to corrupt the register (r3) where the this pointer is. The call is inside the while loop. After removing tha call by hand in the assembly everything works. R3 is usually pushed into stack when it is used in a function. I have to check what is wrong in this case.Does declaring it as: extern __gshared int volatile_dummy; help? artur
Aug 17 2014
On Sunday, 17 August 2014 at 13:59:03 UTC, Artur Skawina via D.gnu wrote:On 08/17/14 15:44, Timo Sintonen via D.gnu wrote:Yes, now it works. But the register corruption is still an issue. My tls function clearly uses r3 and does not save it. Johannes, do you know the arm calling system? Is it caller or callee that should save r3? In this case it is my function that has one function inlined that has another function inlined that contains a compiler generated function call. Could this be a bug in the compiler that it does not recognize the innermost call and does not save registers?I am compiling for arm and I am sorry I misinterpreted the optimized code. Actually the code is correct but it still does not work. The problem is that the call to get the tls pointer for volatile_dummy seems to corrupt the register (r3) where the this pointer is. The call is inside the while loop. After removing tha call by hand in the assembly everything works. R3 is usually pushed into stack when it is used in a function. I have to check what is wrong in this case.Does declaring it as: extern __gshared int volatile_dummy; help? artur
Aug 17 2014
Am Sun, 17 Aug 2014 14:36:53 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:On Sunday, 17 August 2014 at 13:59:03 UTC, Artur Skawina via D.gnu wrote:r3 is an argument/scratch register, the callee can't rely on its contents after a function call. This could also be caused by the inline ASM.On 08/17/14 15:44, Timo Sintonen via D.gnu wrote:Yes, now it works. But the register corruption is still an issue. My tls function clearly uses r3 and does not save it. Johannes, do you know the arm calling system? Is it caller or callee that should save r3? In this case it is my function that has one function inlined that has another function inlined that contains a compiler generated function call. Could this be a bug in the compiler that it does not recognize the innermost call and does not save registers?I am compiling for arm and I am sorry I misinterpreted the optimized code. Actually the code is correct but it still does not work. The problem is that the call to get the tls pointer for volatile_dummy seems to corrupt the register (r3) where the this pointer is. The call is inside the while loop. After removing tha call by hand in the assembly everything works. R3 is usually pushed into stack when it is used in a function. I have to check what is wrong in this case.Does declaring it as: extern __gshared int volatile_dummy; help? artur
Aug 17 2014
Am Sun, 17 Aug 2014 16:45:15 +0200 schrieb Johannes Pfau <nospam example.com>:the callee can't rely on itscaller of course ;-)
Aug 17 2014
On Sunday, 17 August 2014 at 14:47:57 UTC, Johannes Pfau wrote:Am Sun, 17 Aug 2014 14:36:53 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:So is this a bug or just undefined behavior?On Sunday, 17 August 2014 at 13:59:03 UTC, Artur Skawina via D.gnu wrote:r3 is an argument/scratch register, the callee can't rely on its contents after a function call. This could also be caused by the inline ASM.On 08/17/14 15:44, Timo Sintonen via D.gnu wrote:Yes, now it works. But the register corruption is still an issue. My tls function clearly uses r3 and does not save it. Johannes, do you know the arm calling system? Is it caller or callee that should save r3? In this case it is my function that has one function inlined that has another function inlined that contains a compiler generated function call. Could this be a bug in the compiler that it does not recognize the innermost call and does not save registers?I am compiling for arm and I am sorry I misinterpreted the optimized code. Actually the code is correct but it still does not work. The problem is that the call to get the tls pointer for volatile_dummy seems to corrupt the register (r3) where the this pointer is. The call is inside the while loop. After removing tha call by hand in the assembly everything works. R3 is usually pushed into stack when it is used in a function. I have to check what is wrong in this case.Does declaring it as: extern __gshared int volatile_dummy; help? artur
Aug 17 2014
On Sunday, 17 August 2014 at 14:47:57 UTC, Johannes Pfau wrote:Am Sun, 17 Aug 2014 14:36:53 +0000 schrieb "Timo Sintonen" <t.sintonen luukku.com>:I have had some weird bugs lately and then I looked my other object files. I think there is a bug because I found more like this: This is a class function (actually a constructor) that writes constant values into two variables, one is a static class variable in tls an the other is an instance variable 27 0000 10B5 push {r4, lr} 28 0002 0346 mov r3, r0 29 0004 FFF7FEFF bl __aeabi_read_tp load_tp_soft 30 0008 034A ldr r2, .L3 33 0010 1150 str r1, [r2, r0] 35 0014 1846 mov r0, r3 36 0016 10BD pop {r4, pc} 37 .L4: 38 .align 2 39 .L3: 40 0018 00000000 .word .LANCHOR0(tpoff) 41 In line 28 the this pointer is saved to r3, then the call in line 29 returns the tls start address in r0. __aeabi_read_tp uses r3 to fetch the address so r3 is corrupted R3 is used in 34 to store to address this+8 and then r3 is moved back to r0 returning incorrect value for this. Is this a gdc or gcc bug?But the register corruption is still an issue. My tls function clearly uses r3 and does not save it. Johannes, do you know the arm calling system? Is it caller or callee that should save r3? In this case it is my function that has one function inlined that has another function inlined that contains a compiler generated function call. Could this be a bug in the compiler that it does not recognize the innermost call and does not save registers?r3 is an argument/scratch register, the caller can't rely on its contents after a function call. This could also be caused by the inline ASM.
Aug 18 2014
On 08/17/14 09:57, Timo Sintonen via D.gnu wrote:What is the purpose of volatile_dummy? Even if it is not used,Ensuring ordering, w/o it the compiler could reorder operations on different volatile objects. (Which isn't necessarily a bad thing, but people expect certain semantics of 'volatile', so it would be a bad and dangerous default)the address for it is calculated in several places.It's completely optimized away for me (I'm testing on x86). Can you show the emitted code?The struct members are defined saparately. This means the address of every member is stored and fetched separately. The compiler seems to remove some of these and use the pointer, but I am not sure what happens when the structs are bigger.Yes, the compiler does not to generate optimal code, but so far I've only seen dead immediate-constant->register loads; so it's not a huge problem. artur
Aug 17 2014
Am Sat, 16 Aug 2014 11:58:49 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:On 08/16/14 09:33, Johannes Pfau via D.gnu wrote:That's a good start. Can you also get unary operators working? e.g TimerB++; Do you think it's possible to combine this with the other solution you posted for struct fields? Or do we need separate Volatile!T and VolatileField!T types?https://github.com/D-Programming-GDC/GDC/pull/82[Only noticed this accidentally; using a mailing list instead of some web forum would increase visibility...]enum var = Volatile!(T,addr)(): doesn't allow |= on enum literals, even if the type implements opAssign as there's no this pointerT volatile_load(T)(ref T v) nothrow { asm { "" : "+m" v; } T res = v; asm { "" : "+g" res; } return res; } void volatile_store(T)(ref T v, const T a) nothrow { asm { "" : : "m" v; } v = a; asm { "" : "+m" v; } } struct Volatile(T, alias /* T* */ A) { void opOpAssign(string OP)(const T rhs) nothrow { auto v = volatile_load(*A); mixin("v " ~ OP ~ "= rhs;"); volatile_store(*A, v); } } enum TimerB = Volatile!(uint, cast(uint*)0xDEADBEEF)(); void main() { TimerB |= 0b1; TimerB += 1; }
Aug 17 2014
On 08/17/14 10:49, Johannes Pfau via D.gnu wrote:That's a good start. Can you also get unary operators working? e.g TimerB++;Unary ops are easy. If you mean post-inc and post-dec -- that's a language problem. At least for volatile, they will cause a compile error; for atomic ops the naive `post-op->tmp-load+op+tmp` rewrite can introduce bugs... D would need to make the post-ops overloadable to get rid of these issues.Do you think it's possible to combine this with the other solution you posted for struct fields? Or do we need separate Volatile!T and VolatileField!T types?Right now, I'd prefer this approach: -------------------------------------------------------------- module volat; version (GNU) { static import gcc.attribute; enum inline = gcc.attribute.attribute("forceinline"); } extern int volatile_dummy; inline T volatile_load(T)(ref T v) nothrow { asm { "" : "+m" v, "+m" volatile_dummy; } T res = v; asm { "" : "+g" res, "+m" v, "+m" volatile_dummy; } return res; } inline void volatile_store(T, A)(ref T v, A a) nothrow { asm { "" : "+m" volatile_dummy : "m" v; } v = a; asm { "" : "+m" v, "+m" volatile_dummy; } } inline void volatile_barrier(T)(ref T v) nothrow { asm { "" : "+m" v, "+m" volatile_dummy; } } struct Volatile(T) { T raw; nothrow: inline: disable this(this); void opAssign(A)(A a) { volatile_store(raw, a); } T load() property { return volatile_load(raw); } alias load this; void opOpAssign(string OP)(const T b) { volatile_barrier(raw); mixin("raw " ~ OP ~ "= b;"); volatile_barrier(raw); } T opUnary(string OP)() { volatile_barrier(raw); auto result = mixin(OP ~ "raw"); volatile_barrier(raw); return result; } } -------------------------------------------------------------- import volat; struct Timer { Volatile!uint control; Volatile!uint data; } enum timerA = cast(Timer*)0xDEADBEAF; int main() { timerA.control |= 0b1; timerA.control += 1; timerA.control = 42; int a = timerA.data - timerA.data; int b = ++timerA.control; --timerA.data; timerA.control /= 2; return b; } -------------------------------------------------------------- compiles to: -------------------------------------------------------------- 0000000000403620 <_Dmain>: 403620: ba af be ad de mov $0xdeadbeaf,%edx 403625: b9 b3 be ad de mov $0xdeadbeb3,%ecx 40362a: 83 0a 01 orl $0x1,(%rdx) 40362d: 83 02 01 addl $0x1,(%rdx) 403630: c7 02 2a 00 00 00 movl $0x2a,(%rdx) 403636: 8b 42 04 mov 0x4(%rdx),%eax 403639: 8b 72 04 mov 0x4(%rdx),%esi 40363c: 8b 02 mov (%rdx),%eax 40363e: 83 c0 01 add $0x1,%eax 403641: 89 02 mov %eax,(%rdx) 403643: 83 6a 04 01 subl $0x1,0x4(%rdx) 403647: d1 2a shrl (%rdx) 403649: c3 retq -------------------------------------------------------------- Do you see any problems with it? (Other than gcc not removing that dead constant load) [The struct-with-volatile-fields can be built from a "normal" struct at CT. But that's just syntax sugar.] artur
Aug 17 2014
Am Sun, 17 Aug 2014 15:15:12 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:Do you see any problems with it? (Other than gcc not removing that dead constant load)It's perfect for structs, but when simply declaring a Volatile!uint the pointer dereference must be done manually, right? ---- enum TimerB = cast(Volatile!(uint)*)0xDEADBEEF; *TimerB |= 0b1; ---- I don't think that a huge problem though, just a little bit inconvenient.
Aug 17 2014
On 08/17/14 16:16, Johannes Pfau via D.gnu wrote:Am Sun, 17 Aug 2014 15:15:12 +0200 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:Another D-problem - the language doesn't have /real/ refs. But... import volat; inline ref property timerA() { return *cast(Volatile!uint*)0xDEADBEAF; } int main() { timerA |= 0b1; timerA += 1; timerA = 42; int a = timerA - timerA; int b = ++timerA; --timerA; timerA /= 2; return b; } => 0000000000403620 <_Dmain>: 403620: ba af be ad de mov $0xdeadbeaf,%edx 403625: 83 0a 01 orl $0x1,(%rdx) 403628: 83 02 01 addl $0x1,(%rdx) 40362b: c7 02 2a 00 00 00 movl $0x2a,(%rdx) 403631: 8b 02 mov (%rdx),%eax 403633: 8b 0a mov (%rdx),%ecx 403635: 8b 02 mov (%rdx),%eax 403637: 83 c0 01 add $0x1,%eax 40363a: 89 02 mov %eax,(%rdx) 40363c: 83 2a 01 subl $0x1,(%rdx) 40363f: d1 2a shrl (%rdx) 403641: c3 retq arturDo you see any problems with it? (Other than gcc not removing that dead constant load)It's perfect for structs, but when simply declaring a Volatile!uint the pointer dereference must be done manually, right? ---- enum TimerB = cast(Volatile!(uint)*)0xDEADBEEF; *TimerB |= 0b1; ---- I don't think that a huge problem though, just a little bit inconvenient.
Aug 17 2014