www.digitalmars.com         C & C++   DMDScript  

D.gnu - Found one (the?) ARM bug: (Pointer) aliasing issues

reply Johannes Pfau <nospam example.com> writes:
I think I finally found the root cause for one of the remaining ARM
bugs (the codegen bug which only appears on -O2 or higher).

What happened in the test case was that the gcc backend illegally moved
a read to a memory location before the write. It turns out that GCC
assumed that the read and write locations were in different alias sets
and therefore could not possibly reference the same memory location. One
alias set was for 'ubyte[]' and one for 'char[]'.

The code that triggers this issue is in std.algorithm.find:
--------
R1 find(...)(R1 haystack, R2 needle) if (/* are strings*/)
{
    return cast(char[]) .find!(ubyte[], ubyte[])
        (cast(ubyte[]) haystack, cast(ubyte[])needle);
}
--------
(Real code:
https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/src/std/algorithm.d#L3555
)

The generic emitted by gdc:
--------
return <retval> = *(struct  *) &find (*(struct  *) &haystack, *(struct
*) &needle);
--------

Or in the raw form:
 44 = ubyte[]
 11 = string
--------
 11     record_type      name:  17      size:  18      algn: 32      
                         tag : struct   flds:  19   
 31     pointer_type     size:  15      algn: 32       ptd :  11    
 40     pointer_type     size:  15      algn: 32       ptd :  44   
 41     call_expr        type:  44      fn  :  45      0   :  46     
                         1   :  47
 44     record_type      name:  49      size:  18      algn:  
                         tag : struct   flds:  50    
 46     indirect_ref     type:  44      op 0:  53     
 47     indirect_ref     type:  44      op 0:  54    
 53     nop_expr         type:  40      op 0:  58     
 54     nop_expr         type:  40      op 0:  59    
 58     addr_expr        type:  31      op 0:  30     
 59     addr_expr        type:  31      op 0:  61 
--------

Here it's easy to see that we essentially generate this code:
*(cast(ubyte[]*)(&haystack))
and this is AFAIK a violation of the aliasing rules.

This problem is not observed at -O1 as the function call generates a
new stackframe and we therefore have two distinct memory locations. But
with -O2 inlining removes this copy and we now have variables
referencing the same memory location with type ubyte[] and char[].

I can not provide a reduced testcase as the smallest changes in
compiler codegen (gcc version, gdc commit) or the test case can hide
this issue. It's always reproducible with test15.d in the test suite.
(In older gdc versions the test case does not segfault but it produces
wrong output at -O2. This can be a very subtle difference)

But here's a working code snippet which illustrates the generic
generated by GDC:

-----------------
void main()
{
	char[] in1 = "Test".dup;
        char[] in2 = "Test2".dup;

        char[] result = cast(char[])find(cast(ubyte[])in1,
            cast(ubyte[])in2);
}

ubyte[] find(ubyte[] a, ubyte[] b)
{
    return a;
}
-----------------


So the important question here: Is this a bug in GDC codegen or is the
code in std.algorithm invalid? According to
http://dlang.org/expression.html
"The cast is done as a type paint" so this could indeed be interpreted
as a user mistake. But OTOH that page also talks about a runtime check
of the array .lengths which is clearly missing here.

I'm also wondering if that runtime check can actually fix this
aliasing issue or if it can come up again if the runtime check itself
is inlined? 
Nov 02 2013
parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 2 November 2013 20:07, Johannes Pfau <nospam example.com> wrote:

 I think I finally found the root cause for one of the remaining ARM
 bugs (the codegen bug which only appears on -O2 or higher).

 What happened in the test case was that the gcc backend illegally moved
 a read to a memory location before the write. It turns out that GCC
 assumed that the read and write locations were in different alias sets
 and therefore could not possibly reference the same memory location. One
 alias set was for 'ubyte[]' and one for 'char[]'.

 The code that triggers this issue is in std.algorithm.find:
 --------
 R1 find(...)(R1 haystack, R2 needle) if (/* are strings*/)
 {
     return cast(char[]) .find!(ubyte[], ubyte[])
         (cast(ubyte[]) haystack, cast(ubyte[])needle);
 }
 --------
 (Real code:

 https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/src/std/algorithm.d#L3555
 )

 The generic emitted by gdc:
 --------
 return <retval> = *(struct  *) &find (*(struct  *) &haystack, *(struct
 *) &needle);
 --------

 Or in the raw form:
  44 = ubyte[]
  11 = string
 --------
  11     record_type      name:  17      size:  18      algn: 32
                          tag : struct   flds:  19
  31     pointer_type     size:  15      algn: 32       ptd :  11
  40     pointer_type     size:  15      algn: 32       ptd :  44
  41     call_expr        type:  44      fn  :  45      0   :  46
                          1   :  47
  44     record_type      name:  49      size:  18      algn:
                          tag : struct   flds:  50
  46     indirect_ref     type:  44      op 0:  53
  47     indirect_ref     type:  44      op 0:  54
  53     nop_expr         type:  40      op 0:  58
  54     nop_expr         type:  40      op 0:  59
  58     addr_expr        type:  31      op 0:  30
  59     addr_expr        type:  31      op 0:  61
 --------

 Here it's easy to see that we essentially generate this code:
 *(cast(ubyte[]*)(&haystack))
 and this is AFAIK a violation of the aliasing rules.

 This problem is not observed at -O1 as the function call generates a
 new stackframe and we therefore have two distinct memory locations. But
 with -O2 inlining removes this copy and we now have variables
 referencing the same memory location with type ubyte[] and char[].

 I can not provide a reduced testcase as the smallest changes in
 compiler codegen (gcc version, gdc commit) or the test case can hide
 this issue. It's always reproducible with test15.d in the test suite.
 (In older gdc versions the test case does not segfault but it produces
 wrong output at -O2. This can be a very subtle difference)

 But here's a working code snippet which illustrates the generic
 generated by GDC:

 -----------------
 void main()
 {
         char[] in1 = "Test".dup;
         char[] in2 = "Test2".dup;

         char[] result = cast(char[])find(cast(ubyte[])in1,
             cast(ubyte[])in2);
 }

 ubyte[] find(ubyte[] a, ubyte[] b)
 {
     return a;
 }
 -----------------


 So the important question here: Is this a bug in GDC codegen or is the
 code in std.algorithm invalid? According to
 http://dlang.org/expression.html
 "The cast is done as a type paint" so this could indeed be interpreted
 as a user mistake. But OTOH that page also talks about a runtime check
 of the array .lengths which is clearly missing here.

 I'm also wondering if that runtime check can actually fix this
 aliasing issue or if it can come up again if the runtime check itself
 is inlined?
I guess D does not have any aliasing rules. Question: does this occur when you pass -fno-strict-aliasing? By default, GDC defines -fstrict-aliasing (which is also enabled by default in -O2 in GCC), which is nice to have, but if it's proving to be non-trivial, we can always disallow the optimisation being done in the middle-end. See the LANG_HOOKS_GET_ALIAS_SET langhook - last time I checked, returning 0 disables aliasing rules from taking effect. Regards. -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Nov 02 2013
next sibling parent Johannes Pfau <nospam example.com> writes:
Am Sun, 3 Nov 2013 02:10:20 +0000
schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 2 November 2013 20:07, Johannes Pfau <nospam example.com> wrote:
 
 I think I finally found the root cause for one of the remaining ARM
 bugs (the codegen bug which only appears on -O2 or higher).

 What happened in the test case was that the gcc backend illegally
 moved a read to a memory location before the write. It turns out
 that GCC assumed that the read and write locations were in
 different alias sets and therefore could not possibly reference the
 same memory location. One alias set was for 'ubyte[]' and one for
 'char[]'.

 The code that triggers this issue is in std.algorithm.find:
 --------
 R1 find(...)(R1 haystack, R2 needle) if (/* are strings*/)
 {
     return cast(char[]) .find!(ubyte[], ubyte[])
         (cast(ubyte[]) haystack, cast(ubyte[])needle);
 }
 --------
 (Real code:

 https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/src/std/algorithm.d#L3555
 )

 The generic emitted by gdc:
 --------
 return <retval> = *(struct  *) &find (*(struct  *) &haystack,
 *(struct *) &needle);
 --------

 Or in the raw form:
  44 = ubyte[]
  11 = string
 --------
  11     record_type      name:  17      size:  18      algn: 32
                          tag : struct   flds:  19
  31     pointer_type     size:  15      algn: 32       ptd :  11
  40     pointer_type     size:  15      algn: 32       ptd :  44
  41     call_expr        type:  44      fn  :  45      0   :  46
                          1   :  47
  44     record_type      name:  49      size:  18      algn:
                          tag : struct   flds:  50
  46     indirect_ref     type:  44      op 0:  53
  47     indirect_ref     type:  44      op 0:  54
  53     nop_expr         type:  40      op 0:  58
  54     nop_expr         type:  40      op 0:  59
  58     addr_expr        type:  31      op 0:  30
  59     addr_expr        type:  31      op 0:  61
 --------

 Here it's easy to see that we essentially generate this code:
 *(cast(ubyte[]*)(&haystack))
 and this is AFAIK a violation of the aliasing rules.

 This problem is not observed at -O1 as the function call generates a
 new stackframe and we therefore have two distinct memory locations.
 But with -O2 inlining removes this copy and we now have variables
 referencing the same memory location with type ubyte[] and char[].

 I can not provide a reduced testcase as the smallest changes in
 compiler codegen (gcc version, gdc commit) or the test case can hide
 this issue. It's always reproducible with test15.d in the test
 suite. (In older gdc versions the test case does not segfault but
 it produces wrong output at -O2. This can be a very subtle
 difference)

 But here's a working code snippet which illustrates the generic
 generated by GDC:

 -----------------
 void main()
 {
         char[] in1 = "Test".dup;
         char[] in2 = "Test2".dup;

         char[] result = cast(char[])find(cast(ubyte[])in1,
             cast(ubyte[])in2);
 }

 ubyte[] find(ubyte[] a, ubyte[] b)
 {
     return a;
 }
 -----------------


 So the important question here: Is this a bug in GDC codegen or is
 the code in std.algorithm invalid? According to
 http://dlang.org/expression.html
 "The cast is done as a type paint" so this could indeed be
 interpreted as a user mistake. But OTOH that page also talks about
 a runtime check of the array .lengths which is clearly missing here.

 I'm also wondering if that runtime check can actually fix this
 aliasing issue or if it can come up again if the runtime check
 itself is inlined?
I guess D does not have any aliasing rules. Question: does this occur when you pass -fno-strict-aliasing? By default, GDC defines -fstrict-aliasing (which is also enabled by default in -O2 in GCC), which is nice to have, but if it's proving to be non-trivial, we can always disallow the optimisation being done in the middle-end. See the LANG_HOOKS_GET_ALIAS_SET langhook - last time I checked, returning 0 disables aliasing rules from taking effect. Regards.
Yes, -fno-strict-aliasing also fixes this problem. I'm starting a fresh build on ARM right now to see if this fixes the remaining problems in the test suite. I think we should really disable strict aliasing. While it may provide some nice optimization it's almost impossible for the user to do type punning / casting with strict aliasing rules. I looked into the gcc aliasing implementation (alias.c) and the only safe way is (as the documentation says) to use unions. But then all access has to go through the union. The often proposed helper functions which just cast a value through the union are therefore illegal and will fail under certain circumstances (e.g. if they're inlined). As optimization (inlining, dead code / dead store elimination and target architecture) also heavily changes the effects of pointer aliasing it's almost impossible for a developer to write a safe type cast. (See also http://d.puremagic.com/issues/show_bug.cgi?id=10750)
Nov 03 2013
prev sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Sun, 3 Nov 2013 02:10:20 +0000
schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 last time I
 checked, returning 0 disables aliasing rules from taking effect.
That should work. Alias set 0 is a special alias set which conflicts with everything. I'll check if it works as expected.
Nov 03 2013
parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 3 November 2013 10:20, Johannes Pfau <nospam example.com> wrote:

 Am Sun, 3 Nov 2013 02:10:20 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 last time I
 checked, returning 0 disables aliasing rules from taking effect.
That should work. Alias set 0 is a special alias set which conflicts with everything. I'll check if it works as expected.
This is taken from hunks in the 2.064 merge I'm testing: Pastebin link here: http://pastebin.com/jxQQL68N diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc index 8cba369..a19a59b 100644 --- a/gcc/d/d-lang.cc +++ b/gcc/d/d-lang.cc -73,6 +73,7 const attribute_spec d_attribute_table[] = #undef LANG_HOOKS_COMMON_ATTRIBUTE_TABLE #undef LANG_HOOKS_ATTRIBUTE_TABLE #undef LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE +#undef LANG_HOOKS_GET_ALIAS_SET #undef LANG_HOOKS_TYPES_COMPATIBLE_P #undef LANG_HOOKS_BUILTIN_FUNCTION #undef LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE -95,6 +96,7 const attribute_spec d_attribute_table[] = #define LANG_HOOKS_COMMON_ATTRIBUTE_TABLE d_builtins_attribute_table #define LANG_HOOKS_ATTRIBUTE_TABLE d_attribute_table #define LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE d_format_attribute_table +#define LANG_HOOKS_GET_ALIAS_SET d_get_alias_set #define LANG_HOOKS_TYPES_COMPATIBLE_P d_types_compatible_p #define LANG_HOOKS_BUILTIN_FUNCTION d_builtin_function #define LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE d_builtin_function -1505,6 +1523,33 d_getdecls (void) } +// Get the alias set corresponding to a type or expression. +// Return -1 if we don't do anything special. + +static alias_set_type +d_get_alias_set (tree t) +{ + if (!TYPE_P (t)) + return get_alias_set (TREE_TYPE (t)); + + Type *dtype = build_dtype (t); + + // If the type is a dynamic array, use the alias set of the basetype. + if (dtype && dtype->ty == Tarray) + return get_alias_set (dtype->nextOf()->toCtype()); + + // Permit type-punning when accessing a union, provided the access + // is directly through the union. + for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0)) + { + if (TREE_CODE (u) == COMPONENT_REF + && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE) + return 0; + } + + return -1; +} + static int d_types_compatible_p (tree t1, tree t2) { -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Nov 07 2013
parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 7 Nov 2013 16:14:59 +0000
schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 3 November 2013 10:20, Johannes Pfau <nospam example.com> wrote:
 
 Am Sun, 3 Nov 2013 02:10:20 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 last time I
 checked, returning 0 disables aliasing rules from taking effect.
That should work. Alias set 0 is a special alias set which conflicts with everything. I'll check if it works as expected.
This is taken from hunks in the 2.064 merge I'm testing: Pastebin link here: http://pastebin.com/jxQQL68N
Some probably stupid questions about this patch: // If the type is a dynamic array, use the alias set of the basetype. What exactly does happen in that case? The Tarray type is the two-field type consisting of length and ptr, right? Currently TypeDArray->toCtype constructs a two_field_type with size_t and typeof(Element)*. So according to the C aliasing rules, the TypeDArray alias set does already conflict with size_t and Element*. It does not conflict with Element. But I don't know why it should conflict with Element if we're talking about the slice type here. It would allow code like this to work: "char[] a; char* b = (cast(char*)&a)" but I don't see why this should work, it's illegal anyway? Also, AFAICS it does not help with the problem in std.algorithm: char[] a; //cast(ubyte[])a generates: *cast(ubyte[]*)&a; Do you think this cast should be illegal in D? I think if we want to support strict aliasing for the code above we'll have to do what gcc does for pointers: Put all array slices - regardless of element type - into the same alias set and make size_t and void* subsets of this alias set. // Permit type-punning when accessing a union Isn't that already guaranteed by GCC? See: Unions have all their member types added as subsets. So as long as the reference is through the union GCC knows the union type and it'll conflict with all member types. But even if we make those changes to aliasing rules, we'll have to fix many places in phobos. For example: https://github.com/D-Programming-Language/phobos/blob/master/std/math.d#L1965 real value; ushort* vu = cast(ushort*)&value; AFAICS this will always be invalid with strict aliasing. https://github.com/D-Programming-Language/phobos/blob/master/std/uuid.d#L468 casts ubyte[16]* to size_t* also illegal, AFAICS. Are there any statistics about the performance improvements with strict aliasing? I'm not really sold on the idea of strict aliasing, right now it looks to me as if it's mainly a way to introduce subtle, hard to debug and often latent bugs (As whether you really see a problem depends on optimization) http://stackoverflow.com/questions/1225741/performance-impact-of-fno-strict-aliasing
Nov 07 2013
next sibling parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 7 November 2013 18:05, Johannes Pfau <nospam example.com> wrote:

 Am Thu, 7 Nov 2013 16:14:59 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 3 November 2013 10:20, Johannes Pfau <nospam example.com> wrote:

 Am Sun, 3 Nov 2013 02:10:20 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 last time I
 checked, returning 0 disables aliasing rules from taking effect.
That should work. Alias set 0 is a special alias set which conflicts with everything. I'll check if it works as expected.
This is taken from hunks in the 2.064 merge I'm testing: Pastebin link here: http://pastebin.com/jxQQL68N
Some probably stupid questions about this patch: // If the type is a dynamic array, use the alias set of the basetype. What exactly does happen in that case? The Tarray type is the two-field type consisting of length and ptr, right? Currently TypeDArray->toCtype constructs a two_field_type with size_t and typeof(Element)*. So according to the C aliasing rules, the TypeDArray alias set does already conflict with size_t and Element*. It does not conflict with Element. But I don't know why it should conflict with Element if we're talking about the slice type here. It would allow code like this to work: "char[] a; char* b = (cast(char*)&a)" but I don't see why this should work, it's illegal anyway?
That would be seen as two distinct alias sets that would break strict aliasing in that example. Though, will have to implement -Wstrict-aliasing in the front-end to get any feel for what could potentially be utterly wrong. But the idea is that for dynamic arrays, telling gcc to not rely on structural equality to determine whether or not two dynamic arrays are part of the same alias set. eg: byte[] a, long[] b = *(cast (long[]*)&a) should be seen as being different alias sets, and so *will* be about breaking strict aliasing. In contrast, string[] a, char[] b = *cast(string[]*)&a) should be seen as being part of the same alias set, and so the compiler must know that the two types (which are distinct structures to the backend) could potentially be referencing the same slice of memory, as to not cause any problems. For people trying to work around the cast system for dynamic arrays, IMO they should be punished for it, and told to do it in the correct way that invokes _d_arraycopy, or do their unsafe work through unions.
 Also, AFAICS it does not help with the problem in std.algorithm:
 char[] a;
 //cast(ubyte[])a generates:
 *cast(ubyte[]*)&a;

 Do you think this cast should be illegal in D?
 I think if we want to support strict aliasing for the code above we'll
 have to do what gcc does for pointers:

 Put all array slices - regardless of element type - into the same alias
 set and make size_t and void* subsets of this alias set.


 // Permit type-punning when accessing a union

 Isn't that already guaranteed by GCC? See:

 Unions have all their member types added as subsets. So as long as the
 reference is through the union GCC knows the union type and it'll
 conflict with all member types.
There is no harm enforcing it in the front-end as well, even if it is just there to speed up the process of returning what the backend will no doubt return too. There's also the (extremely) unlikely event that the guarantee by GCC might be removed in a later version.
 But even if we make those changes to aliasing rules, we'll have to fix
 many places in phobos. For example:

 https://github.com/D-Programming-Language/phobos/blob/master/std/math.d#L1965
 real value;
 ushort* vu = cast(ushort*)&value;
 AFAICS this will always be invalid with strict aliasing.
Yep, as it should be. std.math is a danger point for type punning between pointers and reals, ensuring that type-punning/casting does not get DCE'd, etc... This needs to be fixed. https://github.com/D-Programming-Language/phobos/blob/master/std/uuid.d#L468
 casts ubyte[16]* to size_t* also illegal, AFAICS.

 Are there any statistics about the performance improvements with strict
 aliasing? I'm not really sold on the idea of strict aliasing, right now
 it looks to me as if it's mainly a way to introduce subtle, hard to
 debug and often latent bugs (As whether you really see a problem
 depends on optimization)


 http://stackoverflow.com/questions/1225741/performance-impact-of-fno-strict-aliasing
Not that I'm aware of (other than Ada boasting it's use). But I'd like to push the opinion of - although it isn't in the spec, D should be strict aliasing. And people should be aware of the problems breaking strict aliasing (see: http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html) But first... plan of attack: - We first disable strict aliasing entirely (lang_hook.get_alias_set => 0) - Implement -Wstrict-aliasing - Start turning on strict aliasing for types guaranteed not to be referencing the same memory location as other types (eg: TypeBasic, TypeDelegate, TypeSArray, TypeVector). - Identify gdc problems with implicit code generation that could break strict aliasing (these are our bugs). - Identify frontend/library problems that could break strict aliasing (these are the dmd/phobos developer's bugs). - Turn on strict aliasing for the remaining types. For those that cause problems, we can define a TYPE_LANG_FLAG macro to allow us to tell the backend if the type can alias any other types. I still stand by what I say on aliasing rules of D: - Permit type-punning when accessing through a union - Dynamic arrays of the same basetype (regardless of qualifiers) may alias each other/occupy the same slice of memory. Other possible considerations: - Most D code pretty much assumes that any object may be accessed via a void[] or void*. - C standard allows aliasing between signed and unsigned variants. It is therefore likely not unreasonable to do the same for convenience. - Infact, for the consideration of std.math. It we could go one step further and simply build up an alias set list based on the type size over type distinction. In this model double/long/byte[8]/short[4]/int[2] would all be considered as types that could be referencing each other. -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Nov 08 2013
parent reply Johannes Pfau <nospam example.com> writes:
Am Fri, 8 Nov 2013 13:12:23 +0000
schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 7 November 2013 18:05, Johannes Pfau <nospam example.com> wrote:
 
 Am Thu, 7 Nov 2013 16:14:59 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:
 type here. It would allow code like this to work: "char[] a; char*
 b = (cast(char*)&a)" but I don't see why this should work, it's
 illegal anyway?
That would be seen as two distinct alias sets that would break strict aliasing in that example. Though, will have to implement -Wstrict-aliasing in the front-end to get any feel for what could potentially be utterly wrong. But the idea is that for dynamic arrays, telling gcc to not rely on structural equality to determine whether or not two dynamic arrays are part of the same alias set. eg: byte[] a, long[] b = *(cast (long[]*)&a) should be seen as being different alias sets, and so *will* be about breaking strict aliasing. In contrast, string[] a, char[] b = *cast(string[]*)&a) should be seen as being part of the same alias set, and so the compiler must know that the two types (which are distinct structures to the backend) could potentially be referencing the same slice of memory, as to not cause any problems.
Now I get what you mean. AFAIK the backend already strips type qualifiers when building alias sets, so char[] and string[] should already have the same alias set. But if you want to explicitly enforce this in GDC this makes sense.
 
 For people trying to work around the cast system for dynamic arrays,
 IMO they should be punished for it, and told to do it in the correct
 way that invokes _d_arraycopy, or do their unsafe work through unions.
 
OK. Although then we have to be very careful when implementing _d_arraycast (but maybe we can just check if we're in _d_arraycast and disable aliasing? Is this possible?).
 // Permit type-punning when accessing a union

 Isn't that already guaranteed by GCC? See:

 [...]
There is no harm enforcing it in the front-end as well, [...]
OK
 
 Not that I'm aware of (other than Ada boasting it's use).  But I'd
 like to push the opinion of - although it isn't in the spec, D should
 be strict aliasing.  And people should be aware of the problems
 breaking strict aliasing (see:
 http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html)
I only found this: http://docs.lib.purdue.edu/ecetr/123/ They list performance improvements in the range of 0-6% depending on architecture. It's testing GCC from 2004 so some things may have changed, but 0-6% is not bad.
 But first... plan of attack:
 
 - We first disable strict aliasing entirely (lang_hook.get_alias_set
 => 0)
 - Implement -Wstrict-aliasing
 - Start turning on strict aliasing for types guaranteed not to be
 referencing the same memory location as other types (eg: TypeBasic,
 TypeDelegate, TypeSArray, TypeVector).
 - Identify gdc problems with implicit code generation that could break
 strict aliasing (these are our bugs).
 - Identify frontend/library problems that could break strict aliasing
 (these are the dmd/phobos developer's bugs).
 - Turn on strict aliasing for the remaining types.  For those that
 cause problems, we can define a TYPE_LANG_FLAG macro to allow us to
 tell the backend if the type can alias any other types.
Sounds good. If we can convince Walter to do this step-by-step (especially -Wstrict-aliasing needs to be in all compilers) this should work. But if GDC started enforcing strict aliasing and DMD didn't people probably wouldn't fix their code. Maybe the ARM port will be finished when we start working on this list so we have one more architecture to verify. ~5 phobos unit tests remaining ;-) (test suite already passes)
 
 Other possible considerations:
 
 - Most D code pretty much assumes that any object may be accessed via
 a void[] or void*.
Probably ubyte*, ubyte[] as well.
 
 - Infact, for the consideration of std.math.  It we could go one step
 further and simply build up an alias set list based on the type size
 over type distinction.  In this model
 double/long/byte[8]/short[4]/int[2] would all be considered as types
 that could be referencing each other.
 
I don't think that really helps much. Think of real with 16bytes. This may also negatively impact possible optimizations (I'd guess many types are word-sized. Another place where statistics would be nice)
Nov 08 2013
next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On Nov 8, 2013 7:20 PM, "Johannes Pfau" <nospam example.com> wrote:
 Am Fri, 8 Nov 2013 13:12:23 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:

 On 7 November 2013 18:05, Johannes Pfau <nospam example.com> wrote:

 Am Thu, 7 Nov 2013 16:14:59 +0000
 schrieb Iain Buclaw <ibuclaw ubuntu.com>:
 type here. It would allow code like this to work: "char[] a; char*
 b = (cast(char*)&a)" but I don't see why this should work, it's
 illegal anyway?
That would be seen as two distinct alias sets that would break strict aliasing in that example. Though, will have to implement -Wstrict-aliasing in the front-end to get any feel for what could potentially be utterly wrong. But the idea is that for dynamic arrays, telling gcc to not rely on structural equality to determine whether or not two dynamic arrays are part of the same alias set. eg: byte[] a, long[] b = *(cast (long[]*)&a) should be seen as being different alias sets, and so *will* be about breaking strict aliasing. In contrast, string[] a, char[] b = *cast(string[]*)&a) should be seen as being part of the same alias set, and so the compiler must know that the two types (which are distinct structures to the backend) could potentially be referencing the same slice of memory, as to not cause any problems.
Now I get what you mean. AFAIK the backend already strips type qualifiers when building alias sets, so char[] and string[] should already have the same alias set. But if you want to explicitly enforce this in GDC this makes sense.
I believe the difference is char* vs immutable(char)* - which a) is two distinct struct types in the back end and b) the strip mod stuff would see both as 'nothing to do' as the pointer has no qualifiers directly attached to it. Now I stop and think about dynamic arrays, I believe there's some work I need to do on improving debugging of them (both in gdb and when debugging gcc) - for instance when printing the type of an &array, is it still shown as an unnamed 'struct*' ? :)
 For people trying to work around the cast system for dynamic arrays,
 IMO they should be punished for it, and told to do it in the correct
 way that invokes _d_arraycopy, or do their unsafe work through unions.
OK. Although then we have to be very careful when implementing _d_arraycast (but maybe we can just check if we're in _d_arraycast and disable aliasing? Is this possible?).
Should be ok if we give special treatment for void[]. I can't remember the exact reason why there is a dereferenced cast in that function, might just be a pointer copy... (when I think array cast, I only see array.length = (length * F.sizeof) / T.sizeof;
 // Permit type-punning when accessing a union

 Isn't that already guaranteed by GCC? See:

 [...]
There is no harm enforcing it in the front-end as well, [...]
OK
 Not that I'm aware of (other than Ada boasting it's use).  But I'd
 like to push the opinion of - although it isn't in the spec, D should
 be strict aliasing.  And people should be aware of the problems
 breaking strict aliasing (see:
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html )
 I only found this:
 http://docs.lib.purdue.edu/ecetr/123/

 They list performance improvements in the range of 0-6% depending on
 architecture. It's testing GCC from 2004 so some things may have
 changed, but 0-6% is not bad.

 But first... plan of attack:

 - We first disable strict aliasing entirely (lang_hook.get_alias_set
 => 0)
 - Implement -Wstrict-aliasing
 - Start turning on strict aliasing for types guaranteed not to be
 referencing the same memory location as other types (eg: TypeBasic,
 TypeDelegate, TypeSArray, TypeVector).
 - Identify gdc problems with implicit code generation that could break
 strict aliasing (these are our bugs).
 - Identify frontend/library problems that could break strict aliasing
 (these are the dmd/phobos developer's bugs).
 - Turn on strict aliasing for the remaining types.  For those that
 cause problems, we can define a TYPE_LANG_FLAG macro to allow us to
 tell the backend if the type can alias any other types.
Sounds good. If we can convince Walter to do this step-by-step (especially -Wstrict-aliasing needs to be in all compilers) this should work. But if GDC started enforcing strict aliasing and DMD didn't people probably wouldn't fix their code. Maybe the ARM port will be finished when we start working on this list so we have one more architecture to verify. ~5 phobos unit tests remaining ;-) (test suite already passes)
 Other possible considerations:

 - Most D code pretty much assumes that any object may be accessed via
 a void[] or void*.
Probably ubyte*, ubyte[] as well.
 - Infact, for the consideration of std.math.  It we could go one step
 further and simply build up an alias set list based on the type size
 over type distinction.  In this model
 double/long/byte[8]/short[4]/int[2] would all be considered as types
 that could be referencing each other.
I don't think that really helps much. Think of real with 16bytes. This may also negatively impact possible optimizations (I'd guess many types are word-sized. Another place where statistics would be nice)
reals are special in that they are flexible in size. std.math is able to cope with this, but assumes that the target uses IEEE format. See my implementation of floor() or lrint() for instance which uses different code paths for each support float. Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Nov 09 2013
prev sibling next sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
By the way - I've gotten round to upgrading my trimslice to precise - *now*
I should be able to start doing regular gdc builds on it. :o)

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
Nov 10 2013
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 10 November 2013 19:08, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 By the way - I've gotten round to upgrading my trimslice to precise - *now*
 I should be able to start doing regular gdc builds on it. :o)
We now just assume everything aliases everything else: https://github.com/D-Programming-GDC/GDC/commit/926baf64a57895142f8cabf3cee97232b13777d3
Dec 05 2013
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 5 December 2013 16:48, Iain Buclaw <ibuclaw gdcproject.org> wrote:
 On 10 November 2013 19:08, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 By the way - I've gotten round to upgrading my trimslice to precise - *now*
 I should be able to start doing regular gdc builds on it. :o)
We now just assume everything aliases everything else: https://github.com/D-Programming-GDC/GDC/commit/926baf64a57895142f8cabf3cee97232b13777d3
Fixed that unexpected sign-extend conversion bug: https://github.com/D-Programming-GDC/GDC/commit/a27f600facf7ae87abad0c5618d712b32e2e3882
Dec 05 2013
prev sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 5 December 2013 17:46, Iain Buclaw <ibuclaw gdcproject.org> wrote:
 On 5 December 2013 16:48, Iain Buclaw <ibuclaw gdcproject.org> wrote:
 On 10 November 2013 19:08, Iain Buclaw <ibuclaw ubuntu.com> wrote:
 By the way - I've gotten round to upgrading my trimslice to precise - *now*
 I should be able to start doing regular gdc builds on it. :o)
We now just assume everything aliases everything else: https://github.com/D-Programming-GDC/GDC/commit/926baf64a57895142f8cabf3cee97232b13777d3
Fixed that unexpected sign-extend conversion bug: https://github.com/D-Programming-GDC/GDC/commit/a27f600facf7ae87abad0c5618d712b32e2e3882
Actually fixed that duplicate definition problem last week. Just got round to pushing it: https://github.com/D-Programming-GDC/GDC/commit/15d7b5ce334e851893d39efc148699dab4967cab
Dec 16 2013
prev sibling parent Iain Buclaw <ibuclaw ubuntu.com> writes:
On 8 November 2013 13:12, Iain Buclaw <ibuclaw ubuntu.com> wrote:

 For people trying to work around the cast system for dynamic arrays, IMO
 they should be punished for it, and told to do it in the correct way that
 invokes _d_arraycopy, or do their unsafe work through unions.
s/_d_arraycopy/_d_arraycast/ -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Nov 08 2013