digitalmars.D - Undefined behaviours in D and C

bearophile (14/14) Apr 14 2010 This recent blog post says nothing new for people that know C, it contai...

Lars T. Kyllingstad (8/15) Apr 14 2010 Some time ago, I believe Walter decided to let @safe mean "no undefined

bearophile (7/7) Apr 15 2010 Lars T. Kyllingstad:

Lars T. Kyllingstad (4/13) Apr 15 2010 The effect of @safe would be to forbid code that leads to undefined

bearophile (4/6) Apr 15 2010 Right, but that's not the solution I was looking for, and it's not going...

BCS (8/12) Apr 15 2010 Wrong! >90% of the time, when I want to use D over some other language, ...
Walter Bright (4/13) Apr 17 2010 I'm confused. It appears you want to write unsafe code and yet have it

BCS (9/13) Apr 17 2010 Currently the described code is legal, unsafe (it can result in invalid ...

Walter Bright (4/10) Apr 17 2010 I don't think that's a useful thing to specify - where's the advantage,

Michel Fortin (17/28) Apr 17 2010 Me thinks this is not a very good argument. Supporting obscure

Walter Bright (3/12) Apr 18 2010 These are allowed in safe functions.

Pelle (7/13) Apr 18 2010 Just checking, this is allowed:

Fawzi Mohamed (2/18) Apr 19 2010

bearophile (5/7) Apr 19 2010 In the stdint.h of C99 there is (optionally) uintptr_t that's is an unsi...

bearophile (4/7) Apr 19 2010 There's uintptr_t in D std lib too, I have to start using it:

Fawzi Mohamed (3/12) Apr 19 2010 that is for C compatibility, D has always defined size_t and ptrdiff_t

Pelle (5/10) Apr 19 2010 On 04/18/2010 02:46 PM, Walter Bright wrote:

BCS (11/25) Apr 17 2010 #1 point to a machine in use now (keeping in mind D already dumped near/...

Walter Bright (2/6) Apr 15 2010 You won't be able to cast pointers from integral types in safe functions...

bearophile (5/6) Apr 15 2010 That doesn't solve the problem, because I will surely want to use unsafe...

Walter Bright (4/23) Apr 16 2010 I don't see any way to make conversions between pointers and ints

bearophile (21/22) Apr 18 2010 Walter Bright:

Walter Bright (9/29) Apr 18 2010 'restrict' is not at all about eliminating undefined behavior. It is

bearophile (37/40) Apr 19 2010 And one of the few ways out of this, while keeping the language safe, is...

Don (5/10) Apr 19 2010 Array operations address the same as issue as restrict, but are much

bearophile (5/8) Apr 19 2010 I agree. (And I am not sure compilers use it well).

Lars T. Kyllingstad (3/30) Apr 19 2010 Don't you mean 'union' here, not 'enum'?

bearophile (5/13) Apr 19 2010 Yes, sorry -.- In Python newsgroups most code snippets shown by people a...

Jesse Phillips (10/10) Apr 15 2010 Part of the reason D leaves undefined behavior is because you are

Steven Schveighoffer (8/12) Apr 15 2010 This is not undefined behavior. Continuing to use s would be.

Sean Kelly (3/9) Apr 20 2010 I suggest using core.stdc if you're using D2:

bearophile <bearophileHUGS lycos.com> writes:

This recent blog post says nothing new for people that know C, it contains just
few notes about some undefined C behaviours, but it's a starting point for what
I want to say in this post:

http://james-iry.blogspot.com/2010/04/c-is-not-assembly.html

Undefined behaviours help adapt the language to different CPUs, but today PC
CPUs are more similar to each other compared to the CPUs used when C was
defined (because in an evolutionary tree most diversity is located near the

you can have an efficient enough C-family language even if you remove many/most
undefined behaviours from it (a JIT compiler can be better than a static
compiler in this).

D semantics is quite based on C, but of course there are no written formal
language specs yet, as you can find for C. Undefined behaviours are a really
good source of bugs in programs (to avoid some of them you can try to put
warnings in your compiler/lint for each undefined behaviour of your language).

D already defines some behaviours that are left undefined in C, for example I
think operations like 5%(-2) and 5/(-2) are defined in D, as well as shifts <<
>> when the number of bits shifted is larger than the number of bits of the
value. And the removal from D of some other undefined C behaviours is planned
in D, like the eval order of function arguments.

But I think some other undefined holes coming from C remain in D, for example
regarding:
- Static casts between size_t/ptrdiff_t and pointers;
- Pointer aliasing;
- Read of an enum field different from the last field written;
- etc.

It can be positive to write down a complete list of such undefined C behaviours
and decide if it's good to leave them undefined in D too, and where the answer

suggestions.

D Bugzilla shows that there are few 'undefined behaviours' in some D constructs
too, but starting from the C ones is good because there's already a lot of
experience about using C to write programs.

Bye,
bearophile

Apr 14 2010

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

bearophile wrote:
 This recent blog post says nothing new for people that know C, it contains
just few notes about some undefined C behaviours, but it's a starting point for
what I want to say in this post:
 
 http://james-iry.blogspot.com/2010/04/c-is-not-assembly.html
 
 Undefined behaviours help adapt the language to different CPUs, but today PC
CPUs are more similar to each other compared to the CPUs used when C was
defined (because in an evolutionary tree most diversity is located near the

you can have an efficient enough C-family language even if you remove many/most
undefined behaviours from it (a JIT compiler can be better than a static
compiler in this).
 
 D semantics is quite based on C, but of course there are no written formal
language specs yet, as you can find for C. Undefined behaviours are a really
good source of bugs in programs (to avoid some of them you can try to put
warnings in your compiler/lint for each undefined behaviour of your language).

Some time ago, I believe Walter decided to let  safe mean "no undefined 
behaviour".  Hopefully, this will reduce the number of 
undefined-behaviour related bugs.  After all, most D code should be 
marked  safe.

Here it is:
http://www.digitalmars.com/d/archives/digitalmars/D/Safety_undefined_behavior_safe_trusted_100138.html

-Lars

Apr 14 2010

bearophile <bearophileHUGS lycos.com> writes:

Lars T. Kyllingstad:

Thank you for your answer & thread link.

Some time ago, I believe Walter decided to let  safe mean "no undefined
behaviour".<

I find it hard to believe that safe modules can define for example the semantic
of static casts between size_t and a pointer, while unsafe modules can leave it
undefined as in C :-) To me this will lead to a mess even worse than the C
situation.

So a better solution is to define such behaviours in both kinds of modules, or
leave them undefined in both. I prefer the first possibility. And to make this
happen a starting point is to list all things C standard leaves undefined.

Bye,
bearophile

Apr 15 2010

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

bearophile wrote:
 Lars T. Kyllingstad:
 
 Thank you for your answer & thread link.
 
 Some time ago, I believe Walter decided to let  safe mean "no undefined
behaviour".

 
 I find it hard to believe that safe modules can define for example the
semantic of static casts between size_t and a pointer, while unsafe modules can
leave it undefined as in C :-) To me this will lead to a mess even worse than
the C situation.
 
 So a better solution is to define such behaviours in both kinds of modules, or
leave them undefined in both. I prefer the first possibility. And to make this
happen a starting point is to list all things C standard leaves undefined.

The effect of  safe would be to forbid code that leads to undefined 
behaviour, not make it well-defined.

-Lars

Apr 15 2010

bearophile <bearophileHUGS lycos.com> writes:

Lars T. Kyllingstad:
 The effect of  safe would be to forbid code that leads to undefined 
 behaviour, not make it well-defined.

Right, but that's not the solution I was looking for, and it's not going to
solve the problems inherited from C. Because if people that use D want to use

idea, but safe modules can't be a replacement for efforts to make safer the low
level code too.

Bye,
bearophile

Apr 15 2010

BCS <none anon.com> writes:

Hello bearophile,

 [...] people that


Wrong! >90% of the time, when I want to use D over some other language, it 

I have never wanted to use them for any language related reasons. The only 

liked Java, but that's a personal preference thing).

 Bye,
 bearophile

-- 
... <IXOYE><

Apr 15 2010

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 Lars T. Kyllingstad:
 The effect of  safe would be to forbid code that leads to undefined
  behaviour, not make it well-defined.

 
 Right, but that's not the solution I was looking for, and it's not
 going to solve the problems inherited from C. Because if people that

 safe modules in D is a good idea, but safe modules can't be a
 replacement for efforts to make safer the low level code too.

I'm confused. It appears you want to write unsafe code and yet have it 
be guaranteed safe.

Functions tagged with  system are where you should put unsafe code.

Apr 17 2010

BCS <none anon.com> writes:

Hello Walter,

 bearophile wrote:
 
 I'm confused. It appears you want to write unsafe code and yet have it
 be guaranteed safe.

Currently the described code is legal, unsafe (it can result in invalid
pointers) 
and has undefined semantics (it can result in unpredictable, implementation 
defined results). What I think bearophile wants is for only the last to be 
changed, that is; you can still do things that result in invalid pointers, 
but it does so in a well defined way (at least with regards to the bit pattern 
the pointer ends up as)



-- 
... <IXOYE><

Apr 17 2010

Walter Bright <newshound1 digitalmars.com> writes:

BCS wrote:
 Currently the described code is legal, unsafe (it can result in invalid 
 pointers) and has undefined semantics (it can result in unpredictable, 
 implementation defined results). What I think bearophile wants is for 
 only the last to be changed, that is; you can still do things that 
 result in invalid pointers, but it does so in a well defined way (at 
 least with regards to the bit pattern the pointer ends up as)

I don't think that's a useful thing to specify - where's the advantage, 
and if D is on a machine that does pointers differently, why make it 
impossible to port standard D to it?

Apr 17 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-04-17 13:35:16 -0400, Walter Bright <newshound1 digitalmars.com> said:

 BCS wrote:
 Currently the described code is legal, unsafe (it can result in invalid 
 pointers) and has undefined semantics (it can result in unpredictable, 
 implementation defined results). What I think bearophile wants is for 
 only the last to be changed, that is; you can still do things that 
 result in invalid pointers, but it does so in a well defined way (at 
 least with regards to the bit pattern the pointer ends up as)

 
 I don't think that's a useful thing to specify - where's the advantage, 
 and if D is on a machine that does pointers differently, why make it 
 impossible to port standard D to it?

Me thinks this is not a very good argument. Supporting obscure 
platforms isn't very useful, that's why D only supports complement-2 
arithmetics (you said it yourself).

There is a very good reason to disallow manipulating the bit pattern in 
safe D however: memory safety. If you can dereference a pointer made 
from an arbitrary bit pattern, you may have an exploitable flaw similar 
to a buffer overrun. Dereferencing an arbitrary value is definitely 
*not* memory-safe and should *not* be allowed in safe D.

So you shouldn't be able to cast a value to a pointer. The reverse, 
casting a pointer to a value, makes sense in my opinion: you may want 
to print the pointer value in a debug output of some sort. There's 
nothing unsafe with that so it should be allowed.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Apr 17 2010

Walter Bright <newshound1 digitalmars.com> writes:

Michel Fortin wrote:
 There is a very good reason to disallow manipulating the bit pattern in 
 safe D however: memory safety. If you can dereference a pointer made 
 from an arbitrary bit pattern, you may have an exploitable flaw similar 
 to a buffer overrun. Dereferencing an arbitrary value is definitely 
 *not* memory-safe and should *not* be allowed in safe D.

And it is not allowed in safe functions.

 So you shouldn't be able to cast a value to a pointer. The reverse, 
 casting a pointer to a value, makes sense in my opinion: you may want to 
 print the pointer value in a debug output of some sort. There's nothing 
 unsafe with that so it should be allowed.

These are allowed in safe functions.

Apr 18 2010

Pelle <pelle.mansson gmail.com> writes:

On 04/18/2010 02:46 PM, Walter Bright wrote:
 Michel Fortin wrote:
 So you shouldn't be able to cast a value to a pointer. The reverse,
 casting a pointer to a value, makes sense in my opinion: you may want
 to print the pointer value in a debug output of some sort. There's
 nothing unsafe with that so it should be allowed.

 These are allowed in safe functions.

Just checking, this is allowed:

 safe void crash_maybe() {
     int* p = cast(int*)uniform(size_t.min, size_t.max);
     *p = 14;
}

right?

Apr 18 2010

Fawzi Mohamed <fawzi gmx.ch> writes:

On 19-apr-10, at 08:23, Pelle wrote:

 On 04/18/2010 02:46 PM, Walter Bright wrote:
 Michel Fortin wrote:
 So you shouldn't be able to cast a value to a pointer. The reverse,
 casting a pointer to a value, makes sense in my opinion: you may  
 want
 to print the pointer value in a debug output of some sort. There's
 nothing unsafe with that so it should be allowed.

 These are allowed in safe functions.

 Just checking, this is allowed:

  safe void crash_maybe() {
    int* p = cast(int*)uniform(size_t.min, size_t.max);
    *p = 14;
 }

 right?

no the opposite is safe (pointer -> size_t) but there is no way size_t- 
pointer can be safe...

Apr 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Fawzi Mohamed:
 no the opposite is safe (pointer -> size_t) but there is no way size_t- 
 pointer can be safe...

In the stdint.h of C99 there is (optionally) uintptr_t that's is an unsigned
int that is large enough to contain a pointer (there is a intptr_t too,
signed). In C99 you use that to convert a pointer to an integral.

I don't know if D specs assert that D size_t is wide enough to represent a
pointer.

Bye,
bearophile

Apr 19 2010

bearophile <bearophileHUGS lycos.com> writes:

 In the stdint.h of C99 there is (optionally) uintptr_t that's is an unsigned
int that is large enough to contain a pointer (there is a intptr_t too,
signed). In C99 you use that to convert a pointer to an integral.
 
 I don't know if D specs assert that D size_t is wide enough to represent a
pointer.

There's uintptr_t in D std lib too, I have to start using it:
http://www.digitalmars.com/d/2.0/phobos/std_stdint.html

Bye,
bearophile

Apr 19 2010

Fawzi Mohamed <fawzi gmx.ch> writes:

On 19-apr-10, at 12:32, bearophile wrote:

 In the stdint.h of C99 there is (optionally) uintptr_t that's is an  
 unsigned int that is large enough to contain a pointer (there is a  
 intptr_t too, signed). In C99 you use that to convert a pointer to  
 an integral.

 I don't know if D specs assert that D size_t is wide enough to  
 represent a pointer.

 There's uintptr_t in D std lib too, I have to start using it:
 http://www.digitalmars.com/d/2.0/phobos/std_stdint.html

that is for C compatibility, D has always defined size_t and ptrdiff_t  
(without needing to import anything) exactly like that.

Apr 19 2010

Pelle <pelle.mansson gmail.com> writes:

On 04/19/2010 11:47 AM, Fawzi Mohamed wrote:
 no the opposite is safe (pointer -> size_t) but there is no way
 size_t->pointer can be safe...

Michel Fortin wrote:
 So you shouldn't be able to  *cast a value to a pointer*.  The reverse,
 casting a pointer to a value, makes sense in my opinion:

On 04/18/2010 02:46 PM, Walter Bright wrote:
  *These*  are allowed in safe functions.

(emphasis mine)

I was trying to visualize a point.

Apr 19 2010

BCS <none anon.com> writes:

Hello Walter,

 BCS wrote:
 
 Currently the described code is legal, unsafe (it can result in
 invalid pointers) and has undefined semantics (it can result in
 unpredictable, implementation defined results). What I think
 bearophile wants is for only the last to be changed, that is; you can
 still do things that result in invalid pointers, but it does so in a
 well defined way (at least with regards to the bit pattern the
 pointer ends up as)
 

 I don't think that's a useful thing to specify - where's the
 advantage, and if D is on a machine that does pointers differently,
 why make it impossible to port standard D to it?
 


pointers) that "does pointers differently"?


but on another, it compiles, runs without error and does NOT work?

I'll grant Michel's point about pointer->int for debugging, etc. but even 
then I'd consider requiring an explicit cast.

In the end, while I see the point and see some merit, I'm almost natural 
on the subject.

-- 
... <IXOYE><

Apr 17 2010

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 I find it hard to believe that safe modules can define for example
 the semantic of static casts between size_t and a pointer, while
 unsafe modules can leave it undefined as in C :-) To me this will
 lead to a mess even worse than the C situation.

You won't be able to cast pointers from integral types in safe functions.

Apr 15 2010

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:
 You won't be able to cast pointers from integral types in safe functions.

That doesn't solve the problem, because I will surely want to use unsafe code
in D, and unsafe modules will keep having the same undefined-derived bugs
inherited from C. What I was asking for in this thread is to fix some of the C
holes, not to just forbid the things I was looking for in D in the first place.
If I use D instead of for example Python is because D has unions and pointers,
that allow me to create the tight data structures that have a good performance.
I am not interested in using D just as a Java.

This can be an irreducible difference between my ideal language and D. Maybe my
purpose is  hopeless, who knows. My ideal system language is like a C that
helps me avoid a large percentage of possible bugs. A language that the
programmer can predict what it will do, with lower level features. Maybe
someday I'll try to create this language :-)

Bye,
bearophile

Apr 15 2010

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 Walter Bright:
 You won't be able to cast pointers from integral types in safe
 functions.

 
 That doesn't solve the problem, because I will surely want to use
 unsafe code in D, and unsafe modules will keep having the same
 undefined-derived bugs inherited from C. What I was asking for in
 this thread is to fix some of the C holes, not to just forbid the
 things I was looking for in D in the first place. If I use D instead
 of for example Python is because D has unions and pointers, that
 allow me to create the tight data structures that have a good
 performance. I am not interested in using D just as a Java.
 
 This can be an irreducible difference between my ideal language and
 D. Maybe my purpose is  hopeless, who knows. My ideal system language
 is like a C that helps me avoid a large percentage of possible bugs.
 A language that the programmer can predict what it will do, with
 lower level features. Maybe someday I'll try to create this language
 :-)

I don't see any way to make conversions between pointers and ints 
implementation defined, and make dereferencing a pointer coming from 
some int anything but undefined behavior.

Apr 16 2010

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

Sorry for the delay, I was away.
In this post I try to write in a quite explicit way.


I don't see any way to make conversions between pointers and ints
implementation defined,<

I see. Thank you for the explanation, I'm often ignorant enough.


In my original post I was talking about all places where C standard leaves
things undefined. I'm not a C language lawyer, so I don't know all the things
the C standard leaves undefined, but I know there are other undefined things in
C beside the pointer <-> int conversion. That's why I was saying that it can be
quite positive to write down a list of such things. So even if there is no hope
to fix this pointer <-> int hole, maybe there are other C holes that can be
fixed. I will not be able to write down a complete list, but I think having a
complete list can be a good starting point.

In my original post I have listed two more things that I think the C standard
leaves undefined:
- Pointer aliasing;
- Read of an enum field different from the last field written;

The first of them is fixed in C99 with the 'restrict' keyword. I guess the D
compiler has to assume all pointers can be an alias to each other (but I don't
remember if the D docs say this explicitely somewhere) because I think D
prefers to not give keywords that the compiler itself can't then test and make
sure they are correct.

The second of them is relative to code like:

enum SI { short s; int i; }
void main() {
  SI e;
  e.i = 1_000_000;
  int foo = e.s;
}
    
I think that according the C standard this code (the contents of foo) is
undefined. Is D going to define this, or is it going to leave this undefined as
in C? (Leaving it undefined can speed up a little the D code, but making it
defined can make D more flexible, for example you can use an enum to split an
int in two shorts in a reliable way). Note: here I am talking about D unsafe
modules, because I think safe D modules can't use enums. So I am talking about
the possibility of removing some undefined behaviours from unsafe D modules.

Probably the C standard leaves other things undefined. Some of them can cause
bugs in unsafe D code.

Bye,
bearophile

Apr 18 2010

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 The first of them is fixed in C99 with the 'restrict' keyword. I
 guess the D compiler has to assume all pointers can be an alias to
 each other (but I don't remember if the D docs say this explicitely
 somewhere) because I think D prefers to not give keywords that the
 compiler itself can't then test and make sure they are correct.

'restrict' is not at all about eliminating undefined behavior. It is 
about providing more information to the optimizer so better code can be 
generated. If restrict is used incorrectly, however, undefined behavior 
can result. D doesn't have this problem because D doesn't have the 
restrict qualifier.

 The second of them is relative to code like:
 
 enum SI { short s; int i; } void main() { SI e; e.i = 1_000_000; int
 foo = e.s; }
 
 I think that according the C standard this code (the contents of foo)
 is undefined. Is D going to define this, or is it going to leave this
 undefined as in C? (Leaving it undefined can speed up a little the D
 code, but making it defined can make D more flexible, for example you
 can use an enum to split an int in two shorts in a reliable way).
 Note: here I am talking about D unsafe modules, because I think safe
 D modules can't use enums. So I am talking about the possibility of
 removing some undefined behaviours from unsafe D modules.

D leaves byte ordering (endianness) implementation defined. I see no way 
to do otherwise without incurring severe performance penalties.

 Probably the C standard leaves other things undefined. Some of them
 can cause bugs in unsafe D code.

Yes, endianness issues can cause bugs.

Apr 18 2010

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

D doesn't have this problem because D doesn't have the restrict qualifier.<

So the D2 specs have to explicitly state that all D pointers can be an alias of
each other (and this will make D code slower than Fortran77 code).


If restrict is used incorrectly, however, undefined behavior can result.<

And one of the few ways out of this, while keeping the language safe, is the
ownership/lent/etc extensions to the type system, that are cute, but they are
not so easy to learn to use and can become a little burden for the D programmer.

Another solution is the restrict keyword as in C. In a D program the restrict
keyword can be useful only in few numeric kernels, often less than 30 lines of
code, that perform tons of computations in few loops. In such loops the
knowledge of distinct pointers can be significantly useful to improve the code.
In all other parts of the program such keyword is useless or not essential
(such loops can even enjoy a harder form or compilation, almost a
supercompilation. The programmer can even give an attribute like  hot to this
loop/function. GCC too has a 'hot' function attribute, but I think in GCC it's
not very useful).

I don't know what to think about this. Being D a system language, the language
is expected to offer unsafe features too, as this one. So maybe offering
restrict, to be used in very limited situations, can be acceptable in D too.

In many situations the numerical kernels work over arrays, and D arrays have
both a pointer and a length, so it's easy to test if a pointer is inside such
interval and if two interval are fully distinct. Such tests can be done in
nonrelease mode to give a little more safety to the restrict keyword. Some of
such tests can even be kept in release mode if they are outside the heavy loops.

Maybe it can be invented something like restrict but more limited, that works
on D arrays only. An extension of the D type system that's useful for numerical
kernels that work on arrays. Something like:

 enforce_restrict(array1, array2, ...) {
    // numerical kernel that uses the arrays
}

Inside that enforce the D type system knows they are distinct, it's like a
restrict applied to their pointers. I don't know if this can work in practical
situations. Maybe there's an acceptable solution to this problem of D2.

---------------

I think in C you can't reliably cast a pointer from a type to a different type.
I think because the C compiler (and D compiler, I presume) can optimize away
some things, making this unsafe/undefined.

This conversion is sometimes done using an union, that's a bit safer than the
reinterpret cast:

union Foo2Bar {
   int* iptr;
   double* dptr;
}

But I think the C standard says that from a union you can't read a field
different from the last field you have written, so that too is unsafe:

import std.stdio;
union U { int i; float f; }
void main() {
  U u;
  u.i = 10;
  writeln(u.i); // defined
  U u;
  u.f = 10;
  writeln(u.f); // defined  
  writeln(u.i); // undefined
}


I think this not because of endianeess problems, but because the compiler can
keep values in registers and optimize away the read/write inside the union. D
language can state this is defined, making unions a safer way to statically
convert ints to floats, or it can follow the C way to make code a little faster.

Strict aliasing means that two objects of different types cannot refer to the
same location in memory.

See also the -fno-strict-aliasing GCC compiler switch, and related matters:
In C99, it is illegal to create an alias of a different type than the original.
This is often refered to as the strict aliasing rule.<<


I don't know if D here follows C99 or not.
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html

Bye and thank you,
bearophile

Apr 19 2010

Don <nospam nospam.com> writes:

bearophile wrote:
 Walter Bright:
 
 D doesn't have this problem because D doesn't have the restrict qualifier.<

 
 So the D2 specs have to explicitly state that all D pointers can be an alias
of each other (and this will make D code slower than Fortran77 code).

Array operations address the same as issue as restrict, but are much 
easier for the compiler. (They don't completely overlap in 
functionality, but the most important cases are covered by both).
AFAIK 'restrict' hasn't been a terribly successful feature in the C world.

Apr 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Don:
 (They don't completely overlap in 
 functionality, but the most important cases are covered by both).

I will need to use array ops more to if you are right.


 AFAIK 'restrict' hasn't been a terribly successful feature in the C world.

I agree. (And I am not sure compilers use it well).

Bye,
bearophile

Apr 19 2010

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

bearophile wrote:
 Walter Bright:
 
 Sorry for the delay, I was away.
 In this post I try to write in a quite explicit way.
 
 
 I don't see any way to make conversions between pointers and ints
implementation defined,<

 
 I see. Thank you for the explanation, I'm often ignorant enough.
 
 
 In my original post I was talking about all places where C standard leaves
things undefined. I'm not a C language lawyer, so I don't know all the things
the C standard leaves undefined, but I know there are other undefined things in
C beside the pointer <-> int conversion. That's why I was saying that it can be
quite positive to write down a list of such things. So even if there is no hope
to fix this pointer <-> int hole, maybe there are other C holes that can be
fixed. I will not be able to write down a complete list, but I think having a
complete list can be a good starting point.
 
 In my original post I have listed two more things that I think the C standard
leaves undefined:
 - Pointer aliasing;
 - Read of an enum field different from the last field written;
 
 The first of them is fixed in C99 with the 'restrict' keyword. I guess the D
compiler has to assume all pointers can be an alias to each other (but I don't
remember if the D docs say this explicitely somewhere) because I think D
prefers to not give keywords that the compiler itself can't then test and make
sure they are correct.
 
 The second of them is relative to code like:
 
 enum SI { short s; int i; }
 void main() {
   SI e;
   e.i = 1_000_000;
   int foo = e.s;
 }


Don't you mean 'union' here, not 'enum'?

-Lars

Apr 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Lars T. Kyllingstad:
 enum SI { short s; int i; }
 void main() {
   SI e;
   e.i = 1_000_000;
   int foo = e.s;
 }

 
 Don't you mean 'union' here, not 'enum'?

Yes, sorry -.- In Python newsgroups most code snippets shown by people are
being run before post. It's an habit that I must keep in D newsgroups too.
This whole thread is mostly showing how smart I am not.

Bye and thank you,
bearophile

Apr 19 2010

Jesse Phillips <jessekphillips+D gmail.com> writes:

Part of the reason D leaves undefined behavior is because you are
breaking compiler guarantees. Such as:

    char[] s = ...;
    immutable(char)[] p = cast(immutable)s;     // undefined behavior

I think what would be more helpful is instead propose what undefined
behavior should be defined as and push that. Walter doesn't like
undefined behavior, so I'm sure either he doesn't know what it should
be defined as or has a good reason to leave it.Part of the reason D
leaves undefined behavior is because you are breaking compiler
guarentees.

Apr 15 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 15 Apr 2010 10:24:07 -0400, Jesse Phillips  
<jessekphillips+D gmail.com> wrote:

 Part of the reason D leaves undefined behavior is because you are
 breaking compiler guarantees. Such as:

     char[] s = ...;
     immutable(char)[] p = cast(immutable)s;     // undefined behavior

This is not undefined behavior.  Continuing to use s would be.

I just wanted to make that clear.  Except for strings, there is currently  
no way to generate immutable data except via casting.  Don't use idup  
except on pure value types, that is currently unsafe, see  
http://d.puremagic.com/issues/show_bug.cgi?id=3550

-Steve

Apr 15 2010

Sean Kelly <sean invisibleduck.org> writes:

bearophile Wrote:

 In the stdint.h of C99 there is (optionally) uintptr_t that's is an unsigned
int that is large enough to contain a pointer (there is a intptr_t too,
signed). In C99 you use that to convert a pointer to an integral.
 
 I don't know if D specs assert that D size_t is wide enough to represent a
pointer.

 
 There's uintptr_t in D std lib too, I have to start using it:
 http://www.digitalmars.com/d/2.0/phobos/std_stdint.html

I suggest using core.stdc if you're using D2:

http://dsource.org/projects/druntime/browser/trunk/import/core/stdc/stdint.d

Apr 20 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Undefined behaviours in D and C