digitalmars.D - Portability bug in integral conversion
- Andrei Alexandrescu (28/28) Jan 15 2011 We've spent a lot of time trying to improve the behavior of integral
- Graham St Jack (7/37) Jan 16 2011 It seems to me that the real problem here is that it isn't meaningful to...
- bearophile (4/8) Jan 16 2011 I'm asking for signed and unsigned overflows for years :-)
- Andrei Alexandrescu (3/11) Jan 16 2011 Nagonna happen.
- Andrei Alexandrescu (3/46) Jan 16 2011 That's too inefficient.
- Graham St Jack (15/61) Jan 16 2011 If that is the case, then a static check like you are suggesting seems
- Andrei Alexandrescu (4/65) Jan 16 2011 That would require flow analysis. I'm not sure we want to embark on that...
- Graham St Jack (18/92) Jan 16 2011 My fear is that if a cast is always required, people will just put one
- Andrei Alexandrescu (9/103) Jan 16 2011 I don't think it's the same. A cast's target will document the behavior....
- Jonathan M Davis (10/14) Jan 16 2011 Well, since it would mean checking a condition every time that you did
- Graham St Jack (8/21) Jan 16 2011 Yes, I agree that checking all the time would be too expensive. What I
- bearophile (8/11) Jan 17 2011 I agree that other solutions have to be adopted first, runtime tests are...
- Walter Bright (6/15) Jan 17 2011 Look at the asm dump of a function. It's full of add's - not only ADD
- Jonathan M Davis (3/20) Jan 17 2011 I think that you'd fare better with the cat. :)
- bearophile (4/7) Jan 17 2011 This answer is a bit relevant only if the programmer is using inline asm...
- Walter Bright (3/14) Jan 17 2011 A lot of the addition is also carried out at link time, and even by the ...
- bearophile (4/19) Jan 17 2011 The back-end carries out my D operations using unsigned differences on C...
- so (5/13) Jan 17 2011 Wouldn't this be just pushing a design error one step further?
- so (1/6) Jan 17 2011 Oh didn't see Don's reply.
- Bruno Medeiros (5/51) Feb 04 2011 Really? :/ Even if the runtime error can be optionally disabled on
- Walter Bright (3/7) Jan 17 2011 1. Yes it is meaningful - depending on what you're doing.
- Don (9/45) Jan 17 2011 This is a new example of an old issue; it is in no way specific to 64 bi...
- Andrei Alexandrescu (9/18) Jan 17 2011 That doesn't compile. This does:
- Don (33/59) Jan 17 2011 Aargh, that should have been:
We've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? Andrei
Jan 15 2011
On 16/01/11 08:52, Andrei Alexandrescu wrote:We've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? AndreiIt seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0? -- Graham St Jack
Jan 16 2011
Graham St Jack:It seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?I'm asking for signed and unsigned overflows for years :-) Bye, bearophile
Jan 16 2011
On 1/16/11 5:53 PM, bearophile wrote:Graham St Jack:Nagonna happen. AndreiIt seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?I'm asking for signed and unsigned overflows for years :-) Bye, bearophile
Jan 16 2011
On 1/16/11 5:24 PM, Graham St Jack wrote:On 16/01/11 08:52, Andrei Alexandrescu wrote:That's too inefficient. AndreiWe've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? AndreiIt seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?
Jan 16 2011
On 17/01/11 10:39, Andrei Alexandrescu wrote:On 1/16/11 5:24 PM, Graham St Jack wrote:If that is the case, then a static check like you are suggesting seems like a good way to go. Sure it will be annoying, but it will pick up a lot of bugs. This particular problem is one that bights me from time to time because I tend to use uints wherever it isn't meaningful to have negative values. It is great until I need to do a subtraction, when I sometimes forget to check which is greater. Would the check you have in mind statically check the following as ok? where a and b are uints and ptr is a pointer: if (a > b) { ptr += (a-b); } -- Graham St JackOn 16/01/11 08:52, Andrei Alexandrescu wrote:That's too inefficient. AndreiWe've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? AndreiIt seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?
Jan 16 2011
On 1/16/11 7:51 PM, Graham St Jack wrote:On 17/01/11 10:39, Andrei Alexandrescu wrote:That would require flow analysis. I'm not sure we want to embark on that ship. In certain situations value range propagation could take care of it. AndreiOn 1/16/11 5:24 PM, Graham St Jack wrote:If that is the case, then a static check like you are suggesting seems like a good way to go. Sure it will be annoying, but it will pick up a lot of bugs. This particular problem is one that bights me from time to time because I tend to use uints wherever it isn't meaningful to have negative values. It is great until I need to do a subtraction, when I sometimes forget to check which is greater. Would the check you have in mind statically check the following as ok? where a and b are uints and ptr is a pointer: if (a > b) { ptr += (a-b); }On 16/01/11 08:52, Andrei Alexandrescu wrote:That's too inefficient. AndreiWe've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? AndreiIt seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?
Jan 16 2011
On 17/01/11 13:30, Andrei Alexandrescu wrote:On 1/16/11 7:51 PM, Graham St Jack wrote:My fear is that if a cast is always required, people will just put one in out of habit and we are no better off (just like exception-swallowing). Is the cost of run-time checking really prohibitive? Correct code should have some checking anyway. Maybe providing phobos functions to perform various correct-usage operations with run-time checks like in my code fragment above would by useful. They could do the cast, and most of the annoyance factor would be dealt with. A trivial example: int difference(uint a, uint b) { if (a >= b) { return cast(int) a-b; } else { return -(cast(int) b-a); } } -- Graham St JackOn 17/01/11 10:39, Andrei Alexandrescu wrote:That would require flow analysis. I'm not sure we want to embark on that ship. In certain situations value range propagation could take care of it. AndreiOn 1/16/11 5:24 PM, Graham St Jack wrote:If that is the case, then a static check like you are suggesting seems like a good way to go. Sure it will be annoying, but it will pick up a lot of bugs. This particular problem is one that bights me from time to time because I tend to use uints wherever it isn't meaningful to have negative values. It is great until I need to do a subtraction, when I sometimes forget to check which is greater. Would the check you have in mind statically check the following as ok? where a and b are uints and ptr is a pointer: if (a > b) { ptr += (a-b); }On 16/01/11 08:52, Andrei Alexandrescu wrote:That's too inefficient. AndreiWe've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? AndreiIt seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?
Jan 16 2011
On 1/16/11 9:32 PM, Graham St Jack wrote:On 17/01/11 13:30, Andrei Alexandrescu wrote:I don't think it's the same. A cast's target will document the behavior. Right now we're simply doing silently the patently wrong thing. Walter stared at that code for hours. A cast would definitely be a good clue even if wrong.On 1/16/11 7:51 PM, Graham St Jack wrote:My fear is that if a cast is always required, people will just put one in out of habit and we are no better off (just like exception-swallowing).On 17/01/11 10:39, Andrei Alexandrescu wrote:That would require flow analysis. I'm not sure we want to embark on that ship. In certain situations value range propagation could take care of it. AndreiOn 1/16/11 5:24 PM, Graham St Jack wrote:If that is the case, then a static check like you are suggesting seems like a good way to go. Sure it will be annoying, but it will pick up a lot of bugs. This particular problem is one that bights me from time to time because I tend to use uints wherever it isn't meaningful to have negative values. It is great until I need to do a subtraction, when I sometimes forget to check which is greater. Would the check you have in mind statically check the following as ok? where a and b are uints and ptr is a pointer: if (a > b) { ptr += (a-b); }On 16/01/11 08:52, Andrei Alexandrescu wrote:That's too inefficient. AndreiWe've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? AndreiIt seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?Is the cost of run-time checking really prohibitive?Yes. There is no question about that. This is not negotiable.Correct code should have some checking anyway. Maybe providing phobos functions to perform various correct-usage operations with run-time checks like in my code fragment above would by useful. They could do the cast, and most of the annoyance factor would be dealt with. A trivial example: int difference(uint a, uint b) { if (a >= b) { return cast(int) a-b; } else { return -(cast(int) b-a); } }The general approach is to define properly bounded types with policy-based checking. Andrei
Jan 16 2011
On Sunday 16 January 2011 19:38:55 Andrei Alexandrescu wrote:On 1/16/11 9:32 PM, Graham St Jack wrote:Well, since it would mean checking a condition every time that you did arithmetic, that would likely _at least_ double the cost of doing any arithmetic. And particularly since arithmetic is such a basic operation that _everything else_ relies on, that could get really expensive, really fast. Yeah. I don't think that that's negotiable. Absolutely best case, I could see adding a compiler flag to enable it for debugging purposes, but it would definitely be expensive to do such checks and would be totally unacceptable in the release build of a systems programming language. - Jonathan M DavisIs the cost of run-time checking really prohibitive?Yes. There is no question about that. This is not negotiable.
Jan 16 2011
On 17/01/11 14:16, Jonathan M Davis wrote:On Sunday 16 January 2011 19:38:55 Andrei Alexandrescu wrote:Yes, I agree that checking all the time would be too expensive. What I meant was that we could provide functions that could do appropriate checking when it is needed. Andrei didn't like the functions idea, suggesting types that do policy-based checking, which I am happy with. -- Graham St JackOn 1/16/11 9:32 PM, Graham St Jack wrote:Well, since it would mean checking a condition every time that you did arithmetic, that would likely _at least_ double the cost of doing any arithmetic. And particularly since arithmetic is such a basic operation that _everything else_ relies on, that could get really expensive, really fast. Yeah. I don't think that that's negotiable. Absolutely best case, I could see adding a compiler flag to enable it for debugging purposes, but it would definitely be expensive to do such checks and would be totally unacceptable in the release build of a systems programming language. - Jonathan M DavisIs the cost of run-time checking really prohibitive?Yes. There is no question about that. This is not negotiable.
Jan 16 2011
Graham St Jack:Yes, I agree that checking all the time would be too expensive.I agree that other solutions have to be adopted first, runtime tests are the last thing to try. But I think Andrei doesn't know how much expensive that would be. --------------------- Walter:1. Yes it is meaningful - depending on what you're doing.I am not sure.2. Such a runtime test is expensive in terms of performance and code bloat.I have not seen even synthetic benchmarks about this. Bye, bearophile
Jan 17 2011
bearophile wrote:Look at the asm dump of a function. It's full of add's - not only ADD instructions, but addressing mode multiplies and add's. Subtraction is often expressed in terms of addition, relying on twos-complement wraparound. Trying to remove twos-complement arithmetic from a systems language is like trying to teach your cat to fetch.1. Yes it is meaningful - depending on what you're doing.I am not sure.2. Such a runtime test is expensive in terms of performance and code bloat.I have not seen even synthetic benchmarks about this.
Jan 17 2011
On Monday 17 January 2011 01:32:39 Walter Bright wrote:bearophile wrote:I think that you'd fare better with the cat. :) - Jonathan M DavisLook at the asm dump of a function. It's full of add's - not only ADD instructions, but addressing mode multiplies and add's. Subtraction is often expressed in terms of addition, relying on twos-complement wraparound. Trying to remove twos-complement arithmetic from a systems language is like trying to teach your cat to fetch.1. Yes it is meaningful - depending on what you're doing.I am not sure.2. Such a runtime test is expensive in terms of performance and code bloat.I have not seen even synthetic benchmarks about this.
Jan 17 2011
Walter:Look at the asm dump of a function. It's full of add's - not only ADD instructions, but addressing mode multiplies and add's. Subtraction is often expressed in terms of addition, relying on twos-complement wraparound.This answer is a bit relevant only if the programmer is using inline asm, while the discussion was about unsigned differences in D code, that are uncommon in my D code. Sometimes I even assign lengths to signed-word variables, to avoid some signed/unsigned comparison bugs. Bye, bearophile
Jan 17 2011
bearophile wrote:Walter:A lot of the addition is also carried out at link time, and even by the loader. Subtraction is done by relying on overflow.Look at the asm dump of a function. It's full of add's - not only ADD instructions, but addressing mode multiplies and add's. Subtraction is often expressed in terms of addition, relying on twos-complement wraparound.This answer is a bit relevant only if the programmer is using inline asm, while the discussion was about unsigned differences in D code, that are uncommon in my D code. Sometimes I even assign lengths to signed-word variables, to avoid some signed/unsigned comparison bugs.
Jan 17 2011
Walter:bearophile wrote:The back-end carries out my D operations using unsigned differences on CPU registers, the linker has to use them, etc. But the discussion was about explicit operations done by the D code written by the programmer. Modular arithmetic done by unsigned fixed bitfields is mathematically sound, but it's a bit too much bug-prone for normal Safe D modules :-) Bye, bearophileWalter:A lot of the addition is also carried out at link time, and even by the loader. Subtraction is done by relying on overflow.Look at the asm dump of a function. It's full of add's - not only ADD instructions, but addressing mode multiplies and add's. Subtraction is often expressed in terms of addition, relying on twos-complement wraparound.This answer is a bit relevant only if the programmer is using inline asm, while the discussion was about unsigned differences in D code, that are uncommon in my D code. Sometimes I even assign lengths to signed-word variables, to avoid some signed/unsigned comparison bugs.
Jan 17 2011
int difference(uint a, uint b) { if (a >= b) { return cast(int) a-b; } else { return -(cast(int) b-a); } }Wouldn't this be just pushing a design error one step further? uint has no mathematical basis whatsoever, it is there because we "can" have it. I have another solution, remove "uint-uint" from the language and provide explicit functions.
Jan 17 2011
Wouldn't this be just pushing a design error one step further? uint has no mathematical basis whatsoever, it is there because we "can" have it. I have another solution, remove "uint-uint" from the language and provide explicit functions.Oh didn't see Don's reply.
Jan 17 2011
On 17/01/2011 00:09, Andrei Alexandrescu wrote:On 1/16/11 5:24 PM, Graham St Jack wrote:Really? :/ Even if the runtime error can be optionally disabled on compilation, like arrays bound checking? -- Bruno Medeiros - Software EngineerOn 16/01/11 08:52, Andrei Alexandrescu wrote:That's too inefficient. AndreiWe've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? AndreiIt seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?
Feb 04 2011
Graham St Jack wrote:It seems to me that the real problem here is that it isn't meaningful to perform (a-b) on unsigned integers when (a<b). Attempting to clean up the resultant mess is really papering over the problem. How about a runtime error instead, much like dividing by 0?1. Yes it is meaningful - depending on what you're doing. 2. Such a runtime test is expensive in terms of performance and code bloat.
Jan 17 2011
Andrei Alexandrescu wrote:We've spent a lot of time trying to improve the behavior of integral types in D. For the most part, we succeeded, but the success was partial. There was some hope with the polysemy notion, but it ultimately was abandoned because it was deemed too difficult to implement for its benefits, which were considered solving a minor annoyance. I was sorry to see it go, and I'm glad that now its day of reckoning has come. Some of the 32-64 portability bugs have come in the following form: char * p; uint a, b; ... p += a - b; On 32 bits, the code works even if a < b: the difference will become a large unsigned number, which is then converted to a size_t (which is a no-op since size_t is uint) and added to p. The pointer itself is a 32-bit quantity. Due to two's complement properties, the addition has the same result regardless of the signedness of its operands. On 64-bits, the same code has different behavior. The difference a - b becomes a large unsigned number (say e.g. 4 billion), which is then converted to a 64-bit size_t. After conversion the sign is not extended - so we end up with the number 4 billion on 64-bit. That is added to a 64-bit pointer yielding an incorrect value. For the wraparound to work, the 32-bit uint should have been sign-extended to 64 bit. To fix this problem, one possibility is to mark statically every result of one of uint-uint, uint+int, uint-int as "non-extensible", i.e. as impossible to implicitly extend to a 64-bit value. That would force the user to insert a cast appropriately. Thoughts? Ideas? AndreiThis is a new example of an old issue; it is in no way specific to 64 bits. Any expression which contains a size-extension AND a signed<->unsigned implicit conversion is almost always a bug. (unsigned - unsigned leaves the carry flag unknown, so sign extension is impossible). It happens a lot with ushort, ubyte. There are several examples of it in bugzilla. short a=-1; a = a>>>1; is a particularly horrific example. I think it should be forbidden in all cases. I think it can be done with a flag in the range propagation.
Jan 17 2011
On 1/17/11 2:47 AM, Don wrote:Andrei Alexandrescu wrote:[snip]This is a new example of an old issue; it is in no way specific to 64 bits. Any expression which contains a size-extension AND a signed<->unsigned implicit conversion is almost always a bug. (unsigned - unsigned leaves the carry flag unknown, so sign extension is impossible). It happens a lot with ushort, ubyte. There are several examples of it in bugzilla. short a=-1; a = a>>>1; is a particularly horrific example.That doesn't compile. This does: short a = -1; a >>>= 1; a becomes 32767, which didn't surprise me. Replacing >>>= with >>= keeps a unchanged, which I also didn't find surprising.I think it should be forbidden in all cases. I think it can be done with a flag in the range propagation.Yes, that would be awesome! Andrei
Jan 17 2011
Andrei Alexandrescu wrote:On 1/17/11 2:47 AM, Don wrote:Aargh, that should have been: short a = -1; ushort b = -1; assert( a == b ); // passes assert( a >>> 1 == b >>> 1); // fails Another example: uint x = 3; uint y = 8; ulong z = 0; ulong a = (z + x) - y; ulong b = z + (x - y); assert(a == b); // Thought addition was associative, did you? 'a' only involves size-extension, so it's OK. But 'b' has a subexpression which sets the carry bit. Actually it doesn't even need subtraction. uint x = uint.max; uint y = uint.max; ulong z = 0; ulong a = (z + x) + y; ulong b = z + (x + y); assert(a == b); // Still thought addition was associative? It's the same deal: you shouldn't be able to size-extend, when the state of the carry flag is unknown. Once you have performed an operation which can wrap around, you have discarded the carry bit. This means you have made a commitment to arithmetic modulo 2^^32. And then the next addition is arithmetic modulo 2^^64! Which is a fundamentally different, incompatible operation. It should be a type mismatch. Note that because small types get promoted to int, the problem mostly shows up with uint -> ulong (for smaller types, the carry bit is retained inside the int).Andrei Alexandrescu wrote:[snip]This is a new example of an old issue; it is in no way specific to 64 bits. Any expression which contains a size-extension AND a signed<->unsigned implicit conversion is almost always a bug. (unsigned - unsigned leaves the carry flag unknown, so sign extension is impossible). It happens a lot with ushort, ubyte. There are several examples of it in bugzilla. short a=-1; a = a>>>1; is a particularly horrific example.That doesn't compile. This does: short a = -1; a >>>= 1; a becomes 32767, which didn't surprise me. Replacing >>>= with >>= keeps a unchanged, which I also didn't find surprising.I think it should be forbidden in all cases. I think it can be done with a flag in the range propagation.Yes, that would be awesome! Andrei
Jan 17 2011