digitalmars.D - primitive value overflow
- luka8088 (17/17) May 16 2013 Hello everyone.
- Andrej Mitrovic (4/7) May 16 2013 Yeah, I don't like these bugs either. In the meantime you can
- Mr. Anonymous (7/16) May 16 2013 It's exactly the same as checking if(text1.length > text2.length).
- luka8088 (7/23) May 16 2013 I agree that it is exactly the same as checking if (text1.length >
- Jonathan M Davis (19/25) May 16 2013 Well, I'm not sure what can be done about that. length is size_t, which ...
- Mr. Anonymous (7/53) May 16 2013 I agree with Walter if we're talking about production code, but I
- 1100110 (10/58) May 16 2013 e
- Andrej Mitrovic (4/6) May 16 2013 We could add it in -debug mode perhaps. Ideally this sort of thing
- luka8088 (2/11) May 16 2013 Yes, thanks for the advice, I did something similar. =)
- Marco Leise (34/47) May 16 2013 Now that doesn't work if you deal with some text2 that is over
- Regan Heath (15/33) May 17 2013 This is not a problem I worry about in the general case. I can honestly...
- luka8088 (13/58) May 23 2013 I understand perfectly the issue that you are pointing out. But that is
- Marco Leise (11/24) May 23 2013
- Ary Borenszweig (3/14) May 17 2013 It makes sense if you don't want to have these subtle bugs in your
- Marco Leise (7/27) May 18 2013 Just saying: the subtraction of two integers creates a
- Minas Mina (2/2) May 18 2013 I agree that checks for overflow should exist in debug builds
- Peter Alexander (4/6) May 23 2013 What about code that relies on overflow? It's well-defined
- bearophile (20/23) May 23 2013 Do you rely on signed or unsigned overflow?
- luka8088 (5/27) May 29 2013 I agree completely!
- Marco Leise (6/9) May 23 2013 See my post about Delphi's approach. It could be disabled in D
- Timothee Cour (35/44) May 25 2013 A)
- Marco Leise (3/3) May 26 2013 Good that we talked about it.
Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length; and in case text2 was longer then text1 I got something around 4294967291. So I opened an issue: http://d.puremagic.com/issues/show_bug.cgi?id=10093 I know that there is a perfectly valid reason for this behavior, and that this behavior is not undefined, but it is unexpected, especially because unsigned is never mentioned in the code. One solution that comes to mind is changing length to signed, but that makes no sense because length is never negative. After some thinking a though came that maybe such value overflow should be treated the same way as array overflow and checked by druntime with optional disabling in production code (like array bound checks)? I think it would be very helpful to get an error for such mistake (that could very easily happen by accident), and on the other hand it can be disabled (like all other checks).
May 16 2013
On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length;Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
May 16 2013
On Thursday, 16 May 2013 at 20:29:13 UTC, Andrej Mitrovic wrote:On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:It's exactly the same as checking if(text1.length > text2.length). But the idea of checking for integer overflows in debug builds is really nice. P.S. I remember Microsoft had some serious bug because of an integer overflow, that allowed a remote machine to create a denial of service.Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length;Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
May 16 2013
On 16.5.2013. 22:35, Mr. Anonymous wrote:On Thursday, 16 May 2013 at 20:29:13 UTC, Andrej Mitrovic wrote:I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:It's exactly the same as checking if(text1.length > text2.length). But the idea of checking for integer overflows in debug builds is really nice. P.S. I remember Microsoft had some serious bug because of an integer overflow, that allowed a remote machine to create a denial of service.Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length;Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
May 16 2013
On Thursday, May 16, 2013 22:42:23 luka8088 wrote:I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.Well, I'm not sure what can be done about that. length is size_t, which is the unsigned integral value which matches the architecture (uint for 32-bit and ulong for 64-bit). AFAIK, the documentation is clear on this (though I haven't read it recently). If it's not, then it should be made clearer, but using size_t for length is pervasive in D as it is in C++, and if you know the standard types, you know what size_t is. As for overflow checking, it's come up quite a few times, and Walter is completely against it. The hardware doesn't support it, and it would definitely be slow if it were added. The standard way to handle that if you want it is to create user-defined integral type which does the checks (though that obviously won't help you when you can't control the types that you're dealing with). But if you want to add checks in your code, you can always create a wrapper function for doing the subtraction. And if you wanted the checks to only be there in non-release mode, then you could even put the checks in a version(assert) block. So, you can add checks if you want them, but there's pretty much no way that how unsigned or overlflow are handled in the language is going to change. - Jonathan M Davis
May 16 2013
On Thursday, 16 May 2013 at 21:04:38 UTC, Jonathan M Davis wrote:On Thursday, May 16, 2013 22:42:23 luka8088 wrote:I agree with Walter if we're talking about production code, but I think it could be very helpful for debug builds. P.S.I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.Well, I'm not sure what can be done about that. length is size_t, which is the unsigned integral value which matches the architecture (uint for 32-bit and ulong for 64-bit). AFAIK, the documentation is clear on this (though I haven't read it recently). If it's not, then it should be made clearer, but using size_t for length is pervasive in D as it is in C++, and if you know the standard types, you know what size_t is. As for overflow checking, it's come up quite a few times, and Walter is completely against it. The hardware doesn't support it, and it would definitely be slow if it were added. The standard way to handle that if you want it is to create user-defined integral type which does the checks (though that obviously won't help you when you can't control the types that you're dealing with). But if you want to add checks in your code, you can always create a wrapper function for doing the subtraction. And if you wanted the checks to only be there in non-release mode, then you could even put the checks in a version(assert) block. So, you can add checks if you want them, but there's pretty much no way that how unsigned or overlflow are handled in the language is going to change. - Jonathan M DavisThe hardware doesn't support itThat's not completely true. e.g. x86, while it doesn't throw an exception on an overflow, it does set a flag, which could be relatively cheaply checked.
May 16 2013
On 05/16/2013 04:17 PM, Mr. Anonymous wrote:On Thursday, 16 May 2013 at 21:04:38 UTC, Jonathan M Davis wrote:areOn Thursday, May 16, 2013 22:42:23 luka8088 wrote:I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aw=eof the fact that you are working with unsigned values. But in the cod=fthat I wrote there was no mentioning of unsigned so the possibility o=that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.Well, I'm not sure what can be done about that. length is size_t, which is the unsigned integral value which matches the architecture (uint for 32-bit and ulong for 64-bit). AFAIK, the documentation is clear on this (though I=inghaven't read it recently). If it's not, then it should be made clearer, but us=hesize_t for length is pervasive in D as it is in C++, and if you know t=sstandard types, you know what size_t is. As for overflow checking, it's come up quite a few times, and Walter i=rcompletely against it. The hardware doesn't support it, and it would definitely be slow if it were added. The standard way to handle that if you want it is to create user-defined integral type which does the checks (though that obviously won't help you when you can't control the types that you're dealing with). But if you want to add checks in your code, you can always create a wrappe=function for doing the subtraction. And if you wanted the checks to only be there in non-release mode, then you could even put the checks in a version(assert) block. So, you can add checks if you want them, but there's pretty much no way that how unsigned or overflow are handled in the language is going to change. - Jonathan M Davis=20 I agree with Walter if we're talking about production code, but I think=it could be very helpful for debug builds. =20 P.S.Perhaps a compiler flag would be nice, can't really argue performance if it's explicitly chosen. Like -profile or -cov.The hardware doesn't support itThat's not completely true. e.g. x86, while it doesn't throw an exception on an overflow, it does set a flag, which could be relatively cheaply checked.
May 16 2013
On 5/16/13, Jonathan M Davis <jmdavisProg gmx.com> wrote:The hardware doesn't support it, and it would definitely be slow if it were added.We could add it in -debug mode perhaps. Ideally this sort of thing would be its own switch, but yes that goes towards switch proliferation, I know the story.
May 16 2013
On 16.5.2013. 22:29, Andrej Mitrovic wrote:On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:Yes, thanks for the advice, I did something similar. =)Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length;Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
May 16 2013
Am Thu, 16 May 2013 22:39:16 +0200 schrieb luka8088 <luka8088 owave.net>:On 16.5.2013. 22:29, Andrej Mitrovic wrote:Now that doesn't work if you deal with some text2 that is over 2 GiB longer than text1. My approach is to see the close relation between any offset from beginning or length to the machine memory model. So any byte or char array in memory naturally has an unsigned length typed by the architecture's word size (e.g. 32 or 64 bit). With that in mind I _only_ ever subtract two values if I know the difference will be positive. That is the case for file_size - valid_offset for example. I don't know the context for your line of code, but if text1 and text2 are passed in as parameters to a function, a contract should verify that text1 is longer (or equal) than text2. Now feel free to tell me I'm wrong, but with the two lengths being natural numbers or "countable", I claim that a negative value for your offset variable would not have been usable anyway. It is a result that makes no sense. So on the next line you probably check "if (offset >= 0)" which is the same as putting "if (text1.length >= text2.length)" one line earlier to avoid running into the situation where you can end up with an over- or underflow because the result range of size_t - size_t fits neither size_t nor sizediff_t. Say text1 is 0 bytes long and text2 is 3_000_000_000 bytes long. Then -3_000_000_000 would be the result that cannot be stored in any 32-bit type. And thus it is important to think about possible input to your integer calculations and place if-else-branches there (or in-contracts), especially when the language accepts overflows silently. But I'd really like to see the context of your code if it is not a secret. :) -- MarcoOn Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:Yes, thanks for the advice, I did something similar. =)Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length;Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
May 16 2013
On Thu, 16 May 2013 23:23:20 +0100, Marco Leise <Marco.Leise gmx.de> wrote:Am Thu, 16 May 2013 22:39:16 +0200 schrieb luka8088 <luka8088 owave.net>:This is not a problem I worry about in the general case. I can honestly say I have never had to write any code which had to handle strings that long.. in fact the longest string length would probably have fit in ushort TBH. This is why, for better or worse, I cast all C strlen() calls from size_t to int as a matter of course. Sure, length cannot be negative, but for all practical purposes it will not be larger than signed int either. I have been doing this for ages now and have had exactly 0 bugs as a result, vs the few I have had using unsigned types and subtraction. Of course, were I coding something that was intended to specifically handle super sized strings, it would be a different story. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/On 16.5.2013. 22:29, Andrej Mitrovic wrote:Now that doesn't work if you deal with some text2 that is over 2 GiB longer than text1.On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:autoHello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length;Yeah, I don't like these bugs either. In the meantime you can swapwith 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.Yes, thanks for the advice, I did something similar. =)
May 17 2013
On 17.5.2013. 0:23, Marco Leise wrote:Am Thu, 16 May 2013 22:39:16 +0200 schrieb luka8088<luka8088 owave.net>:I understand perfectly the issue that you are pointing out. But that is not the real issue here. I know how computer arithmetic works, the understanding is not the issue here. The real issue is that at the time of writing unsigned was never mentioned, and it never came to my mind that it could be the issue (until I printed that actual value). So in reality I made a mistake, as I sometimes make a typo mistake. And I fix them (as I did this one) by debugging. What I want to point out is that this kind of mistakes can be pointed out to the programer in debug mode (and save programer time) by adding a runtime check. The only real benefit here would be shorter debug time, and the only real tradeoff would be slower executing in debug mode. Nothing else.On 16.5.2013. 22:29, Andrej Mitrovic wrote:Now that doesn't work if you deal with some text2 that is over 2 GiB longer than text1. My approach is to see the close relation between any offset from beginning or length to the machine memory model. So any byte or char array in memory naturally has an unsigned length typed by the architecture's word size (e.g. 32 or 64 bit). With that in mind I _only_ ever subtract two values if I know the difference will be positive. That is the case for file_size - valid_offset for example. I don't know the context for your line of code, but if text1 and text2 are passed in as parameters to a function, a contract should verify that text1 is longer (or equal) than text2. Now feel free to tell me I'm wrong, but with the two lengths being natural numbers or "countable", I claim that a negative value for your offset variable would not have been usable anyway. It is a result that makes no sense. So on the next line you probably check "if (offset>= 0)" which is the same as putting "if (text1.length>= text2.length)" one line earlier to avoid running into the situation where you can end up with an over- or underflow because the result range of size_t - size_t fits neither size_t nor sizediff_t. Say text1 is 0 bytes long and text2 is 3_000_000_000 bytes long. Then -3_000_000_000 would be the result that cannot be stored in any 32-bit type. And thus it is important to think about possible input to your integer calculations and place if-else-branches there (or in-contracts), especially when the language accepts overflows silently. But I'd really like to see the context of your code if it is not a secret. :)On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:Yes, thanks for the advice, I did something similar. =)Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length;Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
May 23 2013
Am Thu, 23 May 2013 19:55:36 +0200 schrieb luka8088 <luka8088 owave.net>:I understand perfectly the issue that you are pointing out. But that is not the real issue here. I know how computer arithmetic works, the understanding is not the issue here. The real issue is that at the time of writing unsigned was never mentioned, and it never came to my mind that it could be the issue (until I printed that actual value). So in reality I made a mistake, as I sometimes make a typo mistake. And I fix them (as I did this one) by debugging. What I want to point out is that this kind of mistakes can be pointed out to the programer in debug mode (and save programer time) by adding a runtime check. The only real benefit here would be shorter debug time, and the only real tradeoff would be slower executing in debug mode. Nothing else.Ok, I remember that from Delphi http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/devcommon/compdirsoverflowchecking_xml.html It is a good idea and probably better then some Checked!int construct. Overflows are only wanted in a few places, so it makes sense to have this check enabled globally and to disable it for some code locally. -- Marco
May 23 2013
On 5/16/13 5:24 PM, luka8088 wrote:Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length; and in case text2 was longer then text1 I got something around 4294967291. So I opened an issue: http://d.puremagic.com/issues/show_bug.cgi?id=10093 I know that there is a perfectly valid reason for this behavior, and that this behavior is not undefined, but it is unexpected, especially because unsigned is never mentioned in the code. One solution that comes to mind is changing length to signed, but that makes no sense because length is never negative.It makes sense if you don't want to have these subtle bugs in your programming language -.-
May 17 2013
Am Fri, 17 May 2013 14:43:08 -0300 schrieb Ary Borenszweig <ary esperanto.org.ar>:On 5/16/13 5:24 PM, luka8088 wrote:Just saying: the subtraction of two integers creates a resulting range that doesn't fit in ANY integer of same size. If you want to be safe disallow subtraction -.- -- MarcoHello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length; and in case text2 was longer then text1 I got something around 4294967291. So I opened an issue: http://d.puremagic.com/issues/show_bug.cgi?id=10093 I know that there is a perfectly valid reason for this behavior, and that this behavior is not undefined, but it is unexpected, especially because unsigned is never mentioned in the code. One solution that comes to mind is changing length to signed, but that makes no sense because length is never negative.It makes sense if you don't want to have these subtle bugs in your programming language -.-
May 18 2013
I agree that checks for overflow should exist in debug builds (and not exist in release builds).
May 18 2013
On Saturday, 18 May 2013 at 20:29:57 UTC, Minas Mina wrote:I agree that checks for overflow should exist in debug builds (and not exist in release builds).What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)
May 23 2013
Peter Alexander:What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)Do you rely on signed or unsigned overflow? My opinions on this topic have changed few times. A modern system language should offer the programmer both integral types for the rare situations where the overflow or wrap around are expected or acceptable, and other "default" integral types to be used in all the other situations, where overflow or wrap-around are unexpected and not desired. The implementation then should offer ways to optionally perform run-time tests on the second group of integrals. A very good system language should also offer various means to statically verify the bounds of a certain percentage of values and expression results, to reduce the amount of run-time tests needed (here things like "Liquid Types" help). D currently doesn't have such safe built-in types, and it doesn't offer means to create such efficient types in library code. I think such means should be provided: http://d.puremagic.com/issues/show_bug.cgi?id=9850 Bye, bearophile
May 23 2013
On 24.5.2013. 1:58, bearophile wrote:Peter Alexander:I agree completely! Would it maybe be a good consensus to allow operator overloading and invariants for primitive types? Or is there a reason why that is a bad idea?What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)Do you rely on signed or unsigned overflow? My opinions on this topic have changed few times. A modern system language should offer the programmer both integral types for the rare situations where the overflow or wrap around are expected or acceptable, and other "default" integral types to be used in all the other situations, where overflow or wrap-around are unexpected and not desired. The implementation then should offer ways to optionally perform run-time tests on the second group of integrals. A very good system language should also offer various means to statically verify the bounds of a certain percentage of values and expression results, to reduce the amount of run-time tests needed (here things like "Liquid Types" help). D currently doesn't have such safe built-in types, and it doesn't offer means to create such efficient types in library code. I think such means should be provided: http://d.puremagic.com/issues/show_bug.cgi?id=9850 Bye, bearophile
May 29 2013
Am Fri, 24 May 2013 01:35:42 +0200 schrieb "Peter Alexander" <peter.alexander.au gmail.com>:What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)See my post about Delphi's approach. It could be disabled in D using pragma(), UDAs or even scopes. -- Marco
May 23 2013
A) Requiring a construct such as Checked!int is too complicated for the user in general as it requires a lot of code change from the user. It may be useful in certain cases but overflow bugs will crop up in unexpected places. B) To help finding such bugs, introduce a special version identifier for checking arithmetic overflow (eg -version=check_arithmetic_overflow) (no new compiler flag syntax, just a built in version identifier). C) It shouldn't be implied by -debug flag as it may slow down debug code significantly, making debug code too slow to be useful. Conflating the "debug" statements and arithmetic overflow is a bad idea; debugging shouldn't be an all or nothing option. D) More generally, there's a trade-off between keeping the number of compiler flags low for simplicity and having and having more flags for more expressiveness. I feel dmd is trying too hard to keep the number low. There's a reason why gcc, clang etc have so many flags, which gives user finer control. There should be a good middle ground: the defaults should be simple (optimize vs debug), but when searching for bugs, optimizing or dealing with deprecated constructs, we want flexibility. F) Finally, some code may indeed depend on arithmetic overflow, and shouldn't be affected by -version=check_arithmetic_overflow. Maybe we need a way to "unset" a version identifier; here's a possible syntax: ---- module test; version = check_arithmetic_overflow; //existing syntax void fun(){} version -= check_arithmetic_overflow; //new syntax void fun2(){} //now check_arithmetic_overflow is unset ---- This would be very useful in many ways, eg make some code run fast while the rest is in debug mode (otherwise the debug code would be too slow). On Thu, May 23, 2013 at 9:13 PM, Marco Leise <Marco.Leise gmx.de> wrote:Am Fri, 24 May 2013 01:35:42 +0200 schrieb "Peter Alexander" <peter.alexander.au gmail.com>:What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)See my post about Delphi's approach. It could be disabled in D using pragma(), UDAs or even scopes. -- Marco
May 25 2013