digitalmars.D - GCC Undefined Behavior Sanitizer
- bearophile (4/4) Oct 16 2014 Just found with Reddit. C seems one step ahead of D with this:
- Paulo Pinto (14/18) Oct 17 2014 The sad thing about this tools is that they are all about fixing
- Marco Leise (20/45) Oct 17 2014 I have a feeling back then the C designers weren't quite sure
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (43/45) Oct 17 2014 Actually, this is the first thing I would change about D and make
- eles (3/7) Oct 17 2014 Nice idea, but how to persuade libraries to play that game?
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (27/33) Oct 17 2014 1. Provide a meta-language for writing propositions that
- eles (15/29) Oct 19 2014 That's complicated, to provide another langage for describing the
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (57/73) Oct 19 2014 I think D need to unify UDA, type-traits, template constraints
- eles (23/33) Oct 19 2014 I mostly agree with all that you are saying, still I am aware
- ketmar via Digitalmars-d (28/58) Oct 17 2014 do you know any widespread hardware with doesn't work this way?
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (43/77) Oct 17 2014 Yes, the carry flag is set if you add with carry. It means you
- ketmar via Digitalmars-d (46/91) Oct 17 2014 i was writing about 'if_carry_set'. yes, i really-really-really want
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (23/36) Oct 17 2014 If you want a circular type, then call it something to that
- ketmar via Digitalmars-d (29/61) Oct 17 2014 it would be nice. but i'm still against "integral wrapping is
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (24/40) Oct 17 2014 Actually it makes a lot of sense to be able to reuse 16-bit
- ketmar via Digitalmars-d (36/72) Oct 17 2014 and this is perfectly doable with fixed-size ints. just use 16-bit
- monarch_dodra (6/12) Oct 18 2014 Besides, the code uses x + 1, so the code is already in undefined
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (12/24) Oct 18 2014 It wasn't an overflow check as ketmar suggested… It was a check
- monarch_dodra (16/28) Oct 19 2014 Op usually suggested that all overflows should be undefined
- Iain Buclaw via Digitalmars-d (17/31) Oct 19 2014 ote:
- Walter Bright (4/6) Oct 19 2014 Yeah, but one has to be careful when using a backend designed for C that...
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (21/27) Oct 20 2014 8-I
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (25/38) Oct 19 2014 I don't agree with how C/C++ defines arithmetics. I think
- monarch_dodra (12/16) Oct 19 2014 Speed: How so?
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (31/41) Oct 19 2014 All kind of situations where you can prove that "expression1 >
- Walter Bright (4/5) Oct 18 2014 Oh come on!
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (3/7) Oct 18 2014 Hey, that was a historically motivated reflection on the smallest
- Iain Buclaw via Digitalmars-d (3/5) Oct 17 2014 *cough* GDC *cough* :o)
- Andrei Alexandrescu (2/9) Oct 17 2014 Do you mean ubsan will work with gdc? -- Andrei
- ketmar via Digitalmars-d (8/9) Oct 17 2014 On Fri, 17 Oct 2014 08:08:34 -0700
- Iain Buclaw via Digitalmars-d (6/20) Oct 17 2014 It doesn't out of the box, but adding in front-end support is a small
- eles (6/10) Oct 17 2014 "Not every software bug has as serious consequences as seen in
- Andrei Alexandrescu (2/15) Oct 17 2014 Still a step forward. -- Andrei
- eles (9/13) Oct 17 2014 While I agree, IIRC, Ariane was never tested in that particular
- Walter Bright (5/7) Oct 18 2014 On the other hand, D is one step ahead of C with many of those (they are...
Just found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/ Bye, bearophile
Oct 16 2014
On Thursday, 16 October 2014 at 21:00:18 UTC, bearophile wrote:Just found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/ Bye, bearophileThe sad thing about this tools is that they are all about fixing the holes introduced by C into the wild. So in the end when using C and C++, we need to have compiler + static analyzer + sanitizers, in a real life example of "Worse is Better", instead of fixing the languages. At least, C++ is on the path of having less undefined behaviors, as the work group clearly saw the benefits don't outweigh the costs and is now the process of cleaning the standard in that regard. As an outsider, I think D would be better by having only defined behaviors. -- Paulo
Oct 17 2014
Am Fri, 17 Oct 2014 08:38:11 +0000 schrieb "Paulo Pinto" <pjmlp progtools.org>:On Thursday, 16 October 2014 at 21:00:18 UTC, bearophile wrote:I have a feeling back then the C designers weren't quite sure how the language would work out on current and future architectures, so they gave implementations some freedom here and there. Now that C/C++ is the primary language for any architecture, the table turned and the hardware designers build chips that behave "as expected" in some cases that C/C++ left undefined. That in turn allows C/C++ to become more restrictive. Or maybe I don't know what I'm talking about. What behavior is undefined in D? I'm not kidding, I don't really know of any list of undefined behaviors. The only thing I remember is casting away immutable and modifying the content is undefined behavior. Similar to C/C++ I think this is to allow current and future compilers to perform as of yet unknown optimizations on immutable data structures. Once such optimizations become well known in 10 to 20 years or so, D will define that behavior, too. Just like C/C++. -- MarcoJust found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/ Bye, bearophileThe sad thing about this tools is that they are all about fixing the holes introduced by C into the wild. So in the end when using C and C++, we need to have compiler + static analyzer + sanitizers, in a real life example of "Worse is Better", instead of fixing the languages. At least, C++ is on the path of having less undefined behaviors, as the work group clearly saw the benefits don't outweigh the costs and is now the process of cleaning the standard in that regard. As an outsider, I think D would be better by having only defined behaviors. -- Paulo
Oct 17 2014
On Friday, 17 October 2014 at 08:38:12 UTC, Paulo Pinto wrote:As an outsider, I think D would be better by having only defined behaviors.Actually, this is the first thing I would change about D and make it less dependent on x86. I think a system level language should enable max optimization on basic types and rather inject integrity tests for debugging/testing or support debug-exceptions where available. The second thing I would change is to make whole program analysis mandatory so that you can deduce and constrain value ranges. I don't believe the argument about separate compilation and commercial needs (and even then augmented object code is a distinct possibility). Even FFI is not a great argument, you should be able to specify what can happen in a foreign function. It is just plain wrong to let integers wrap by default in an accessible result. That is not integer behaviour. The correct thing to do is to inject overflow checks in debug mode and let overflow in results (that are accessed) be undefined. Otherwise you end up giving the compiler a difficult job: uint y=x+1; if (x < y){…} Should be optimized to: {…} In D (and C++) you would get: if (x < ((x+1)&0xffffffff)){…} As a result you are encouraged to use signed int everywhere in C++, since unsigned ints use modulo-arithmetic. Unsigned ints in C++ are only meant for bit-field stuff. And the C++ designers admit that the C++ library is ill-specified because it uses unsigned ints for integers that cannot be negative, while that is now considered a bad practice… In D it is even worse since you are forced to use a fixed size modulo even for int, so you cannot do 32 bit arithmetic in a 64 bit register without getting extra modulo operations. So, "undefined behaviour" is not so bad, as long as you qualify it. You could for instance say that overflow on ints leads to an unknown value, but no other side effects. That was probably the original intent for C, but compiler writers have taken it a step further… D has locked itself to Pentium-style x86 behaviour. Unfortunately it is very difficult to have everything be well-defined in a low level programming language. It isn't even obvious that a byte should be 8 bits, although the investments in creating UTF-8 resources on the Internet probably has locked us to it for the next 100 years… :)
Oct 17 2014
On Friday, 17 October 2014 at 09:46:49 UTC, Ola Fosheim Grøstad wrote:On Friday, 17 October 2014 at 08:38:12 UTC, Paulo Pinto wrote:The second thing I would change is to make whole program analysis mandatory so that you can deduce and constrain value ranges.Nice idea, but how to persuade libraries to play that game?
Oct 17 2014
On Friday, 17 October 2014 at 10:30:14 UTC, eles wrote:On Friday, 17 October 2014 at 09:46:49 UTC, Ola Fosheim Grøstad wrote:1. Provide a meta-language for writing propositions that describes what libraries do if they are foreign (pre/post conditions). Could be used for "asserts" too. 2. Provide a C compiler that compiles to the same internal representation as the new language, so you can run the same analysis on C code. 3. Remove int so that you have to specify the range and make typedefs local to the library 4. Provide the ability to specify additional constraints on library functions you use in your project or even probabilistic information. Essentially it is a cultural thing, so the standard library has to be very well written. Point 4 above could let you specify properties on the input to a sort function on the call site and let the compiler use that information for optimization. E.g. if one million values are evenly distributed over a range of 0..100000 then a quick sort could break it down without using pivots. If the range is 0..1000 then it could switch to an array of counters. If the input is 99% sorted then it could switch to some insertion-sort based scheme. If you allow both absolute and probabilistic meta-information then the probabilistic information can be captured on a corpus of representative test-data. You could run the algorithm within the "measured probable range" and switch to a slower algorithm when you detect values outside it. Lots of opportunities for improving "state-of-the-art".The second thing I would change is to make whole program analysis mandatory so that you can deduce and constrain value ranges.Nice idea, but how to persuade libraries to play that game?
Oct 17 2014
On Friday, 17 October 2014 at 10:50:54 UTC, Ola Fosheim Grøstad wrote:On Friday, 17 October 2014 at 10:30:14 UTC, eles wrote:On Friday, 17 October 2014 at 09:46:49 UTC, Ola Fosheim Grøstad wrote:That's complicated, to provide another langage for describing the behavior. And how? Embedded in the binary library? Maybe a set of annotations that are exposed through the .di files. But, then, we are back to headers... Another idea would be to simply make the in and out contracts of a function exposed in the corresponding .di file, or at least a part of them (we could use "public" for those). Anyway, as far as I ca imagine it, it would be like embedding Polyspace inside the compiler and stub functions inside libraries.Nice idea, but how to persuade libraries to play that game?1. Provide a meta-language for writing propositions that describes what libraries do if they are foreign (pre/post conditions). Could be used for "asserts" too.2. Provide a C compiler that compiles to the same internal representation as the new language, so you can run the same analysis on C code.For source code. But for cloused-source libraries?3. Remove int so that you have to specify the range and make typedefs local to the libraryPascal arrays?Lots of opportunities for improving "state-of-the-art".True. But a lot of problems too. And there is not much agreement on what is the state of the art...
Oct 19 2014
On Sunday, 19 October 2014 at 09:04:59 UTC, eles wrote:That's complicated, to provide another langage for describing the behavior.I think D need to unify UDA, type-traits, template constraints and other deductive facts and rules into a deductive database in order to make it more pleasant and powerful. And also provide the means to query that database from CTFE code. A commercial compiler could also speed up compilation of large programs with complex compile time logic. (storing facts in a persistent high performance database) There are several languages to learn from in terms of specifying "reach" in a graph/tree structure. E.g. XQuery You can view " nogc func(){} " as a fact: nogc('func). or perhaps: nogc( ('modulename,'func) ). then you could list the functions that are nogc in a module using: nogc( ('modulename,X) ) Same for type traits. If you build it into the typesystem then you can easily define new type constraints in complex ways. (You could start with something simple, like specifying if values reachable through a parameter escape the lifetime of the function call.)And how? Embedded in the binary library?The same way you would do it with C/C++ today. Some binary format allow extra meta-info, so it is possible… in the long term.Another idea would be to simply make the in and out contracts of a function exposed in the corresponding .di file, or at least a part of them (we could use "public" for those).That's an option. Always good to start with something simple, but with an eye for a more generic/powerful/unified solution in the future.Anyway, as far as I ca imagine it, it would be like embedding Polyspace inside the compiler and stub functions inside libraries.Yes, or have a semantic analyser check, provide and collect facts for a deductive database, i.e.: 1. collect properties that are cheap to derive from source, build database 2. CTFE: query property X 3. if database query for X succeeds return result 4. collect properties that are more expensive guided by (2), inject into database 5. return resultFor source code. But for cloused-source libraries?You need annotations. Or now that you are getting stuff like PNACL, maybe you can have closed source libraries in a IR format that can be analysed.subrange variables: var age : 0 ... 150; year: 1970 ... 9999;3. Remove int so that you have to specify the range and make typedefs local to the libraryPascal arrays?Right, and it get's worse the less specific the use scenario is. What should be created is a modular generic specification for a system programming language, based on what is, what should be, hardware trends and theory. Then you can see the dependencies among the various concepts. C, C++, D, Rust can form a starting point. I think D2 is to far on it's own trajectory to be modified, so I view D1 and D2 primarily as experiments, which is important too. But D3 (or some other language) should build on what the existing languages enable and unify concepts so you have something more coherent than C++ and D. Evolution can only take you so far, then you hit the walls set up by existing features/implementation.Lots of opportunities for improving "state-of-the-art".True. But a lot of problems too. And there is not much agreement on what is the state of the art...
Oct 19 2014
On Sunday, 19 October 2014 at 10:45:31 UTC, Ola Fosheim Grøstad wrote:On Sunday, 19 October 2014 at 09:04:59 UTC, eles wrote:I mostly agree with all that you are saying, still I am aware that much effort and coordination will be needed. OTOH, this would give D (and/aka the future of computing) a non-negligeable edge (being able to optimize across libraries).Some binary format allow extra meta-info, so it is possible… in the long term.Debug builds could be re-used for that, with some minor modifications, I think.I think it would not turn that bad. For the time being, putting the contracts in the .di files would cost barely nothing (but disk space). And, progressively, the compiler could be made to integrate those, when the .di files with contracts are available, in order to optimize the builds. It would be directly D code, so very easily to interpret. Basically, the optimizer would have the set of the asserts that limit the behaviour of that function at his hand. Anybody else who would like to comment on this?Another idea would be to simply make the in and out contracts of a function exposed in the corresponding .di file, or at least a part of them (we could use "public" for those).That's an option. Always good to start with something simple, but with an eye for a more generic/powerful/unified solution in the future.But D3People here traditionally don't like that word, but it has been unleased several times on the forum. Maybe not that stringent need, but I think that a somewhat disruptive "clean, clarify and fix glitches and bad legacy" release of D(2) is more and more needed and quite accepted as a good thing by the community (which is ready to take the effort to bring code up to date).
Oct 19 2014
On Fri, 17 Oct 2014 09:46:48 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:It is just plain wrong to let integers wrap by default in an=20 accessible result. That is not integer behaviour.do you know any widespread hardware with doesn't work this way? yet i know very widespread language which doesn't care. by a strange coincidence programs in this language tend to have endless problems with overflows.The correct=20 thing to do is to inject overflow checks in debug mode and let=20 overflow in results (that are accessed) be undefined.the correct thing is to not turning perfectly defined operations to undefined ones.Otherwise=20 you end up giving the compiler a difficult job: =20 uint y=3Dx+1; if (x < y){=E2=80=A6} =20 Should be optimized to: =20 {=E2=80=A6}no, it shouldn't. at least not until there will be something like 'if_carry_set'.In D (and C++) you would get: =20 if (x < ((x+1)&0xffffffff)){=E2=80=A6}perfect. nice and straightforward way to do overflow checks.In D it is even worse since you are forced to use a fixed size=20 modulo even for int, so you cannot do 32 bit arithmetic in a 64=20 bit register without getting extra modulo operations.why should i, as programmer, care? what i *really* care about is portable code. having size of base types not strictly defined is not helping at all.So, "undefined behaviour" is not so badyes, it's not bad, it's terrible. having "undefined behavior" in language is like saying "hey, we don't know what to do with this, and we don't want to think about it. so we'll turn our problem into your problem. have a nice day, sucker!"You could for instance say that overflow on ints leads to an=20 unknown value, but no other side effects. That was probably the=20 original intent for C, but compiler writers have taken it a step=20 further=E2=80=A6how is this differs from the current interpretation?D has locked itself to Pentium-style x86 behaviour.oops. 2's complement integer arithmetic is "pentium-style x86" now... i bet x86_64 does everything in ternary, right? oh, and how about pre-pentium era?Unfortunately=20 it is very difficult to have everything be well-defined in a low=20 level programming language. It isn't even obvious that a byte=20 should be 8 bitsit is very easy. take current hardware, evaluate it's popularity, do what most popular hardware does. that's it. i, for myself, don't need a language for "future hardware", i need to work with what i have now. if we'll have some drastic changes in the future... well, we always can emulate old HW to work with old code, and rewrite that old code for new HW.
Oct 17 2014
On Friday, 17 October 2014 at 13:44:24 UTC, ketmar via Digitalmars-d wrote:On Fri, 17 Oct 2014 09:46:48 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:Yes, the carry flag is set if you add with carry. It means you SHOULD add to another hi-word with carry. :P You can also add with clamp with SSE, so you clamp to max/min. Too bad languages don't support it. I've always thought it be nice to have clamp operators, so you can say x(+)y and have the result clamped to the max/min values. Useful for stuff like DSP on integers.It is just plain wrong to let integers wrap by default in an accessible result. That is not integer behaviour.do you know any widespread hardware with doesn't work this way?Uh, so you want slow? If you want this you should also check the overflow flag so that you can catch overflows and throw an exception. But then you have a high level language. All high level languages should do this.if (x < ((x+1)&0xffffffff)){…}perfect. nice and straightforward way to do overflow checks.So you want to have lots of masking on your shiny new 64-bit register only CPU, because D is stuck on promoting to 32-bits by spec? That's not portable, that is "portable".In D it is even worse since you are forced to use a fixed size modulo even for int, so you cannot do 32 bit arithmetic in a 64 bit register without getting extra modulo operations.why should i, as programmer, care? what i *really* care about is portable code. having size of base types not strictly defined is not helping at all.Nah, it is saying: if your code is wrong then you will get wrong results unless you turn on runtime checks. What D is saying is: nothing is wrong even if you get something you never wanted to express, because we specify all operations to be boundless (circular) so that nothing can be wrong by definition (but your code will still crash and burn). That also means that you cannot turn on runtime checks, since it is by definition valid. No way for the compiler to figure out if it is intentional or not.So, "undefined behaviour" is not so badyes, it's not bad, it's terrible. having "undefined behavior" in language is like saying "hey, we don't know what to do with this, andThe overhead for doing 64bit calculations is marginal. Locking yourself to 32bit is a bad idea.D has locked itself to Pentium-style x86 behaviour.oops. 2's complement integer arithmetic is "pentium-style x86" now... i bet x86_64 does everything in ternary, right? oh, and how about pre-pentium era?it is very easy. take current hardware, evaluate it's popularity, do what most popular hardware does. that's it. i, for myself, don't need a language for "future hardware", i need to work with what i have now.My first computer had no division or multiply and 8 bit registers and was insanely popular. It was inconceivable that I would afford anything more advanced in the next decade. In the next 5 years I had two 16 bit computers, one with 16x RAM and GPU… and at a much lower price…if we'll have some drastic changes in the future... well, we always can emulate old HW to work with old code, and rewrite that old code for new HW.The most work on a codebase is done after it ships. Interesting things may happen on the hardware side in the next few years: - You'll find info on the net where Intel has planned buffered transactional memory for around 2017. - AMD is interested in CPU/GPU intergration/convergence - Intel has a many core "co-processor" - SIMD registers are getting wider and wider… 512 bits is a lot! etc...
Oct 17 2014
On Fri, 17 Oct 2014 14:38:29 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:i was writing about 'if_carry_set'. yes, i really-really-really want either "propagating carry" if 'if_carry_flag_set', or a way to tell the compiler "do overflow check on this expression and throw exception on overflow".=20 Yes, the carry flag is set if you add with carry. It means you=20 SHOULD add to another hi-word with carry. :PIt is just plain wrong to let integers wrap by default in an=20 accessible result. That is not integer behaviour.do you know any widespread hardware with doesn't work this way?You can also add with clamp with SSE, so you clamp to max/min.=20 Too bad languages don't support it. I've always thought it be=20 nice to have clamp operators, so you can say x(+)y and have the=20 result clamped to the max/min values. Useful for stuff like DSP=20 on integers.it's good. but this not justifies the decision to make 2's complement overflow undefined.i want a way to check integer overflows. i don't even want to think about dirty C code to do that ('cause, eh, our compilers are very smart and they, eh, know that there must be no overflows on ints, and they, eh, just removing some checks 'cause that checks is no-ops when there are no overflows, and now we, eh, have to cheat the compiler to... screw it, i'm going home!)Uh, so you want slow? If you want this you should also check the=20 overflow flag so that you can catch overflows and throw an=20 exception.if (x < ((x+1)&0xffffffff)){=E2=80=A6}perfect. nice and straightforward way to do overflow checks.So you want to have lots of masking on your shiny new 64-bit=20 register only CPU, because D is stuck on promoting to 32-bits by=20 spec?yes. what's wrong with using long/ulong when you need 64 bits? i don't care about work CPU must perform to execute my code, CPU was created to help me, not vice versa. yet i really care about 'int' being the same size on different architectures (size_t, you sux! i want you to go away!).That's not portable, that is "portable".it's portable. and "portable" is when i should making life of some silicon crap easier instead of silicon crap making my life easier.Nah, it is saying: if your code is wrong then you will get wrong=20 results unless you turn on runtime checks....and have a nice day, sucker!What D is saying is: nothing is wrong even if you get something=20 you never wanted to express, because we specify all operations to=20 be boundless (circular) so that nothing can be wrong by=20 definition (but your code will still crash and burn).perfect!That also means that you cannot turn on runtime checks, since it=20 is by definition valid. No way for the compiler to figure out if=20 it is intentional or not.if you want such checks, you have a choice. you either can do such checks manually or use something like CheckedInt. this way when i see CheckedInt variable i know programmer's intentions from the start. and if programmer using simple 'int' i know that compiler will not "optimize away" some checking code.The overhead for doing 64bit calculations is marginal. Locking=20 yourself to 32bit is a bad idea.did you noticed long/ulong types in D specs? and reserved 'cent' type for that matter?My first computer had no division or multiply and 8 bit registers=20 and was insanely popular. It was inconceivable that I would=20 afford anything more advanced in the next decade. In the next 5=20 years I had two 16 bit computers, one with 16x RAM and GPU=E2=80=A6 and=20 at a much lower price=E2=80=A6that's why you don't use assembler to write your code now, aren't you? i was trying to use C for Z80, and that wasn't a huge success that days. why do you want to make my life harder by targeting D2 to some imaginary "future hardware" instead of targetting to current one? by the days when "future hardware" will become "current hardware" we will have D5 or so. nobody using K&R C now, right?The most work on a codebase is done after it ships.porting to another arch, for example. where... ah, FSCK, int is 29 bits there! shit! or 16 bits... "portable by rewriting", yeah.Interesting things may happen on the hardware side in the next=20 few years: =20 - You'll find info on the net where Intel has planned buffered=20 transactional memory for around 2017.(looking at 'date' output) ok, it's 2014 now. and i see no such HW around. let's talk about this when we'll have widespreaded HW with this feature.- AMD is interested in CPU/GPU intergration/convergenceand me not. but i'm glad for AMD.- Intel has a many core "co-processor"and?..- SIMD registers are getting wider and wider=E2=80=A6 512 bits is a lot!and i must spend time to make some silicon crap happy, again? teach compilers to transparently rewrite my code, i don't want to be a slave to CPUs.
Oct 17 2014
On Friday, 17 October 2014 at 15:17:12 UTC, ketmar via Digitalmars-d wrote:it's good. but this not justifies the decision to make 2's complement overflow undefined.If you want a circular type, then call it something to that effect. Not uint or int. Call it bits or wrapint.yes. what's wrong with using long/ulong when you need 64 bits?What is wrong and arbitrary is promoting to 32-bits by default.did you noticed long/ulong types in D specs? and reserved 'cent' type for that matter?If you want fixed width, make it part of the name: i8, i16, i24, i32, i64… Seriously, if you are going to stick to fixed register sizes you have to support 24 bit and other common register sizes too. Otherwise you'll get 24bit wrapping 32bit ints.i was trying to use C for Z80, and that wasn't a huge success that days.How did you manage to compile with it? ;-) The first good programming tool I had was an assembler written in Basic… I had to load the assembler from tape… slooow. And if my program hung I had to reset and reload it. Patience… Then again, that makes you a very careful programmer ;)have D5 or so. nobody using K&R C now, right?ANSI-C is pretty much the same, plenty of codebases are converted over from K&R. With roots in the 70s… :-Paround. let's talk about this when we'll have widespreaded HW with this feature.That goes real fast, because is probably cheaper to have it built into all CPUs of the same generation and just disable it on the ones that have to be sold cheap because they are slow/market demand.and i must spend time to make some silicon crap happy, again?If you want a high level language, no. If you want a system level language, yes!!!!!!!!!!
Oct 17 2014
On Fri, 17 Oct 2014 15:58:02 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:On Friday, 17 October 2014 at 15:17:12 UTC, ketmar via=20 Digitalmars-d wrote:it would be nice. but i'm still against "integral wrapping is undefined". define it! either specify the result, or force program to crash.it's good. but this not justifies the decision to make 2's=20 complement overflow undefined.If you want a circular type, then call it something to that=20 effect. Not uint or int. Call it bits or wrapint.64-bit ARMs aren't so widespread yet. oh, wait, are we talking about x86-compatible CPUs only? but why?yes. what's wrong with using long/ulong when you need 64 bits?What is wrong and arbitrary is promoting to 32-bits by default.i have nothing against this either. but i have alot against "integral with arbitrary size" type.did you noticed long/ulong types in D specs? and reserved=20 'cent' type for that matter?If you want fixed width, make it part of the name: i8, i16, i24,=20 i32, i64=E2=80=A6Seriously, if you are going to stick to fixed register sizes you=20 have to support 24 bit and other common register sizes too.=20 Otherwise you'll get 24bit wrapping 32bit ints.nope. if int is 32 but, and it's behavior is defined as 2's complement 32-bit value, it doesn't matter what register size HW has. it's compiler task to make this int behave right.it was... painful. even with disk drive. having only 64KB of memory (actually, only 48K free for use) doesn't help much too.i was trying to use C for Z80, and that wasn't a huge success=20 that days.How did you manage to compile with it? ;-)Then again, that makes you a very careful programmer ;)that almost turned me to serial killer.i don't buying that "we'll made that pretty soon" PR. first they making it widespreaded, then i'll start caring, not vice versa.around. let's talk about this when we'll have widespreaded HW=20 with this feature.That goes real fast, because is probably cheaper to have it built=20 into all CPUs of the same generation and just disable it on the=20 ones that have to be sold cheap because they are slow/market=20 demand.this is a misconception. "low level language" is not one that pleases CPU down to bits and registers, it's about *conceptions*. for example, good high-level language doesn't need pointers, yet low-level one needs 'em. good high-level language makes alot of checks automatically (range checking, overflow checking and so on), goot low-level language allows programmer to control what will be checked and how. good high-level language can transparently use bigints on overflow, good low-level language has clearly defined semantic of integer overflow and defined sizes for integral types. and so on. "going low-level" is not about pleasing CPU (it's not assembler), it's about writing "low-level code" -- one with pointers, manual checks and such.and i must spend time to make some silicon crap happy, again?=20 If you want a high level language, no. =20 If you want a system level language, yes!!!!!!!!!!
Oct 17 2014
On Friday, 17 October 2014 at 16:26:08 UTC, ketmar via Digitalmars-d wrote:i have nothing against this either. but i have alot against "integral with arbitrary size" type.Actually it makes a lot of sense to be able to reuse 16-bit library code on a 24-bit ALU. Like for loading a sound at 16-bit then process it at 24-bit.nope. if int is 32 but, and it's behavior is defined as 2's complement 32-bit value, it doesn't matter what register size HW has. it's compiler task to make this int behave right.And that will result in slow code.Yeah, IIRC I "cracked" it and put it on a diskette after a while…Then again, that makes you a very careful programmer ;)that almost turned me to serial killer.i don't buying that "we'll made that pretty soon" PR. first they making it widespreaded, then i'll start caring, not vice versa.C++ has a workgroup on transactional memory with expertise… So, how long can you wait with planning for the future before being hit by the train? You need to be ahead of the big mover if you want to gain positions in multi-threading (which is the most important area that is up for grabs in system level programming these days).this is a misconception. "low level language" is not one that pleases CPU down to bits and registers, it's about *conceptions*.for example, good high-level language doesn't need pointers, yet low-level one needs 'em.Bad example. Low level languages need pointers because the hardware use 'em. If you have a non-standard memory model you need deal with different aspects of pointers too (like segments or bank switching). If you cannot efficiently compute existing libraries on 24-bit, 48-bit or 64-bit ALUs then the programming language is tied to a specific CPU. That is not good and it will have problems being viewed as a general system level programming language. A system level language should not force you to be overly abstract in a manner that affects performance or restricts flexibility.
Oct 17 2014
On Fri, 17 Oct 2014 16:49:10 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:and this is perfectly doable with fixed-size ints. just use 16-bit library when you can and write new code when you can't.i have nothing against this either. but i have alot against=20 "integral with arbitrary size" type.=20 Actually it makes a lot of sense to be able to reuse 16-bit=20 library code on a 24-bit ALU. Like for loading a sound at 16-bit=20 then process it at 24-bit.i prefer slow code over incorrect code. if i'll find that some code is a real bottleneck (using profiler, of course, almost noone can make a right guess here ;-), i'll write arch-dependant assembler part to replace slow code. but i prefer first to get it working after recompiling, and then starting to optimize. what i don't want is to think each time "what if int will have different size here?" that's why i'm using types from stdint.h in my C code instead of just "int", "long" and so on.nope. if int is 32 but, and it's behavior is defined as 2's=20 complement 32-bit value, it doesn't matter what register size HW has. it's compiler task to make this int behave right.And that will result in slow code.indefinitely long if language guarantees that my existing code will continue to work as expected.i don't buying that "we'll made that pretty soon" PR. first=20 they making it widespreaded, then i'll start caring, not vice versa.=20 C++ has a workgroup on transactional memory with expertise=E2=80=A6 So,=20 how long can you wait with planning for the future before being=20 hit by the train?You need to be ahead of the big mover if you want to gain=20 positions in multi-threading (which is the most important area=20 that is up for grabs in system level programming these days).i don't care about positions. what i care about is language with defined behavior. besides, threads sux. ;-) anyway, it's not hard to add "transaction {}" block (from the language design POV, not from the implementor's POV). ;-)i really can't imagine hardware without pointers.for example, good high-level language doesn't need pointers, yet low-level one needs 'em.Bad example. Low level languages need pointers because the=20 hardware use 'em.If you have a non-standard memory model you=20 need deal with different aspects of pointers too (like segments=20 or bank switching).this must be accessible, but hidden from me unless i explicitly ask about gory details.If you cannot efficiently compute existing libraries on 24-bit,=20 48-bit or 64-bit ALUs then the programming language is tied to a=20 specific CPU.are you saying that 32-bit operations on 64-bit CPUs sux? than those CPUs sux. throw 'em away. besides, having guaranteed and well-defined integer sizes and overflow values is what making using such libs on different architectures possible. what *really* ties code to CPU is "int size depends of host CPU", "overflow result depends of host CPU" and other such things.That is not good and it will have problems being=20 viewed as a general system level programming language.nope. this is the problem of HW designers and compiler writers, not language problem. i still can't understand why i must write my code to please C compiler. weren't compilers invented to please *me*? i'm not going to serve the servants.A system level language should not force you to be overly=20 abstract in a manner that affects performance or restricts=20 flexibility.system level language must provide ability to go to "CPU level" if programmer want that, but it *must* abstract away unnecessary details by default. it's way easier to have not superefficient, but working code first and continually refine it than trying to write it the hard way from the start.
Oct 17 2014
On Friday, 17 October 2014 at 13:44:24 UTC, ketmar via Digitalmars-d wrote:On Fri, 17 Oct 2014 09:46:48 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:Besides, the code uses x + 1, so the code is already in undefined state. It's just as wrong as the "horrible code with UB" we wère trying to avoid in the first place. So much for convincing me that it's a good idea...In D (and C++) you would get: if (x < ((x+1)&0xffffffff)){…}perfect. nice and straightforward way to do overflow checks.
Oct 18 2014
On Saturday, 18 October 2014 at 08:22:25 UTC, monarch_dodra wrote:On Friday, 17 October 2014 at 13:44:24 UTC, ketmar via Digitalmars-d wrote:It wasn't an overflow check as ketmar suggested… It was a check that should stay true, always for this instantiation. So the wrong code is bypassed on overflow, possibly missing a termination. The code would have been correct with an optimization that set it to true or with a higher resolution register.On Fri, 17 Oct 2014 09:46:48 +0000 via Digitalmars-d <digitalmars-d puremagic.com> wrote:In D (and C++) you would get: if (x < ((x+1)&0xffffffff)){…}perfect. nice and straightforward way to do overflow checks.Besides, the code uses x + 1, so the code is already in undefined state. It's just as wrong as the "horrible code with UB" we wère trying to avoid in the first place. So much for convincing me that it's a good idea...Not sure if you are saying that modulo-arithmetic as a default is a bad or good idea? In D and (C++ for uint) it is modulo-arithmetic so it is defined as a circular type with at discontinuity which makes reasoning about integers harder.
Oct 18 2014
On Saturday, 18 October 2014 at 23:10:15 UTC, Ola Fosheim Grøstad wrote:On Saturday, 18 October 2014 at 08:22:25 UTC, monarch_dodra wrote:Op usually suggested that all overflows should be undefined behavior, and that you could "pre-emptivelly" check for overflow with the above code. The code provided itself overflowed, so was also undefined. What I'm pointing out is that working with undefined behavior overflow is exceptionally difficult, see later.Besides, the code uses x + 1, so the code is already in undefined state. It's just as wrong as the "horrible code with UB" we wère trying to avoid in the first place. So much for convincing me that it's a good idea...Not sure if you are saying that modulo-arithmetic as a default is a bad or good idea?In D and (C++ for uint) it is modulo-arithmetic so it is defined as a circular type with at discontinuity which makes reasoning about integers harder.What interesting is that overflow is only defined for unsigned integers. signed integer overflow is *undefined*, and GCC *will* optimize away any conditions that rely on it. One thing I am certain of, is that making overflow *undefined* is *much* worst than simple having modulo arithmetic. In particular, implementing trivial overflow checks is much easier for the average developper. And worst case scenario, you can still have library defined checked integers.
Oct 19 2014
On 19 Oct 2014 09:40, "monarch_dodra via Digitalmars-d" < digitalmars-d puremagic.com> wrote:On Saturday, 18 October 2014 at 23:10:15 UTC, Ola Fosheim Gr=C3=B8stad wr=ote:state. It's just as wrong as the "horrible code with UB" we w=C3=A8re tryin= g to avoid in the first place.On Saturday, 18 October 2014 at 08:22:25 UTC, monarch_dodra wrote:Besides, the code uses x + 1, so the code is already in undefinedor good idea?So much for convincing me that it's a good idea...Not sure if you are saying that modulo-arithmetic as a default is a badOp usually suggested that all overflows should be undefined behavior, andthat you could "pre-emptivelly" check for overflow with the above code. The code provided itself overflowed, so was also undefined.What I'm pointing out is that working with undefined behavior overflow isexceptionally difficult, see later.circular type with at discontinuity which makes reasoning about integers harder.In D and (C++ for uint) it is modulo-arithmetic so it is defined as aWhat interesting is that overflow is only defined for unsigned integers.signed integer overflow is *undefined*, and GCC *will* optimize away any conditions that rely on it.Good thing that overflow is strictly defined in D then. You can rely on overflowing to occur rather than be optimised away. Iain.
Oct 19 2014
On 10/19/2014 1:56 AM, Iain Buclaw via Digitalmars-d wrote:Good thing that overflow is strictly defined in D then. You can rely on overflowing to occur rather than be optimised away.Yeah, but one has to be careful when using a backend designed for C that it doesn't use the C semantics on that anyway. (I know the dmd backend does do the D semantics.)
Oct 19 2014
On Monday, 20 October 2014 at 06:17:40 UTC, Walter Bright wrote:On 10/19/2014 1:56 AM, Iain Buclaw via Digitalmars-d wrote:8-I And here I was hoping that Iain was being ironic! If you want to support wrapping you could do it like this: int x = wrapcalc( y + DELTA ); And clamping: int x = clampcalc(y+ DELTA); And overflow int x = y+DELTA; if(x.status!=0){ x.status.carry… x.status.overflow… } or if( overflowed( x=a+b+c+d )){ if( overflowed( x=cast(somebigint)a+b+c+d )){ throw … } } or int x = throw_on_overflow(a+b+c+d)Good thing that overflow is strictly defined in D then. You can rely on overflowing to occur rather than be optimised away.Yeah, but one has to be careful when using a backend designed for C that it doesn't use the C semantics on that anyway.
Oct 20 2014
On Sunday, 19 October 2014 at 08:37:54 UTC, monarch_dodra wrote:On Saturday, 18 October 2014 at 23:10:15 UTC, Ola Fosheim Grøstad wrote:I don't agree with how C/C++ defines arithmetics. I think integers should exhibit monotonic behaviour over addition and multiplication and that the compiler should assume, prove or assert that assigned values are within bounds according to the programmer's confidence and specification. It is the programmers' responsibility to make sure that results stays within the type boundaries or to configure the compiler so that they will be detected. If you provide value-ranges for integers then the compiler could flag all values that are out of bounds and force the programmer to explicitly cast them back to the restricted type. If you default to 64 bit addition then getting overflows within simple expressions is not the most common problem.In D and (C++ for uint) it is modulo-arithmetic so it is defined as a circular type with at discontinuity which makes reasoning about integers harder.What interesting is that overflow is only defined for unsigned integers. signed integer overflow is *undefined*, and GCC *will* optimize away any conditions that rely on it.One thing I am certain of, is that making overflow *undefined* is *much* worst than simple having modulo arithmetic. In particular, implementing trivial overflow checks is much easier for the average developper. And worst case scenario, you can still have library defined checked integers.One big problem with that view is that "a < a+1" is not an overflow check, it is the result of aliasing. It should be optimized to true. That is the only sustainable interpretation from a correctness point of view. Even if "a < a+1" is meant to be an overflow check it completely fails if a is a short since it is promoted to int. So this is completely stuck in 32-bit land. In C++ you should default to int and avoid uint unless you do bit manipulation according to the C++ designers. There are three reasons: speed, portability to new hardware and correctness.
Oct 19 2014
On Sunday, 19 October 2014 at 09:56:44 UTC, Ola Fosheim Grøstad wrote:In C++ you should default to int and avoid uint unless you do bit manipulation according to the C++ designers. There are three reasons: speed, portability to new hardware and correctness.Speed: How so? Portability: One issue to keep in mind is that C works on *tons* of hardware. C allows hardware to follow either two's complement, or one's complement. This means that, at best, signed overflow can be implementation defined, but not defined by spec. Unfortunately, it appears C decided to outright go the undefined way. Correctness: IMO, I'm not even sure. Yeah, use int for numbers, but stick to size_t for indexing. I've seen too many bugs on x64 software when data becomes larger than 4G...
Oct 19 2014
On Sunday, 19 October 2014 at 10:22:37 UTC, monarch_dodra wrote:Speed: How so?All kind of situations where you can prove that "expression1 > expression2" holds, but have no bounds on the variables.Portability: One issue to keep in mind is that C works on *tons* of hardware. C allows hardware to follow either two's complement, or one's complement. This means that, at best, signed overflow can be implementation defined, but not defined by spec. Unfortunately, it appears C decided to outright go the undefined way.I think you might be able to make it defined like this: 1. overlflow is illegal and should not limit reasoning about monoticity 2. after overflow accessing a derived result can lead to a value where the overflow either lead to a higher bit representation which was propagated or lead to a value which was truncated. This is slightly different from "undefined". :-)Correctness: IMO, I'm not even sure. Yeah, use int for numbers, but stick to size_t for indexing. I've seen too many bugs on x64 software when data becomes larger than 4G...Sure, getting C types right and correct is tedious. The type system does not help you a whole lot. And D and C++ does not make it a lot better. Maybe the implicit conversions is a bad thing. I machine language there is often no difference between a signed and unsigned instructions which can be handy, but the typedness of multiplication is actually better than in C languages "u64 mul(u32 a,u32 b)". Multiplication over int is dangerous! Before compilers got good at optimization I viewed C as an annoying assembler. I assumed wrapping behaviour and wanted an easy way to reinterpret_cast between ints and uints (in C it gets rather ugly). These days I take the view that programmers should be explicit about "bit-crushing" operations. Maybe even for multiplication. If you are forced to explicitly truncate() when the compiler fails to rule out overflow then the problem areas also become more visible in the source code: "uint r = a*b/N" might overflow badly even if r is large enough to hold the result. "uint r = truncate(a*b/N)" makes you aware that you are on thin ice.
Oct 19 2014
On 10/17/2014 2:46 AM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>" wrote:It isn't even obvious that a byte should be 8 bits,Oh come on! http://dlang.org/type.html
Oct 18 2014
On Saturday, 18 October 2014 at 23:45:37 UTC, Walter Bright wrote:On 10/17/2014 2:46 AM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>" wrote:Hey, that was a historically motivated reflection on the smallest addressable unit. Not obvious that it should be 8 bit. :9It isn't even obvious that a byte should be 8 bits,Oh come on!
Oct 18 2014
On 16 October 2014 22:00, bearophile via Digitalmars-d <digitalmars-d puremagic.com> wrote:Just found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/*cough* GDC *cough* :o)
Oct 17 2014
On 10/17/14, 2:53 AM, Iain Buclaw via Digitalmars-d wrote:On 16 October 2014 22:00, bearophile via Digitalmars-d <digitalmars-d puremagic.com> wrote:Do you mean ubsan will work with gdc? -- AndreiJust found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/*cough* GDC *cough* :o)
Oct 17 2014
On Fri, 17 Oct 2014 08:08:34 -0700 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:Do you mean ubsan will work with gdc? -- Andreias far as i can understand, ubsan is GCC feature. not "GCC C compiler", but "GNU Compiler Collection". it works on IR representations, so GDC should be able to use ubsan almost automatically, without significant efforts from Iain. at least it looks like this.
Oct 17 2014
On 17 October 2014 16:08, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 10/17/14, 2:53 AM, Iain Buclaw via Digitalmars-d wrote:It doesn't out of the box, but adding in front-end support is a small codegen addition for each plugin you wish to support. The rest is taken care of by GCC. Iain.On 16 October 2014 22:00, bearophile via Digitalmars-d <digitalmars-d puremagic.com> wrote:Do you mean ubsan will work with gdc? -- AndreiJust found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/*cough* GDC *cough* :o)
Oct 17 2014
On Thursday, 16 October 2014 at 21:00:18 UTC, bearophile wrote:Just found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/ Bye, bearophile"Not every software bug has as serious consequences as seen in the Ariane 5 rocket crash." "if ubsan detects any problem, it outputs a “runtime error:” message, and in most cases continues executing the program." The latter won't really solve the former...
Oct 17 2014
On 10/17/14, 3:51 AM, eles wrote:On Thursday, 16 October 2014 at 21:00:18 UTC, bearophile wrote:Still a step forward. -- AndreiJust found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/ Bye, bearophile"Not every software bug has as serious consequences as seen in the Ariane 5 rocket crash." "if ubsan detects any problem, it outputs a “runtime error:” message, and in most cases continues executing the program." The latter won't really solve the former...
Oct 17 2014
On Friday, 17 October 2014 at 15:10:33 UTC, Andrei Alexandrescu wrote:On 10/17/14, 3:51 AM, eles wrote:On Thursday, 16 October 2014 at 21:00:18 UTC, bearophile wrote:Just found with Reddit. C seems one step ahead of D with this:Still a step forward. -- AndreiWhile I agree, IIRC, Ariane was never tested in that particular flight configuration that caused the bug (which was not a Heisenbug, as it was easy to reproduce but, you know, *afterwards*). Now, imagine that Ariane in space encountering a runtime error. Go back to Earth, anyone?... I specifically referred to the crash itself.
Oct 17 2014
On 10/16/2014 2:00 PM, bearophile wrote:Just found with Reddit. C seems one step ahead of D with this: http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/On the other hand, D is one step ahead of C with many of those (they are part of the language, not an add-on tool). Anyhow, for the remainder, https://issues.dlang.org/show_bug.cgi?id=13636
Oct 18 2014