digitalmars.D - OT (partially): about promotion of integers
- eles (40/40) Dec 11 2012 Hello,
- eles (7/13) Dec 11 2012 Rephrasing all that, it would be just like the fundamental type
- Andrei Alexandrescu (12/16) Dec 11 2012 [snip]
- eles (12/23) Dec 11 2012 Speed can be still optimized by the compiler, behind the scenes.
- Andrei Alexandrescu (10/33) Dec 11 2012 Agreed. But then that's one of them "sufficiently smart compiler"
- H. S. Teoh (6/21) Dec 11 2012 [...]
- Walter Bright (7/10) Dec 11 2012 Why stop at 64 bits? Why not make there only be one integral type, and i...
- eles (22/25) Dec 11 2012 You really miss the point here. Nobody will ask you to promote
- Walter Bright (9/11) Dec 11 2012 No, I don't miss the point. There are very few cases where the compiler ...
- Andrei Alexandrescu (3/5) Dec 11 2012 I thought my answer wasn't all that shoddy and not defensive at all.
- eles (1/3) Dec 11 2012 I step back. I agree. Thank you.
- Andrei Alexandrescu (4/6) Dec 11 2012 Somebody convinced somebody else of something on the Net. This has good
- eles (4/5) Dec 11 2012 About the non-defensiveness. As for the int's, I tend to consider
- bearophile (20/26) Dec 11 2012 Nope, this is a significant fallacy of yours.
- Walter Bright (8/29) Dec 11 2012 I know they didn't ask. But they did ask for 64 bits, and the exact same
- deadalnix (10/49) Dec 11 2012 I don't know about common LISP performances, never tried it in
- Walter Bright (9/19) Dec 11 2012 I'm interested in crafting D to be a language that people will like and ...
- bearophile (44/53) Dec 11 2012 Nowadays CommonLisp is not used much for anything (people at ITA
- Isaac Gouy (6/15) Dec 15 2012 I looked at that haskellwiki page but I didn't find anything to
- SomeDude (7/24) Dec 16 2012 Probably bearophile meant that the shootout allowed them to see
- jerro (3/6) Dec 16 2012 And especially if you also consider the fact that there Clean and
- Isaac Gouy (3/10) Dec 16 2012 See
- SomeDude (5/16) Dec 16 2012 Still, you don't explain why you picked, say ATS, which is
- SomeDude (5/24) Dec 16 2012 Proof is, it seems to me that you (Isaac Gouy) often come around
- Walter Bright (3/6) Dec 16 2012 Not really. You can set Google to email you whenever a search phrase tur...
- Isaac Gouy (5/13) Dec 17 2012 Yes, that's more or less what I do.
- H. S. Teoh (6/13) Dec 16 2012 I've used Clean before.
- Isaac Gouy (4/6) Dec 16 2012 I can make my own guesses, but I wanted to know what bearophile
- foobar (21/72) Dec 11 2012 All of the above relies on the assumption that the safety problem
- Walter Bright (7/16) Dec 11 2012 Trick? Not at all.
- foobar (35/57) Dec 11 2012 Thanks for proving my point. after all , you are a C++ developer,
- bearophile (7/8) Dec 11 2012 Plus one or two switches to disable such checking, if/when
- foobar (11/20) Dec 11 2012 Yeah, of course, that's why I said the C# semantics are _way_
- H. S. Teoh (27/49) Dec 11 2012 I don't agree that compiler switches should change language semantics.
- bearophile (19/22) Dec 11 2012 The idea was about two switches, one for signed integrals, and
- H. S. Teoh (36/56) Dec 11 2012 Two switches is even worse than one. The problem is that existing code
- Araq (5/16) Dec 12 2012 Yeah, it's not that easy; Nimrod uses a hygienic macro system
- Walter Bright (22/30) Dec 11 2012 The way to deal with this is to examine the implementation of CheckedInt...
- bearophile (7/13) Dec 11 2012 OK.
- d coder (6/7) Dec 11 2012 Greetings
- jerro (5/14) Dec 11 2012 The code is at https://github.com/TurkeyMan/phobos
- Walter Bright (2/3) Dec 11 2012 Manu posts here, reply to him!
- David Piepgrass (24/33) Dec 11 2012 I disagree with the analysis. I do want overflow detection, yet I
- Walter Bright (22/48) Dec 11 2012 You're not going to get performance with overflow checking even with the...
- David Piepgrass (31/51) Dec 12 2012 Thanks for the tip. Of course, I don't need and wouldn't use
- bearophile (7/11) Dec 11 2012 Here I have listed several problems in a library-defined SafeInt,
- Max Samukha (10/11) Dec 12 2012 OT: Why those are not allowed on module decls and local decls? We
- foobar (12/78) Dec 12 2012 I didn't say D should change the implementation of integers, in
- Walter Bright (2/5) Dec 12 2012 See my reply to bearophile about that.
- foobar (13/19) Dec 12 2012 Yeah, just saw that :)
- bearophile (9/19) Dec 12 2012 I think there were no references to ML in that part of Walter
- Walter Bright (2/7) Dec 12 2012 I don't know the ARM instruction set.
- Walter Bright (18/22) Dec 11 2012 I.e. the C# "solution".
- Walter Bright (17/47) Dec 11 2012 No, I'm an assembler programmer. I know how the machine works, and C, C+...
- bearophile (6/7) Dec 11 2012 OcaML, Haskell, F#, and so on are all languages derived more or
- Walter Bright (4/8) Dec 11 2012 Haskell is the language that everyone loves to learn and talk about, and...
- Timon Gehr (9/21) Dec 12 2012 (Sufficiently sane) languages are not slow or fast and I think the
- Walter Bright (8/11) Dec 12 2012 When you have a language that fundamentally disallows mutation, some alg...
- bearophile (18/21) Dec 12 2012 Two comments:
- Walter Bright (14/28) Dec 12 2012 I know people who use Java on server farms. They are very, very, very co...
- bearophile (11/22) Dec 12 2012 There are various kinds of code. In some kinds of programs you
- Walter Bright (3/7) Dec 12 2012 Fair enough, so I challenge you to write an Ocaml version of the input s...
- jerro (4/6) Dec 13 2012 Last I tried OCaml, "well used" in context of performance would
- SomeDude (9/32) Dec 13 2012 According to the Benchmark game, performance of Ocaml is good,
- Timon Gehr (32/47) Dec 12 2012 Here's a (real) quicksort:
- Walter Bright (3/5) Dec 12 2012 No language or compiler can prove code correct.
- Timon Gehr (53/58) Dec 12 2012 - There is no way to specify that a delegate is strongly pure without
- Walter Bright (2/10) Dec 12 2012 Are these in bugzilla?
- Timon Gehr (6/18) Dec 13 2012 Now they certainly are.
- Walter Bright (2/7) Dec 13 2012 Thank you.
- Walter Bright (23/27) Dec 12 2012 Ok, I'll bite.
- deadalnix (10/40) Dec 12 2012 You'll find a lot of trap into that in D, some that can kill your
- Timon Gehr (24/55) Dec 12 2012 You are testing some standard library functions that are implemented in
- Walter Bright (3/5) Dec 12 2012 So, please take the bait :-) and write a Haskell version that runs faste...
- SomeDude (14/20) Dec 13 2012 Well, you can write C code in D.
- SomeDude (3/10) Dec 13 2012 Hmmm, forget about this...
- Timon Gehr (3/19) Dec 14 2012 That is not what we are arguing.
- SomeDude (8/36) Dec 13 2012 Actually, a factor of 2 to 3 can be huge. Consider that java is
- Timon Gehr (5/42) Dec 14 2012 Most software I use is written in C or C++. I think some of it is way
- foobar (21/92) Dec 12 2012 This is precisely the point that signed and unsigned types are
- Walter Bright (3/5) Dec 12 2012 Um, there are many implementations of Javascript. In fact, I have implem...
- foobar (11/16) Dec 13 2012 Completely besides the point.
- Max Samukha (7/24) Dec 12 2012 Led Zep, too. Long time ago I read some pseudo-scientific book
- xenon325 (10/21) Dec 13 2012 Overlooked ? No. Using ? No. Disliked and abandoned ? No.
- Araq (17/20) Dec 12 2012 From http://embed.cs.utah.edu/ioc/
- bearophile (7/10) Dec 12 2012 There is some range analysis on shorter integral values. But
- Walter Bright (3/19) Dec 12 2012 D requires 2's complement arithmetic, it does not support 1's complement...
- Timon Gehr (11/20) Dec 12 2012 I think what he is talking about is that in C, if after a few steps of
- Walter Bright (5/26) Dec 12 2012 You're right in that the D optimizer does not take advantage of C "undef...
- Christopher Appleyard (2/2) Dec 12 2012 Hai :D I have seen your D program language on Google, it looks
- eles (2/5) Dec 11 2012 Clarification: to have those two types as fundamental (ie:
- Walter Bright (3/5) Dec 11 2012 Requiring integer operations to all be 64 bits would be a heavy burden o...
- Michael (11/11) Dec 12 2012 Machine/hardware have a explicitly defined register size and does
- Michael (1/5) Dec 12 2012 en link: http://msdn.microsoft.com/en-us/library/74b4xzyw.aspx
- eles (18/21) Dec 12 2012 Frankly, the hardware knows nothing about classes and about
- eles (3/8) Dec 12 2012 And the question is exactly that: what are the reasons to favor
- Michael (15/15) Dec 12 2012 I read all thread and conclude that developers want a one button
Hello, The previous thread, about the int resulting from operations on bytes, rised me a question, that is somewhat linked to a difference between Pascal/Delphi/FPC (please, no flame here) and C/D. Basically, as far as i get it, both FPC and C use Integer (name it int, if you like), as a fundamental type. That means, among others, that this is the prefered type to cast (implicitely) to. Now, there is a difference between the int-FPC and the int-C: int-FPC is the *widest* integer type (and it is signed), and all others integral types are subranges of this int-FPC. That is, the unsigned type is simply a sub-range of positive numbers, the char type is simply the subrange between -128 and +127 and so on. This looks to me as a great advantage, since implicit conversions are always straightforward and simple: everything is first converted to the fundamental (widest) type, calculation is made (yes, there might be some optimizations made, but this should be handled by the compiler, not by the programmer), then the final result is obtanied. Note that this approach, of making unsigned integrals a subrange of the int-FPC halves the maximum-representable number as unsigned, since 1 bit is always reserved for the sign (albeit, for unsigned, it is always 0). OTOH, the fact that the int-FPC is the widest available, makes it very naturally as a fundamental type and justifies (I think, without doubt), the casting all other types to this type and of the result of the arithmetic operation. If this result is in a subrange, then it might get casted back to a subrange (that is, another integral type). In C/D, the problem is that int-C is the fundamental (and prefered for conversion) type, but it is not the widest. So, you have a plethora of implicit promotions. Now, the off-topic question: the loss in unsigned-range aside (that I find it to be a small price for the earned clarity), is that any other reason (except C-compatibility) that D would not implement that model (this is not a suggestion to do it now, I know D is almost ready for prime-time, but it is a question), that is the int-FPC like model for integral types? Thank you, Eles
Dec 11 2012
Now, the off-topic question: the loss in unsigned-range aside (that I find it to be a small price for the earned clarity), is that any other reason (except C-compatibility) that D would not implement that model (this is not a suggestion to do it now, I know D is almost ready for prime-time, but it is a question), that is the int-FPC like model for integral types?Rephrasing all that, it would be just like the fundamental type in D would be the widest-integral type, and the unsigned variant of that widest-integral type would be supprimated. Then, all operands in an integral operations would be first promoted to this widest-integral, computation would be made, then the final result may be demoted back (the compiler is free to optimize it as it wants, but behind the scene).
Dec 11 2012
On 12/11/12 10:20 AM, eles wrote:Hello, The previous thread, about the int resulting from operations on bytes, rised me a question, that is somewhat linked to a difference between Pascal/Delphi/FPC (please, no flame here) and C/D.[snip] There's a lot to be discussed on the issue. A few quick thoughts: * 32-bit integers are a sweet spot for CPU architectures. There's rarely a provision for 16- or 8-bit operations; the action is at 32- or 64-bit. * Then, although most 64-bit operations are as fast as 32-bit ones, transporting operands takes twice as much internal bus real estate and sometimes twice as much core real estate (i.e. there are units that do either two 32-bit ops or one 64-bit op). * The whole reserving a bit and halving the range means extra costs of operating with a basic type. Andrei
Dec 11 2012
There's a lot to be discussed on the issue. A few quick thoughts: * 32-bit integers are a sweet spot for CPU architectures. There's rarely a provision for 16- or 8-bit operations; the action is at 32- or 64-bit.Speed can be still optimized by the compiler, behind the scenes. The approach does not asks the compiler to promote everything to widest-integral, but to do the job "as if". Currently, the choice of int-C as the fastest-integral instead of widest-integral move the burden from the compiler to the user.* Then, although most 64-bit operations are as fast as 32-bit ones, transporting operands takes twice as much internal bus real estate and sometimes twice as much core real estate (i.e. there are units that do either two 32-bit ops or one 64-bit op). * The whole reserving a bit and halving the range means extra costs of operating with a basic type.Yes, there is a cost. But, as always, there is a balance between advantages and drawbacks. What is favourable? Simplicity of promotion or a supplimentary bit? Besides, at the end of the day, a half-approach would be to have a widest-signed-integral and a widest-unsigned-integral type and only play with those two. Eles
Dec 11 2012
On 12/11/12 11:29 AM, eles wrote:Agreed. But then that's one of them "sufficiently smart compiler" arguments. http://c2.com/cgi/wiki?SufficientlySmartCompilerThere's a lot to be discussed on the issue. A few quick thoughts: * 32-bit integers are a sweet spot for CPU architectures. There's rarely a provision for 16- or 8-bit operations; the action is at 32- or 64-bit.Speed can be still optimized by the compiler, behind the scenes. The approach does not asks the compiler to promote everything to widest-integral, but to do the job "as if". Currently, the choice of int-C as the fastest-integral instead of widest-integral move the burden from the compiler to the user.A direct and natural mapping between language constructs and machine execution is very highly appreciated in the market D is in. I don't see that changing in the foreseeable future.* Then, although most 64-bit operations are as fast as 32-bit ones, transporting operands takes twice as much internal bus real estate and sometimes twice as much core real estate (i.e. there are units that do either two 32-bit ops or one 64-bit op). * The whole reserving a bit and halving the range means extra costs of operating with a basic type.Yes, there is a cost. But, as always, there is a balance between advantages and drawbacks. What is favourable? Simplicity of promotion or a supplimentary bit?Besides, at the end of the day, a half-approach would be to have a widest-signed-integral and a widest-unsigned-integral type and only play with those two.D has terrific abstraction capabilities. Lave primitive types alone and define a UDT that implements your desired behavior. You can always implement safe on top of fast but not the other way around. Andrei
Dec 11 2012
On Tue, Dec 11, 2012 at 11:35:39AM -0500, Andrei Alexandrescu wrote:On 12/11/12 11:29 AM, eles wrote:[...] A sufficiently smart compiler can solve the halting problem. ;-) T -- Obviously, some things aren't very obvious.Agreed. But then that's one of them "sufficiently smart compiler" arguments. http://c2.com/cgi/wiki?SufficientlySmartCompilerThere's a lot to be discussed on the issue. A few quick thoughts: * 32-bit integers are a sweet spot for CPU architectures. There's rarely a provision for 16- or 8-bit operations; the action is at 32- or 64-bit.Speed can be still optimized by the compiler, behind the scenes. The approach does not asks the compiler to promote everything to widest-integral, but to do the job "as if". Currently, the choice of int-C as the fastest-integral instead of widest-integral move the burden from the compiler to the user.
Dec 11 2012
On 12/11/2012 8:35 AM, Andrei Alexandrescu wrote:Why stop at 64 bits? Why not make there only be one integral type, and it is of whatever precision is necessary to hold the value? This is quite doable, and has been done. But at a terrible performance cost. And, yes, in D you can create your own "BigInt" datatype which exhibits this behavior.Besides, at the end of the day, a half-approach would be to have a widest-signed-integral and a widest-unsigned-integral type and only play with those two.
Dec 11 2012
Why stop at 64 bits? Why not make there only be one integral type, and it is of whatever precision is necessary to hold the value? This is quite doable, and has been done.You really miss the point here. Nobody will ask you to promote those numbers to 64-bit or whatever *unless necessary*. It will only modify the implicit promotion rule, from "at least to int" to "widest-integral". You may chose, as a compiler, to promote the numbers only to 16 bits, or 32 bits, if you like, but only if the final result is not viciated. The compiler will be free to promote as it likes, as long as it guarantees that the final result is "as if" the promotion is to the widest-integral. The point case is that this way the promotion rules, quite complex now, will go straightforward. Yes, the burden will be on the compiler rather than on the user. But this could improve in time: C++ classes are nothing else than a burden that falls on the compiler in order to make the programmer's life easier. Those classes too, started as big behemots, so slow that scared everyone. Anyway, I will not defend this to the end of the world. Actually, if you look in my original post, you will see that this is a simple question, not a suggestion. Until now the question received many backfights, but no answer. A bit shameful.
Dec 11 2012
On 12/11/2012 10:36 AM, eles wrote:You really miss the point here. Nobody will ask you to promote those numbers to 64-bit or whatever *unless necessary*.No, I don't miss the point. There are very few cases where the compiler could statically prove that something will fit in less than 32 bits. Consider this: Integer foo(Integer i) { return i * 2; } Tell me how many bits that should be.
Dec 11 2012
On 12/11/12 1:36 PM, eles wrote:Until now the question received many backfights, but no answer. A bit shameful.I thought my answer wasn't all that shoddy and not defensive at all. Andrei
Dec 11 2012
I thought my answer wasn't all that shoddy and not defensive at all.I step back. I agree. Thank you.
Dec 11 2012
On 12/11/12 5:07 PM, eles wrote:Somebody convinced somebody else of something on the Net. This has good day written all over it. Time to open that champagne. Cheers! AndreiI thought my answer wasn't all that shoddy and not defensive at all.I step back. I agree. Thank you.
Dec 11 2012
Somebody convinced somebody else of something on the Net.About the non-defensiveness. As for the int's, I tend to consider that the matter is controversial, but the balance is more equilibrated than it seems (between drawbacks and advantages) of either choice.
Dec 11 2012
Walter Bright:Why stop at 64 bits? Why not make there only be one integral type, and it is of whatever precision is necessary to hold the value? This is quite doable, and has been done.I think no one has asked for *bignums on default* in this thread.But at a terrible performance cost.Nope, this is a significant fallacy of yours. Common lisp (and OCaML) uses tagged integers on default, and they are very far from being "terrible". Tagged integers cause no heap allocations if they aren't large. Also the Common Lisp compiler in various situations is able to infer an integer can't be too much large, replacing it with some fixnum. And it's easy to add annotations in critical spots to ask the Common Lisp compiler to use a fixnum, to squeeze out all the performance. The result is code that's quick, for most situations. But it's more often correct. In D you drive with eyes shut; sometimes for me it's hard to know if some integral overflow has occurred in a long computation.And, yes, in D you can create your own "BigInt" datatype which exhibits this behavior.Currently D bigints don't have short int optimization. And even when this library problem is removed, I think the compiler doesn't perform on BigInts the optimizations it does on ints, because it doesn't know about bigint properties. Bye, bearophile
Dec 11 2012
On 12/11/2012 10:45 AM, bearophile wrote:Walter Bright:I know they didn't ask. But they did ask for 64 bits, and the exact same argument will apply to bignums, as I pointed out.Why stop at 64 bits? Why not make there only be one integral type, and it is of whatever precision is necessary to hold the value? This is quite doable, and has been done.I think no one has asked for *bignums on default* in this thread.I don't notice anyone reaching for Lisp or Ocaml for high performance applications.But at a terrible performance cost.Nope, this is a significant fallacy of yours. Common lisp (and OCaML) uses tagged integers on default, and they are very far from being "terrible". Tagged integers cause no heap allocations if they aren't large. Also the Common Lisp compiler in various situations is able to infer an integer can't be too much large, replacing it with some fixnum. And it's easy to add annotations in critical spots to ask the Common Lisp compiler to use a fixnum, to squeeze out all the performance.The result is code that's quick, for most situations. But it's more often correct. In D you drive with eyes shut; sometimes for me it's hard to know if some integral overflow has occurred in a long computation.That's irrelevant to this discussion. It is not a problem with the language. Anyone can improve the library one if they desire, or do their own.And, yes, in D you can create your own "BigInt" datatype which exhibits this behavior.Currently D bigints don't have short int optimization.I think the compiler doesn't perform on BigInts the optimizations it does on ints, because it doesn't know about bigint properties.I think the general lack of interest in bigints indicate that the builtin types work well enough for most work.
Dec 11 2012
On Tuesday, 11 December 2012 at 21:57:38 UTC, Walter Bright wrote:On 12/11/2012 10:45 AM, bearophile wrote:Agreed.Walter Bright:I know they didn't ask. But they did ask for 64 bits, and the exact same argument will apply to bignums, as I pointed out.Why stop at 64 bits? Why not make there only be one integral type, and it is of whatever precision is necessary to hold the value? This is quite doable, and has been done.I think no one has asked for *bignums on default* in this thread.I don't know about common LISP performances, never tried it in something where that really matter. But OCaml is really very performant. I don't know how it handle integer internally.I don't notice anyone reaching for Lisp or Ocaml for high performance applications.But at a terrible performance cost.Nope, this is a significant fallacy of yours. Common lisp (and OCaML) uses tagged integers on default, and they are very far from being "terrible". Tagged integers cause no heap allocations if they aren't large. Also the Common Lisp compiler in various situations is able to infer an integer can't be too much large, replacing it with some fixnum. And it's easy to add annotations in critical spots to ask the Common Lisp compiler to use a fixnum, to squeeze out all the performance.That's irrelevant to this discussion. It is not a problem with the language. Anyone can improve the library one if they desire, or do their own.The library is part of the language. What is a language with no vocabulary ?That argument is fallacious. Something more used don't really mean better. OR PHP and C++ are some of the best languages ever made.I think the compiler doesn't perform on BigInts the optimizations it does on ints, because it doesn't know about bigint properties.I think the general lack of interest in bigints indicate that the builtin types work well enough for most work.
Dec 11 2012
On 12/11/2012 3:15 PM, deadalnix wrote:I think it is useful to draw a distinction.That's irrelevant to this discussion. It is not a problem with the language. Anyone can improve the library one if they desire, or do their own.The library is part of the language. What is a language with no vocabulary ?I'm interested in crafting D to be a language that people will like and use. Therefore, what things make a language popular are of significant interest. I.e. it's meaningless to create the best language evar and be the only user of it. Now, if we have int with terrible problems, and bigint that solves those problems, and yet people still prefer int by a 1000:1 margin, that makes me very skeptical that those problems actually matter. We need to be solving the *right* problems with D.That argument is fallacious. Something more used don't really mean better. OR PHP and C++ are some of the best languages ever made.I think the compiler doesn't perform on BigInts the optimizations it does on ints, because it doesn't know about bigint properties.I think the general lack of interest in bigints indicate that the builtin types work well enough for most work.
Dec 11 2012
Walter Bright:I don't notice anyone reaching for Lisp or Ocaml for high performance applications.Nowadays CommonLisp is not used much for anything (people at ITA use it to plan flights, their code is efficient, algorithmically complex, and used for heavy loads). OCaML on the other hand is regarded as quite fast (but it's not much used in general), it's sometimes used for its high performance united to its greater safety, so someone uses it in automatic high-speed trading: https://ocaml.janestreet.com/?q=node/61 https://ocaml.janestreet.com/?q=node/82Where do you see this general lack of interest in bigints? In D or in other languages? I use bigints often in D. In Python we use only bigints. In Scheme, OcaML and Lisp-like languages multi-precison numbers are the default ones. I think if you give programmers better bigints (this means efficient and usable as naturally as ints), they will use them. I think currently in D there is no way to make bigints as efficient as ints because there is no ways to express in D the full semantics of integral numbers, that ints have. This is a language limitation. One way to solve this problem, and keep BigInts as Phobos code, is to introduce a built-in attribute that's usable to mark user-defined structs as int-like. ---------------------- deadalnix:I think the compiler doesn't perform on BigInts the optimizations it does on ints, because it doesn't know about bigint properties.I think the general lack of interest in bigints indicate that the builtin types work well enough for most work.But OCaml is really very performant.<It's fast considering it's a mostly functional language. OCaML Vs C++ in the Shootout: http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=ocaml&lang2=gpp Versus Haskell: http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=ocaml&lang2=ghc But as usual you have to take such comparisons cum grano salis, because there are a lot more people working on the GHC compiler and because the Shootout Haskell solutions are quite un-idiomatic (you can see it also from the Shootout site itself, taking a look at the length of the solutions) and they come from several years of maniac-level discussions (they have patched the Haskell compiler and its library several times to improve the results of those benchmarks): http://www.haskell.org/haskellwiki/ShootoutI don't know how it handle integer internally.<It uses tagged integers, that are 31 or 63 bits long, the tag on the less significant side: http://stackoverflow.com/questions/3773985/why-is-an-int-in-ocaml-only-31-bits Bye, bearophile
Dec 11 2012
On Tuesday, 11 December 2012 at 23:59:29 UTC, bearophile wrote: -snip-But as usual you have to take such comparisons cum grano salis, because there are a lot more people working on the GHC compiler and because the Shootout Haskell solutions are quite un-idiomatic (you can see it also from the Shootout site itself, taking a look at the length of the solutions) and they come from several years of maniac-level discussions (they have patched the Haskell compiler and its library several times to improve the results of those benchmarks): http://www.haskell.org/haskellwiki/ShootoutI looked at that haskellwiki page but I didn't find anything to suggest -- "they have patched the Haskell compiler and its library several times to improve the results of those benchmarks"? Was it something the compiler writers told you?
Dec 15 2012
On Saturday, 15 December 2012 at 17:11:01 UTC, Isaac Gouy wrote:On Tuesday, 11 December 2012 at 23:59:29 UTC, bearophile wrote: -snip-Probably bearophile meant that the shootout allowed them to see some weaknesses in some implementations, and therefore helped them improve on those. It's also something that would benefit D if, say, GDC was granted to come back in the shootout. Given it's now widely acknowledged (at least in the programming communities) to be one of the most promising languages around...But as usual you have to take such comparisons cum grano salis, because there are a lot more people working on the GHC compiler and because the Shootout Haskell solutions are quite un-idiomatic (you can see it also from the Shootout site itself, taking a look at the length of the solutions) and they come from several years of maniac-level discussions (they have patched the Haskell compiler and its library several times to improve the results of those benchmarks): http://www.haskell.org/haskellwiki/ShootoutI looked at that haskellwiki page but I didn't find anything to suggest -- "they have patched the Haskell compiler and its library several times to improve the results of those benchmarks"? Was it something the compiler writers told you?
Dec 16 2012
if, say, GDC was granted to come back in the shootout. Given it's now widely acknowledged (at least in the programming communities) to be one of the most promising languages around...And especially if you also consider the fact that there Clean and ATS are in the shootout and I'm guessing that very few people use those.
Dec 16 2012
On Sunday, 16 December 2012 at 15:45:32 UTC, jerro wrote:See http://www.digitalmars.com/d/archives/digitalmars/D/Why_did_D_leave_the_programming_language_shootout_and_will_it_return_144864.html#N144870if, say, GDC was granted to come back in the shootout. Given it's now widely acknowledged (at least in the programming communities) to be one of the most promising languages around...And especially if you also consider the fact that there Clean and ATS are in the shootout and I'm guessing that very few people use those.
Dec 16 2012
On Sunday, 16 December 2012 at 19:59:31 UTC, Isaac Gouy wrote:On Sunday, 16 December 2012 at 15:45:32 UTC, jerro wrote:Still, you don't explain why you picked, say ATS, which is significantly more esoteric than D, and much less likely to be used by the community in the large. I argue that many more people would be interested in the performance of D.See http://www.digitalmars.com/d/archives/digitalmars/D/Why_did_D_leave_the_programming_language_shootout_and_will_it_return_144864.html#N144870if, say, GDC was granted to come back in the shootout. Given it's now widely acknowledged (at least in the programming communities) to be one of the most promising languages around...And especially if you also consider the fact that there Clean and ATS are in the shootout and I'm guessing that very few people use those.
Dec 16 2012
On Sunday, 16 December 2012 at 23:21:15 UTC, SomeDude wrote:On Sunday, 16 December 2012 at 19:59:31 UTC, Isaac Gouy wrote:Proof is, it seems to me that you (Isaac Gouy) often come around here. We can magically invoke you every time one talks about the shootout. Which is pretty astonishing for a language you aren't interested in.On Sunday, 16 December 2012 at 15:45:32 UTC, jerro wrote:Still, you don't explain why you picked, say ATS, which is significantly more esoteric than D, and much less likely to be used by the community in the large. I argue that many more people would be interested in the performance of D.See http://www.digitalmars.com/d/archives/digitalmars/D/Why_did_D_leave_the_programming_language_shootout_and_will_it_return_144864.html#N144870if, say, GDC was granted to come back in the shootout. Given it's now widely acknowledged (at least in the programming communities) to be one of the most promising languages around...And especially if you also consider the fact that there Clean and ATS are in the shootout and I'm guessing that very few people use those.
Dec 16 2012
On 12/16/2012 3:24 PM, SomeDude wrote:Proof is, it seems to me that you (Isaac Gouy) often come around here. We can magically invoke you every time one talks about the shootout. Which is pretty astonishing for a language you aren't interested in.Not really. You can set Google to email you whenever a search phrase turns up a new result.
Dec 16 2012
On Monday, 17 December 2012 at 01:14:37 UTC, Walter Bright wrote:On 12/16/2012 3:24 PM, SomeDude wrote:Yes, that's more or less what I do. I have a couple of Google searches saved as bookmarks, and when the mood takes me I check what comments are being made about the benchmarks game.Proof is, it seems to me that you (Isaac Gouy) often come around here. We can magically invoke you every time one talks about the shootout. Which is pretty astonishing for a language you aren't interested in.Not really. You can set Google to email you whenever a search phrase turns up a new result.
Dec 17 2012
On Sun, Dec 16, 2012 at 04:45:31PM +0100, jerro wrote:I've used Clean before. But yeah, probably not many people would be familiar with it. T -- Never trust an operating system you don't have source for! -- Martin Schulzeif, say, GDC was granted to come back in the shootout. Given it's now widely acknowledged (at least in the programming communities) to be one of the most promising languages around...And especially if you also consider the fact that there Clean and ATS are in the shootout and I'm guessing that very few people use those.
Dec 16 2012
On Sunday, 16 December 2012 at 13:05:50 UTC, SomeDude wrote: -snip-I can make my own guesses, but I wanted to know what bearophile meant so I asked him ;-)Was it something the compiler writers told you?Probably bearophile meant that...
Dec 16 2012
On Tuesday, 11 December 2012 at 16:35:39 UTC, Andrei Alexandrescu wrote:On 12/11/12 11:29 AM, eles wrote:All of the above relies on the assumption that the safety problem is due to the memory layout. There are many other programming languages that solve this by using a different point of view - the problem lies in the implicit casts and not the memory layout. In other words, the culprit is code such as: uint a = -1; which compiles under C's implicit coercion rules but _really shouldn't_. The semantically correct way would be something like: uint a = 0xFFFF_FFFF; but C/C++ programmers tend to think the "-1" trick is less verbose and "better". Another way is to explicitly state the programmer's intention: uint a = reinterpret!uint(-1); // no run-time penalty should occur D decided to follow C's coercion rules which I think is a design mistake but one that cannot be easily changed. Perhaps as Andrei suggested, a solution would be to use a higher level "Integer" type defined in a library that enforces better semantics.Agreed. But then that's one of them "sufficiently smart compiler" arguments. http://c2.com/cgi/wiki?SufficientlySmartCompilerThere's a lot to be discussed on the issue. A few quick thoughts: * 32-bit integers are a sweet spot for CPU architectures. There's rarely a provision for 16- or 8-bit operations; the action is at 32- or 64-bit.Speed can be still optimized by the compiler, behind the scenes. The approach does not asks the compiler to promote everything to widest-integral, but to do the job "as if". Currently, the choice of int-C as the fastest-integral instead of widest-integral move the burden from the compiler to the user.A direct and natural mapping between language constructs and machine execution is very highly appreciated in the market D is in. I don't see that changing in the foreseeable future.* Then, although most 64-bit operations are as fast as 32-bit ones, transporting operands takes twice as much internal bus real estate and sometimes twice as much core real estate (i.e. there are units that do either two 32-bit ops or one 64-bit op). * The whole reserving a bit and halving the range means extra costs of operating with a basic type.Yes, there is a cost. But, as always, there is a balance between advantages and drawbacks. What is favourable? Simplicity of promotion or a supplimentary bit?Besides, at the end of the day, a half-approach would be to have a widest-signed-integral and a widest-unsigned-integral type and only play with those two.D has terrific abstraction capabilities. Lave primitive types alone and define a UDT that implements your desired behavior. You can always implement safe on top of fast but not the other way around. Andrei
Dec 11 2012
On 12/11/2012 10:44 AM, foobar wrote:All of the above relies on the assumption that the safety problem is due to the memory layout. There are many other programming languages that solve this by using a different point of view - the problem lies in the implicit casts and not the memory layout. In other words, the culprit is code such as: uint a = -1; which compiles under C's implicit coercion rules but _really shouldn't_. The semantically correct way would be something like: uint a = 0xFFFF_FFFF; but C/C++ programmers tend to think the "-1" trick is less verbose and "better".Trick? Not at all. 1. -1 is the size of an int, which varies in C. 2. -i means "complement and then increment". 3. Would you allow 2-1? How about 1-1? (1-1)-1? Arithmetic in computers is different from the math you learned in school. It's 2's complement, and it's best to always keep that in mind when writing programs.
Dec 11 2012
On Tuesday, 11 December 2012 at 22:08:15 UTC, Walter Bright wrote:On 12/11/2012 10:44 AM, foobar wrote:Thanks for proving my point. after all , you are a C++ developer, aren't you? :) Seriously though, it _is_ a trick and a code smell. I'm fully aware that computers used 2's complement. I'm also am aware of the fact that the type has an "unsigned" label all over it. You see it right there in that 'u' prefix of 'int'. An unsigned type should semantically entail _no sign_ in its operations. You are calling a cat a dog and arguing that dogs barf? Yeah, I completely agree with that notion, except, we are still talking about _a cat_. To answer you question, yes, I would enforce overflow and underflow checking semantics. Any negative result assigned to an unsigned type _is_ a logic error. you can claim that: uint a = -1; is perfectly safe and has a well defined meaning (well, for C programmers that is), but what about: uint a = b - c; what if that calculation results in a negative number? What should the compiler do? well, there are _two_ equally possible solutions: a. The overflow was intended as in the mask = -1 case; or b. The overflow is a _bug_. The user should be made aware of this and should make the decision how to handle this. This should _not_ be implicitly handled by the compiler and allow bugs go unnoticed. would be (S)ML which is a compiled language which requires _explicit conversions_ and has a very strong typing system. Its programs are compiled to efficient native executables and the strong typing allows both the compiler and the programmer better reasoning of the code. Thus programs are more correct and can be optimized by the compiler. In fact, several languages are implemented in ML because of its higher guaranties.All of the above relies on the assumption that the safety problem is due to the memory layout. There are many other programming languages that solve this by using a different point of view - the problem lies in the implicit casts and not the memory layout. In other words, the culprit is code such as: uint a = -1; which compiles under C's implicit coercion rules but _really shouldn't_. The semantically correct way would be something like: uint a = 0xFFFF_FFFF; but C/C++ programmers tend to think the "-1" trick is less verbose and "better".Trick? Not at all. 1. -1 is the size of an int, which varies in C. 2. -i means "complement and then increment". 3. Would you allow 2-1? How about 1-1? (1-1)-1? Arithmetic in computers is different from the math you learned in school. It's 2's complement, and it's best to always keep that in mind when writing programs.
Dec 11 2012
foobar:I would enforce overflow and underflow checking semantics.<Plus one or two switches to disable such checking, if/when someone wants it, to regain the C performance. (Plus some syntax way to disable/enable such checking in a small piece of code). Maybe someday Walter will change his mind about this topic :-) Bye, bearophile
Dec 11 2012
On Wednesday, 12 December 2012 at 00:06:53 UTC, bearophile wrote:foobar:better. (That's a self quote) btw, here's the link for SML which does not use tagged ints - http://www.standardml.org/Basis/word.html#Word8:STR:SPEC "Instances of the signature WORD provide a type of unsigned integer with modular arithmetic and logical operations and conversion operations. They are also meant to give efficient access to the primitive machine word types of the underlying hardware, and support bit-level operations on integers. They are not meant to be a ``larger'' int. "I would enforce overflow and underflow checking semantics.<Plus one or two switches to disable such checking, if/when someone wants it, to regain the C performance. (Plus some syntax way to disable/enable such checking in a small piece of code). Maybe someday Walter will change his mind about this topic :-) Bye, bearophile
Dec 11 2012
On Wed, Dec 12, 2012 at 01:26:08AM +0100, foobar wrote:On Wednesday, 12 December 2012 at 00:06:53 UTC, bearophile wrote:I don't agree that compiler switches should change language semantics. Just because you specify a certain compiler switch, it can cause unrelated breakage in some obscure library somewhere, that assumes modular arithmetic with C/C++ semantics. And this breakage will in all likelihood go *unnoticed* until your software is running on the customer's site and then it crashes horribly. And good luck debugging that, because the breakage can be very subtle, plus it's *not* in your own code, but in some obscure library code that you're not familiar with. I think a much better approach is to introduce a new type (or new types) that *does* have the requisite bounds checking and static analysis. That's what a type system is for. [...]foobar:I would enforce overflow and underflow checking semantics.<Plus one or two switches to disable such checking, if/when someone wants it, to regain the C performance. (Plus some syntax way to disable/enable such checking in a small piece of code). Maybe someday Walter will change his mind about this topic :-)better. (That's a self quote) btw, here's the link for SML which does not use tagged ints - http://www.standardml.org/Basis/word.html#Word8:STR:SPEC "Instances of the signature WORD provide a type of unsigned integer with modular arithmetic and logical operations and conversion operations. They are also meant to give efficient access to the primitive machine word types of the underlying hardware, and support bit-level operations on integers. They are not meant to be a ``larger'' int. "It's kinda too late for D to rename int to word, say, but it's not too late to introduce a new checked int type, say 'number' or something like that (you can probably think of a better name). In fact, Andrei describes a CheckedInt type that uses operator overloading, etc., to implement an in-library solution to bounds checks. You can probably expand that into a workable lightweight int replacement. By wrapping an int in a struct with custom operators, you can pretty much have an int-sized type (with value semantics, just like "native" ints, no less!) that does what you want, instead of the usual C/C++ int semantics. T -- In a world without fences, who needs Windows and Gates? -- Christian Surchi
Dec 11 2012
H. S. Teoh:Just because you specify a certain compiler switch, it can cause unrelated breakage in some obscure library somewhere, that assumes modular arithmetic with C/C++ semantics.The idea was about two switches, one for signed integrals, and the other for both signed and unsigned. But from other posts I guess Walter doesn't think this is a viable possibility. So the solutions I see now are stop using D for some kind of more important programs, or using some kind of safeInt, and then work with the compiler writers to allow user-defined structs to be usable as naturally as possible as ints (and possibly efficiently). Regarding safeInt I think today there is no way to write it efficiently in D, because the overflow flags are not accessible from D, and if you use inlined asm, you lose inlining in DMD. This is just one of the problems. The other problems are syntax incompatibilities of user-defined structs compared to built-in ints. Other problems are the probable lack of high-level optimizations done on such user defined type. We are very far from a good solution to such problems. Bye, bearophile
Dec 11 2012
On Wed, Dec 12, 2012 at 02:15:24AM +0100, bearophile wrote:H. S. Teoh:Two switches is even worse than one. The problem is that existing code assumes certain kind of behaviours from int, uint, etc.. Such code may exist in common libraries imported by your code (directly or indrectly). Now you compile your code with a switch (or two switches) that modifies the behaviour of int, and things start to break. Even worse, if you only use the switches on certain critical source files, then you may end up with incompatible behaviour of the same library code in the same executable (e.g. a template got instantiated once with the switches enabled, once without). It leads to all kinds of inconsistencies and subtle breakages that totally outweigh whatever benefits it may have had.Just because you specify a certain compiler switch, it can cause unrelated breakage in some obscure library somewhere, that assumes modular arithmetic with C/C++ semantics.The idea was about two switches, one for signed integrals, and the other for both signed and unsigned. But from other posts I guess Walter doesn't think this is a viable possibility.So the solutions I see now are stop using D for some kind of more important programs, or using some kind of safeInt, and then work with the compiler writers to allow user-defined structs to be usable as naturally as possible as ints (and possibly efficiently).It's not too late to add a new native type (or types) to the language that support this kind of checking. I see that as the best solution to this issue. Don't mess with the existing types, because too much already depends on it. Add a new type that has the desired behaviour. But you may have a hard time convincing Walter to put it in, though.Regarding safeInt I think today there is no way to write it efficiently in D, because the overflow flags are not accessible from D, and if you use inlined asm, you lose inlining in DMD. This is just one of the problems. The other problems are syntax incompatibilities of user-defined structs compared to built-in ints. Other problems are the probable lack of high-level optimizations done on such user defined type.[...] These are implementation issues that we can work on improving. For one thing, I'd love to see D get closer to the point where the distinction between built-in types and user-defined types is gone. We may never actually reach that point, but the closer we get, the better. This will let us solve a lot of things, like drop-in replacements for AA's, etc., that are a bit ugly to do today. One thing I've always thought about is a way for user-types to specify sub-expression optimizations that the compiler can apply. Basically, if I implement, say, a Matrix class, then I should be able to tell the compiler that certain Matrix expressions, say A*B+A*C, can be factored into A*(B+C), and have the optimizer automatically do this for me based on what is defined in the type. Or specify that write("a");writeln("b"); can be replaced by writeln("ab");. But I haven't come up with a good generic framework for actually making this implementable yet. T -- I don't trust computers, I've spent too long programming to think that they can get anything right. -- James Miller
Dec 11 2012
I implement, say, a Matrix class, then I should be able to tell the compiler that certain Matrix expressions, say A*B+A*C, can be factored into A*(B+C), and have the optimizer automatically do this for me based on what is defined in the type. Or specify that write("a");writeln("b"); can be replaced by writeln("ab");. But I haven't come up with a good generic framework for actually making this implementable yet.Yeah, it's not that easy; Nimrod uses a hygienic macro system with term rewriting rules and side effect analysis and alias analysis for that ;-). http://build.nimrod-code.org/docs/trmacros.html http://forum.nimrod-code.org/t/70
Dec 12 2012
On 12/11/2012 5:15 PM, bearophile wrote:Regarding safeInt I think today there is no way to write it efficiently in D, because the overflow flags are not accessible from D, and if you use inlined asm, you lose inlining in DMD. This is just one of the problems.The way to deal with this is to examine the implementation of CheckedInt, and design a couple of compiler intrinsics to use in its implementation that will eliminate the asm code. (This is how the high level vector library Manu is implementing is done.)The other problems are syntax incompatibilities of user-defined structs compared to built-in ints.This is not an issue.Other problems are the probable lack of high-level optimizations done on such user defined type.Using intrinsics deals with this issue nicely, as the optimizer knows about them.We are very far from a good solution to such problems.No, we are not. The problem, as I see it, is nobody actually cares about this. Why would I say something so provocative? Because I've seen D programmers go to herculean lengths to get around problems they are having in the language. These efforts make a strong case that they need better language support (UDAs are a primo example of this). I see nobody bothering to write a CheckedInt type and seeing how far they can push it, even though writing such a type is not a significant challenge; it's a bread-and-butter job. Also, as I said before, there is a SafeInt class in C++. So far as I can tell, nobody uses it. Want to prove me wrong? Implement such a user defined type, and demonstrate user interest in it. (Also note the HalfFloat class I implemented for Manu, as a demonstration of how a user defined type can implement a floating point type that is unknown to the compiler.)
Dec 11 2012
Walter Bright:The way to deal with this is to examine the implementation of CheckedInt, and design a couple of compiler intrinsics to use in its implementation that will eliminate the asm code.OK, good. I didn't think of this option.Using intrinsics deals with this issue nicely, as the optimizer knows about them.OK.The problem, as I see it, is nobody actually cares about this.Maybe you are right. I think I have never said there is a lot of people caring about this in D :-) Bye, bearophile
Dec 11 2012
On Wed, Dec 12, 2012 at 8:14 AM, Walter Bright <newshound2 digitalmars.com>wrote:(This is how the high level vector library Manu is implementing is done.)Greetings Where can I learn more about this library Manu is developing? regards - Puneet
Dec 11 2012
On Wednesday, 12 December 2012 at 04:42:57 UTC, d coder wrote:On Wed, Dec 12, 2012 at 8:14 AM, Walter Bright <newshound2 digitalmars.com>wrote:The code is at https://github.com/TurkeyMan/phobos It doesn't have anything to do with checked integers, though - Walter was just using it as an example of an approach that we could also use with checked integers.(This is how the high level vector library Manu is implementing is done.)Greetings Where can I learn more about this library Manu is developing? regards - Puneet
Dec 11 2012
On 12/11/2012 8:42 PM, d coder wrote:Where can I learn more about this library Manu is developing?Manu posts here, reply to him!
Dec 11 2012
The problem, as I see it, is nobody actually cares about this. Why would I say something so provocative? Because I've seen D programmers go to herculean lengths to get around problems they are having in the language. These efforts make a strong case that they need better language support (UDAs are a primo example of this). I see nobody bothering to write a CheckedInt type and seeing how far they can push it, even though writing such a type is not a significant challenge; it's a bread-and-butter job.I disagree with the analysis. I do want overflow detection, yet I would not use a CheckedInt in D for the same reason I do not usually use one in C++: without compiler support, it is too expensive to detect overflow. In my C++ I have a lot of math to otherwise prefer. Constantly checking for overflow without hardware support would kill most of the performance advantage, so I don't do it. I do use "clipped conversion" though: e.g. ClippedConvert<short>(40000)==32767. I can afford the overhead in this case because I don't do type conversions as often as addition, bit shifts, etc. on overflow, which is convenient but is bad for performance if it happens regularly; it can also make a debugger almost unusable. Some sort of mechanism that works like an exception, but faster, would probably be better. Consider: result = a * b + c * d; If a * b overflows, there is probably no point to executing c * d so it may as well jump straight to a handler; on the other hand, the exception mechanism is costly, especially if the debugger is hooked in and causes a context switch every single time it happens. So... I dunno. What's the best semantic for an overflow detector?
Dec 11 2012
On 12/11/2012 9:51 PM, David Piepgrass wrote:You're not going to get performance with overflow checking even with the best compiler support. For example, much arithmetic code is generated for the x86 using addressing mode instructions, like: LEA EAX,16[8*EBX][ECX] for 16+8*b+c The LEA instruction does no overflow checking. If you wanted it, the best code would be: MOV EAX,16 IMUL EBX,8 JO overflow ADD EAX,EBX JO overflow ADD EAX,ECX JO overflow Which is considerably less efficient. (The LEA is designed to run in one cycle). Plus, often more registers are modified which impedes good register allocation. does it under duress. It is not a conspiracy of pig-headed language developers :-)The problem, as I see it, is nobody actually cares about this. Why would I say something so provocative? Because I've seen D programmers go to herculean lengths to get around problems they are having in the language. These efforts make a strong case that they need better language support (UDAs are a primo example of this). I see nobody bothering to write a CheckedInt type and seeing how far they can push it, even though writing such a type is not a significant challenge; it's a bread-and-butter job.I disagree with the analysis. I do want overflow detection, yet I would not use a CheckedInt in D for the same reason I do not usually use one in C++: without compiler support, it is too expensive to detect overflow. In my C++ I have a lot otherwise prefer. Constantly checking for overflow without hardware support would kill most of the performance advantage, so I don't do it.I do use "clipped conversion" though: e.g. ClippedConvert<short>(40000)==32767. I can afford the overhead in this case because I don't do type conversions as often as addition, bit shifts, etc.You can't have both performant code and overflow detection.which is convenient but is bad for performance if it happens regularly; it can also make a debugger almost unusable. Some sort of mechanism that works like an exception, but faster, would probably be better. Consider: result = a * b + c * d; If a * b overflows, there is probably no point to executing c * d so it may as well jump straight to a handler; on the other hand, the exception mechanism is costly, especially if the debugger is hooked in and causes a context switch every single time it happens. So... I dunno. What's the best semantic for an overflow detector?If you desire overflows to be programming errors, then you want an abort, not a thrown exception. I am perplexed by your desire to continue execution when overflows happen regularly.
Dec 11 2012
On Wednesday, 12 December 2012 at 06:19:14 UTC, Walter Bright wrote:You're not going to get performance with overflow checking even with the best compiler support. For example, much arithmetic code is generated for the x86 using addressing mode instructions, like: LEA EAX,16[8*EBX][ECX] for 16+8*b+c The LEA instruction does no overflow checking. If you wanted it, the best code would be: MOV EAX,16 IMUL EBX,8 JO overflow ADD EAX,EBX JO overflow ADD EAX,ECX JO overflow Which is considerably less efficient. (The LEA is designed to run in one cycle). Plus, often more registers are modified which impedes good register allocation.Thanks for the tip. Of course, I don't need and wouldn't use overflow checking all the time--in fact, since I've written a big system in a language that can't do overflow checking, you might say I "never need" overflow checking, in the same way that C programmers "never need" constructors, destructors, generics or exceptions as demonstrated by the fact that they can and do build large systems without them. Still, the cost of overflow checking is a lot bigger, and requires a lot more cleverness, without compiler support. Hence I work harder to avoid the need for it.If you desire overflows to be programming errors, then you want an abort, not a thrown exception. I am perplexed by your desire to continue execution when overflows happen regularly.I explicitly say I want to handle overflows quickly, and you conclude that I want an unrecoverable abort? WTF! No, I think overflows should be handled efficiently, and should be nonfatal. Maybe it would be better to think in terms of the carry flag: it seems to me that a developer needs access to the carry flag in order to do 128+bit arithmetic efficiently. I have written code to "make do" without the carry flag, it's just more efficient if it can be used. So imagine an intrinsic that gets the value of the carry flag*--obviously it wouldn't throw an exception. I just think overflow should be handled the same way. If the developer wants to react to overflow with an exception/abort, fine, but it should not be mandatory as it is in .NET. * Yes, I know you'd usually just ADC instead of retrieving the actual value of the flag, but sometimes you do want to just get the flag. Usually when there is an overflow I just want to discard one data point and move on, or set the result to the maximum/minimum integer, possibly make a note in a log, but only occasionally do I want the debugger to break.
Dec 12 2012
David Piepgrass:I do want overflow detection, yet I would not use a CheckedInt in D for the same reason I do not usually use one in C++: without compiler support, it is too expensive to detect overflow.Here I have listed several problems in a library-defined SafeInt, but Walter has expressed willingness to introduce intrinsics, to give some compiler support, so it's a start of a solution: http://forum.dlang.org/thread/jhkbsghxjmdrxoxaevzm forum.dlang.org Bye, bearophile
Dec 11 2012
On Wednesday, 12 December 2012 at 02:44:42 UTC, Walter Bright wrote:UDAs are a primo example of this.OT: Why those are not allowed on module decls and local decls? We can't use UDAs on decls in unittest blocks. We can't use a UDA to mark a module reflectable, can't put an attribute on a "voldemort" type, etc. Please don't introduce arbitrary restrictions. That way you exclude many valid potential use cases, a recurring pattern that constantly pisses of D users. Features should be as general as reasonably possible. Otherwise, they *do* make us go herculean lengths.
Dec 12 2012
On Wednesday, 12 December 2012 at 00:43:39 UTC, H. S. Teoh wrote:On Wed, Dec 12, 2012 at 01:26:08AM +0100, foobar wrote:I didn't say D should change the implementation of integers, in fact I said the exact opposite - that it's probably to late to change the semantics for D. Had D was designed from scratch than one or as you suggest go even further and have two distinct types (as in SML) which is even better. But by no means am I to suggest to change D semantics _now_. Sadly, it's likely to late and we can only try to paper it on top with additional library types. This isn't a perfect solutions since the compiler has builtin knowledge about int and does optimizations that will be lost with a library type.On Wednesday, 12 December 2012 at 00:06:53 UTC, bearophile wrote:I don't agree that compiler switches should change language semantics. Just because you specify a certain compiler switch, it can cause unrelated breakage in some obscure library somewhere, that assumes modular arithmetic with C/C++ semantics. And this breakage will in all likelihood go *unnoticed* until your software is running on the customer's site and then it crashes horribly. And good luck debugging that, because the breakage can be very subtle, plus it's *not* in your own code, but in some obscure library code that you're not familiar with. I think a much better approach is to introduce a new type (or new types) that *does* have the requisite bounds checking and static analysis. That's what a type system is for. [...]foobar:I would enforce overflow and underflow checking semantics.<Plus one or two switches to disable such checking, if/when someone wants it, to regain the C performance. (Plus some syntax way to disable/enable such checking in a small piece of code). Maybe someday Walter will change his mind about this topic :-)better. (That's a self quote) btw, here's the link for SML which does not use tagged ints - http://www.standardml.org/Basis/word.html#Word8:STR:SPEC "Instances of the signature WORD provide a type of unsigned integer with modular arithmetic and logical operations and conversion operations. They are also meant to give efficient access to the primitive machine word types of the underlying hardware, and support bit-level operations on integers. They are not meant to be a ``larger'' int. "It's kinda too late for D to rename int to word, say, but it's not too late to introduce a new checked int type, say 'number' or something like that (you can probably think of a better name). In fact, Andrei describes a CheckedInt type that uses operator overloading, etc., to implement an in-library solution to bounds checks. You can probably expand that into a workable lightweight int replacement. By wrapping an int in a struct with custom operators, you can pretty much have an int-sized type (with value semantics, just like "native" ints, no less!) that does what you want, instead of the usual C/C++ int semantics. T
Dec 12 2012
On 12/12/2012 2:33 AM, foobar wrote:This isn't a perfect solutions since the compiler has builtin knowledge about int and does optimizations that will be lost with a library type.See my reply to bearophile about that.
Dec 12 2012
On Wednesday, 12 December 2012 at 10:35:26 UTC, Walter Bright wrote:On 12/12/2012 2:33 AM, foobar wrote:Yeah, just saw that :) So basically you're suggesting to implement Integer and Word library types using compiler intrinsics as a way to migrate to better ML compatible semantics. This is a possible solution if it can be proven to work. Regarding performance and overflow checking, the example you give is x86 specific. What about other platforms? For example ARM is very popular nowadays in the mobile world and there are many more smart-phones out there than there are PCs. Is the same issue exists and if not (I suspect not, but really have no idea) should D be geared towards current platforms or future ones?This isn't a perfect solutions since the compiler has builtin knowledge about int and does optimizations that will be lost with a library type.See my reply to bearophile about that.
Dec 12 2012
foobar:So basically you're suggesting to implement Integer and Word library types using compiler intrinsics as a way to migrate to better ML compatible semantics.I think there were no references to ML in that part of Walter answer.Regarding performance and overflow checking, the example you give is x86 specific. What about other platforms? For example ARM is very popular nowadays in the mobile world and there are many more smart-phones out there than there are PCs. Is the same issue exists and if not (I suspect not, but really have no idea) should D be geared towards current platforms or future ones?Currently DMD (and a bit D too) is firmly oriented toward x86, with a moderate orientation toward 64 bit too. Manu has asked for more attention toward ARM, but (as Andrei has said) maybe finishing const/immutable/shared is better now. Bye, bearophile
Dec 12 2012
On 12/12/2012 3:12 AM, foobar wrote:Regarding performance and overflow checking, the example you give is x86 specific. What about other platforms? For example ARM is very popular nowadays in the mobile world and there are many more smart-phones out there than there are PCs. Is the same issue exists and if not (I suspect not, but really have no idea) should D be geared towards current platforms or future ones?I don't know the ARM instruction set.
Dec 12 2012
On Wednesday, 12 December 2012 at 21:27:35 UTC, Walter Bright wrote:On 12/12/2012 3:12 AM, foobar wrote:I think it would be very hard, at this stage, to argue that you should be putting your effort into ARM rather than x86. It's a nice to have that doesn't seem very relevant to gaining traction for D, areas like bioinformatics seem more relevant.Regarding performance and overflow checking, the example you give is x86 specific. What about other platforms? For example ARM is very popular nowadays in the mobile world and there are many more smart-phones out there than there are PCs. Is the same issue exists and if not (I suspect not, but really have no idea) should D be geared towards current platforms or future ones?I don't know the ARM instruction set.
Dec 12 2012
On Wednesday, 12 December 2012 at 22:36:35 UTC, ixid wrote:On Wednesday, 12 December 2012 at 21:27:35 UTC, Walter Bright wrote:http://santyhammer.blogspot.com/2012/11/something-is-changing-in-desktop.htmlOn 12/12/2012 3:12 AM, foobar wrote:I think it would be very hard, at this stage, to argue that you should be putting your effort into ARM rather than x86. It's a nice to have that doesn't seem very relevant to gaining traction for D, areas like bioinformatics seem more relevant.Regarding performance and overflow checking, the example you give is x86 specific. What about other platforms? For example ARM is very popular nowadays in the mobile world and there are many more smart-phones out there than there are PCs. Is the same issue exists and if not (I suspect not, but really have no idea) should D be geared towards current platforms or future ones?I don't know the ARM instruction set.
Dec 12 2012
On 12/11/2012 4:06 PM, bearophile wrote:Plus one or two switches to disable such checking, if/when someone wants it, to regain the C performance. (Plus some syntax way to disable/enable such checking in a small piece of code).1. The global switch "solution": What I hate about this was discussed earlier today in another thread. Global switches that change the semantics of the language are a disaster. It means you cannot write a piece of code and have confidence that it will behave in a certain way. It means your testing becomes a combinatorial explosion of cases - how many modules do you have, and you must (to be thorough) test every combination of switches across your whole project. If you have a 2 way switch, and 8 modules, that's 256 test runs. 2. The checked block "solution": This is a blunt club that affects everything inside a block. What happens with template instantiations, inlined functions, and mixins, for starters? What if you want one part of the expression checked and not another? What a mess.Maybe someday Walter will change his mind about this topic :-)Not likely :-) What you (and anyone else) *can* do, today, is write a SafeInt struct that acts just like an int, but checks for overflow. It's very doable (one exists for C++). Write it, use it, and prove its worth. Then you'll have a far better case. Write a good one, and we'll consider it for Phobos.
Dec 11 2012
On 12/11/2012 3:44 PM, foobar wrote:Thanks for proving my point. after all , you are a C++ developer, aren't you? :)No, I'm an assembler programmer. I know how the machine works, and C, C++, and D map onto that, quite deliberately. It's one reason why D supports the vector types directly.Seriously though, it _is_ a trick and a code smell.Not to me. There is no trick or "smell" to anyone familiar with how computers work.I'm fully aware that computers used 2's complement. I'm also am aware of the fact that the type has an "unsigned" label all over it. You see it right there in that 'u' prefix of 'int'. An unsigned type should semantically entail _no sign_ in its operations. You are calling a cat a dog and arguing that dogs barf? Yeah, I completely agree with that notion, except, we are still talking about _a cat_.Andrei and I have endlessly talked about this (he argued your side). The inevitable result is that signed and unsigned types *are* conflated in D, and have to be, otherwise many things stop working. For example, p[x]. What type is x? Integer signedness in D is not really a property of the data, it is only how one happens to interpret the data in a specific context.To answer you question, yes, I would enforce overflow and underflow checking semantics. Any negative result assigned to an unsigned type _is_ a logic error. you can claim that: uint a = -1; is perfectly safe and has a well defined meaning (well, for C programmers that is), but what about: uint a = b - c; what if that calculation results in a negative number? What should the compiler do? well, there are _two_ equally possible solutions: a. The overflow was intended as in the mask = -1 case; or b. The overflow is a _bug_. The user should be made aware of this and should make the decision how to handle this. This should _not_ be implicitly handled by the compiler and allow bugs go unnoticed.checked { } block, or with a compiler switch. I don't see that as "solving" the issue in any elegant or natural way, it's more of a clumsy hack. Both of these rely on wraparound 2's complement arithmetic.Another data point would be (S)ML which is a compiled language which requires _explicit conversions_ and has a very strong typing system. Its programs are compiled to efficient native executables and the strong typing allows both the compiler and the programmer better reasoning of the code. Thus programs are more correct and can be optimized by the compiler. In fact, several languages are implemented in ML because of its higher guaranties.ML has been around for 30-40 years, and has failed to catch on.
Dec 11 2012
Walter Bright:ML has been around for 30-40 years, and has failed to catch on.less directly from ML, that share many of its ideas. Has Haskell caught on? :-) Bye, bearophile
Dec 11 2012
On 12/11/2012 5:05 PM, bearophile wrote:Walter Bright:Haskell is the language that everyone loves to learn and talk about, and few actually use. And it's significantly slower than D, in unfixable ways.ML has been around for 30-40 years, and has failed to catch on.from ML, that share many of its ideas. Has Haskell caught on? :-)
Dec 11 2012
On 12/12/2012 03:45 AM, Walter Bright wrote:On 12/11/2012 5:05 PM, bearophile wrote:(Sufficiently sane) languages are not slow or fast and I think the factor GHC/DMD cannot be more than about 2 or 3 for roughly equivalently written imperative code. Furthermore no D implementation has any kind of useful performance for lazy functional style D code. In some ways, D is very significantly slower than Haskell. The compilers optimize specific coding styles better than others.Walter Bright:Haskell is the language that everyone loves to learn and talk about, and few actually use. And it's significantly slower than D,ML has been around for 30-40 years, and has failed to catch on.directly from ML, that share many of its ideas. Has Haskell caught on? :-)in unfixable ways.I disagree. That is certainly fixable. It is a mere QOI issue.
Dec 12 2012
On 12/12/2012 12:01 PM, Timon Gehr wrote:That is certainly fixable. It is a mere QOI issue.When you have a language that fundamentally disallows mutation, some algorithms are doomed to be slower. I asked Erik Maijer, one of the developers of Haskell, if the implementation does mutation "under the hood" to make things go faster. He assured me that it does not, that it follows the "no mutation" all the way.I think the factor GHC/DMD cannot be more than about 2 or 3 for roughly equivalently written imperative code.A factor of 2 or 3 is make or break for a large class of programs. Consider running a server farm. If you can make your code 5% faster, you need 5% fewer servers. That translates into millions of dollars.
Dec 12 2012
Walter Bright:Consider running a server farm. If you can make your code 5% faster, you need 5% fewer servers. That translates into millions of dollars.Two comments: - I've seen Facebook start from PHP, go to PHP compiled in some ways, and lately start to switch to faster languages, so when you have tons of servers space and electricity used by CPUs becomes important for the bottom line. On the other hand on similar servers lot of other people use languages where there is far more than your 5% overhead, as Java. Often small performance differences are not more important than several other considerations, like coding speed, how much easy is to find programmer, how cheap those programmers are, etc, even on server farms. - If your code is buggy (because of overflows, or other causes), its output can be worthless or even harmful. This is why some people are using OcaML for high-speed trading (I have given two links in a precedent post), where bugs risk being quite costly. Bye, bearophile
Dec 12 2012
On 12/12/2012 2:17 PM, bearophile wrote:Two comments: - I've seen Facebook start from PHP, go to PHP compiled in some ways, and lately start to switch to faster languages, so when you have tons of servers space and electricity used by CPUs becomes important for the bottom line. On the other hand on similar servers lot of other people use languages where there is far more than your 5% overhead, as Java.I know people who use Java on server farms. They are very, very, very cognizant of overhead and work like hell trying to reduce it, because reducing it drops millions of dollars right to the bottom line of profit. Java makes no attempt to detect integer overflows.Often small performance differences are not more important than several other considerations, like coding speed, how much easy is to find programmer, how cheap those programmers are, etc, even on server farms.The problem they have with C++ is it is hard to find C++ programmers, not because of overflow in C++ programs.- If your code is buggy (because of overflows, or other causes), its output can be worthless or even harmful. This is why some people are using OcaML for high-speed trading (I have given two links in a precedent post), where bugs risk being quite costly.I personally know people who write high speed trading software. These people are concerned with nanosecond delays. They write code in C++. They even hack on the compiler trying to get it to generate faster code. It doesn't surprise me a bit that some people who operate server farms use slow languages like Ruby, Python, and Perl on them. This does cost them money for extra hardware. There are always going to be businesses that have inefficient operations, poorly allocated resources, and who leave a lot of money on the table.
Dec 12 2012
Walter Bright:Java makes no attempt to detect integer overflows.There are various kinds of code. In some kinds of programs you want to be more sure that the result is correct, while other kinds of programs this need is less strong.I personally know people who write high speed trading software. These people are concerned with nanosecond delays. They write code in C++. They even hack on the compiler trying to get it to generate faster code. It doesn't surprise me a bit that some people who operate server farms use slow languages like Ruby, Python, and Perl on them. This does cost them money for extra hardware. There are always going to be businesses that have inefficient operations, poorly allocated resources, and who leave a lot of money on the table.One "important" firm uses OcaML for high speed trading because it's both very fast (C++-class fast, faster than Java on certain kinds of code, if well used) and apparently quite safer to use than C/C++. And it's harder to find OcaML programmers than C++ ones. Bye, bearophile
Dec 12 2012
On 12/12/2012 5:32 PM, bearophile wrote:One "important" firm uses OcaML for high speed trading because it's both very fast (C++-class fast, faster than Java on certain kinds of code, if well used) and apparently quite safer to use than C/C++. And it's harder to find OcaML programmers than C++ ones.Fair enough, so I challenge you to write an Ocaml version of the input sorting program I presented in this thread in Haskell and D versions.
Dec 12 2012
it's both very fast (C++-class fast, faster than Java on certain kinds of code, if well used) and apparently quite saferLast I tried OCaml, "well used" in context of performance would mean avoiding many useful abstractions. One thing I remember is that using functors always has run time cost and I don't see why it should.
Dec 13 2012
On Thursday, 13 December 2012 at 01:32:23 UTC, bearophile wrote:Walter Bright:According to the Benchmark game, performance of Ocaml is good, but not fantstic. And certainly not "C++-class" fast. It's more like "Java-class" fast. (in fact it's slower than Java 7 on most tests, but uses much more memory). Unfortunately, D hasn't been on the game for a long time, but last time it was, it was effectively faster than g++. So really, we are not talking the same kind of performance here. D is likely to be MUCH faster than Ocaml.Java makes no attempt to detect integer overflows.There are various kinds of code. In some kinds of programs you want to be more sure that the result is correct, while other kinds of programs this need is less strong.I personally know people who write high speed trading software. These people are concerned with nanosecond delays. They write code in C++. They even hack on the compiler trying to get it to generate faster code. It doesn't surprise me a bit that some people who operate server farms use slow languages like Ruby, Python, and Perl on them. This does cost them money for extra hardware. There are always going to be businesses that have inefficient operations, poorly allocated resources, and who leave a lot of money on the table.One "important" firm uses OcaML for high speed trading because it's both very fast (C++-class fast, faster than Java on certain kinds of code, if well used) and apparently quite safer to use than C/C++. And it's harder to find OcaML programmers than C++ ones. Bye, bearophile
Dec 13 2012
On 12/12/2012 10:35 PM, Walter Bright wrote:On 12/12/2012 12:01 PM, Timon Gehr wrote:It does not.That is certainly fixable. It is a mere QOI issue.When you have a language that fundamentally disallows mutation,some algorithms are doomed to be slower.Here's a (real) quicksort: http://stackoverflow.com/questions/5268156/how-do-you-do-an-in-place-quicksort-in-haskellI asked Erik Maijer, one of the developers of Haskell, if the implementation does mutation "under the hood" to make things go faster."under the hood", obviously there must be mutation as this is how the machine works.He assured me that it does not, that it follows the "no mutation" all the way.Maybe he misunderstood. i.e. DMD does not do this to immutable data either. eg. Control.Monad.ST allows in-place state mutation of data types eg. from Data.STRef and Data.Array.ST. Such operations are sequenced and crosstalk between multiple such 'threads' is excluded by the type system, as long as only safe operations are used. It is somewhat similar to (the still quite broken) 'pure' in D, but stronger. (e.g. it is possible to pass mutable references into the rough equivalent of 'strongly pure' code, but that code won't be able to read their values, the references can appear as part of the return type, and the caller will be able to access them again -- Done using basically nothing but parametric polymorphism, which D lacks.) Eg:runST $ do -- ()pure{x <- newSTRef 0 -- auto x = 0; writeSTRef x 2 -- x = 2; // mutate x y <- readSTRef x -- auto y = x; writeSTRef x 3 -- x = 3; // mutate x z <- readSTRef x -- auto z = x; return (y,z) -- return tuple(y,z);}(); (2,3) -- tuple(2,3) This paper describes how this is implemented in GHC (in-place mutation) http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.3299 The only reason I can see why this is not as fast as D is implementation simplicity on the compiler side. Here is some of the library code. It makes use of primitives (intrinsics): http://www.haskell.org/ghc/docs/latest/html/libraries/base/src/GHC-ST.html#STProvided the code is correct.I think the factor GHC/DMD cannot be more than about 2 or 3 for roughly equivalently written imperative code.A factor of 2 or 3 is make or break for a large class of programs. Consider running a server farm. If you can make your code 5% faster, you need 5% fewer servers. That translates into millions of dollars.
Dec 12 2012
On 12/12/2012 3:23 PM, Timon Gehr wrote:It is somewhat similar to (the still quite broken) 'pure' in D,Broken how?Provided the code is correct.No language or compiler can prove code correct.
Dec 12 2012
On 12/13/2012 12:43 AM, Walter Bright wrote:On 12/12/2012 3:23 PM, Timon Gehr wrote:- There is no way to specify that a delegate is strongly pure without resorting to type deduction, because - Member functions/local functions are handled inconsistently. - Delegate types legally obtained from certain member functions are illegal to declare. - 'pure' means 'weakly pure' for member functions and 'strongly pure' for local functions. Therefore it means 'weakly pure' for delegates, as those can be obtained from both. - Delegates may break the transitivity of immutable, and by extension, shared. A good first step in fixing up immutable/shared would be to make everything that is annotated 'error' pass, and the line annotated 'ok' should fail: import std.stdio; struct S{ int x; int foo()pure{ return x++; } int bar()immutable pure{ // return x++; // error return 2; } } int delegate()pure s(){ int x; int foo()pure{ // return x++; // error return 2; } /+int bar()immutable pure{ // error return 2; }+/ return &foo; } void main(){ S s; int delegate()pure dg = &s.foo; // int delegate()pure immutable dg2 = &s.bar; // error writeln(dg(), dg(), dg(), dg()); // 0123 immutable int delegate()pure dg3 = dg; // ok writeln(dg3(), dg3(), dg3(), dg3()); // 4567 // static assert(is(typeof(cast()dg3)==int delegate() immutable pure)); // error auto bar = &s.bar; pragma(msg, typeof(bar)); // "int delegate() immutable pure" }It is somewhat similar to (the still quite broken) 'pure' in D,Broken how?Sometimes it can. Certainly a compiler can check a user-provided proof. eg: http://coq.inria.fr/ A minor issue with proving code correct is of course that the proven specification might contain an error. The formal specification is often far more explicit and easier to verify manually than the program though.Provided the code is correct.No language or compiler can prove code correct.
Dec 12 2012
On 12/12/2012 5:16 PM, Timon Gehr wrote:On 12/13/2012 12:43 AM, Walter Bright wrote:Are these in bugzilla?On 12/12/2012 3:23 PM, Timon Gehr wrote:- There is no way to specify that a delegate is strongly pure without resorting to type deduction, becauseIt is somewhat similar to (the still quite broken) 'pure' in D,Broken how?
Dec 12 2012
On 12/13/2012 04:54 AM, Walter Bright wrote:On 12/12/2012 5:16 PM, Timon Gehr wrote:Now they certainly are. http://d.puremagic.com/issues/show_bug.cgi?id=9148 The following you can close if you think 'const' should not guarantee no mutation. It does not break other parts of the type system: http://d.puremagic.com/issues/show_bug.cgi?id=9149On 12/13/2012 12:43 AM, Walter Bright wrote:Are these in bugzilla?On 12/12/2012 3:23 PM, Timon Gehr wrote:- There is no way to specify that a delegate is strongly pure without resorting to type deduction, becauseIt is somewhat similar to (the still quite broken) 'pure' in D,Broken how?
Dec 13 2012
On 12/13/2012 4:46 AM, Timon Gehr wrote:Now they certainly are. http://d.puremagic.com/issues/show_bug.cgi?id=9148 The following you can close if you think 'const' should not guarantee no mutation. It does not break other parts of the type system: http://d.puremagic.com/issues/show_bug.cgi?id=9149Thank you.
Dec 13 2012
On 12/12/2012 3:23 PM, Timon Gehr wrote:On 12/12/2012 10:35 PM, Walter Bright wrote:Ok, I'll bite. Here's a program in Haskell and D that reads from standard in, splits into lines, sorts the lines, and writes the result the standard out: ============================== import Data.List import qualified Data.ByteString.Lazy.Char8 as L main = L.interact $ L.unlines . sort . L.lines ============================== import std.stdio; import std.array; import std.algorithm; void main() { stdin.byLine(KeepTerminator.yes) map!(a => a.idup). array. sort. copy( stdout.lockingTextWriter()); } =============================== The D version runs twice as fast as the Haskell one. Note that there's nothing heroic going on with the D version - it's straightforward dumb code.some algorithms are doomed to be slower.Here's a (real) quicksort: http://stackoverflow.com/questions/5268156/how-do-you-do-an-in-place-quicksort-in-haskell
Dec 12 2012
On Wednesday, 12 December 2012 at 23:47:26 UTC, Walter Bright wrote:On 12/12/2012 3:23 PM, Timon Gehr wrote:You'll find a lot of trap into that in D, some that can kill your perfs. For instance : stdin.byLine(KeepTerminator.yes). map!(a => a.idup). filter!(a => a). array And bazinga, you just doubled the number of memory allocation made.On 12/12/2012 10:35 PM, Walter Bright wrote:Ok, I'll bite. Here's a program in Haskell and D that reads from standard in, splits into lines, sorts the lines, and writes the result the standard out: ============================== import Data.List import qualified Data.ByteString.Lazy.Char8 as L main = L.interact $ L.unlines . sort . L.lines ============================== import std.stdio; import std.array; import std.algorithm; void main() { stdin.byLine(KeepTerminator.yes) map!(a => a.idup). array. sort. copy( stdout.lockingTextWriter()); } =============================== The D version runs twice as fast as the Haskell one. Note that there's nothing heroic going on with the D version - it's straightforward dumb code.some algorithms are doomed to be slower.Here's a (real) quicksort: http://stackoverflow.com/questions/5268156/how-do-you-do-an-in-place-quicksort-in-haskell
Dec 12 2012
On 12/13/2012 12:47 AM, Walter Bright wrote:On 12/12/2012 3:23 PM, Timon Gehr wrote:You are testing some standard library functions that are implemented in wildly different ways in both languages. They are not the same algorithms. For example, looking at just the first element of the sorted list will run in O(length.) in Haskell. If you build a sort function with that property in D, it will be slower as well. (if a rather Haskell-inspired implementation strategy is chosen, it will be a lot slower.) The key difference is that the D version operates in a strict fashion on arrays, while the Haskell version operates in a lazy fashion on lazy lists. This just means that Data.List.sort is inadequate for high-performance code in case the entire contents of the list get looked at. This is a good treatment of the matter: You are using Data.List.sort. The best implementations shown there seem to be around 5 times faster. I do not know how large the I/O overhead is. Certainly, you can argue that the faster version should be in a prominent place in the standard library, but the fact that it is not does not indicate a fundamental performance problem in the Haskell language. Also, note that I am completely ignoring what kind of code is idiomatic in both languages. Fast Haskell code often looks similar to C code.On 12/12/2012 10:35 PM, Walter Bright wrote:Ok, I'll bite. Here's a program in Haskell and D that reads from standard in, splits into lines, sorts the lines, and writes the result the standard out: ============================== import Data.List import qualified Data.ByteString.Lazy.Char8 as L main = L.interact $ L.unlines . sort . L.lines ============================== import std.stdio; import std.array; import std.algorithm; void main() { stdin.byLine(KeepTerminator.yes) map!(a => a.idup). array. sort. copy( stdout.lockingTextWriter()); } =============================== The D version runs twice as fast as the Haskell one.some algorithms are doomed to be slower.Here's a (real) quicksort: http://stackoverflow.com/questions/5268156/how-do-you-do-an-in-place-quicksort-in-haskellNote that there's nothing heroic going on with the D version - it's straightforward dumb code.A significant part of the D code is spent arranging data into the right layout, while the Haskell code does nothing like that.
Dec 12 2012
On 12/12/2012 5:51 PM, Timon Gehr wrote:A significant part of the D code is spent arranging data into the right layout, while the Haskell code does nothing like that.So, please take the bait :-) and write a Haskell version that runs faster than the D one.
Dec 12 2012
On Thursday, 13 December 2012 at 01:51:27 UTC, Timon Gehr wrote:Certainly, you can argue that the faster version should be in a prominent place in the standard library, but the fact that it is not does not indicate a fundamental performance problem in the Haskell language. Also, note that I am completely ignoring what kind of code is idiomatic in both languages. Fast Haskell code often looks similar to C code.Well, you can write C code in D. You can compare top performance for both languages, but the fact is, if you write Haskell code extensively, you aren't going to write it like C, so comparing idiomatic Haskell vs idiomatic D does make sense. And comparing programs using the standard libraries also makes sense because that's how languages are used. It probably doesn't make much sense in a microbenchmark, but in a larger program it certainly does. And if the standard library is twice as slow in implementation A than in implemention B, then most programs will feel *at least* twice as slow, and usually more, because if you call a function f that's twice as slow in A than in B from another function that's also twice as slow in A than in B, then the whole thing is 4 times slower.
Dec 13 2012
On Thursday, 13 December 2012 at 21:28:52 UTC, SomeDude wrote:On Thursday, 13 December 2012 at 01:51:27 UTC, Timon Gehr wrote:Andif the standard library is twice as slow in implementation A than in implemention B, then most programs will feel *at least* twice as slow, and usually more, because if you call a function f that's twice as slow in A than in B from another function that's also twice as slow in A than in B, then the whole thing is 4 times slower.Hmmm, forget about this...
Dec 13 2012
On 12/13/2012 10:28 PM, SomeDude wrote:On Thursday, 13 December 2012 at 01:51:27 UTC, Timon Gehr wrote:Optimizing bottlenecks is idiomatic in every language.Certainly, you can argue that the faster version should be in a prominent place in the standard library, but the fact that it is not does not indicate a fundamental performance problem in the Haskell language. Also, note that I am completely ignoring what kind of code is idiomatic in both languages. Fast Haskell code often looks similar to C code.Well, you can write C code in D. You can compare top performance for both languages, but the fact is, if you write Haskell code extensively, you aren't going to write it like C, so comparing idiomatic Haskell vs idiomatic D does make sense.And comparing programs using the standard libraries also makes sense because that's how languages are used. It probably doesn't make much sense in a microbenchmark, but in a larger program it certainly does. ...That is not what we are arguing.
Dec 14 2012
On Wednesday, 12 December 2012 at 20:01:43 UTC, Timon Gehr wrote:On 12/12/2012 03:45 AM, Walter Bright wrote:Actually, a factor of 2 to 3 can be huge. Consider that java is around a factor 2 or less to C++ in the Computer Languages Benchmark Game, and yet, you easily feel the difference everyday on your desktop applications. But although the pure computation power is not very different, the real difference I believe lies the memory management, which is probably far less efficient in Java than in C++.On 12/11/2012 5:05 PM, bearophile wrote:(Sufficiently sane) languages are not slow or fast and I think the factor GHC/DMD cannot be more than about 2 or 3 for roughly equivalently written imperative code. Furthermore no D implementation has any kind of useful performance for lazy functional style D code. In some ways, D is very significantly slower than Haskell. The compilers optimize specific coding styles better than others.Walter Bright:Haskell is the language that everyone loves to learn and talk about, and few actually use. And it's significantly slower than D,ML has been around for 30-40 years, and has failed to catch on.or less directly from ML, that share many of its ideas. Has Haskell caught on? :-)in unfixable ways.I disagree. That is certainly fixable. It is a mere QOI issue.
Dec 13 2012
On 12/13/2012 09:09 PM, SomeDude wrote:On Wednesday, 12 December 2012 at 20:01:43 UTC, Timon Gehr wrote:Sure.On 12/12/2012 03:45 AM, Walter Bright wrote:Actually, a factor of 2 to 3 can be huge.On 12/11/2012 5:05 PM, bearophile wrote:(Sufficiently sane) languages are not slow or fast and I think the factor GHC/DMD cannot be more than about 2 or 3 for roughly equivalently written imperative code. Furthermore no D implementation has any kind of useful performance for lazy functional style D code. In some ways, D is very significantly slower than Haskell. The compilers optimize specific coding styles better than others.Walter Bright:Haskell is the language that everyone loves to learn and talk about, and few actually use. And it's significantly slower than D,ML has been around for 30-40 years, and has failed to catch on.directly from ML, that share many of its ideas. Has Haskell caught on? :-)in unfixable ways.I disagree. That is certainly fixable. It is a mere QOI issue.Consider that java is around a factor 2 or less to C++ in the Computer Languages Benchmark Game, and yet, you easily feel the difference everyday on your desktop applications.Most software I use is written in C or C++. I think some of it is way too slow.But although the pure computation power is not very different, the real difference I believe lies the memory management, which is probably far less efficient in Java than in C++.It still depends heavily on how well it is done in each case.
Dec 14 2012
On Wednesday, 12 December 2012 at 00:51:19 UTC, Walter Bright wrote:On 12/11/2012 3:44 PM, foobar wrote:This is precisely the point that signed and unsigned types are conflated *in D*. Other languages, namely ML chose a different design. ML chose to have two distinct types: word and int, word is for binary data and int for integer numbers. Words provide efficient access to the machine representation and have no overflow checks, ints represent numbers and do carry overflow checks. you can convert between the two and the compiler/run-time can carry special knowledge about such conversions in order to provide better optimization. in ML, array indexing is done with an int since it _is_ conceptually a number. Btw, SML was standardized in '97. I'll also dispute the claim that it hasn't caught on - there are many derived languages from it and it is just as large if not larger than the C family of languages. It has influenced many languages and it and its derivations are being used. One example that comes to mind is the future version of JavaScript is implemented in ML. So no, not forgotten but rather alive and kicking.Thanks for proving my point. after all , you are a C++ developer, aren't you? :)No, I'm an assembler programmer. I know how the machine works, and C, C++, and D map onto that, quite deliberately. It's one reason why D supports the vector types directly.Seriously though, it _is_ a trick and a code smell.Not to me. There is no trick or "smell" to anyone familiar with how computers work.I'm fully aware that computers used 2's complement. I'm also am aware of the fact that the type has an "unsigned" label all over it. You see it right there in that 'u' prefix of 'int'. An unsigned type should semantically entail _no sign_ in its operations. You are calling a cat a dog and arguing that dogs barf? Yeah, I completely agree with that notion, except, we are still talking about _a cat_.Andrei and I have endlessly talked about this (he argued your side). The inevitable result is that signed and unsigned types *are* conflated in D, and have to be, otherwise many things stop working. For example, p[x]. What type is x? Integer signedness in D is not really a property of the data, it is only how one happens to interpret the data in a specific context.To answer you question, yes, I would enforce overflow and underflow checking semantics. Any negative result assigned to an unsigned type _is_ a logic error. you can claim that: uint a = -1; is perfectly safe and has a well defined meaning (well, for C programmers that is), but what about: uint a = b - c; what if that calculation results in a negative number? What should the compiler do? well, there are _two_ equally possible solutions: a. The overflow was intended as in the mask = -1 case; or b. The overflow is a _bug_. The user should be made aware of this and should make the decision how to handle this. This should _not_ be implicitly handled by the compiler and allow bugs go unnoticed.either using a checked { } block, or with a compiler switch. I don't see that as "solving" the issue in any elegant or natural way, it's more of a clumsy hack. array slicing. Both of these rely on wraparound 2's complement arithmetic.Another data point would be (S)ML which is a compiled language which requires _explicit conversions_ and has a very strong typing system. Its programs are compiled to efficient native executables and the strong typing allows both the compiler and the programmer better reasoning of the code. Thus programs are more correct and can be optimized by the compiler. In fact, several languages are implemented in ML because of its higher guaranties.ML has been around for 30-40 years, and has failed to catch on.
Dec 12 2012
On 12/12/2012 2:53 AM, foobar wrote:One example that comes to mind is the future version of JavaScript is implemented in ML.Um, there are many implementations of Javascript. In fact, I have implemented it in both C++ and D.
Dec 12 2012
On Wednesday, 12 December 2012 at 21:05:05 UTC, Walter Bright wrote:On 12/12/2012 2:53 AM, foobar wrote:Completely besides the point. The discussion was about using ML in real life, not what you specifically chose to use. The fact that you use D has no baring on your own assertion that ML is effectively dead. Fact is _other people and organizations_ do use it to great effect. Also, I said _future version_ of JS, meaning the next version of the ECMAscript standard. ML was specifically chosen as it allows to both be efficient and verify the correctness of the implementation.One example that comes to mind is the future version of JavaScript is implemented in ML.Um, there are many implementations of Javascript. In fact, I have implemented it in both C++ and D.
Dec 13 2012
On Wednesday, 12 December 2012 at 08:00:09 UTC, Walter Bright wrote:On 12/11/2012 11:53 PM, Walter Bright wrote:Led Zep, too. Long time ago I read some pseudo-scientific book called "Heavy Metal" (don't remember the author) who claimed it is a rule: a couple of fans in the beginning, several years of desperation and only after that - fame and fortune. Of course, the reality as much more complex.On 12/11/2012 11:47 PM, Han wrote:BTW, many rock bands burst out of nowhere on the scene with instant success. Overlooked is the previous 10 years the band struggled in obscurity. This includes bands like The Beatles. Well, 6 years for The Beatles.Walter Bright wrote:Many languages wander in the wilderness for years before they catch on.ML has been around for 30-40 years, and has failed to catch on.Isn't D on that same historical path?
Dec 12 2012
On Wednesday, 12 December 2012 at 08:25:04 UTC, Han wrote:Walter Bright wrote:Overlooked ? No. Using ? No. Disliked and abandoned ? No. Quite a few times I've seen on the web people saying something like: "D looks *really* nice, can't use it right now, but definitely keeping eye on it" Same with myself. Keeping eye on it since mid-2010.Overlooked is the previous 10 years the band struggled in obscurity.You KNOW that D has not been "overlooked". Developers and users with applications give it a look (the former mostly) and then choose something else.Do you really think that D will ever have popularity to the level of The Beatles?From what I've seen so far, I'd say that's quite possible.Do you have a "Moses complex" (psychological) a little bit maybe?You do understand this doesn't add anything to the discussion, right ?
Dec 13 2012
On Friday, 14 December 2012 at 08:04:55 UTC, Han wrote:Then put up a real-time tracking chart of that on the D website: "The popularity of D vs. the popularity of The Beatles". I think what you answered goes to show the level of dissillusionment of (or shamefully insulting level of propagandism put forth by) the typical D fanboy.nice try mr.Troll, but no, you came here and start saying crap(sry if it insults you) on almost 10 pages. if you really need feature "X" you can always do it yourself and even maybe give ur solution to community, instead you are just flooding with same template "how D can be a number one(and no one really want D to be number one, people want good usable language right?) if (people/devs) there is everything don't want to (...)?" you are saying devs/people ignorant, they give you facts but you don't want to accept them, so everyone is bad? and now you are calling the whole community D fanboys, what is wrong with u dude? p.s. i don't want to continue this time wasting discussion
Dec 14 2012
Arithmetic in computers is different from the math you learned in school. It's 2's complement, and it's best to always keep that in mind when writing programs.From http://embed.cs.utah.edu/ioc/ " Examples of undefined integer overflows we have reported: An SQLite bug Some problems in SafeInt GNU MPC PHP Firefox GCC PostgreSQL LLVM Python We also reported bugs to BIND and OpenSSL. Most of the SPEC CPU 2006 benchmarks contain undefined overflows." So how does D improve on C's model? If signed integers are required to wrap around in D (no undefined behaviour), you also prevent some otherwise possible optimizations (there is a reason it's still undefined behaviour in C).
Dec 12 2012
Araq:So how does D improve on C's model?There is some range analysis on shorter integral values. But overall it shares the same troubles.If signed integers are required to wrap around in D (no undefined behaviour),I think in D specs signed integers don't require the wrap-around (so it's undefined behaviour). Bye, bearophile
Dec 12 2012
On 12/12/2012 4:51 AM, Araq wrote:From http://embed.cs.utah.edu/ioc/ " Examples of undefined integer overflows we have reported: An SQLite bug Some problems in SafeInt GNU MPC PHP Firefox GCC PostgreSQL LLVM Python We also reported bugs to BIND and OpenSSL. Most of the SPEC CPU 2006 benchmarks contain undefined overflows."Thanks, this is interesting information.So how does D improve on C's model? If signed integers are required to wrap around in D (no undefined behaviour), you also prevent some otherwise possible optimizations (there is a reason it's still undefined behaviour in C).D requires 2's complement arithmetic, it does not support 1's complement as C does.
Dec 12 2012
On 12/12/2012 10:25 PM, Walter Bright wrote:On 12/12/2012 4:51 AM, Araq wrote:I think what he is talking about is that in C, if after a few steps of inlining and constant propagation you end up with something like: int x; // ... if(x>x+1) { // lots and lots of code }else return 0; Then a C compiler will assume that the addition does not overflow and reduce the code to 'return 0;', whereas a D compiler will not apply this optimization as it might change the semantics of valid D programs.... So how does D improve on C's model? If signed integers are required to wrap around in D (no undefined behaviour), you also prevent some otherwise possible optimizations (there is a reason it's still undefined behaviour in C).D requires 2's complement arithmetic, it does not support 1's complement as C does.
Dec 12 2012
On 12/12/2012 3:29 PM, Timon Gehr wrote:On 12/12/2012 10:25 PM, Walter Bright wrote:You're right in that the D optimizer does not take advantage of C "undefined behavior" in its optimizations. The article mentioned that many bugs were caused not by the actual wraparound behavior, but by aggressive C optimizers that interpreted "undefined behavior" as not having to account for those cases.On 12/12/2012 4:51 AM, Araq wrote:I think what he is talking about is that in C, if after a few steps of inlining and constant propagation you end up with something like: int x; // ... if(x>x+1) { // lots and lots of code }else return 0; Then a C compiler will assume that the addition does not overflow and reduce the code to 'return 0;', whereas a D compiler will not apply this optimization as it might change the semantics of valid D programs.... So how does D improve on C's model? If signed integers are required to wrap around in D (no undefined behaviour), you also prevent some otherwise possible optimizations (there is a reason it's still undefined behaviour in C).D requires 2's complement arithmetic, it does not support 1's complement as C does.
Dec 12 2012
Hai :D I have seen your D program language on Google, it looks cool! how much different is it to the C program language?
Dec 12 2012
Besides, at the end of the day, a half-approach would be to have a widest-signed-integral and a widest-unsigned-integral type and only play with those two.Clarification: to have those two types as fundamental (ie: promotion-favourite) types, not the sole types in the language.
Dec 11 2012
On 12/11/2012 8:22 AM, Andrei Alexandrescu wrote:* 32-bit integers are a sweet spot for CPU architectures. There's rarely a provision for 16- or 8-bit operations; the action is at 32- or 64-bit.Requiring integer operations to all be 64 bits would be a heavy burden on 32 bit CPUs.
Dec 11 2012
Machine/hardware have a explicitly defined register size and does know nothing about sign and data type. fastest operation is unsigned and fits to register size. For example in your case, some algorithm that coded with chained-if-checks may come unusable because it will slow. http://msdn.microsoft.com/ru-ru/library/74b4xzyw.aspx By default it is only for constants. For expressions in runtime it must be explicitly enabled. I think this check must be handled by developer through library or compiler.
Dec 12 2012
http://msdn.microsoft.com/ru-ru/library/74b4xzyw.aspx By default it is only for constants. For expressions in runtime it must be explicitly enabled.en link: http://msdn.microsoft.com/en-us/library/74b4xzyw.aspx
Dec 12 2012
On Wednesday, 12 December 2012 at 14:39:40 UTC, Michael wrote:Machine/hardware have a explicitly defined register size and does know nothing about sign and data type. fastest operation is unsigned and fits to register size.Frankly, the hardware knows nothing about classes and about virtual methods, neither. The question is: why the DEVELOPER should know about that register size? There are many other things that the developer is unaware of (read SSE instructions) and those are optimized behind his back. The choice to promote to the fastest type is a sweet thing for the compiler, but a burden for the developer. OTOH, I *never* asked for compulsory promotion, just mimicking it. (in fact, I was not asking for anything, just addressed a question) The idea is to guarantee, by the compiler, that the final result of an integral arithmetic expression is AS IF all integrals there are promoted to some widest-integral type. Actual promotion would be made only if the compiler believes that's really necessary. In the current implementattion too, speed is lost as long as you have a long there, as you need promotion further than int.
Dec 12 2012
OTOH, I *never* asked for compulsory promotion, just mimicking it. (in fact, I was not asking for anything, just addressed a question) The idea is to guarantee, by the compiler, that the final result of an integral arithmetic expression is AS IF all integrals there are promoted to some widest-integral type.And the question is exactly that: what are the reasons to favor one view over another? (that is int-C over int-FPC) Is FPC that slow? Is C that easy? Weighting pro and cons?
Dec 12 2012
I read all thread and conclude that developers want a one button - 'do all what I need'. As mentioned above, for example, python have a arbitrary int (that implemented as C library ;)). C can be used on many platforms. For each platform developer have solution as library. Right way is creating something new instead cutting something that exist. Each platform have a own limitations: memory, execution time etc. It's good if different platforms can communicate between each other. Not all algorithms consume a lots of memory or have needs in arbitrary int. In some cases we have + or -. Good/right way is "-" -> library solution -> "+" . Language features are fundamental features.
Dec 12 2012
For each platform developer have solution as library. Right way is creating something new instead cutting something that exist.Moving some of the things to from the library to the language is hard and limitating, but sometimes it worths the effort. An example: threads. C/C++ have those as external library (not speaking about standard library here), as pthreads, for example. This is very nice, but limits the degree to which the compiler is able to optimize and to check the correctness of code, since the very notion/concept of thread is alien to it. Such library can be optimized just with respect to one compiler, at most. Take Java or D: here, threads are part of the language/standard library. Compiler *knows* about them and about what can do with them. There is a trade-off. (This issue is a bit off-topic, but it shows why it is important that some things should be *standard*)
Dec 12 2012
Thread (and etc) is a high level abstraction that requires a support by hardware/software/instruction set. If it necessary, library can be integrated to language. And it's another one question about design.
Dec 12 2012
Thread (and etc) is a high level abstraction that requires a support by hardware/software/instruction set.Not only. First of all, it requires that the compiler *knows* and *understands* the concept of thread. This is why C mimicking C++ will *never* get as fast as a true C++ compiler, for the latter *knows* what a class is and wat to expect from it, what are the goals and the uses of such concept. The same stands for any other thing. The ideea is: conceptualization. A compiler that does not know what a class is will only partially optimize, if any. It is a blind compiler.
Dec 12 2012
On Wednesday, 12 December 2012 at 21:51:00 UTC, Michael wrote:Thread (and etc) is a high level abstraction that requires a support by hardware/software/instruction set.And you can do happily multi-threading on a single processor, with no parallelization and so on. It is just time-slicing. This could be implemented at many levels: at the hardware level, at the OS level, but also at the compiler level (through a runtime).
Dec 12 2012