digitalmars.D - Bits again. A proposal.
- larrycowan (73/73) Oct 14 2004 One more time...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (18/29) Oct 14 2004 Note: This D suggestion has been rejected by Walter.
- larry cowan (15/30) Oct 14 2004 I know that. Not just once, many times. Still, too many peole have the...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (13/19) Oct 14 2004 Well, perhaps not. But it does make it familiar to many ?
- David L. Davis (6/8) Oct 14 2004 No, I think it was Al Gore who raised his hand first and took the credit...
- larrycowan (14/33) Oct 14 2004 So is the preprocessor.
- Sjoerd van Leent (4/14) Oct 14 2004 As a matter of fact, while MS was doing a presentation on Win95, they
- Ben Hinkle (15/20) Oct 14 2004 These definitions can go in object.d where "alias bit bool" already exis...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (18/41) Oct 14 2004 I think they should go together...
- Sean Kelly (9/19) Oct 14 2004 This is a bit weird now that I think about it. If "bit" is supposed to
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (6/11) Oct 14 2004 This is not always a good thing. Sometimes "char" or "int" are
- Sean Kelly (11/21) Oct 14 2004 It seems that one design aspect of D is that all primitive types have
- larrycowan (32/58) Oct 14 2004 What? I can't think of any logic instructions using more cpu cycles tha...
- Sean Kelly (19/46) Oct 14 2004 This was the alternative suggestion. Move to "bool" which is always one...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (11/20) Oct 14 2004 I just know that in my GCC 3.4, sizeof(bool) equals sizeof(int) ?
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (27/36) Oct 15 2004 More bit / bool inconsistencies:
- Charles Hixson (8/14) Oct 15 2004 But currently D's type IS bit. bool is an alias for convenience
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (19/29) Oct 16 2004 And I think that is just plain wrong. It should be the other way around.
- Charles Hixson (23/39) Oct 17 2004 Ah. To me NEITHER is an integer type.
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (9/14) Oct 17 2004 Back at the gates and voltages, it seems... (feels like a kid again)
- Charles Hixson (10/30) Oct 17 2004 Well, in Ada one could define a type (as opposed to a sub-type) and
- Derek Parnell (9/18) Oct 17 2004 Come to think of it, the only time I really use bits is when I'm mapping
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (28/34) Oct 16 2004 I'm not sure how I would use that ?
- Ben Hinkle (9/32) Oct 16 2004 It works fine if you replace
- =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= (4/12) Oct 16 2004 Argh, you are right... Thinking in C, I guess. (or not at all)
- Sean Kelly (4/6) Oct 16 2004 They are on x86 hardware anyway. I would be surprised if this were
- =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= (6/10) Oct 17 2004 This was on a big-endian machine... But I only meant within the
- Ben Hinkle (2/15) Oct 16 2004 Looks like bit arrays are packed into ints not bytes - so the sizeof wil...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (11/28) Oct 16 2004 Not at all, I guess...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (23/63) Oct 17 2004 It wasn't obvious to me what the sizes were,
- Sean Kelly (5/11) Oct 17 2004 Yup. There have been some proposals for addressing this issue (no pun
- Andy Friesen (13/16) Oct 14 2004 bit should probably be done away with entirely: the only thing that
- Charles Hixson (16/38) Oct 14 2004 ?????
- Andy Friesen (8/28) Oct 15 2004 I wasn't arguing that bitsets should be eradicated from existence. I
One more time... C has it wrong in several ways. I have programmed C for over 20 years. Bools are bits, but bits are not only boolean. Arithmetics, characters, and addresses are not boolean. Comparisons produce a boolean value. Bits are 1:0, true:false, yes:no, flags, switches, checkmarks, ... Bits are not useful as numbers for arithmetic operations. Their arrays should not automatically become so. I have no problem with numbers and characters being subject to logic operations, but bits are not directly useful for arithmetic operations. Perhaps shifting of bit arrays could be useful when used as a rotating mask or some such. We have that with slicing of arrays as well as selection out of contiguous groups of flags or switches. Comparative operations result in a true:false value entirely interpretable as a bit. They should be valid in the right hand side of an assignment or initialization for bits, but not numerics. Casting of a bit array as a numeric to be able to compare it to a number does make some sense and could be implicit. Assembling a number in a bit array and then casting it is less useful, but maybe. We do have logic operations directly on numerics and characters to do anything that may be needed there. Explicit casting of a numeric or character as a bit array is probably necessary for analytic purposes, but should never be implicit. Explicit casting of a bit array as a character or numeric should probably be handled by unions. I propose the following: 1. All usage of numerics, characters, and addresses as boolean values should be rejected by the compiler. The need for === as well as == should blow away use of addresses, and the evaluation of true as "anything not composed fully of zero bits" is just nasty. 2. All usage of bits and bit arrays for numeric operations (except comparison as allowed below) should be rejected by the compiler. Logic operations suffice and are more descriptive and are well-defined. 3. Comparisons between numbers and characters with bits and bit arrays are well-defined and should be supported with implicit casting. High order 0 bits should be assumed as necesssary for compatible unit sizes. 4. No other implicit casts should be injected back and forth. 5. Shift operations on bit arrays are unnecessary and should not be supported except by slicing and concatenating. The compiler should be able to optimize down to minimal operations when appropriate. 6. There should be no need for compiler requirements of either packed or not-packed bit arrays or arrays of bit arrays. Structs and arrays of structs can fully describe anything there (see below). Implementation should be left to the compiler vendor for this. It's tied too closely to optimization. 7. Structs with successive bits and bit arrays should be packable down to minimum size|width. Offsetting can be accomplished explicitly with dummy bits. Bytes, shorts, ints, chars, etc should not intrude - they have their own boundary considerations which should still be met with minimum of modulo-8 bit addresses. It would be nice if placeholder bits and bytes did not require names, but that's just syntactic sugar. 8. Unions which include bits and|or bit arrays can fill in the rest, and probably should be used rather than casting bit arrays elsewhere anyway.. It therefore would be possible to disallow even explicit casting back and forth between logic and other representations but I'm not proposing this. 9. Add on:off as keywords identical to true:false. Their valuation as 1:0 is consistent with most other languages and programmatic use (though often allowed as any:0 instead), so keep that. 10. Fix that C/C++ code. It won't hurt to check the logic anyway - it may not be exactly the same when pointers and references are considered properly. While(1) {} and for(;;) {} both can easily be changed to while(true) {} and be more safely readable. [ We have "foreach" now, we could add "forever". I like the "forever break;" statement as a no-op. ] We have already blown away "for(...;...;...) ;" for good reasons, so there's some precedent for changes which require actually looking at C code (if you can find any without preprocessor use) before D compiling it. 11. Let's please take at least this step toward making type safety possible. ------------------- Another thread should take up the ignorance of C about arithmetic and logical overflows and underflows and propose something there. (Not necessarily unaware, but no provision for access and handling.) The hardware knows, why can't the programmer?. ints uints, oints, ouints? smart operations? an _ or an standard variable? this gets tangled with the problems of actual vs. potential lossy casts as well. And don't tell me to use assembler when I want this... Do not reply to this in this thread, please. -------------------- 6.
Oct 14 2004
larrycowan wrote:One more time...New thread, old discussion...I propose the following: 1. All usage of numerics, characters, and addresses as boolean values should be rejected by the compiler. The need for === as well as == should blow away use of addresses, and the evaluation of true as "anything not composed fully of zero bits" is just nasty.Note: This D suggestion has been rejected by Walter. D currently follows the same "nasty" rules as C/C++.9. Add on:off as keywords identical to true:false. Their valuation as 1:0 is consistent with most other languages and programmatic use (though often allowed as any:0 instead), so keep that.Hear, hear! This is more or less the same thing as I suggested earlier, keywords to the equivalent effect of the following #defines: const bit on = 1; const bit off = 0; Preferrably, a "bool" type (like C++) can be added to D? Barring that, the same kludge as in C99 would have to do: module std.stdbool; alias bit bool; const bool true = 1; const bool false = 0;11. Let's please take at least this step toward making type safety possible.The first step would be *having* two different types for bits and bools? --anders
Oct 14 2004
In article <ckm861$2nng$1 digitaldaemon.com>, =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...larrycowan wrote:I know that. Not just once, many times. Still, too many peole have the opinion that this should change.One more time...New thread, old discussion...I propose the following: 1. All usage of numerics, characters, and addresses as boolean values should be rejected by the compiler. The need for === as well as == should blow away use of addresses, and the evaluation of true as "anything not composed fully of zero bits" is just nasty.Note: This D suggestion has been rejected by Walter.D currently follows the same "nasty" rules as C/C++.So? Does that make it good?..What justification is there for a bit type that is not only the boolean? Are you sure you don't want signed and unsigned bits? You could use the signed bit in a union with the high bit of an unsigned int to pretend it was signed and invent your own arithmetic. And an unsigned bit similarly to see if you were vulnerable to overflows... No - we don't need bits for arithmetic! or to build utf-13 character sets with! My proposal does not give us type safety (though it would greatly help), but it surely does make it less disruptive to provide it later.11. Let's please take at least this step toward making type safety possible.The first step would be *having* two different types for bits and bools? --anders
Oct 14 2004
larry cowan wrote:Well, perhaps not. But it does make it familiar to many ? Just like you, I had hoped for D to improve upon this - like it improves on other things that are "old" in C.D currently follows the same "nasty" rules as C/C++.So? Does that make it good?Isn't it? Right after they invented the Internet ? :-) "as in Java", but they chose the "boolean" keyword instead ? It's not convertable to the other types, like int and such. I wrote a long essay about the now somewhat boring subject in: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/11757 --anders
Oct 14 2004
Anders wrote: Isn't it? Right after they invented the Internet ? :-)No, I think it was Al Gore who raised his hand first and took the credit for inventing the Internet...soon after he learned how to spell "potato." :) Sorry, I just couldn't help myself. David L. ------------------------------------------------------------------- "Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"
Oct 14 2004
In article <ckmauc$2qpk$1 digitaldaemon.com>, =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...larry cowan wrote:So is the preprocessor.Well, perhaps not. But it does make it familiar to many ?D currently follows the same "nasty" rules as C/C++.So? Does that make it good?Just like you, I had hoped for D to improve upon this - like it improves on other things that are "old" in C.No, I invented it. I just couldn't spare the time to implement it.Isn't it? Right after they invented the Internet ? :-)"as in Java", but they chose the "boolean" keyword instead ? It's not convertable to the other types, like int and such.A boolean class would provide whatever we wanted however we wanted it, but not as efficiently most likely. But that doesn't kill the bad C'isms we are carrying forward in regard to expression defaults based on "any nozero is true" concepts. I don't even want "bit[] xa; ...; if (xa) {}" to be valid as "if any bit in xa is non-zero, then...".I wrote a long essay about the now somewhat boring subject in: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/11757 --andersI read it (and carried forward copies of it). I have been around since last Feb, and have written a good bit of D code. I just don't post much. Did you know D code can randomly deal and find the winner in over 20,000 7-player hands of Texas Hold'em per second keeping stats? (my 700mHz w2k laptop) Makes Monte Carlo a fun thing. 5 times that fast to deal and evaluate 1-player hands.
Oct 14 2004
Anders F Björklund wrote:larry cowan wrote:As a matter of fact, while MS was doing a presentation on Win95, they clearly stated that they didn't believe in the Internet. So they did sure *NOT* invent it!Isn't it? Right after they invented the Internet ? :-)
Oct 14 2004
[snip]Barring that, the same kludge as in C99 would have to do: module std.stdbool; alias bit bool; const bool true = 1; const bool false = 0;These definitions can go in object.d where "alias bit bool" already exists. That way "true" and "false" can be removed as keywords and the code in src/dmd/parse.d case TOKtrue: e = new IntegerExp(loc, 1, Type::tbit); nextToken(); break; (similarly for TOKfalse) can be removed. If someone redefines true then the old true can be obtained by cast(bit)1. The only reason I can see for keeping true/false as keywords is to help editors that do syntax highlighting of keywords. But that is a pretty weak reason. Another possible reason is that it makes the strict boolean people a tad happier to have true/false as keywords. [snip]
Oct 14 2004
Ben Hinkle wrote:The only reason I can see for keeping true/false as keywords is to help editors that do syntax highlighting of keywords. But that is a pretty weak reason. Another possible reason is that it makes the strict boolean people a tad happier to have true/false as keywords.I think they should go together... Either bool/true/false are *all* keywords (as C++), or none are (as C). And since I think they're good, best would be to add the missing bool ? The strange part is that "bit" does have some pseudo-boolean qualities. For instance, it behaves strange when casted from a bigger size integer:void main() { uint i = cast(uint) 0xFFFFFFFF00000000; short s = cast(ushort) 0xFFFF0000; ubyte b = cast(ubyte) 0xFF00; bit t = cast(bit) 0xFE; printf("%ld\n", i); printf("%d\n", s); printf("%d\n", b); printf("%d\n", t ? 1 : 0); }Also shown by:void main() { bit b; for (int i = -2; i <= +2; i++) { b = cast(bit) i; printf("%+d = %.*s\n", i, b ? "true" : "false"); } }-2 = true -1 = true +0 = false +1 = true +2 = true That sounds like boolean (i != 0) and not like a bit (i & 1) to me! This, and the fact that "true" and "false" were of the bit type makes me think that the built-in D type bit really is our long lost *bool* type ? And that bit isn't an integer type at all, but instead a boolean type... Which makes it even more puzzling why the name "bit" was chosen for it ? --anders
Oct 14 2004
In article <ckmqnk$8c8$1 digitaldaemon.com>, =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...-2 = true -1 = true +0 = false +1 = true +2 = true That sounds like boolean (i != 0) and not like a bit (i & 1) to me!This is a bit weird now that I think about it. If "bit" is supposed to represent a bit, then shouldn't 2 be false, 3 be true, etc? I grant that this would likely be a significant source of bugs, but logic suggests that bit should behave the same as any other unsigned integer type.This, and the fact that "true" and "false" were of the bit type makes me think that the built-in D type bit really is our long lost *bool* type ? And that bit isn't an integer type at all, but instead a boolean type... Which makes it even more puzzling why the name "bit" was chosen for it ?Just about. The only reason I can think to call it "bit" is that it implies a storage size (which is accurte in arrays). Sean
Oct 14 2004
Sean Kelly wrote:This is not always a good thing. Sometimes "char" or "int" are preferrable over "bit", for implementation performance reasons. If "bool" was kept as an *abstract* concept, the compiler could then chose a representation that was optimal for the actual task ? --andersAnd that bit isn't an integer type at all, but instead a boolean type... Which makes it even more puzzling why the name "bit" was chosen for it ?Just about. The only reason I can think to call it "bit" is that it implies a storage size (which is accurte in arrays).
Oct 14 2004
In article <ckmv1a$dck$1 digitaldaemon.com>, =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...Sean Kelly wrote:It seems that one design aspect of D is that all primitive types have well-defined storage attributes. byte is 8 bits, int is 32 bits, etc. C/C++ make no such claims for any type. While part of me does wonder if this is going to be a problem at some point for D (there are some rare systems where a byte is not 8 bits), I think the overall benefit is a good one. And as I said in the other thread, if packing of bits in arrays were removed as a feature in D, then the token name should change to "bool." This is the principal difference in my mind. SeanThis is not always a good thing. Sometimes "char" or "int" are preferrable over "bit", for implementation performance reasons. If "bool" was kept as an *abstract* concept, the compiler could then chose a representation that was optimal for the actual task ?And that bit isn't an integer type at all, but instead a boolean type... Which makes it even more puzzling why the name "bit" was chosen for it ?Just about. The only reason I can think to call it "bit" is that it implies a storage size (which is accurte in arrays).
Oct 14 2004
In article <ckn0ii$era$1 digitaldaemon.com>, Sean Kelly says...In article <ckmv1a$dck$1 digitaldaemon.com>, =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...What? I can't think of any logic instructions using more cpu cycles than any equivalent arithmetics. If moving them, and they are already in 8-bit or 2^n multiples of this groups, the compiler can optimize the same way you want to write code to do. If you are doing it so the code looks simpler that's not performance. If bits are properly supported in arrays, the compiler will hide this complexity and it will be faster than using char or ints in place of bits.Sean Kelly wrote:This is not always a good thing. Sometimes "char" or "int" are preferrable over "bit", for implementation performance reasons.And that bit isn't an integer type at all, but instead a boolean type... Which makes it even more puzzling why the name "bit" was chosen for it ?Just about. The only reason I can think to call it "bit" is that it implies a storage size (which is accurate in arrays).1, 4, 9, 10, 12, 16, 18, ... and IBM 9000 series 10-digit words with no bits. I think we can dismiss these for our purposes - they are either obsolete or special purpose cpus, or are likely to have too small a memory to remember D, or C, (tiny C? - possibly).If "bool" was kept as an *abstract* concept, the compiler could then chose a representation that was optimal for the actual task ?It seems that one design aspect of D is that all primitive types have well-defined storage attributes. byte is 8 bits, int is 32 bits, etc. C/C++ make no such claims for any type. While part of me does wonder if this is going to be a problem at some point for D (there are some rare systems where a byte is not 8 bits),I think the overall benefit is a good one. And as I said in the other thread, if packing of bits in arrays were removed as a feature in D, then the token name should change to "bool." This is the principal difference in my mind. SeanWhat is wrong with packing of bits in arrays? Please explain. Flags are essentially boolean units and are quite commonly stored 8 to a byte. If you are saying they should have an actual size which includes many unused bits, it seems to me that this must be because addressing of these is not unique without a bit number in a byte. Slicing could also require shifts, perhaps of more than a register size. To get a set of 16 contiguous flags I should have to have an array of two packed structs with 8 individual bits and 8 different bit-position names? At this point I'd just do away with bit arrays entirely. (I wouldn't ever do this, I would just use logic ops to handle the bits individually and forget bit arrays.) I can't ever see a need to have 1 bit per byte bit arrays rather than byte arrays using only the low order bit. There are machine instructions which test or set individual bits, the only problem I see is addressing consistency in the addressing between bits and all larger uniquely addressible (using only the address field of instructions) units such as bytes, chars, longs, etc. This cannot be done away with the way you apparently want without losing more than we gain. All the above is not to say that there aren't problems in the way our compiler and other C-derived compilers handle bits and bit arrays. Addressing has always been a kludge. Provide a good, self-consistent, understandable and complete solution and the world will thank you.
Oct 14 2004
larrycowan wrote:In article <ckn0ii$era$1 digitaldaemon.com>, Sean Kelly says...These are the reasons.I think the overall benefit is a good one. And as I said in the other thread, if packing of bits in arrays were removed as a feature in D, then the token name should change to "bool." This is the principal difference in my mind.What is wrong with packing of bits in arrays? Please explain. Flags are essentially boolean units and are quite commonly stored 8 to a byte. If you are saying they should have an actual size which includes many unused bits, it seems to me that this must be because addressing of these is not unique without a bit number in a byte. Slicing could also require shifts, perhaps of more than a register size.To get a set of 16 contiguous flags I should have to have an array of two packed structs with 8 individual bits and 8 different bit-position names? At this point I'd just do away with bit arrays entirely. (I wouldn't ever do this, I would just use logic ops to handle the bits individually and forget bit arrays.)This was the alternative suggestion. Move to "bool" which is always one byte and let a library class handle the packing when needed.I can't ever see a need to have 1 bit per byte bit arrays rather than byte arrays using only the low order bit.Agreed.There are machine instructions which test or set individual bits, the only problem I see is addressing consistency in the addressing between bits and all larger uniquely addressible (using only the address field of instructions) units such as bytes, chars, longs, etc. This cannot be done away with the way you apparently want without losing more than we gain.Thus the quandry. I personally don't have any strong preference for either side of the issue. Packed bit arrays have the potential to make (robust) template code more difficult to write, they prohibit addressing elements in bit arrays, and they impose restrictions on slicing. At the same time, they are a convenient feature, it does make logical sense to represent a two-state value as a single bit when possible, and it isn't particularly difficult for a programmer to work around the problems (I already do this quite effortlessly with vector<bool>). But as I said in the other thread: from an idealistic perspective, is the tradeoff worthwhile? Walter certainly thinks it is. I'm undecided. Others don't like it one bit :)All the above is not to say that there aren't problems in the way our compiler and other C-derived compilers handle bits and bit arrays. Addressing has always been a kludge. Provide a good, self-consistent, understandable and complete solution and the world will thank you.Definately. If there were a straightforward and robust way to address these few problems, I would be quite happy. Sean
Oct 14 2004
larry cowan wrote:I just know that in my GCC 3.4, sizeof(bool) equals sizeof(int) ? In C99/C++, the compiler can choose any representation of a "bool". (at least they standardized on a common name for it, in <stdbool.h>) Just like an "int" is allowed to between short or long, although a lot of new code just ignores 16-bit computers and then breaks if sizeof(int) is not the same as sizeof(long). Thus the <stdint.h> That being said, a "bit" looks like a perfect choice for bool if only it can have the pointer to it taken and be implemented reasonably sane. But if it walks like a bool and quaks like a bool, why not name it bool? --andersWhat? I can't think of any logic instructions using more cpu cycles than any equivalent arithmetics. If moving them, and they are already in 8-bit or 2^n multiples of this groups, the compiler can optimize the same way you want to write code to do. If you are doing it so the code looks simpler that's not performance. If bits are properly supported in arrays, the compiler will hide this complexity and it will be faster than using char or ints in place of bits.This is not always a good thing. Sometimes "char" or "int" are preferrable over "bit", for implementation performance reasons.
Oct 14 2004
I earlier wrote:That being said, a "bit" looks like a perfect choice for bool if only it can have the pointer to it taken and be implemented reasonably sane. But if it walks like a bool and quaks like a bool, why not name it bool?More bit / bool inconsistencies: http://www.digitalmars.com/d/htomodule.htmlA little global search and replace will take care of renaming the C types to D types. The following table shows a typical mapping for 32 bit C code: C type D type[...]bool inthttp://www.digitalmars.com/d/ctod.htmlC to D types bool => bitSo at one place, bool is an integer. Of 32-bits, none the less. Hoping that the C compiler chose int for bool, and not char... (the first page should probably read: bool => char or int, just like wchar_t currently says: wchar_t => wchar or dchar) In the other, bool is a bit (which now has a "boolean" cast operator) So the current "bit" type is definitely a bool, being 1 bit in size. (Ignoring whether or not it's a good thing that it converts to int) Why the type name was changed to reflect the storage is still unclear? A questionable feature is whether a language *needs* any sub-byte integer types, such as e.g. bits (1 bit) and nybbles (4 bits)... ? Possibly because they could (potentially) be useful in packed arrays to avoid having to use any bit-operators (the nybble macros are nasty). But D's practice of calling the boolean type "bit" is *not good*. The sooner it can be changed, the better! It could *work* the same. Here is one idea: rename the D keyword from "bit" back to bool again. And then change the definition in object.d to read: "alias bool bit;" Then all we need is some bool type-safety... Which could be added now, and checked later ? (i.e. start converting code, hope for D 2.0) --anders PS. Does anyone have any good usages of bit arrays they like to share? (and I am *not* talking about bool[] arrays, like in "sieve.d")
Oct 15 2004
Anders F Björklund wrote:... But D's practice of calling the boolean type "bit" is *not good*. The sooner it can be changed, the better! It could *work* the same. ...But currently D's type IS bit. bool is an alias for convenience only. And currently bit arrays are packed, and thus bit[8] is equivalent to a bit addressable byte. This can be quite useful, but since I don't know your boundaries about how you think of bit and how you think of bool, I can't claim that there isn't an overlap, but to my mind, if you care how it's packed, then it's a bit type, otherwise, bool is probably a decent label.
Oct 15 2004
Charles Hixson wrote:And I think that is just plain wrong. It should be the other way around. If it's all about storage, then we might just as well do away with all the character types too ? char => ubyte, wchar => ushort, dchar => uint No, my request is for "bool" to be a proper language keyword in D.But D's practice of calling the boolean type "bit" is *not good*. The sooner it can be changed, the better! It could *work* the same.But currently D's type IS bit. bool is an alias for convenience only.And currently bit arrays are packed, and thus bit[8] is equivalent to a bit addressable byte. This can be quite useful, but since I don't know your boundaries about how you think of bit and how you think of bool, I can't claim that there isn't an overlap, but to my mind, if you care how it's packed, then it's a bit type, otherwise, bool is probably a decent label.I think that using a single bit for a boolean is an elegant solution, even if does have a lot of pain implied on the implementation front... As for my own "boundaries", I happen to think that: -"bool" is a boolean type, that can have one of values "true" or "false" when you assign an integer to a bool, the end result is: b = (i != 0) -"bit" is an integer type, size 1 bit, that can contain numbers 1 and 0 when you assign an integer to a bit, the end result is: b = (i & 1) Some people (me included) think that integers and booleans should not be assignable at all, but that's another discussion... (about type-safety) And even if you use a "char" or an "int" to store values of type "bool", in the end it can only hold two result values: zero and non-zero... :-) Currently D has a boolean type, called "bit". And *that's* confusing. (*especially* for all C99 and C++ programmers that are used to "bool") --anders
Oct 16 2004
Anders F Björklund wrote:Charles Hixson wrote:Ah. To me NEITHER is an integer type. Bool comes out of Boolean Logic, and is either true or false. Bit comes out of Information Theory, and is the smallest individual piece of information (and thus, strictly speaking, it should be typeless). But historically arrays of flip-flops (hardwared bit representations) were ganged together to form addressable hunks of memory that were called words (I'm leaving out a lot of steps) and the words were sub-divided into characters...usually with a 6-bit/character code. IBM later changed this into a 32 bit word and 8 bit byte. But do notice that bytes are being built out of bits. So historically, there is an affinity between bits and pieces of bytes, also pieces of numbers. The bit itself is typeless. Having a type of bit is thus slightly anomolous, but not much so if you can use arrays of them to build bytes and words. (That you do this via a union is a bit peculiar, but not overwhelmingly so.) So to me neither one is an integer type. It would probably be quite reasonable if D so defined them. But bit arrays should be the packed chunks of "smallest piece of information", and have other types that are buildable from them. (I.e., bit[8] should have a to_char method, bit [16] should have a to_ushort method [possibly also a to_short method.) I know that this isn't expressible within the normal context of D, but conceptually, that's how it *OUGHT* to be.I think that using a single bit for a boolean is an elegant solution, even if does have a lot of pain implied on the implementation front... As for my own "boundaries", I happen to think that: -"bool" is a boolean type, that can have one of values "true" or "false" when you assign an integer to a bool, the end result is: b = (i != 0) -"bit" is an integer type, size 1 bit, that can contain numbers 1 and 0 when you assign an integer to a bit, the end result is: b = (i & 1) ... --anders...
Oct 17 2004
Charles Hixson wrote:Ah. To me NEITHER is an integer type. Bool comes out of Boolean Logic, and is either true or false. Bit comes out of Information Theory, and is the smallest individual piece of information (and thus, strictly speaking, it should be typeless). But historically arrays of flip-flops [..snip...]Back at the gates and voltages, it seems... (feels like a kid again) Anyway, it is still strange that "true" and "false" are of type "bit"? (I think they should either be plain 1 or 0 like in C, or type "bool") I have mostly given in and up in the bit / bool wars, just wanted the D keywords to be somewhat consistent (preferrably same as C99 or C++) And I bet you are "thrilled" that *both* are integers in D, then ? (I must confess I hadn't heard the "bit should not be int" before) --anders
Oct 17 2004
Anders F Björklund wrote:Charles Hixson wrote:Well, in Ada one could define a type (as opposed to a sub-type) and get a totally separate type. typedef doesn't seem to separate things quite as thoroughly. But I'll give up a lot for a language that's easier to use (even C++ fits here!) and has a garbage collector. (I never understood why Ada didn't include one.) I don't really worry about int, etc., which will probably bit me some day, but hasn't so far. (I tend to think of bits as being inherrently typeless...but it doesn't bother me to use bit as a boolean value. I never think of either bit or bool as integers.)Ah. To me NEITHER is an integer type. Bool comes out of Boolean Logic, and is either true or false. Bit comes out of Information Theory, and is the smallest individual piece of information (and thus, strictly speaking, it should be typeless). But historically arrays of flip-flops [..snip...]Back at the gates and voltages, it seems... (feels like a kid again) Anyway, it is still strange that "true" and "false" are of type "bit"? (I think they should either be plain 1 or 0 like in C, or type "bool") I have mostly given in and up in the bit / bool wars, just wanted the D keywords to be somewhat consistent (preferrably same as C99 or C++) And I bet you are "thrilled" that *both* are integers in D, then ? (I must confess I hadn't heard the "bit should not be int" before) --anders
Oct 17 2004
On Sun, 17 Oct 2004 17:59:53 -0700, Charles Hixson wrote:Anders F Björklund wrote:[snip]Charles Hixson wrote:Ah. To me NEITHER is an integer type. Bool comes out of Boolean Logic, and is either true or false. Bit comes out of Information Theory, and is the smallest individual piece of information (and thus, strictly speaking, it should be typeless). But historically arrays of flip-flops [..snip...]I never think of either bit or bool as integers.)Come to think of it, the only time I really use bits is when I'm mapping RAM structures. I can't think of why I'd use a bit or a bit array as a data item for anything else. I guess I don't get out as often as I should ;-) -- Derek Melbourne, Australia 18/10/2004 11:19:57 AM
Oct 17 2004
Charles Hixson wrote:And currently bit arrays are packed, and thus bit[8] is equivalent to a bit addressable byte. This can be quite useful, [...]I'm not sure how I would use that ? My first attempt crashed the compiler: void main() { union U { ubyte bite; bit[8] bits; } U.bite = 0x80; foreach (bit b; U.bits) { printf(" %d", b ? 1 : 0); } }bitarray.d: In function `main': bitarray.d:5: internal compiler error: in d_expand_expr, at d/d-glue.cc:3000The second attempt shows different sizes: void main() { ubyte bite; bit[8] bits; printf("byte: %d\n", bite.sizeof); printf("bits: %d\n", bits.sizeof); }byte: 1 bits: 4But maybe bit arrays has some other use I'm not aware of ? (and what about the "nybble" type ? nybble[2] hex_byte;) Just that it all feels so Pascal to me: "Bytes as bit sets" Isn't sub-byte manipulation what the bit operators are for ? Currently, I just view bit[] as a nice hack to store arrays of (1-bit) flags in an effective format, just as char[] is a way of storing arrays of (32-bit) Unicode code points effectively... (then again, one *could* just use ubyte[] and dchar[] too ?) --anders
Oct 16 2004
Anders F Björklund wrote:Charles Hixson wrote:It works fine if you replace union U { ubyte bite; bit[8] bits; } with union U_t { ubyte bite; bit[8] bits; } U_t U; Your code was trying to access a type like a variable. The compiler shouldn't error, though, so I'd go ahead and post that example to D.bugs so that Walter can try to fix it.And currently bit arrays are packed, and thus bit[8] is equivalent to a bit addressable byte. This can be quite useful, [...]I'm not sure how I would use that ? My first attempt crashed the compiler: void main() { union U { ubyte bite; bit[8] bits; } U.bite = 0x80; foreach (bit b; U.bits) { printf(" %d", b ? 1 : 0); } }bitarray.d: In function `main': bitarray.d:5: internal compiler error: in d_expand_expr, at d/d-glue.cc:3000
Oct 16 2004
Ben Hinkle wrote:It works fine if you replace union U { ubyte bite; bit[8] bits; } with union U_t { ubyte bite; bit[8] bits; } U_t U; Your code was trying to access a type like a variable. The compiler shouldn't error, though, so I'd go ahead and post that example to D.bugs so that Walter can try to fix it.Argh, you are right... Thinking in C, I guess. (or not at all) Also discovered that bit arrays are little-endian. (LSB first) --anders
Oct 16 2004
Anders F Björklund wrote:Also discovered that bit arrays are little-endian. (LSB first)They are on x86 hardware anyway. I would be surprised if this were preserved for big-endian machines. Sean
Oct 16 2004
Sean Kelly wrote:This was on a big-endian machine... But I only meant within the byte, that is: bit[0] sets 0x01 and bit[7] sets 0x80 of the byte... If you do things like unions or casts, then it'll probably preserve the native endian of the platform (since it just copies the bytes) --andersAlso discovered that bit arrays are little-endian. (LSB first)They are on x86 hardware anyway. I would be surprised if this were preserved for big-endian machines.
Oct 17 2004
The second attempt shows different sizes: void main() { ubyte bite; bit[8] bits; printf("byte: %d\n", bite.sizeof); printf("bits: %d\n", bits.sizeof); }Looks like bit arrays are packed into ints not bytes - so the sizeof will always be a multiple of 4. Where does this matter?byte: 1 bits: 4
Oct 16 2004
Ben Hinkle wrote:Not at all, I guess... Changing to int shows that you are right: void main() { uint bite; bit[32] bits; printf("int: %d\n", bite.sizeof); printf("bit: %d\n", bits.sizeof); }void main() { ubyte bite; bit[8] bits; printf("byte: %d\n", bite.sizeof); printf("bits: %d\n", bits.sizeof); }Looks like bit arrays are packed into ints not bytes - so the sizeof will always be a multiple of 4. Where does this matter?byte: 1 bits: 4int: 4 bit: 4--anders
Oct 16 2004
I wrote:It wasn't obvious to me what the sizes were, so I checked the current D implementation... Bit variables are stored in a byte, unless they occur in arrays. Then they are instead packed into blocks of 32 bits, for speed :Looks like bit arrays are packed into ints not bytes - so the sizeof will always be a multiple of 4. Where does this matter?Not at all, I guess...void Bits::set(unsigned bitnum) { data[bitnum / 32] |= 1 << (bitnum & 31); } void Bits::clear(unsigned bitnum) { data[bitnum / 32] &= ~(1 << (bitnum & 31)); } int Bits::test(unsigned bitnum) { return data[bitnum / 32] & (1 << (bitnum & 31)); }That should be "bitnum >> 5", but the compiler should be smart enough to optimize it away... struct bit_dynamic { bit[] bits; } struct bit_static { bit[2] bits; } struct bit_fields { bit a; bit b; }bit_dynamic.sizeof: 8(Dynamic arrays are the usual length+pointer)bit_static.sizeof: 4 bit_fields.sizeof: 2So if you union a ubyte and a bit[8], the union occupies 4 bytes. Same (!) if you union a uint and a bit[32]: 4 bytes. (ulong with bit[64] is expected 8 bytes) Pointers to bits are funny, they *do* work if you access a byte-stored single bit var - but not if you try access a single bit in an array ?void main() { static bit[32] t = 0; t[5] = 1; for (int i = 0; i < 32; i++) { bit *p = &(t[i]); printf("%d ", (*p) ? 1 : 0); } printf("\n"); static ubyte[32] b = 0; b[5] = 1; for (int i = 0; i < 32; i++) { ubyte *p = &(b[i]); printf("%d ", (*p) ? 1 : 0); } printf("\n"); }Probably for the same reasons that bit[] slices has problems, pointers only knows of whole bytes (while they need to know a 0-7 bit offset too) ? --anders
Oct 17 2004
Anders F Björklund wrote:Pointers to bits are funny, they *do* work if you access a byte-stored single bit var - but not if you try access a single bit in an array ?...Probably for the same reasons that bit[] slices has problems, pointers only knows of whole bytes (while they need to know a 0-7 bit offset too) ?Yup. There have been some proposals for addressing this issue (no pun intended) but all seemed a bit kludgy. Sean
Oct 17 2004
larrycowan wrote:One more time... [...smart things...]bit should probably be done away with entirely: the only thing that makes it useful at all right now is bit arrays, and they can easily be implemented with a struct and a few overloaded operators. Moreover, the fact that bit arrays are packed creates all sorts of special cases and warts (like the behaviour of the .sizeof property), all for a feature that is hardly ever actually put to use! (Phobos itself only uses them in the sense that it implements certain bit[] operations, like bit[].reverse) I would very much like a stricter boolean, for which no implicit conversions exist, but, either way, there isn't a very compelling reason to keep bit at all. -- andy
Oct 14 2004
Andy Friesen wrote:larrycowan wrote:????? I can see the desire for stricter booleans, and I can see arguments for limiting the ways in which bit arrays can be used (perhaps they could be required to be allocated in groups of, say, 32). But packed bit arrays are so useful that doing away with them seems...., well, just very undesireable. Limit them if you must. Make it so that slicing isn't implemented on them. Make them a library class. But don't eliminate them. I rarely want to slice a bit array, but I frequently have need for one. (One CAN get around this by masking and shifting, but that's a quite error-prone approach. At least, *I* find it quite error-prone.) OTOH, I can certainly see making it a special library class, with constructors that take, say, the other basic types, and methods that return the value as packed into an array of one (or several) of the other basic types.One more time... [...smart things...]bit should probably be done away with entirely: the only thing that makes it useful at all right now is bit arrays, and they can easily be implemented with a struct and a few overloaded operators. Moreover, the fact that bit arrays are packed creates all sorts of special cases and warts (like the behaviour of the .sizeof property), all for a feature that is hardly ever actually put to use! (Phobos itself only uses them in the sense that it implements certain bit[] operations, like bit[].reverse) I would very much like a stricter boolean, for which no implicit conversions exist, but, either way, there isn't a very compelling reason to keep bit at all. -- andy
Oct 14 2004
Charles Hixson wrote:Andy Friesen wrote: ????? I can see the desire for stricter booleans, and I can see arguments for limiting the ways in which bit arrays can be used (perhaps they could be required to be allocated in groups of, say, 32). But packed bit arrays are so useful that doing away with them seems...., well, just very undesireable. Limit them if you must. Make it so that slicing isn't implemented on them. Make them a library class. But don't eliminate them. I rarely want to slice a bit array, but I frequently have need for one. (One CAN get around this by masking and shifting, but that's a quite error-prone approach. At least, *I* find it quite error-prone.) OTOH, I can certainly see making it a special library class, with constructors that take, say, the other basic types, and methods that return the value as packed into an array of one (or several) of the other basic types.I wasn't arguing that bitsets should be eradicated from existence. I was referring merely to the fact that they are currently built into the core language itself. :) It would be very easy to write a little struct that implements the indexing and slicing operators and behaves like bit[] in pretty much every way. -- andy
Oct 15 2004