www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Request for new types: int?, long?, etc. like .NET 2.0

reply MicroWizard <MicroWizard_member pathlink.com> writes:


The new CLR has expanded possibilities for value types which I've found
interesting. Maybe it can be introduced to D as well.

"int?" is a type like "int" but "null" is allowed signing, the variable is not
initialized or is out of range.

In C and C++ I have often faced the problem, I needed to spare a value of an
integral type to sign the value is out of range. Reserving zero or -1 is always
a pain. I had to take care of assignments comparison, etc.

To box it into an Object like .NET does seems to me a nightmare, much better it
can be handled like ranges (as it was mentioned here before).

To solve the problem the compiler could allocate a value for "uninitialized" for
every '?' types reducing the acceptable range by one value for these types. Also
the compiler could manage the assignments and comparisons, not to cross range
boundaries.

For me it looks easy to implement. But I am curious also.
Please let me know your thoughts!

Tamás Nagy
Dec 19 2005
next sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
MicroWizard wrote:

 To solve the problem the compiler could allocate a value for "uninitialized"
for
 every '?' types reducing the acceptable range by one value for these types.
Also
 the compiler could manage the assignments and comparisons, not to cross range
 boundaries.
D uses "nan" for the floats, ... "\xFFFF" for code units and ... (drum roll) ... "0" for ints. :-)
 In C and C++ I have often faced the problem, I needed to spare a value of an
 integral type to sign the value is out of range. Reserving zero or -1 is always
 a pain. I had to take care of assignments comparison, etc.
I think you will need to continue to reserve those values in D too. Not perfect, perhaps, and not really consistent either. But anyway... --anders
Dec 19 2005
parent reply MicroWizard <MicroWizard_member pathlink.com> writes:
D uses "nan" for the floats, ...
"\xFFFF" for code units and ...
(drum roll) ... "0" for ints. :-)
nan is fine for any kind of floats.
 In C and C++ I have often faced the problem, I needed to spare a value of an
 integral type to sign the value is out of range. Reserving zero or -1 is always
 a pain. I had to take care of assignments comparison, etc.
I think you will need to continue to reserve those values in D too. Not perfect, perhaps, and not really consistent either. But anyway...
Yep. The problem is not that I can not solve the problem. I onlye feel there should be a nicer way. Like we use unions and structs in D without bitfields and we have bit arrays with their clumsy behaviours. They work now but we still want they to work better ;-) (((Or at least my not so humble being :-))) So I think it would be nice to have something clear solution like 'nan' for integral types too. Tamás Nagy
Dec 19 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
MicroWizard wrote:

D uses "nan" for the floats, ...
"\xFFFF" for code units and ...
(drum roll) ... "0" for ints. :-)
nan is fine for any kind of floats.
I suppose, but I still think it would have been much more consistent to make all the values start out as zero. Including float/char, too ?
 So I think it would be nice to have something clear solution like 'nan' for
 integral types too.
Just to make *all* types have useless .init's, one could use "-1" ? But we would still need three-valued* boolean, sorry "bit", values... --anders * true, false, unknown
Dec 19 2005
parent reply Sean Kelly <sean f4.ca> writes:
Anders F Björklund wrote:
 MicroWizard wrote:
 
 D uses "nan" for the floats, ...
 "\xFFFF" for code units and ...
 (drum roll) ... "0" for ints. :-)
nan is fine for any kind of floats.
I suppose, but I still think it would have been much more consistent to make all the values start out as zero. Including float/char, too ?
I'm normally a huge proponent for consistency, but in this case I prefer the D initialization method. I only wish there were hardware supported trap values for integers as well.
 Just to make *all* types have useless .init's, one could use "-1" ?
 But we would still need three-valued* boolean, sorry "bit", values...
-1 is still a valid integer value. At least 0 is what we typically want to default initialize ints to so the current default isn't quite as likely to introduce hard to find bugs. As for bit and such, were there an integral trap value then bit should be initialized to that, but false/0 fits in its absence. Sean
Dec 19 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Sean Kelly wrote:

 I suppose, but I still think it would have been much more consistent
 to make all the values start out as zero. Including float/char, too ?
I'm normally a huge proponent for consistency, but in this case I prefer the D initialization method. I only wish there were hardware supported trap values for integers as well.
I prefer zero and "might not have been initialized" warnings, myself... It's just that with the floating point types, then it's an "error" to use the uninitialized values (and punishable by getting NaN results), but when it comes to integers then it's suddenly OK to rely on that the default initializer is zero and use it without assigning to it ? If this is a handy feature, then couldn't floats start at zero as well ? --anders
Dec 19 2005
parent reply MicroWizard <MicroWizard_member pathlink.com> writes:
Some of us lost the point :-(

There are definitive difference between "uninitialized" and
"invalid value".
The first case can be handled by compiler policies.
The second one can't.

The question is, should we make difference between them or not?

If we want spare memory, we must accept bits as they are.
If we want to use hardware traps for floats, do the same.
For integers we have the choice what to do, and how to do.

Tamas Nagy

.
I prefer zero and "might not have been initialized" warnings, myself...
..
If this is a handy feature, then couldn't floats start at zero as well ?
..
Dec 19 2005
next sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
MicroWizard wrote:

 Some of us lost the point :-(
Several, it seems... (including me) Sorry for hijacking your thread to whine about .init values
 There are definitive difference between "uninitialized" and
 "invalid value".
 The first case can be handled by compiler policies.
 The second one can't.
 
 The question is, should we make difference between them or not?
 
 If we want spare memory, we must accept bits as they are.
 If we want to use hardware traps for floats, do the same.
 For integers we have the choice what to do, and how to do.
I don't understand what you are trying to say here... And I don't know *how* you could make such a "nullable" integer - without making it into an object like .NET/Java ? --anders
Dec 19 2005
next sibling parent Sean Kelly <sean f4.ca> writes:
Anders F Björklund wrote
 MicroWizard wrote:
 There are definitive difference between "uninitialized" and
 "invalid value".
 The first case can be handled by compiler policies.
 The second one can't.

 The question is, should we make difference between them or not?

 If we want spare memory, we must accept bits as they are.
 If we want to use hardware traps for floats, do the same.
 For integers we have the choice what to do, and how to do.
I don't understand what you are trying to say here... And I don't know *how* you could make such a "nullable" integer - without making it into an object like .NET/Java ?
I don't think you could. Unless you tried reserving a bit to indicate this, and without hardware support this doesn't seem feasible. Sean
Dec 19 2005
prev sibling parent reply =?ISO-8859-1?Q?Jari-Matti_M=E4kel=E4?= <jmjmak utu.fi.invalid> writes:
Anders F Björklund wrote:
 MicroWizard wrote:
 
 Some of us lost the point :-(
Several, it seems... (including me) Sorry for hijacking your thread to whine about .init values
 There are definitive difference between "uninitialized" and
 "invalid value".
 The first case can be handled by compiler policies.
 The second one can't.

 The question is, should we make difference between them or not?

 If we want spare memory, we must accept bits as they are.
 If we want to use hardware traps for floats, do the same.
 For integers we have the choice what to do, and how to do.
I don't understand what you are trying to say here... And I don't know *how* you could make such a "nullable" integer - without making it into an object like .NET/Java ?
IMHO it would be much better to keep things simple. Wrappers around basic types add bloat and are bad for low level programming. Reserving some values like 'int.max-1' for exceptional states works otherwise fine, but those magic values are too easily accessible, even with basic arithmetics. I find the .init-values very useful since they simplify the process of using custom types. They also reduce the amount of code written, therefore reducing the amount of potential bugs. Float NaNs are very helpful too. Isn't the idea of producing extra warnings for uninitialized types also against the philosophy of D? AFAIK all warnings should stop the compilation process. This would not be very helpful. -- Jari-Matti
Dec 19 2005
next sibling parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Jari-Matti Mäkelä wrote:

 I find the .init-values very useful since they simplify the process
 of using custom types. They also reduce the amount of code written,
 therefore reducing the amount of potential bugs. Float NaNs are very
 helpful too.
I'm not against the *concept* of .init-values, I think they're useful. Still not positive whether it's "OK" to use an unintialized D integer as having the value zero, but from the code samples it looks that way ?
 Isn't the idea of producing extra warnings for uninitialized types also 
 against the philosophy of D? AFAIK all warnings should stop the 
 compilation process. This would not be very helpful.
It is, just what I'm used to from other languages (such as old C / Java) where they do break the build. And yes, this warning is sometimes false. --anders
Dec 19 2005
prev sibling parent reply Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:
Jari-Matti Mäkelä wrote:
 Anders F Björklund wrote:
 I don't understand what you are trying to say here...

 And I don't know *how* you could make such a "nullable"
 integer - without making it into an object like .NET/Java ?
IMHO it would be much better to keep things simple. Wrappers around basic types add bloat and are bad for low level programming.
I don't think the OP was talking about wrapping basic types but adding a new one that is nullable. That way i would see int? x; as struct{ int x = int.init; bool assigned=false; } theX; if(x is null) as if(theX.assigend == false) f(x) as f(theX.x); and x=5; as theX.x=5, theX.assigned=true; The compiler could do all this work for us.
 Reserving some values like 'int.max-1' for exceptional states works 
 otherwise fine, but those magic values are too easily accessible, even 
 with basic arithmetics.
I agree.
 
 I find the .init-values very useful since they simplify the process of 
 using custom types. They also reduce the amount of code written, 
 therefore reducing the amount of potential bugs. Float NaNs are very 
 helpful too.
 
Agree also.
Dec 19 2005
parent reply MicroWizard <MicroWizard_member pathlink.com> writes:
 And I don't know *how* you could make such a "nullable"
 integer - without making it into an object like .NET/Java ?
As I (and otherrs also) have formerly written: with ranges. (Or magic values :-) 'byte?' could be stored as a simple byte, no overhead, but the value 0x80 is reserved to store null value. Every assignment can be tracked by the compiler with compile-time and/or run-time checks. Arithmetic operations must not result 0x80. Original range for 'byte' is 0x80-0x7f (-128 - +127) Range for 'byte?' is 0x81-0x7f (-127 - +127) For 'ubyte' the reserved value 0xff is more comfortable. For 'enum A(x=0,y=1,z=2)' 3 is convenient. Just out of the range. But this is an idea only. If we spend some CPU time, we could get this (reserved) version, if we spend some storage we could box integers into objects.
 Wrappers around basic types add bloat and are bad for low level 
 programming.
Yes. This is the problem with .NET implementation. But the possibility is not a must. I would like to have a choice ;-)
I don't think the OP was talking about wrapping basic types but adding a 
new one that is nullable.
Yes. 'int' is fast, 'int?' is convenient.
int? x; as struct{ int x = int.init; bool assigned=false; } theX;
...
The compiler could do all this work for us.
Smart idea.
 Reserving some values like 'int.max-1' for exceptional states works 
 otherwise fine, but those magic values are too easily accessible, even 
 with basic arithmetics.
No. Compiler could take care. Like does at array bounds. (And should do at dealing with enums ;-) <off topic> Is there any movement to make enums really typesafe? I mean, avoid to assign any invalid integer value by range check or some more sofisticated check. </off topic>
 I find the .init-values very useful since they simplify the process of 
 using custom types. They also reduce the amount of code written, 
 therefore reducing the amount of potential bugs. Float NaNs are very 
 helpful too.
Yes. Tamás Nagy
Dec 19 2005
parent reply Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:
MicroWizard wrote:
I don't think the OP was talking about wrapping basic types but adding a 
new one that is nullable.
Yes. 'int' is fast, 'int?' is convenient.
int? x; as struct{ int x = int.init; bool assigned=false; } theX;
...
The compiler could do all this work for us.
Smart idea.
Reserving some values like 'int.max-1' for exceptional states works 
otherwise fine, but those magic values are too easily accessible, even 
with basic arithmetics.
No. Compiler could take care. Like does at array bounds. (And should do at dealing with enums ;-)
Ok i see now. Make a special value that tells us if the variable has been assigned to. But what about something like struct Point{float x, y, z;} Point? getOrNotAPoint() { is(something)return null; else return Point(1,2,3); } How would compiler implement that? The solution with ranges works great for numeric types, but what about other types? What if i want int[]? where is that bit of information stored.
 
 <off topic>
 Is there any movement to make enums really typesafe?
 I  mean, avoid to assign any invalid integer value by range check or
 some more sofisticated check.
 </off topic>
 
:)
Dec 20 2005
parent reply MicroWizard <MicroWizard_member pathlink.com> writes:
Ok i see now. Make a special value that tells us if the variable has 
been assigned to. But what about something like

struct Point{float x, y, z;}

Point? getOrNotAPoint()
{
   is(something)return null;
   else return Point(1,2,3);
}

How would compiler implement that?
No easy way. I do not think that these '?' problem could be ever solved generally for every types. I can't see general solution for structs since they can be used without references (which can be null).
The solution with ranges works great for numeric types, but what about 
other types?
Yes. It works only for integral types. BUT. For an object does this problem really exists? No. They can be 'null'.
What if i want int[]? where is that bit of information stored.
Boxing into Object should work. Not to nice for arrays :-( I am not sure but I remember as dynamic arrays work a bit similar as objects, so maybe they can be 'null' as well. If it is true the problem is not a problem. Tamás Nagy
Dec 20 2005
parent reply Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:
MicroWizard wrote:
Ok i see now. Make a special value that tells us if the variable has 
been assigned to. But what about something like

struct Point{float x, y, z;}

Point? getOrNotAPoint()
{
  is(something)return null;
  else return Point(1,2,3);
}

How would compiler implement that?
No easy way. I do not think that these '?' problem could be ever solved generally for every types. I can't see general solution for structs since they can be used without references (which can be null).
Yes easy way :) The problem could be solved generally if another byte or 4bytes were added to the normal type. I see someType? being to someType something simillar as someType[] is to someType* adding a bit of information (length in dynamic arrays and assigned flag in ? types)
 
The solution with ranges works great for numeric types, but what about 
other types?
Yes. It works only for integral types. BUT. For an object does this problem really exists? No. They can be 'null'.
No doesn't really exist for objects.
 
 
What if i want int[]? where is that bit of information stored.
Boxing into Object should work. Not to nice for arrays :-(
Not nice at all.
 I am not sure but I remember as dynamic arrays work a bit similar as objects,
 so maybe they can be 'null' as well. If it is true the problem is not a
problem.
 
IMO dynamic arrays are objects (and IMO static arrays should also be to make them runtime allocatable, but that is another discussion), but i remember there was a discussion about that some time ago and i can't remember what the conclusion was but i think that in D an empty array and null array are the same unless you reduce the length of an alocated array to 0, and then it isn't 0 anymore. I may be wrong about what i said but it was something as equaly confusing. IMO there should be a way to tell apart 0 length and null arrays.
Dec 20 2005
parent reply Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:
Ivan Senji wrote:
 IMO dynamic arrays are objects (and IMO static arrays should also be to 
 make them runtime allocatable, but that is another discussion), but i 
 remember there was a discussion about that some time ago and i can't 
 remember what the conclusion was but i think that in D an empty array 
 and null array are the same unless you reduce the length of an alocated 
 array to 0, and then it isn't 0 anymore.
 
 I may be wrong about what i said but it was something as equaly confusing.
 
 IMO there should be a way to tell apart 0 length and null arrays.
Replying to myself: I did a little test and it seems that arrays work as expected. Array is null if it's length is 0, but for example although length of "" is 0 that char[] isn't null. Makes sense.
Dec 20 2005
parent "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 20 Dec 2005 16:02:16 +0100, Ivan Senji  
<ivan.senji_REMOVE_ _THIS__gmail.com> wrote:
 Ivan Senji wrote:
 IMO dynamic arrays are objects (and IMO static arrays should also be to  
 make them runtime allocatable, but that is another discussion), but i  
 remember there was a discussion about that some time ago and i can't  
 remember what the conclusion was but i think that in D an empty array  
 and null array are the same unless you reduce the length of an alocated  
 array to 0, and then it isn't 0 anymore.
  I may be wrong about what i said but it was something as equaly  
 confusing.
  IMO there should be a way to tell apart 0 length and null arrays.
Replying to myself: I did a little test and it seems that arrays work as expected. Array is null if it's length is 0, but for example although length of "" is 0 that char[] isn't null. Makes sense.
I think you're looking at it backwards :) A null array has a null data pointer, an empty array has a non-null data pointer. Both have a length of 0. So, the difference is the data pointer, not the length, the length is the same in both cases. To get an empty char array you can assign "" to it, i.e. char[] a = ""; or in the general case slice a 0 length section from another array i.e. int[] a; a ~= 1; int[] b = a[0..0]; The problem is that if you then set the length to 0, the data pointer is free'd or set to null. This means the empty array becomes a null array, which makes the distinction inconsistent and therefore less useful (and I believe it _is_ very useful). The solution is to keep the distinction when setting the length to 0. Changes in length should never set the data pointer to null. Off the top of my head the only operation that should set the data pointer to null is: b = null; Regan
Dec 20 2005
prev sibling parent Sean Kelly <sean f4.ca> writes:
MicroWizard wrote:
 Some of us lost the point :-(
 
 There are definitive difference between "uninitialized" and
 "invalid value".
 The first case can be handled by compiler policies.
 The second one can't.
I think Walter would argue that compiler policies for this sort of thing are inexact, and thus should be avoided. Though others might argue that some checking is better than none at all. This seems to be a simple difference in philosophies, and Walter's wins out in this case :-)
 The question is, should we make difference between them or not?
I'll admit that I'm kind of on the fence with this issue. It's very easy to become reliant on the integer default, and that does occasionally make me wish for zero default initialization for all primitives. But I still like the motivation behind NaN initialization and would even be willing to live with a change in the integer default if such a trap value comes available. So overall, I think it's slightly more useful to have trap initialization whenever possible, as it really does make the occasional initialization error easier to track down. Sean
Dec 19 2005
prev sibling next sibling parent Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:
MicroWizard wrote:

 
 The new CLR has expanded possibilities for value types which I've found
 interesting. Maybe it can be introduced to D as well.
 
 "int?" is a type like "int" but "null" is allowed signing, the variable is not
 initialized or is out of range.
 
 In C and C++ I have often faced the problem, I needed to spare a value of an
 integral type to sign the value is out of range. Reserving zero or -1 is always
 a pain. I had to take care of assignments comparison, etc.
 
 To box it into an Object like .NET does seems to me a nightmare, much better it
 can be handled like ranges (as it was mentioned here before).
 
 To solve the problem the compiler could allocate a value for "uninitialized"
for
 every '?' types reducing the acceptable range by one value for these types.
Also
 the compiler could manage the assignments and comparisons, not to cross range
 boundaries.
 
 For me it looks easy to implement. But I am curious also.
 Please let me know your thoughts!
 
I think the something? types are an interesting concept but i doubt it will get into D soon. As for being hard to implement, i don't think it would be that hard. And it would certainly be usefull.
Dec 19 2005
prev sibling next sibling parent reply "Walter Bright" <newshound digitalmars.com> writes:
"MicroWizard" <MicroWizard_member pathlink.com> wrote in message 
news:do6fq4$bgv$1 digitaldaemon.com...
 To solve the problem the compiler could allocate a value for 
 "uninitialized" for
 every '?' types reducing the acceptable range by one value for these 
 types. Also
 the compiler could manage the assignments and comparisons, not to cross 
 range
 boundaries.
You could do this: typedef int myint = cast(int)0x80000000; which creates your own int type that default initializes to the smallest negative value (often used as a sort of 'nan' for ints).
Dec 19 2005
parent MicroWizard <MicroWizard_member pathlink.com> writes:
Forget the word 'initialize'. To have a non-zero initialization value is only
the smallest part of the idea of 'int?'

These 'new' types make the life easier. If a function returns them
you do not have to throw/catch exceptions and do not have to deal
with magic values. The 'D system' compiler/runtime deals with it.
Convenient only, as I've written.

In article <do7cn2$1q2i$2 digitaldaemon.com>, Walter Bright says...
You could do this:

    typedef int myint = cast(int)0x80000000;

which creates your own int type that default initializes to the smallest 
negative value (often used as a sort of 'nan' for ints). 
It does nothing with range check. Nice but has not related to the original posting. Sorry. Maybe my post was not detailed enough therefore misleading. Tamás Nagy
Dec 19 2005
prev sibling next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 "int?" is a type like "int" but "null" is allowed signing, the variable is 
 not
 initialized or is out of range.
Hi, TamAs While ago I wrote an article for CodeProject about auto values (not those 'auto's) http://www.codeproject.com/cpp/value_t.asp The idea is simple - for each numeric type as a rule you can choose one value which will serve as 'undefined' or 'null' value. It is in C++ and in D it is possible to do something close. Implementation is simple and takes almost nothing as you may see. Andrew Fedoniouk. http://terrainformatica.com
Dec 19 2005
parent reply MicroWizard <MicroWizard_member pathlink.com> writes:
Hi Andrew!

While ago I wrote an article for CodeProject about auto values (not those 
'auto's)
http://www.codeproject.com/cpp/value_t.asp

The idea is simple - for each numeric type as a rule you can choose one 
value
which will serve as 'undefined'  or 'null' value.
Thanks, it is perfectly what I've missed from D.
It is in C++ and  in D it is possible to do something close.
I hope I could do it in D easily.
Implementation is simple and takes almost nothing as you may see.
Yep. The idea of 'val' is cloudy for me, and the implementation also. Please do not be angry, I admire your helpfulness and your code, but: Sorry to say, it is typical IT article: the write understands it and therefore he thinks it is all clear. No. I (and I think I am not alone) need some explanation right at the point where you have written the word 'idea'. Why are these 'val' stuff? What is the problem with inheritance? A detailed example would be nice. Again, thanks a lot. The template implementation with overloaded operators is THE idea for me. The actual implementation in D won't be too hard. Tamás Nagy
Dec 20 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 Why are these 'val' stuff? What is the problem with inheritance?
 A detailed example would be nice.
Agree that it is not obvious. Reason is simple: when you are dealing with variables allowing undetermined state at some point you need determined values: So instead of length_v margin; .... int m; if( margin.undefined ) m = 10; you can write: m = length_v(margin, 10); Andrew.
Dec 20 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
Errata:

Instead of
 m = length_v(margin, 10);
read: m = length_v::val(margin, 10);
Dec 20 2005
parent MicroWizard <MicroWizard_member pathlink.com> writes:
Now it is much clearer. Thank you.

Tamás

In article <do9m8t$210g$1 digitaldaemon.com>, Andrew Fedoniouk says...
Errata:

Instead of
 m = length_v(margin, 10);
read: m = length_v::val(margin, 10);
Dec 21 2005
prev sibling parent Dawid =?UTF-8?B?Q2nEmcW8YXJraWV3aWN6?= <dawid.ciezarkiewicz gmail.com> writes:
MicroWizard wrote:


Yeah.
 "int?" is a type like "int" but "null" is allowed signing, the variable is
 not initialized or is out of range.
the language that they once was told about (first year of college and it was C, after they left college as they boss earlier did) and so they used '?' thinking they gonna change it when they will find out which one character it was. But I remember which char was it! I hope the behavior you would like is accessible using '*' sign. What you want is: int* i; // here i is uninitialized i = new int; // here i is initialized // you can use *i like this: *i = 2 + *i; // (I guess you can even use alias or mixin to forget about '*') i = null; // here i is again uninitialized Hmmm. We could even send a letter to M$ to help those people in they quest.
Dec 20 2005