digitalmars.D.learn - size

digitalmars.D.learn - size_t index=-1;

Laeeth Isharc (2/2) Mar 16 2016 should it be a compiler warning to assign a negative literal to

Mathias Lang (2/4) Mar 16 2016 yes it should. https://issues.dlang.org/show_bug.cgi?id=3468
Jonathan M Davis via Digitalmars-d-learn (10/12) Mar 16 2016 Maybe? It's a common enough thing to do that I'm willing to bet that Wal...
Steven Schveighoffer (7/9) Mar 16 2016 Why? They implicitly convert.

Mathias Lang (7/18) Mar 16 2016 We can change it, and we should. But it should be deprecated

Steven Schveighoffer (4/20) Mar 16 2016 No, please don't. Assigning a signed value to an unsigned (and vice

Anonymouse (15/19) Mar 16 2016 I agree, but implicitly allowing for comparisons between the two

Johan Engelen (6/9) Mar 16 2016 Although I also think it makes sense to warn (in specific cases)

Mathias Lang (18/22) Mar 16 2016 I'm not talking about removing it completely. The implicit

Jonathan M Davis via Digitalmars-d-learn (24/47) Mar 16 2016 Now, you're talking about comparing signed and unsigned values, which is...

tsbockman (16/21) Mar 16 2016 Greater than 90% of the time, even in low level code, an
tsbockman (20/24) Mar 16 2016 That's me (building on Robert Schadek's work):

Steven Schveighoffer (8/27) Mar 17 2016 Converting unsigned to signed or vice versa (of the same size type) is

tsbockman (20/24) Mar 17 2016 Saying that "no information is lost" in such a case, is like

Ola Fosheim =?UTF-8?B?R3LDuHN0YWY=?= (10/15) Mar 17 2016 Only providing modular arithmetics is a significant language

tsbockman (8/11) Mar 18 2016 `ulong.max` and `-1L` are fundamentally different semantically,

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/18) Mar 19 2016 Different types implies different semantics, but not the literals

tsbockman (4/9) Mar 19 2016 Both of the literals I used in my example explicitly indicate the

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/10) Mar 19 2016 Yes, but few people specify unsigned literals and relies on them

Steven Schveighoffer (31/53) Mar 18 2016 In practice, a variable that is unsigned or signed is expected to behave...

tsbockman (13/20) Mar 18 2016 Actually, I think I confused things for you by mentioning to

Jonathan M Davis via Digitalmars-d-learn (23/25) Mar 18 2016 See. Here's the fundamental disagreement. _No_ information is lost when

tsbockman (9/19) Mar 19 2016 You do realize that, technically, there are no comparisons

Basile B. (7/10) Mar 19 2016 Yes and that's the opposite that should happend: when signed and

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/19) Mar 19 2016 I have no problem with C++ compilers complaining about

Basile B. (11/33) Mar 19 2016 FPC (Object Pascal) too, but that not a surpise since it's in the

tsbockman (13/17) Mar 19 2016 That would be reasonable. Whether it's actually faster than just

Jonathan M Davis via Digitalmars-d-learn (5/29) Mar 19 2016 And I really should have proofread this message before sending it... :(
Steven Schveighoffer (15/31) Mar 21 2016 Your definition of when "implicit casting is really a bad idea" is

tsbockman (29/43) Mar 21 2016 This logic can be applied to pretty much any warning condition or

Steven Schveighoffer (23/49) Mar 21 2016 Right, if we were starting over, I'd say let's make sure you can't make

tsbockman (19/37) Mar 21 2016 My experimentation strongly suggests that your "99.99% false

Steven Schveighoffer (24/57) Mar 21 2016 It matters not to the person who is very aware of the issue and doesn't

tsbockman (12/22) Mar 21 2016 Well that's the real problem here then, isn't it?

Marc =?UTF-8?B?U2Now7x0eg==?= (6/39) Mar 18 2016 Strictly speaking yes, but typically, an `int` isn't used as a

Laeeth Isharc <laeethnospam nospam.laeeth.com> writes:

should it be a compiler warning to assign a negative literal to 
an unsigned without a cast ?

Mar 16 2016

Mathias Lang <pro.mathias.lang gmail.com> writes:

On Wednesday, 16 March 2016 at 18:40:56 UTC, Laeeth Isharc wrote:
 should it be a compiler warning to assign a negative literal to 
 an unsigned without a cast ?

yes it should. https://issues.dlang.org/show_bug.cgi?id=3468

Mar 16 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Wednesday, March 16, 2016 18:40:56 Laeeth Isharc via Digitalmars-d-learn 
wrote:
 should it be a compiler warning to assign a negative literal to
 an unsigned without a cast ?

Maybe? It's a common enough thing to do that I'm willing to bet that Walter
would object, but what you're really looking to do in most cases like that
is to get something like uint.max, in which case it's better to just use the
built-in constant. But I doubt that assigning negative literals to unsigned
variables causes much in the way of bugs.

The bigger problem is comparing signed and unsigned types, and a number of
folks have argued that that should be a warning or error.

- Jonathan M Davis

Mar 16 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/16/16 2:40 PM, Laeeth Isharc wrote:
 should it be a compiler warning to assign a negative literal to an
 unsigned without a cast ?

Why? They implicitly convert.

int  x = -1;
uint y = x;

I don't see a difference between this and your code. And we can't change 
this behavior of the second line, too much arguably valid code would break.

-Steve

Mar 16 2016

Mathias Lang <pro.mathias.lang gmail.com> writes:

On Wednesday, 16 March 2016 at 20:11:41 UTC, Steven Schveighoffer 
wrote:
 On 3/16/16 2:40 PM, Laeeth Isharc wrote:
 should it be a compiler warning to assign a negative literal 
 to an
 unsigned without a cast ?

 Why? They implicitly convert.

 int  x = -1;
 uint y = x;

 I don't see a difference between this and your code. And we 
 can't change this behavior of the second line, too much 
 arguably valid code would break.

 -Steve

We can change it, and we should. But it should be deprecated 
properly, and we should put in place enough candy to make it 
viable (See 
http://forum.dlang.org/post/vbeohujwdsoqfgwqgasa forum.dlang.org 
).

Mar 16 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/16/16 4:55 PM, Mathias Lang wrote:
 On Wednesday, 16 March 2016 at 20:11:41 UTC, Steven Schveighoffer wrote:
 On 3/16/16 2:40 PM, Laeeth Isharc wrote:
 should it be a compiler warning to assign a negative literal to an
 unsigned without a cast ?

 Why? They implicitly convert.

 int  x = -1;
 uint y = x;

 I don't see a difference between this and your code. And we can't
 change this behavior of the second line, too much arguably valid code
 would break.

 We can change it, and we should. But it should be deprecated properly,
 and we should put in place enough candy to make it viable (See
 http://forum.dlang.org/post/vbeohujwdsoqfgwqgasa forum.dlang.org ).

No, please don't. Assigning a signed value to an unsigned (and vice 
versa) is very useful, and there is no good reason to break this.

-Steve

Mar 16 2016

Anonymouse <asdf asdf.com> writes:

On Wednesday, 16 March 2016 at 21:49:05 UTC, Steven Schveighoffer 
wrote:
 No, please don't. Assigning a signed value to an unsigned (and 
 vice versa) is very useful, and there is no good reason to 
 break this.

 -Steve

I agree, but implicitly allowing for comparisons between the two 
allows for easy and *silent* mistakes. Also for comparisons of 
less-than-zero, for those functions we have that return -1 on 
failure.

import std.string : indexOf;
size_t pos = "banana".indexOf("c");
if (pos > 0) {
   // oops
}

The above is naturally a programmer error but it's not something 
that will net you an immediate crash. It will just silently not 
behave as you meant, and you'll find yourself with a lot of fake 
bananas.

Mar 16 2016

Johan Engelen <j j.nl> writes:

On Wednesday, 16 March 2016 at 22:07:39 UTC, Anonymouse wrote:
 
 size_t pos = "banana".indexOf("c");
 if (pos > 0) {

Although I also think it makes sense to warn (in specific cases) 
about mixed-sign comparisons, the example you give here does 
nothing that we can warn about. It is a comparison of an unsigned 
"pos" with a literal that is unsigned too. ("0" literal must be 
considered signed and unsigned without any warnings)

Mar 16 2016

Mathias Lang <pro.mathias.lang gmail.com> writes:

On Wednesday, 16 March 2016 at 21:49:05 UTC, Steven Schveighoffer 
wrote:
 No, please don't. Assigning a signed value to an unsigned (and 
 vice versa) is very useful, and there is no good reason to 
 break this.

 -Steve

I'm not talking about removing it completely. The implicit 
conversion should only happen when it's safe:

```
int s;
if (s >= 0) // VRP saves the day
{
   uint u = s;
}
```

```
uint u;

if (u > short.max)
   throw new Exception("Argument out of range");
// Or `assert`
short s = u;
```

Mar 16 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Wednesday, March 16, 2016 22:37:40 Mathias Lang via Digitalmars-d-learn 
wrote:
 On Wednesday, 16 March 2016 at 21:49:05 UTC, Steven Schveighoffer

 wrote:
 No, please don't. Assigning a signed value to an unsigned (and
 vice versa) is very useful, and there is no good reason to
 break this.

 -Steve

 I'm not talking about removing it completely. The implicit
 conversion should only happen when it's safe:

 ```
 int s;
 if (s >= 0) // VRP saves the day
 {
    uint u = s;
 }
 ```

 ```
 uint u;

 if (u > short.max)
    throw new Exception("Argument out of range");
 // Or `assert`
 short s = u;
 ```

Now, you're talking about comparing signed and unsigned values, which is a
completely different ballgame. Just assigning one to the other really isn't
a problem, and sometimes you _want_ the wraparound. If you assume that it's
always the case that assigning a negative value to an unsigned type is
something that programmers don't want to do, then you haven't programmed in
C enough. And while it could still be done by requiring casts, consider that
every time you do a cast, you're telling the compiler to just shut up and do
what you want, which makes it easy to hide stuff that you don't want hidden
- especially when code changes later.

D purposefully allows converting between signed and unsigned types of the
same or greater size. And based on what he's said on related topics in the
past, there's pretty much no way that you're going to convince Walter that
it's a bad idea. And I really don't see a problem with the current behavior
as far as assignment goes. It's comparisons which are potentially
problematic, and that's where you'd have some chance of getting a warning or
error added to the compiler. If you want to actually have the values check
to ensure that a negative value isn't assigned to an unsigned integer, then
use std.conv.to to do conversions or wrap your integers in types that have
more restrictive rules. IIRC, at least one person around here has done that
already so that they can catch integer overflow - which is basically what
you're complaining about here.

- Jonathan M Davis

Mar 16 2016

tsbockman <thomas.bockman gmail.com> writes:

On Thursday, 17 March 2016 at 01:57:16 UTC, Jonathan M Davis 
wrote:
 Just assigning one to the other really isn't a problem, and 
 sometimes you _want_ the wraparound. If you assume that it's 
 always the case that assigning a negative value to an unsigned 
 type is something that programmers don't want to do, then you 
 haven't programmed in C enough.

Greater than 90% of the time, even in low level code, an 
assignment, comparison, or any other operation that mixes signed 
and unsigned types is being done directly (without bounds 
checking) only for speed, laziness, or ignorance - not because 
2's complement mapping of negative to positive values is actually 
desired.

Forcing deliberate invocations of 2's complement mapping between 
signed and unsigned types to be explicitly marked is a good 
thing, seeing as the intended semantics are fundamentally 
different. I interpret any resistance to this idea, simply as 
evidence that we haven't yet made it sufficiently easy/pretty to 
be explicit.

Any idea that it's actually *desirable* for code to be ambiguous 
in this way is just silly.

Mar 16 2016

tsbockman <thomas.bockman gmail.com> writes:

On Thursday, 17 March 2016 at 01:57:16 UTC, Jonathan M Davis 
wrote:
 or wrap your integers in types that have more restrictive 
 rules. IIRC, at least one person around here has done that 
 already so that they can catch integer overflow - which is 
 basically what you're complaining about here.

That's me (building on Robert Schadek's work):
     https://code.dlang.org/packages/checkedint

Although I should point out that my `SmartInt` actually has 
*less* restrictive rules than the built-in types - all possible 
combinations of size and signedness are both allowed and safe for 
all operations, without any explicit casts. A lot of what 
`SmartInt` does depends on (minimal) extra runtime logic, which 
imposes a ~30% performance penalty (when integer math is actually 
the bottleneck) with good compiler optimizations (GDC or LDC).

But, a lot of it could also be done at no runtime cost, by 
leveraging VRP. C's integer math rules are really pretty bad, 
even when taking performance into account. Something as simple as 
by default promoting to a signed, rather than unsigned, type 
would prevent many bugs in practice, at zero cost (except that it 
would be a breaking change).

There is also `SafeInt` with "more restrictive rules", if it is 
for some reason necessary to work inside the limitations of the 
built-in basic integer types.

Mar 16 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/16/16 6:37 PM, Mathias Lang wrote:
 On Wednesday, 16 March 2016 at 21:49:05 UTC, Steven Schveighoffer wrote:
 No, please don't. Assigning a signed value to an unsigned (and vice
 versa) is very useful, and there is no good reason to break this.

 I'm not talking about removing it completely. The implicit conversion
 should only happen when it's safe:

 ```
 int s;
 if (s >= 0) // VRP saves the day
 {
    uint u = s;
 }
 ```

 ```
 uint u;

 if (u > short.max)
    throw new Exception("Argument out of range");
 // Or `assert`
 short s = u;
 ```

Converting unsigned to signed or vice versa (of the same size type) is 
safe. No information is lost. It's the comparison between the two which 
confuses the heck out of people. I think we can solve 80% of the 
problems by just fixing that. And the bug report says it's preapproved 
from Walter and Andrei.

VRP on steroids would be nice, but I don't think it's as trivial to solve.

-Steve

Mar 17 2016

tsbockman <thomas.bockman gmail.com> writes:

On Thursday, 17 March 2016 at 17:09:46 UTC, Steven Schveighoffer 
wrote:
 Converting unsigned to signed or vice versa (of the same size 
 type) is safe. No information is lost.

Saying that "no information is lost" in such a case, is like 
saying that if I encrypt my hard drive and then throw away the 
password, "no information is lost". Technically this is true: the 
bit count is the same as it was before.

In practice, though, the knowledge of how information is encoded 
is essential to actually using it.

In the same way, using `cast(ulong)` to pass `-1L` to a function 
that expects a `ulong` results in a de-facto loss of information, 
because that `-1L` can no longer distinguished from `ulong.max`, 
despite the fundamental semantic difference between the two.

 VRP on steroids would be nice, but I don't think it's as 
 trivial to solve.

D's current VRP is actually surprisingly anemic: it doesn't even 
understand integer comparisons, or the range restrictions implied 
by the predicate when a certain branch of an `if` statement is 
taken.

Lionello Lunesu made a PR a while back that adds these two 
features, and it makes the compiler feel a lot smarter. (The PR 
was not accepted at the time, but I have since revived it:
     https://github.com/D-Programming-Language/dmd/pull/5229)

Mar 17 2016

Ola Fosheim =?UTF-8?B?R3LDuHN0YWY=?= writes:

On Thursday, 17 March 2016 at 22:46:01 UTC, tsbockman wrote:
 In the same way, using `cast(ulong)` to pass `-1L` to a 
 function that expects a `ulong` results in a de-facto loss of 
 information, because that `-1L` can no longer distinguished 
 from `ulong.max`, despite the fundamental semantic difference 
 between the two.

Only providing modular arithmetics is a significant language 
design flaw, but as long as all integers are defined to be 
modular then there is no fundamental semantic difference either.

Of course, comparisons beyond equality doesn't work for modular 
arithmetics either, irrespective of sign...

You basically have to decide whether you want a line or a circle; 
Walter chose the circle for integers and the line for floating 
point. The circle is usually the wrong model, but that does not 
change the language definition...

Mar 17 2016

tsbockman <thomas.bockman gmail.com> writes:

On Friday, 18 March 2016 at 05:20:35 UTC, Ola Fosheim Grøstaf 
wrote:
 Only providing modular arithmetics is a significant language 
 design flaw, but as long as all integers are defined to be 
 modular then there is no fundamental semantic difference either.

`ulong.max` and `-1L` are fundamentally different semantically, 
even with two's complement modular arithmetic.

Just because a few operations (addition and subtraction, mainly) 
can use a common implementation for both, does not change that. 
Division, for example, cannot be done correctly without knowing 
whether the inputs are signed or not.

Mar 18 2016

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Friday, 18 March 2016 at 23:35:42 UTC, tsbockman wrote:
 `ulong.max` and `-1L` are fundamentally different semantically, 
 even with two's complement modular arithmetic.

Different types implies different semantics, but not the literals 
in isolation.

Under modular arithmetics for an ubyte the literals -128 and 128 
both refer to 128. This follows from -128 == 0 - (128). 
Unfortunately in D, the actual arithmetics is not done modulo 
2^8, but modulo 2^32.

So, what we should object to is modular arithmetics over integers 
as defined in D.

 Just because a few operations (addition and subtraction, 
 mainly) can use a common implementation for both, does not 
 change that. Division, for example, cannot be done correctly 
 without knowing whether the inputs are signed or not.

Yes, both multiplication and division change with the type, but 
you usually don't want signed values in modular arithmetics?

The major flaw is in how D defines arithmetics for integers.

Mar 19 2016

tsbockman <thomas.bockman gmail.com> writes:

On Saturday, 19 March 2016 at 08:49:29 UTC, Ola Fosheim Grøstad 
wrote:
 On Friday, 18 March 2016 at 23:35:42 UTC, tsbockman wrote:
 `ulong.max` and `-1L` are fundamentally different 
 semantically, even with two's complement modular arithmetic.

 Different types implies different semantics, but not the 
 literals in isolation.

Both of the literals I used in my example explicitly indicate the 
type, not just the value.

Mar 19 2016

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Saturday, 19 March 2016 at 09:35:00 UTC, tsbockman wrote:
 Both of the literals I used in my example explicitly indicate 
 the type, not just the value.

Yes, but few people specify unsigned literals and relies on them 
being implicitly cast to unsigned. You don't want to type 0UL and 
1UL all the time. This is a another thing that Go does better, 
numeric literals ought to not be bound to a concrete type. So 
while I agree with you that the integer situation is messy, 
changing it to something better requires many changes. Which I am 
all for.

Mar 19 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/17/16 6:46 PM, tsbockman wrote:
 On Thursday, 17 March 2016 at 17:09:46 UTC, Steven Schveighoffer wrote:
 Converting unsigned to signed or vice versa (of the same size type) is
 safe. No information is lost.

 Saying that "no information is lost" in such a case, is like saying that
 if I encrypt my hard drive and then throw away the password, "no
 information is lost". Technically this is true: the bit count is the
 same as it was before.

It's hard to throw away the "key" of 2's complement math.

 In practice, though, the knowledge of how information is encoded is
 essential to actually using it.

In practice, a variable that is unsigned or signed is expected to behave 
like it is declared. I don't think anyone expects differently.

When I see:

size_t x = -1;

I expect x to behave like an unsigned size_t that represents -1. There 
is no ambiguity here. Where it gets confusing is if you didn't mean to 
type size_t. But the compiler can't know that.

When you start doing comparisons, then ambiguity creeps in. The behavior 
is well defined, but not very intuitive. You can get into trouble even 
without mixing signed/unsigned types. For example:

for(size_t i = 0; i < a.length - 1; ++i)

This is going to crash when a.length == 0. Better to do this:

for(size_t i = 0; i + 1 < a.length; ++i)

unsigned math can be difficult, there is no doubt. But we can't just 
disable it, or disable unsigned conversions.

 In the same way, using `cast(ulong)` to pass `-1L` to a function that
 expects a `ulong` results in a de-facto loss of information, because
 that `-1L` can no longer distinguished from `ulong.max`, despite the
 fundamental semantic difference between the two.

Any time you cast a type, the original type information is lost. But in 
this case, no bits are lost. In this case, the function is declaring "I 
don't care what your original type was, I want to use ulong". If it 
desires to know the original type, it should use a template parameter 
instead.

Note, I have made these mistakes myself, and I understand what you are 
asking for and why you are asking for it. But these are bugs. The user 
is telling the compiler to do one thing, and expecting it to do 
something else. It's not difficult to fix, and in fact, many lines of 
code are written specifically to take advantage of these rules. This is 
why we cannot remove them. The benefit is not worth the cost.

 VRP on steroids would be nice, but I don't think it's as trivial to
 solve.

 D's current VRP is actually surprisingly anemic: it doesn't even
 understand integer comparisons, or the range restrictions implied by the
 predicate when a certain branch of an `if` statement is taken.

 Lionello Lunesu made a PR a while back that adds these two features, and
 it makes the compiler feel a lot smarter. (The PR was not accepted at
 the time, but I have since revived it:
      https://github.com/D-Programming-Language/dmd/pull/5229)

I'm not compiler-savvy enough to have an opinion on the PR, but I think 
more sophisticated VRP would be good.

-Steve

Mar 18 2016

tsbockman <thomas.bockman gmail.com> writes:

On Friday, 18 March 2016 at 14:51:34 UTC, Steven Schveighoffer 
wrote:
 Note, I have made these mistakes myself, and I understand what 
 you are asking for and why you are asking for it. But these are 
 bugs. The user is telling the compiler to do one thing, and 
 expecting it to do something else. It's not difficult to fix, 
 and in fact, many lines of code are written specifically to 
 take advantage of these rules. This is why we cannot remove 
 them. The benefit is not worth the cost.

Actually, I think I confused things for you by mentioning to 
`cast(ulong)`.

I'm not asking for a Java-style "no unsigned" system (I hate 
that; it's one of my biggest annoyances with Java). Rather, I'm 
picking on *implicit* conversions between signed and unsigned.

I'm basically saying, "because information is lost when casting 
between signed and unsigned, all such casts should be explicit". 
This could make code rather verbose - except that from my 
experiments, with decent VRP the compiler can actually be 
surprisingly smart about warning only in those cases where 
implicit casting is really a bad idea.

Mar 18 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Friday, March 18, 2016 23:48:32 tsbockman via Digitalmars-d-learn wrote:
 I'm basically saying, "because information is lost when casting
 between signed and unsigned, all such casts should be explicit".

See. Here's the fundamental disagreement. _No_ information is lost when
converting between signed and unsigned integers. e.g.

    int i = -1;
    uint ui = i;
    int j = i;
    assert(j == -1);

But even if you convinced us, you'd have to convince Walter. And based on
previously discussions on this subject, I think that you have an _extremely_
low chance of that. He doesn't even think that there's a problem that

void foo(bool bar) {}
void foo(long bar) {}
foo(1);

resulted in call to the bool overload was a problem when pretty much
everyone else did. The only thing that I'm aware of that Walter has thought
_might_ be something that we should change is allowing the comparison
between signed and unsigned integers, and if you read what he says in the
bug report for it, he clearly doesn't think it's a big problem:

https://issues.dlang.org/show_bug.cgi?id=259

And that's something that clearly causes bugs in way that converting between
signed and unsigned integers does not. You're fighting for a lost cause on
this one.

- Jonathan M Davis

Mar 18 2016

tsbockman <thomas.bockman gmail.com> writes:

On Saturday, 19 March 2016 at 04:17:42 UTC, Jonathan M Davis 
wrote:
 The only thing that I'm aware of that Walter has thought 
 _might_ be something that we should change is allowing the 
 comparison between signed and unsigned integers, and if you 
 read what he says in the bug report for it, he clearly doesn't 
 think it's a big problem:

 https://issues.dlang.org/show_bug.cgi?id=259

 And that's something that clearly causes bugs in way that 
 converting between signed and unsigned integers does not. 
 You're fighting for a lost cause on this one.

 - Jonathan M Davis

You do realize that, technically, there are no comparisons 
between basic signed and unsigned integers in D? The reason that 
*attempting* such a comparison produces such weird results, is 
because the signed value is being implicitly cast to an unsigned 
type.

The thing you say *is* a problem, is directly caused by the thing 
that you say is *not* a problem.

Mar 19 2016

Basile B. <b2.temp gmx.com> writes:

On Saturday, 19 March 2016 at 09:33:25 UTC, tsbockman wrote:
 [...] The reason that *attempting* such a comparison produces 
 such weird results, is because the signed value is being 
 implicitly cast to an unsigned type.

Yes and that's the opposite that should happend: when signed and 
unsigned are mixed in a comparison, the unsigned value should be 
implictly cast to a wider signed value. And then it works!

- https://issues.dlang.org/show_bug.cgi?id=15805
- 
https://github.com/BBasile/iz/blob/v0.5.8/import/iz/sugar.d#L1017

Mar 19 2016

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:

On Saturday, 19 March 2016 at 10:01:41 UTC, Basile B. wrote:
 On Saturday, 19 March 2016 at 09:33:25 UTC, tsbockman wrote:
 [...] The reason that *attempting* such a comparison produces 
 such weird results, is because the signed value is being 
 implicitly cast to an unsigned type.

 Yes and that's the opposite that should happend: when signed 
 and unsigned are mixed in a comparison, the unsigned value 
 should be implictly cast to a wider signed value. And then it 
 works!

 - https://issues.dlang.org/show_bug.cgi?id=15805
 - 
 https://github.com/BBasile/iz/blob/v0.5.8/import/iz/sugar.d#L1017

I have no problem with C++ compilers complaining about 
signed/unsigned comparisons. It sometimes means you should 
reconsider the comparison, so it leads to better code.

The better solution is to add 7, 15, 31 and 63 bit unsigned 
integer types that safely converts to signed (this is what Ada 
does) and remove implicit conversion for unsigned 8,16,32, and 64 
bit integers.

Mar 19 2016

Basile B. <b2.temp gmx.com> writes:

On Saturday, 19 March 2016 at 10:24:41 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 19 March 2016 at 10:01:41 UTC, Basile B. wrote:
 On Saturday, 19 March 2016 at 09:33:25 UTC, tsbockman wrote:
 [...] The reason that *attempting* such a comparison produces 
 such weird results, is because the signed value is being 
 implicitly cast to an unsigned type.

 Yes and that's the opposite that should happend: when signed 
 and unsigned are mixed in a comparison, the unsigned value 
 should be implictly cast to a wider signed value. And then it 
 works!

 - https://issues.dlang.org/show_bug.cgi?id=15805
 - 
 https://github.com/BBasile/iz/blob/v0.5.8/import/iz/sugar.d#L1017

 I have no problem with C++ compilers complaining about 
 signed/unsigned comparisons. It sometimes means you should 
 reconsider the comparison, so it leads to better code.

 The better solution is to add 7, 15, 31 and 63 bit unsigned 
 integer types that safely converts to signed (this is what Ada 
 does)

FPC (Object Pascal) too, but that not a surpise since it's in the 
same family

 and remove implicit conversion for unsigned 8,16,32, and 64 bit 
 integers.

Yes that's almost that but in D the only solution I see is like 
in my template: widening. When widening is not possible (mainly 
on X86_64) then warning. The problem is that cent and ucent are 
not implemented, otherwise it would always work even on 64 bit OS.

I'd like to propose  a PR for this (not for cent/ucent but for 
the widening) but it looks a bit overcomplicated for a first 
contrib in the compiler...

Mar 19 2016

tsbockman <thomas.bockman gmail.com> writes:

On Saturday, 19 March 2016 at 10:01:41 UTC, Basile B. wrote:
 Yes and that's the opposite that should happend: when signed 
 and unsigned are mixed in a comparison, the unsigned value 
 should be implictly cast to a wider signed value. And then it 
 works!

That would be reasonable. Whether it's actually faster than just 
inserting an extra check for `signed_value < 0` in mixed 
comparisons is likely platform dependent, though.

Honestly though - even just changing the rules to implicitly 
convert both operands to a signed type of the same size, instead 
of an unsigned type of the same size, would be a huge 
improvement. Small negative values are way more common than huge 
(greater than signed_type.max) positive ones in almost all code. 
(This change will never happen, of course, as it would be far too 
subtle of a breaking change for existing code.)

Regardless, the first step is to implement the pre-approved 
solution to DMD 259: deprecate the current busted behavior.

Mar 19 2016

Jonathan M Davis via Digitalmars-d-learn writes:

On Friday, March 18, 2016 21:17:42 Jonathan M Davis via Digitalmars-d-learn 
wrote:
 On Friday, March 18, 2016 23:48:32 tsbockman via Digitalmars-d-learn wrote:
 I'm basically saying, "because information is lost when casting
 between signed and unsigned, all such casts should be explicit".

 See. Here's the fundamental disagreement. _No_ information is lost when
 converting between signed and unsigned integers. e.g.

     int i = -1;
     uint ui = i;
     int j = i;
     assert(j == -1);

 But even if you convinced us, you'd have to convince Walter. And based on
 previously discussions on this subject, I think that you have an _extremely_
 low chance of that. He doesn't even think that there's a problem that

 void foo(bool bar) {}
 void foo(long bar) {}
 foo(1);

 resulted in call to the bool overload was a problem when pretty much
 everyone else did. The only thing that I'm aware of that Walter has thought
 _might_ be something that we should change is allowing the comparison
 between signed and unsigned integers, and if you read what he says in the
 bug report for it, he clearly doesn't think it's a big problem:

 https://issues.dlang.org/show_bug.cgi?id=259

 And that's something that clearly causes bugs in way that converting between
 signed and unsigned integers does not. You're fighting for a lost cause on
 this one.

And I really should have proofread this message before sending it... :(

Hopefully, you get what I meant though.

- Jonathan M Davis

Mar 19 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/18/16 7:48 PM, tsbockman wrote:
 On Friday, 18 March 2016 at 14:51:34 UTC, Steven Schveighoffer wrote:
 Note, I have made these mistakes myself, and I understand what you are
 asking for and why you are asking for it. But these are bugs. The user
 is telling the compiler to do one thing, and expecting it to do
 something else. It's not difficult to fix, and in fact, many lines of
 code are written specifically to take advantage of these rules. This
 is why we cannot remove them. The benefit is not worth the cost.

 Actually, I think I confused things for you by mentioning to `cast(ulong)`.

 I'm not asking for a Java-style "no unsigned" system (I hate that; it's
 one of my biggest annoyances with Java). Rather, I'm picking on
 *implicit* conversions between signed and unsigned.

No, I understood you meant implicit casting.

 I'm basically saying, "because information is lost when casting between
 signed and unsigned, all such casts should be explicit". This could make
 code rather verbose - except that from my experiments, with decent VRP
 the compiler can actually be surprisingly smart about warning only in
 those cases where implicit casting is really a bad idea.

Your definition of when "implicit casting is really a bad idea" is 
almost certainly going to include cases where it really isn't a bad 
idea. The compiler isn't all-knowing, and there will always be cases 
where the user knows best (and did the conversion intentionally).

An obvious one is:

void foo(ubyte[] x)
{
   int len = x.length;
}

(let's assume 32-bit CPU) I'm assuming the compiler would complain about 
this, since technically, len could be negative! Disallowing such code or 
requiring a cast is probably too much.

-Steve

Mar 21 2016

tsbockman <thomas.bockman gmail.com> writes:

On Monday, 21 March 2016 at 17:38:35 UTC, Steven Schveighoffer 
wrote:
 Your definition of when "implicit casting is really a bad idea" 
 is almost certainly going to include cases where it really 
 isn't a bad idea.

This logic can be applied to pretty much any warning condition or 
safety/correctness-related compiler feature; if it were followed 
consistently the compiler would just always trust the programmer, 
like an ancient C or C++ compiler with warnings turned off.

 The compiler isn't all-knowing, and there will always be cases 
 where the user knows best (and did the conversion 
 intentionally).

That's what explicit casts are for.

 An obvious one is:

 void foo(ubyte[] x)
 {
   int len = x.length;
 }

 (let's assume 32-bit CPU) I'm assuming the compiler would 
 complain about this, since technically, len could be negative! 
 Disallowing such code or requiring a cast is probably too much.

But that *is* a bad idea - there have been real-world bugs caused 
by doing stuff like that without checking.

With respect to your specific example:

1) The memory limit on a true 32-bit system is 4GiB, not 2GiB. 
Even with an OS that reserves some of the address space, as much 
as 3GiB or 3.5GiB may be exposed to a user-space process in 
practice.

2) Many 32-bit CPUs have Physical Address Extension, which allows 
them to support way more than 4GiB. Even a non-PAE-aware process 
will probably still be offered at least 3GiB on such a system.

3) Just because your program is 32-bit, does *not* mean that it 
will only ever run on 32-bit CPUs. On a 64-bit system, a single 
32-bit process could easily have access to ~3GiB of memory.

4) Even on an embedded system (which D doesn't really support 
right now, anyway) with a true, 2GiB memory limit, you still have 
the problem that such highly platform-dependent code is difficult 
to find and update when the time comes to port the code to more 
powerful hardware.

These kinds of things are why D has fixed-size integer types: to 
encourage writing portable code, without too many invisible 
assumptions about the precise details of the execution 
environment.

Mar 21 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/21/16 4:27 PM, tsbockman wrote:
 On Monday, 21 March 2016 at 17:38:35 UTC, Steven Schveighoffer wrote:
 Your definition of when "implicit casting is really a bad idea" is
 almost certainly going to include cases where it really isn't a bad idea.

 This logic can be applied to pretty much any warning condition or
 safety/correctness-related compiler feature; if it were followed
 consistently the compiler would just always trust the programmer, like
 an ancient C or C++ compiler with warnings turned off.

Right, if we were starting over, I'd say let's make sure you can't make 
these kinds of mistakes. We are not starting over though, and existing 
code will have intentional uses of the existing behavior that are NOT 
bugs. Even that may have been rejected by Walter since a goal is making 
C code easy to port.

Note that we already have experience with such a thing: if(arr). Fixing 
is easy, just put if(arr.ptr). It was rejected because major users of 
this "feature" did not see any useful improvements -- all their usages 
were sound.

 The compiler isn't all-knowing, and there will always be cases where
 the user knows best (and did the conversion intentionally).

 That's what explicit casts are for.

Existing code doesn't need to cast. People are lazy. I only would insert 
a cast if I needed to. Most valid code just works fine without casts, so 
you are going to flag lots of valid code as a nuisance.

 An obvious one is:

 void foo(ubyte[] x)
 {
   int len = x.length;
 }

 (let's assume 32-bit CPU) I'm assuming the compiler would complain
 about this, since technically, len could be negative! Disallowing such
 code or requiring a cast is probably too much.

 But that *is* a bad idea - there have been real-world bugs caused by
 doing stuff like that without checking.

It depends on the situation. foo may know that x is going to be short 
enough to fit in an int.

The question becomes, if 99% of cases the user knows that he was 
converting to a signed value intentionally, and in the remaining 1% of 
cases, 99% of those were harmless "errors", then this is going to be 
just a nuisance update, and it will fail to be accepted.

 With respect to your specific example:

 1) The memory limit on a true 32-bit system is 4GiB, not 2GiB. Even with
 an OS that reserves some of the address space, as much as 3GiB or 3.5GiB
 may be exposed to a user-space process in practice.

Then make it long len = x.length on a 64-bit system.

Only reason I said assume it's 32-bit, is because on 64-bit CPU, using 
int is already an error. The architecture wasn't important for the example.

-Steve

Mar 21 2016

tsbockman <thomas.bockman gmail.com> writes:

On Monday, 21 March 2016 at 22:29:46 UTC, Steven Schveighoffer 
wrote:
 It depends on the situation. foo may know that x is going to be 
 short enough to fit in an int.

 The question becomes, if 99% of cases the user knows that he 
 was converting to a signed value intentionally, and in the 
 remaining 1% of cases, 99% of those were harmless "errors", 
 then this is going to be just a nuisance update, and it will 
 fail to be accepted.

My experimentation strongly suggests that your "99.99% false 
positive" figure is way, *way* off. This stuff is both:

1) Harder for people to get right than you think (you can't 
develop good intuition about the extent of the problem, unless 
you spend some time thoroughly auditing existing code bases 
specifically looking for this kind of problem), and also

2) Easier for the compiler to figure out than you think - I was 
really surprised at how short the list of problems flagged by the 
compiler was, when I tested Lionello Lunesu's work on the current 
D codebase.

The false positive rate would certainly be *much* lower than your 
outlandish 10,000 : 1 estimate, given a good compiler 
implementation.

 With respect to your specific example:

 1) The memory limit on a true 32-bit system is 4GiB, not 2GiB. 
 Even with
 an OS that reserves some of the address space, as much as 3GiB 
 or 3.5GiB
 may be exposed to a user-space process in practice.

 Then make it long len = x.length on a 64-bit system.

 Only reason I said assume it's 32-bit, is because on 64-bit 
 CPU, using int is already an error. The architecture wasn't 
 important for the example.

Huh? The point of mine which you quoted applies specifically to 
32-bit systems. 32-bit array lengths can be greater than 
`int.max`.

Mar 21 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 3/21/16 7:43 PM, tsbockman wrote:
 On Monday, 21 March 2016 at 22:29:46 UTC, Steven Schveighoffer wrote:
 It depends on the situation. foo may know that x is going to be short
 enough to fit in an int.

 The question becomes, if 99% of cases the user knows that he was
 converting to a signed value intentionally, and in the remaining 1% of
 cases, 99% of those were harmless "errors", then this is going to be
 just a nuisance update, and it will fail to be accepted.

 My experimentation strongly suggests that your "99.99% false positive"
 figure is way, *way* off. This stuff is both:

Maybe, what would be a threshold that people would find acceptable?

 1) Harder for people to get right than you think (you can't develop good
 intuition about the extent of the problem, unless you spend some time
 thoroughly auditing existing code bases specifically looking for this
 kind of problem), and also

It matters not to the person who is very aware of the issue and doesn't 
write buggy code. His code "breaks" too.

I would estimate that *most* uses of if(arr) in the wild were/are 
incorrect. However, in one particular user's code *0* were incorrect, 
even though he used it extensively. This kind of problem is what lead to 
the change being reverted. I suspect this change would be far more 
likely to create more headaches than help.

 2) Easier for the compiler to figure out than you think - I was really
 surprised at how short the list of problems flagged by the compiler was,
 when I tested Lionello Lunesu's work on the current D codebase.

This is highly subjective to whose code you use it on.

 The false positive rate would certainly be *much* lower than your
 outlandish 10,000 : 1 estimate, given a good compiler implementation.

I wouldn't say it's outlandish given my understanding of the problem. 
The question is, does the pain justify the update? I haven't run it 
against my code or any code really, but I can see how someone is very 
good at making correct uses of the implicit conversion.

 With respect to your specific example:

 1) The memory limit on a true 32-bit system is 4GiB, not 2GiB. Even with
 an OS that reserves some of the address space, as much as 3GiB or 3.5GiB
 may be exposed to a user-space process in practice.

 Then make it long len = x.length on a 64-bit system.

 Only reason I said assume it's 32-bit, is because on 64-bit CPU, using
 int is already an error. The architecture wasn't important for the
 example.

 Huh? The point of mine which you quoted applies specifically to 32-bit
 systems. 32-bit array lengths can be greater than `int.max`.



You seem to spend a lot of time focusing on 32-bit architecture, which 
was not my point at all.

My point is that most arrays and uses are short enough to be handled 
with a signed value as the length.

If this is a generic library function, sure, we should handle all 
possibilities. This doesn't mean someone's command line utility 
processing strings from the argument list should have to worry about 
that (as an example). Breaking perfectly good code is something we 
should strive against.

-Steve

Mar 21 2016

tsbockman <thomas.bockman gmail.com> writes:

On Tuesday, 22 March 2016 at 00:18:54 UTC, Steven Schveighoffer 
wrote:
 On 3/21/16 7:43 PM, tsbockman wrote:
 The false positive rate would certainly be *much* lower than 
 your
 outlandish 10,000 : 1 estimate, given a good compiler 
 implementation.

 I wouldn't say it's outlandish given my understanding of the 
 problem. The question is, does the pain justify the update? I 
 haven't run it against my code or any code really, but I can 
 see how someone is very good at making correct uses of the 
 implicit conversion.

Well that's the real problem here then, isn't it?

I wouldn't want this stuff "fixed" either, if I thought false 
positives would outnumber useful warnings by 10,000 : 1.

However, I already *know* that's not the case, from my own tests. 
But at this point I'm obviously not going to convince you, except 
by compiling some concrete statistics on what got flagged in some 
real code bases.

And this I plan to do (in some form or other), once `checkedint` 
and/or the fix for DMD issue 259 are really ready. People can 
make an informed decision about the trade-offs then.

Mar 21 2016

Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:

On Thursday, 17 March 2016 at 17:09:46 UTC, Steven Schveighoffer 
wrote:
 On 3/16/16 6:37 PM, Mathias Lang wrote:
 On Wednesday, 16 March 2016 at 21:49:05 UTC, Steven 
 Schveighoffer wrote:
 No, please don't. Assigning a signed value to an unsigned 
 (and vice
 versa) is very useful, and there is no good reason to break 
 this.

 I'm not talking about removing it completely. The implicit 
 conversion
 should only happen when it's safe:

 ```
 int s;
 if (s >= 0) // VRP saves the day
 {
    uint u = s;
 }
 ```

 ```
 uint u;

 if (u > short.max)
    throw new Exception("Argument out of range");
 // Or `assert`
 short s = u;
 ```

 Converting unsigned to signed or vice versa (of the same size 
 type) is safe. No information is lost.

Strictly speaking yes, but typically, an `int` isn't used as a 
bit-pattern but as an integer (it's in the name). Such behaviour 
is very undesirable for integers.

 It's the comparison between the two which confuses the heck out 
 of people. I think we can solve 80% of the problems by just 
 fixing that.

That's probably true, anyway.

Mar 18 2016

D Programming

C/C++ Programming

Other

digitalmars.D.learn - size_t index=-1;