digitalmars.D - Which D features to emphasize for academic review article

TJB (11/11) Aug 09 2012 Hello D Users,

dsimcha (30/30) Aug 09 2012 Ok, so IIUC the audience is academic BUT is people interested in

Walter Bright (18/19) Aug 09 2012 I'd like to add to that:

F i L (22/23) Aug 10 2012 This isn't a good feature, IMO. C# handles this much more

Walter Bright (33/53) Aug 10 2012 It catches only a subset of these at compile time. I can craft any numbe...

F i L (30/68) Aug 10 2012 Yes, but that's not really an issue since the compiler informs

Walter Bright (12/30) Aug 10 2012 That is a good solution, but in my experience programmers just throw in ...

Walter Bright (4/12) Aug 10 2012 Let me amend that. I've never seen anyone use float.nan, or whatever NaN...
F i L (4/9) Aug 10 2012 Yes, if 'int' had a NaN state it would be great. (Though I
F i L (22/41) Aug 11 2012 I heard somewhere before there's actually an (Intel?) CPU which

Andrei Alexandrescu (18/20) Aug 11 2012 Actually there's something that just happened two days ago to me that's

F i L (25/30) Aug 11 2012 My argument was never against the usefulness of NaN for

Walter Bright (5/13) Aug 11 2012 I'd rather have a 100 easy to find bugs than 1 unnoticed one that went o...

F i L (31/33) Aug 11 2012 That's just the thing, bugs are arguably easier to hunt down when

Walter Bright (11/20) Aug 11 2012 Many, many programming bugs trace back to assumptions that floating poin...

F i L (20/36) Aug 11 2012 My point was that the majority of the time there wasn't a bug

dennis luehring (6/11) Aug 11 2012 is makes absolutely no sense to have different initialization stylel in

Don Clugston (8/20) Aug 13 2012 Exactly. I have come to believe that there are very few algorithms

Joseph Rushton Wakeling (14/18) Aug 13 2012 ////////

Walter Bright (3/6) Aug 13 2012 That's called "rounding". But rounding always implies some, small, error...

Joseph Rushton Wakeling (10/12) Aug 13 2012 Well, yes. I was just remarking on the choice of rounding and the motiv...

bearophile (4/7) Aug 13 2012 And JavaScript programs that use integers?
TJB (8/16) Aug 13 2012 Don,

Don Clugston (8/26) Aug 14 2012 I found that when converting code for Special Functions from C to D, the...

Era Scarecrow (23/31) Aug 11 2012 The compiler could always have flags specifying if variables

Walter Bright (3/12) Aug 11 2012 Not so easy. Suppose you pass a pointer to the variable to another funct...

Era Scarecrow (15/24) Aug 11 2012 I suppose there could be a second hidden pointer/bool as part of

F i L (29/45) Aug 10 2012 I just want to clarify something here. In C#, only class/struct

Walter Bright (5/13) Aug 10 2012 However, and I've seen this happen, people will satisfy the compiler com...

Mehrdad (43/59) Aug 14 2012 Note to Walter:

Michal Minich (10/20) Aug 14 2012 Completely agree. I find it quite useful in C#. It helps a lot in
Don Clugston (13/67) Aug 14 2012 DMD detects uninitialized variables if you compile with -O. It's hard to...
F i L (22/29) Aug 14 2012 I think some here are mis-interpreting Walters position

Simen Kjaeraas (10/17) Aug 14 2012 Really? We can catch (or, should be able to) missing initialization

F i L (3/22) Aug 14 2012 You know, I never actually thought about it much, but I think

Era Scarecrow (20/32) Aug 14 2012 Mmmm... What if you added a command that has a file/local scope?
Mehrdad (5/34) Aug 14 2012 C# structs, as you might recall, enforce definite initialization.

Mehrdad (6/25) Aug 14 2012 Ah, well if he's for it, then I misunderstood. I read through the

Walter Bright (11/19) Aug 14 2012 As I've explained before, user defined types have "default constructors"...

Mehrdad (10/33) Aug 14 2012 Just because they _have_ a default constructor doesn't mean the

Mehrdad (2/3) Aug 14 2012 Typo, scratch Java, it's N/A for Java.
Walter Bright (2/4) Aug 14 2012 I know. How does that fit in with default construction?

Mehrdad (16/20) Aug 14 2012 They aren't called unless the user calls them.

Mehrdad (2/4) Aug 14 2012 Er, other way around I mean...
Walter Bright (4/7) Aug 14 2012 I guess they aren't really default constructors, then .

Mehrdad (46/51) Aug 14 2012 For arrays, they're called automatically.

Jakob Ovrum (47/77) Aug 11 2012 The compiler in languages like C# doesn't try to prove that the

Walter Bright (16/36) Aug 11 2012 Of course it is doing what the language requires, but it is an incorrect...

Paulo Pinto (14/21) Aug 11 2012 I have to agree here.
Jakob Ovrum (38/57) Aug 11 2012 It is not meaningless, it's declarative. The same resulting code

Walter Bright (19/37) Aug 12 2012 No, it is not easier to understand, because there's no way to determine ...

simendsjo (4/15) Aug 12 2012 I have thought that many times. The same with default non-null class
dennis luehring (3/11) Aug 12 2012 its never to late - put it back on the list for D 3 - please

Era Scarecrow (14/21) Aug 12 2012 Agreed. If it is only a signature change then it might have been

Jakob Ovrum (31/52) Aug 12 2012 If there is an explicit initializer, it means that the intent is
Adam Wilson (9/58) Aug 12 2012 As a pilot, I completely agree!

Chad J (29/80) Aug 11 2012 To address the concern of static analysis being too hard: I wish we

Era Scarecrow (23/60) Aug 11 2012 Let's keep in mind everyone of these truths:

bearophile (31/35) Aug 11 2012 An alternative possibility is to:

Walter Bright (4/6) Aug 11 2012 This has been suggested repeatedly, but it is in utter conflict with the...

Andrei Alexandrescu (32/32) Aug 11 2012 On 8/11/12 7:33 PM, Walter Bright wrote:

bearophile (13/17) Aug 11 2012 Statistician often use the R language

dsimcha (22/40) Aug 12 2012 For people with more advanced CS/programming knowledge, though,

TJB (15/29) Aug 12 2012 This is exactly how I feel, and why I am turning to D. My data
Joseph Rushton Wakeling (8/14) Aug 12 2012 The main use-case and advantage of both R and MATLAB/Octave seems to me ...

dsimcha (13/21) Aug 12 2012 I've addressed that, too :).

TJB (37/72) Aug 11 2012 Andrei,

Andrei Alexandrescu (12/39) Aug 12 2012 I think this is a great angle. In our lab when I was a grad student in

bearophile (5/7) Aug 12 2012 In Matlab there is COW:

F i L (19/24) Aug 12 2012 I'd like to add to this. Right now I'm reworking some libraries

Walter Bright (7/12) Aug 13 2012 There's a fair amount of low hanging optimization fruit that D makes pos...

TJB (7/14) Aug 10 2012 How unique to D is this feature? Does this imply that things

Walter Bright (10/23) Aug 10 2012 I attended a talk given by a physicist a few months ago where he was usi...

Jonathan M Davis (9/12) Aug 10 2012 I think that it's pretty typical for programmers to think that something...
TJB (9/43) Aug 10 2012 Hopefully this will help make the case that D is the best choice

Justin Whear (7/22) Aug 09 2012 Lazy ranges are a lifesaver when dealing with big data. E.g. read a

Paulo Pinto (2/34) Aug 09 2012 Ah, the beauty of functional programming and streams.

Minas Mina (8/8) Aug 10 2012 1) I think compile-time function execution is a very big plus for

"TJB" <broughtj gmail.com> writes:

Hello D Users,

The Software Editor for the Journal of Applied Econometrics has 
agreed to let me write a review of the D programming language for 
econometricians (econometrics is where economic theory and 
statistical analysis meet).  I will have only about 6 pages.  I 
have an idea of what I am going to write about, but I thought I 
would ask here what features are most relevant (in your minds) to 
numerical programmers writing codes for statistical inference.

I look forward to your suggestions.

Thanks,

TJB

Aug 09 2012

"dsimcha" <dsimcha yahoo.com> writes:

Ok, so IIUC the audience is academic BUT is people interested in 
using D as a means to an end, not computer scientists?  I use D 
for bioinformatics, which IIUC has similar requirements to 
econometrics.  From my point of view:

I'd emphasize the following:

Native efficiency.  (Important for large datasets and monte carlo 
simulations)

Garbage collection.  (Important because it makes it much easier 
to write non-trivial data structures that don't leak memory, and 
statistical analyses are a lot easier if the data is structured 
well.)

Ranges/std.range/builtin arrays and associative arrays.  (Again, 
these make data handling a pleasure.)

Templates.  (Makes it easier to write algorithms that aren't 
overly specialized to the data structure they operate on.  This 
can also be done with OO containers but requires more boilerplate 
and compromises on efficiency.)

Disclaimer:  These last two are things I'm the primary designer 
and implementer of.  I intentionally put them last so it doesn't 
look like a shameless plug.

std.parallelism  (Important because you can easily parallelize 
your simulation, etc.)

dstats  (https://github.com/dsimcha/dstats  Important because a 
lot of statistical analysis code is already implemented for you.  
It's admittedly very basic compared to e.g. R or Matlab, but it's 
also in many cases better integrated and more efficient.  I'd say 
that it has the 15% of the functionality that covers ~70% of use 
cases.  I welcome contributors to add more stuff to it.  I 
imagine economists would be interested in time series, which is 
currently a big area of missing functionality.)

Aug 09 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/9/2012 10:40 AM, dsimcha wrote:
 I'd emphasize the following:

I'd like to add to that:

1. Proper support for 80 bit floating point types. Many compilers' libraries 
have inaccurate 80 bit math functions, or don't implement 80 bit floats at all. 
80 bit floats reduce the incidence of creeping roundoff error.

2. Support for SIMD vectors as native types.

3. Floating point values are default initialized to NaN.

4. Correct support for NaN and infinity values.

5. Correct support for unordered operations.

6. Array types do not degenerate into pointer types whenever passed to a 
function. In other words, array types know their dimension.

7. Array loop operations, i.e.:

     for (size_t i = 0; i < a.length; i++)
            a[i] = b[i] + c;

can be written as:

     a[] = b[] + c;

8. Global data is thread local by default, lessening the risk of unintentional 
unsynchronized sharing between threads.

Aug 09 2012

"F i L" <witte2008 gmail.com> writes:

Walter Bright wrote:
 3. Floating point values are default initialized to NaN.


conveniently with just as much optimization/debugging benefit 
(arguably more so, because it catches NaN issues at 


     class Foo
     {
         float x; // defaults to 0.0f

         void bar()
         {
             float y; // doesn't default
             y ++; // ERROR: use of unassigned local

             float z = 0.0f;
             z ++; // OKAY
         }
     }

This is the same behavior for any local variable, so where in D 
you need to explicitly set variables to 'void' to avoid 

mistakes before runtime.

Sorry, I'm not trying to derail this thread. I just think D's has 
other, much better advertising points that this one.

Aug 10 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/10/2012 1:38 AM, F i L wrote:
 Walter Bright wrote:
 3. Floating point values are default initialized to NaN.


just
 as much optimization/debugging benefit (arguably more so, because it catches
NaN


      class Foo
      {
          float x; // defaults to 0.0f

          void bar()
          {
              float y; // doesn't default
              y ++; // ERROR: use of unassigned local

              float z = 0.0f;
              z ++; // OKAY
          }
      }

 This is the same behavior for any local variable,

It catches only a subset of these at compile time. I can craft any number of 
ways of getting it to miss diagnosing it. Consider this one:

     float z;
     if (condition1)
          z = 5;
     ... lotsa code ...
     if (condition2)
          z++;

To diagnose this correctly, the static analyzer would have to determine that 
condition1 produces the same result as condition2, or not. This is impossible
to 
prove. So the static analyzer either gives up and lets it pass, or issues an 
incorrect diagnostic. So our intrepid programmer is forced to write:

     float z = 0;
     if (condition1)
          z = 5;
     ... lotsa code ...
     if (condition2)
          z++;

Now, as it may turn out, for your algorithm the value "0" is an out-of-range, 
incorrect value. Not a problem as it is a dead assignment, right?

But then the maintenance programmer comes along and changes condition1 so it is 
not always the same as condition2, and now the z++ sees the invalid "0" value 
sometimes, and a silent bug is introduced.

This bug will not remain undetected with the default NaN initialization.


 so where in D you need to
 explicitly set variables to 'void' to avoid assignment costs,

This is incorrect, as the optimizer is perfectly capable of removing dead 
assignments like:

    f = nan;
    f = 0.0f;

The first assignment is optimized away.

 I just think D's has other, much better advertising points that this one.

Whether you agree with it being a good feature or not, it is a feature unique
to 
D and merits discussion when talking about D's suitability for numerical 
programming.

Aug 10 2012

"F i L" <witte2008 gmail.com> writes:

Walter Bright wrote:
 It catches only a subset of these at compile time. I can craft 
 any number of ways of getting it to miss diagnosing it. 
 Consider this one:

     float z;
     if (condition1)
          z = 5;
     ... lotsa code ...
     if (condition2)
          z++;

 To diagnose this correctly, the static analyzer would have to 
 determine that condition1 produces the same result as 
 condition2, or not. This is impossible to prove. So the static 
 analyzer either gives up and lets it pass, or issues an 
 incorrect diagnostic. So our intrepid programmer is forced to 
 write:

     float z = 0;
     if (condition1)
          z = 5;
     ... lotsa code ...
     if (condition2)
          z++;

Yes, but that's not really an issue since the compiler informs 
the coder of it's limitation. You're simply forced to initialize 
the variable in this situation.


 Now, as it may turn out, for your algorithm the value "0" is an 
 out-of-range, incorrect value. Not a problem as it is a dead 
 assignment, right?

 But then the maintenance programmer comes along and changes 
 condition1 so it is not always the same as condition2, and now 
 the z++ sees the invalid "0" value sometimes, and a silent bug 
 is introduced.

 This bug will not remain undetected with the default NaN 
 initialization.

I had a debate on here a few months ago about the merits of 
default-to-NaN and others brought up similar situations. but 
since we can write:

     float z = float.nan;
     ...

explicitly, then this could be thought of as a debugging feature 
available to the programmer. The problem I've always had with 
defaulting to NaN is that it's inconsistent with integer types, 
and while there may be merit to the idea of defaulting all types 
to NaN/Null, it's simply unavailable for half of the number 
spectrum. I can only speak for myself, but I much prefer 
consistency over anything else because it means there's less 
discrepancies I need to remember when hacking things together. It 
also steepens the learning curve.

More importantly, what we have now is code where bugs-- like the 
one you mentioned above --are still possible with Ints, but also 
easy to miss since "the other number type" behaves differently 
and programmers may accidentally assume a NaN will propagate 
where it will not.


 This is incorrect, as the optimizer is perfectly capable of 
 removing dead assignments like:

    f = nan;
    f = 0.0f;

 The first assignment is optimized away.

I thought there was some optimization by avoiding assignment, but 
IDK enough about memory at that level. Now I'm confused as to the 
point of 'float x = void' type annotations. :-\


 Whether you agree with it being a good feature or not, it is a 
 feature unique to D and merits discussion when talking about 
 D's suitability for numerical programming.

True, and I misspoke by saying it wasn't a "selling point". I 
only meant to raise issue with a feature that has been more of an 
annoyance rather than a boon to me personally. That said, I also 
agree that this thread was the wrong place to raise issue with it.

Aug 10 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/10/2012 9:01 PM, F i L wrote:
 I had a debate on here a few months ago about the merits of default-to-NaN and
 others brought up similar situations. but since we can write:

      float z = float.nan;
      ...

That is a good solution, but in my experience programmers just throw in an =0, 
as it is simple and fast, and they don't normally think about NaN's.

 explicitly, then this could be thought of as a debugging feature available to
 the programmer. The problem I've always had with defaulting to NaN is that it's
 inconsistent with integer types, and while there may be merit to the idea of
 defaulting all types to NaN/Null, it's simply unavailable for half of the
number
 spectrum. I can only speak for myself, but I much prefer consistency over
 anything else because it means there's less discrepancies I need to remember
 when hacking things together. It also steepens the learning curve.

It's too bad that ints don't have a NaN value, but interestingly enough, 
valgrind does default initialize them to some internal NaN, making it a most 
excellent bug detector.


 More importantly, what we have now is code where bugs-- like the one you
 mentioned above --are still possible with Ints, but also easy to miss since
"the
 other number type" behaves differently and programmers may accidentally assume
a
 NaN will propagate where it will not.

Sadly, D has to map onto imperfect hardware :-(

We do have NaN values for chars (0xFF) and pointers (the villified 'null'). 
Think how many bugs the latter has exposed, and then think of all the floating 
point code with no such obvious indicator of bad initialization.

 I thought there was some optimization by avoiding assignment, but IDK enough
 about memory at that level. Now I'm confused as to the point of 'float x =
void'
 type annotations. :-\

It would be used where the static analysis is not able to detect that the 
initializer is dead.

Aug 10 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/10/2012 9:32 PM, Walter Bright wrote:
 On 8/10/2012 9:01 PM, F i L wrote:
 I had a debate on here a few months ago about the merits of default-to-NaN and
 others brought up similar situations. but since we can write:

      float z = float.nan;
      ...

 That is a good solution, but in my experience programmers just throw in an =0,
 as it is simple and fast, and they don't normally think about NaN's.

Let me amend that. I've never seen anyone use float.nan, or whatever NaN is in 
the language they were using. They always use =0. I doubt that yelling at them 
will change anything.

Aug 10 2012

"F i L" <witte2008 gmail.com> writes:

Walter Bright wrote:
 Sadly, D has to map onto imperfect hardware :-(

 We do have NaN values for chars (0xFF) and pointers (the 
 villified 'null'). Think how many bugs the latter has exposed, 
 and then think of all the floating point code with no such 
 obvious indicator of bad initialization.

Yes, if 'int' had a NaN state it would be great. (Though I 
remember hearing about a hardware that did support it.. 
somewhere).

Aug 10 2012

"F i L" <witte2008 gmail.com> writes:

Walter Bright wrote:
 That is a good solution, but in my experience programmers just 
 throw in an =0, as it is simple and fast, and they don't 
 normally think about NaN's.

See! Programmers just want usable default values :-P


 It's too bad that ints don't have a NaN value, but 
 interestingly enough, valgrind does default initialize them to 
 some internal NaN, making it a most excellent bug detector.

I heard somewhere before there's actually an (Intel?) CPU which 
supports NaN ints... but maybe that's just hearsay.


 Sadly, D has to map onto imperfect hardware :-(

 We do have NaN values for chars (0xFF) and pointers (the 
 villified 'null'). Think how many bugs the latter has exposed, 
 and then think of all the floating point code with no such 
 obvious indicator of bad initialization.

Ya, but I don't think pointers/refs and floats are comparable 
because one is copy semantics and the other is not. Conceptually, 
pointers are only references to data while numbers are actual 
data. It makes sense that one would default to different things. 
Thought if Int did have a NaN value, I'm not sure which way I 
would side on this issue. I still think I would prefer having 
some level of compile-time indication or my errors simply because 
it saves time when you're making something.


 It would be used where the static analysis is not able to 
 detect that the initializer is dead.

Good to know.


 However, and I've seen this happen, people will satisfy the 
 compiler complaint by initializing the variable to any old 
 value (usually 0), because that value will never get used. 
 Later, after other things change in the code, that value 
 suddenly gets used, even though it may be an incorrect value 
 for the use.

Maybe the perfect solution is to have the compiler initialize the 
value to NaN, but it also does a bit of static analysis and gives 
a compiler error when it can determine your variable is being 
used before being assigned for the sake of productivity.

In fact, for the sake of consistency, you could always enforce 
that (compiler error) rule on every local variable, so even ints 
would be required to have explicit initialization before use.

I still prefer float class members to be defaulted to a usable 
value, for the sake of consistency with ints.

Aug 11 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/11/12 3:11 AM, F i L wrote:
 I still prefer float class members to be defaulted to a usable value,
 for the sake of consistency with ints.

Actually there's something that just happened two days ago to me that's 
relevant to this, particularly because it's in a different language 
(SQL) and different domain (Machine Learning).

I was working with an iterative algorithm implemented in SQL, which 
performs some aggregate computation, on some 30 billions of samples. The 
algorithm is rather intricate, and each iteration takes the previous 
one's result as input.

Somehow at the end there were NaNs in the sample data I was looking at 
(there weren't supposed to). So I started investigating; the NaNs could 
appear only in a rare data corruption case. And indeed before long I 
found 4 (four) samples out of 30 billion that were corrupt. After one 
iteration, there were 300K NaNs. After two iterations, a few millions. 
After four, 800M samples were messed up. NaNs did save the day.

Although this case is not about default values but about the result of a 
computation (in this case 0.0/0.0), I think it still reveals the 
usefulness of having a singular value in the floating point realm.


Andrei

Aug 11 2012

"F i L" <witte2008 gmail.com> writes:

Andrei Alexandrescu wrote:
 [ ... ]

 Although this case is not about default values but about the 
 result of a computation (in this case 0.0/0.0), I think it 
 still reveals the usefulness of having a singular value in the 
 floating point realm.

My argument was never against the usefulness of NaN for 
debugging... only that it should be considered a debugging 
feature and explicitly defined, rather than intruding on 
convenience and consistency (with Int) by being the default.

I completely agree NaNs are important for debugging floating 
point math, in fact D's default-to-NaN has caught a couple of my 
construction mistakes before. The problem, is that this sort of 
construction mistake is bigger than just floating point and NaN. 
You can mis-set a variable, float or not, or you can not set an 
int when you should have.

So the question becomes not what benefit NaN is for debugging, 
but what a persons thought process is when creating/debugging 
code, and herein lies the heart of my qualm. In D we have a bit 
of a conceptual double standard within the number community. I 
have to remember these rules when I'm creating something, not 
just when I'm debugging it. As often as D may have caught a 
construction mistake specifically related to floats in my code, 
10x more so it's produced NaN's where I intended a number, 
because I forgot about the double standard when adding a field or 
creating a variable.

A C++ guy might not think twice about this because he's used to 
having to default values all the time (IDK, I'm not that guy), 

that's a paper-cut on someone's opinion of the language.

Aug 11 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/11/2012 12:33 PM, F i L wrote:
 In D we have a bit of a conceptual double standard within the
 number community. I have to remember these rules when I'm creating something,
 not just when I'm debugging it. As often as D may have caught a construction
 mistake specifically related to floats in my code, 10x more so it's produced
 NaN's where I intended a number, because I forgot about the double standard
when
 adding a field or creating a variable.

I'd rather have a 100 easy to find bugs than 1 unnoticed one that went out in 
the field.


 A C++ guy might not think twice about this because he's used to having to
 default values all the time (IDK, I'm not that guy),

Only if a default constructor is defined for the type, which it often is not, 
and you'll get garbage for a default initialization.

Aug 11 2012

"F i L" <witte2008 gmail.com> writes:

Walter Bright wrote:
 I'd rather have a 100 easy to find bugs than 1 unnoticed one 
 that went out in the field.

That's just the thing, bugs are arguably easier to hunt down when 
things default to a consistent, usable value. When variables are 
defaulted to Zero, I have a guarantee that any propagated NaN bug 
is _not_ coming from them (directly). With NaN defaults, I only 
have a guarantee that the value _might_ be coming said variable.

Then, I also have more to be aware of when searching through 
code, because my ints behave differently than my floats. 
Arguably, you always have to be aware of this, but at least with 
explicit sets to NaN, I know the potential culprits earlier 
(because they'll have distinct assignment).

With static analysis warning against local scope NaN issues, 
there's really only one situation where setting to NaN catches 
bugs, and that's when you want to guarantee that a member 
variable is specifically assigned a value (of some kind) during 
construction. This is a corner case situation because:

1. It makes no guarantees about what value is actually assigned 
to the variable, only that it's set to something. Which means 
it's either forgotten in favor of a  'if' statement, or in 
combination with an if statement.

2. Because of it's singular debugging potential, NaN safeguards 
are, most often, intentionally put in place (or in D's case, left 
in place).

This is why I think such situations should require an explicit 
assignment to NaN. The "100 easy bugs" you mentioned weren't 
actually "bugs", they where times I forgot floats defaulted 
_differently_. The 10 times where NaN caught legitimate bugs, I 
would have had to hunt down the mistake either way, and it was 
trivial to do regardless of the the NaN. Even if it wasn't 
trivial, I could have very easily assigned NaN to questionable 
variables explicitly.

Aug 11 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/11/2012 3:01 PM, F i L wrote:
 Walter Bright wrote:
 I'd rather have a 100 easy to find bugs than 1 unnoticed one that went out in
 the field.

 That's just the thing, bugs are arguably easier to hunt down when things
default
 to a consistent, usable value.

Many, many programming bugs trace back to assumptions that floating point 
numbers act like ints. There's just no way to avoid knowing and understanding 
the differences.


 When variables are defaulted to Zero, I have a
 guarantee that any propagated NaN bug is _not_ coming from them (directly).
With
 NaN defaults, I only have a guarantee that the value _might_ be coming said
 variable.

I don't see why this is a bad thing. The fact is, with NaN you know there is a 
bug. With 0, you may never realize there is a problem. Andrei wrote me about
the 
output of a program he is working on having billions of result values, and he 
noticed a few were NaNs, which he traced back to a bug. If the bug had set the 
float value to 0, there's no way he would have ever noticed the issue.

It's all about daubing bugs with day-glo orange paint so you know there's a 
problem. Painting them with camo is not the right solution.

Aug 11 2012

"F i L" <witte2008 gmail.com> writes:

Walter Bright wrote:
 That's just the thing, bugs are arguably easier to hunt down 
 when things default
 to a consistent, usable value.

 Many, many programming bugs trace back to assumptions that 
 floating point numbers act like ints. There's just no way to 
 avoid knowing and understanding the differences.

My point was that the majority of the time there wasn't a bug 
introduced. Meaning the code was written an functioned as 
expected after I initialized the value to 0. I was only expecting 
the value to act similar (in initial value) as it's 'int' 
relative, but received a NaN in the output because I forgot to be 
explicit.


 I don't see why this is a bad thing. The fact is, with NaN you 
 know there is a bug. With 0, you may never realize there is a 
 problem. Andrei wrote me about the output of a program he is 
 working on having billions of result values, and he noticed a 
 few were NaNs, which he traced back to a bug. If the bug had 
 set the float value to 0, there's no way he would have ever 
 noticed the issue.

 It's all about daubing bugs with day-glo orange paint so you 
 know there's a problem. Painting them with camo is not the 
 right solution.

Yes, and this is an excellent argument for using NaN as a 
debugging practice in general, but I don't see anything in favor 
of defaulting to NaN. If you don't do some kind of check against 
code, especially with such large data sets, bugs of various kinds 
are going to go unchecked regardless.

A bug where an initial data value was accidentally initialized to 
0 (by a third party later on, for instance), could be just as 
hard to miss, or harder if you're expecting a NaN to appear. In 
fact, an explicit set to NaN might discourage a third party to 
assigning without first questioning the original intention. In 
this situation I imagine best practice would be to write:

float dataValue = float.nan; // MUST BE NaN, DO NOT CHANGE!
                              // set to NaN to ensure is-set.

Aug 11 2012

dennis luehring <dl.soluz gmx.net> writes:

Am 12.08.2012 02:43, schrieb F i L:
 Yes, and this is an excellent argument for using NaN as a
 debugging practice in general, but I don't see anything in favor
 of defaulting to NaN. If you don't do some kind of check against
 code, especially with such large data sets, bugs of various kinds
 are going to go unchecked regardless.

is makes absolutely no sense to have different initialization stylel in 
debug an release - and according to Andrei example: there are many 
situations where slow-debug code isn't capable to reproduce the error in 
a human-timespan - especially when working with million, billion 
datasets (like i also do...)

Aug 11 2012

Don Clugston <dac nospam.com> writes:

On 12/08/12 01:31, Walter Bright wrote:
 On 8/11/2012 3:01 PM, F i L wrote:
 Walter Bright wrote:
 I'd rather have a 100 easy to find bugs than 1 unnoticed one that
 went out in
 the field.

 That's just the thing, bugs are arguably easier to hunt down when
 things default
 to a consistent, usable value.

 Many, many programming bugs trace back to assumptions that floating
 point numbers act like ints. There's just no way to avoid knowing and
 understanding the differences.

Exactly. I have come to believe that there are very few algorithms 
originally designed for integers, which also work correctly for floating 
point.

Integer code nearly always assumes things like, x + 1 != x, x == x,
(x + y) - y == x.


for (y = x; y < x + 10; y = y + 1) { .... }

How many times does it loop?

Aug 13 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 13/08/12 11:11, Don Clugston wrote:
 Exactly. I have come to believe that there are very few algorithms originally
 designed for integers, which also work correctly for floating point.

   ////////
     import std.stdio;

     void main()
     {
	    real x = 1.0/9.0;

	    writefln("x = %.128g", x);
	    writefln("9x = %.128g", 9.0*x);
     }
   ////////

... well, that doesn't work, does it?  Looks like some sort of cheat in place
to 
make sure that the successive division and multiplication will revert to the 
original number.

 Integer code nearly always assumes things like, x + 1 != x, x == x,
 (x + y) - y == x.

There's always good old "if(x==0)" :-)

Aug 13 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/13/2012 5:38 AM, Joseph Rushton Wakeling wrote:
 Looks like some sort of cheat in place to
 make sure that the successive division and multiplication will revert to the
 original number.

That's called "rounding". But rounding always implies some, small, error that 
can accumulate into being a very large error.

Aug 13 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 13/08/12 20:04, Walter Bright wrote:
 That's called "rounding". But rounding always implies some, small, error that
 can accumulate into being a very large error.

Well, yes.  I was just remarking on the choice of rounding and the motivation 
behind it.

After all, you _could_ round it instead as,

     x = 1.0/9.0 == 0.11111111111111 ... 111  [finite number of decimal places]

but then

     9*x == 0.999999999999 ... 9999   [i.e. doesn't multiply back to 1.0].

... and this is probably more likely to result in undesirable error than the 
other rounding scheme.  (I think the calculator app on Windows used to have
this 
behaviour some years back.)

Aug 13 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Don Clugston:

 I have come to believe that there are very few algorithms 
 originally designed for integers, which also work correctly for 
 floating point.

And JavaScript programs that use integers?

Bye,
bearophile

Aug 13 2012

"TJB" <broughtj gmail.com> writes:

On Monday, 13 August 2012 at 10:11:06 UTC, Don Clugston wrote:

  ... I have come to believe that there are very few algorithms 
 originally designed for integers, which also work correctly for 
 floating point.

 Integer code nearly always assumes things like, x + 1 != x, x 
 == x,
 (x + y) - y == x.


 for (y = x; y < x + 10; y = y + 1) { .... }

 How many times does it loop?

Don,

I would appreciate your thoughts on the issue of re-implementing 
numeric codes like BLAS and LAPACK in pure D to benefit from the 
many nice features listed in this discussion.  Is it feasible? 
Worthwhile?

Thanks,

TJB

Aug 13 2012

Don Clugston <dac nospam.com> writes:

On 14/08/12 05:03, TJB wrote:
 On Monday, 13 August 2012 at 10:11:06 UTC, Don Clugston wrote:

  ... I have come to believe that there are very few algorithms
 originally designed for integers, which also work correctly for
 floating point.

 Integer code nearly always assumes things like, x + 1 != x, x == x,
 (x + y) - y == x.


 for (y = x; y < x + 10; y = y + 1) { .... }

 How many times does it loop?

 Don,

 I would appreciate your thoughts on the issue of re-implementing numeric
 codes like BLAS and LAPACK in pure D to benefit from the many nice
 features listed in this discussion.  Is it feasible? Worthwhile?

 Thanks,

 TJB

I found that when converting code for Special Functions from C to D, the 
code quality improved enormously. Having 'static if' and things like 
float.epsilon as built-ins makes a surprisingly large difference. It 
encourages correct code. (For example, it makes any use of magic numbers 
in the code look really ugly and wrong). Unit tests help too.

That probably doesn't apply so much to LAPACK and BLAS, but it would be 
interesting to see how far we can get with the new SIMD support.

Aug 14 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 11 August 2012 at 04:33:38 UTC, Walter Bright wrote:
 It's too bad that ints don't have a NaN value, but 
 interestingly enough, valgrind does default initialize them to 
 some internal NaN, making it a most excellent bug detector.

  The compiler could always have flags specifying if variables 
were used, and if they are false they are as good as NaN. Only 
downside is a performance hit unless you Mark it as a release 
binary. It really comes down to if it's worth implementing or 
considered a big change (unless it's a flag you have to specially 
turn on)

example:

   int a;

   writeln(a++); //compile-time error, or throws an exception on 
at runtime (read access before being set)

internally translated as:
   int a;
   bool _is_a_used = false;

   if (!_a__is_a_used)
     throw new exception("a not initialized before use!");
     //passing to functions will throw the exception,
     //unless the signature is 'out'
   writeln(a);

   ++a;
   _a__is_a_used= true;


 Sadly, D has to map onto imperfect hardware :-(

  Not so much imperfect hardware, just the imperfect 'human' 
variable.

 We do have NaN values for chars (0xFF) and pointers (the 
 villified 'null'). Think how many bugs the latter has exposed, 
 and then think of all the floating point code with no such 
 obvious indicator of bad initialization.

Aug 11 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/11/2012 1:30 AM, Era Scarecrow wrote:
 On Saturday, 11 August 2012 at 04:33:38 UTC, Walter Bright wrote:
 It's too bad that ints don't have a NaN value, but interestingly enough,
 valgrind does default initialize them to some internal NaN, making it a most
 excellent bug detector.

   The compiler could always have flags specifying if variables were used, and
if
 they are false they are as good as NaN. Only downside is a performance hit
 unless you Mark it as a release binary. It really comes down to if it's worth
 implementing or considered a big change (unless it's a flag you have to
 specially turn on)

Not so easy. Suppose you pass a pointer to the variable to another function. 
Does that function set it?

Aug 11 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 11 August 2012 at 09:26:42 UTC, Walter Bright wrote:
 On 8/11/2012 1:30 AM, Era Scarecrow wrote:

 The compiler could always have flags specifying if variables 
 were used, and if they are false they are as good as NaN. Only 
 downside is a performance hit unless you Mark it as a release 
 binary. It really comes down to if it's worth implementing or 
 considered a big change (unless it's a flag you have to 
 specially turn on)

 Not so easy. Suppose you pass a pointer to the variable to 
 another function. Does that function set it?

  I suppose there could be a second hidden pointer/bool as part of 
calls, but then it's completely incompatible with any C calling 
convention, meaning that is probably out of the question.

  Either a) pointers are low level enough that like casting; At 
which case it's all up to the programmer. or b) same as before 
that unless it's an 'out' parameter is specified, it would likely 
throw an exception at that point, (Since attempting to read/pass 
the address of an uninitialized variable is the same as accessing 
it directly). Afterall having a false positive is better than not 
being involved at all right?

  Of course with that in mind, specifying a variable to begin as 
void (uninitialized) could be it's own form of initialization? 
(Meaning it wouldn't be checking those even though they hold 
known garbage)

Aug 11 2012

"F i L" <witte2008 gmail.com> writes:

F i L wrote:
 Walter Bright wrote:
 It catches only a subset of these at compile time. I can craft 
 any number of ways of getting it to miss diagnosing it. 
 Consider this one:

    float z;
    if (condition1)
         z = 5;
    ... lotsa code ...
    if (condition2)
         z++;
 
 [...]

 Yes, but that's not really an issue since the compiler informs 
 the coder of it's limitation. You're simply forced to 
 initialize the variable in this situation.


fields are defaulted to a usable value. Locals have to be 
explicitly set before they're used.. so, expanding on your 
example above:

     float z;
     if (condition1)
         z = 5;
     else
         z = 6; // 'else' required

     ... lotsa code ...
     if (condition2)
         z++;

On the first condition, without an 'else z = ...', or if the 
condition was removed at a later time, then you'll get a compiler 
error and be forced to explicitly assign 'z' somewhere above 


you need to:

   1. run the program
   2. get bad result
   3. hunt down bug


are initialized in a constructor:

     class Foo
     {
         float f = float.NaN; // Can't 'f' use unless Foo is
                              // properly constructed.
     }

Aug 10 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/10/2012 9:55 PM, F i L wrote:
 On the first condition, without an 'else z = ...', or if the condition was
 removed at a later time, then you'll get a compiler error and be forced to


 whereas in D you need to:

    1. run the program
    2. get bad result
    3. hunt down bug

However, and I've seen this happen, people will satisfy the compiler complaint 
by initializing the variable to any old value (usually 0), because that value 
will never get used. Later, after other things change in the code, that value 
suddenly gets used, even though it may be an incorrect value for the use.

Aug 10 2012

"Mehrdad" <wfunction hotmail.com> writes:

On Saturday, 11 August 2012 at 05:41:23 UTC, Walter Bright wrote:
 On 8/10/2012 9:55 PM, F i L wrote:
 On the first condition, without an 'else z = ...', or if the 
 condition was removed at a later time, then you'll get a 
 compiler error and be forced to explicitly assign 'z' 


 compile-time, whereas in D you need to:
   1. run the program
   2. get bad result
   3. hunt down bug

 However, and I've seen this happen, people will satisfy the 
 compiler complaint by initializing the variable to any old 
 value (usually 0), because that value will never get used. 
 Later, after other things change in the code, that value 
 suddenly gets used, even though it may be an incorrect value 
 for the use.


Note to Walter:

You're obviously correct that you can make an arbitrarily complex 
program to make it too difficult for the compiler to enforce 


What you seem to be missing is that the issue you're saying is 
correct in theory, but too much of a corner case in practice.


you're mentioning, and even when they do, they don't have nearly 
as much of a problem with fixing it as you seem to think.

The only reason you run into this sort of problem (assuming you 
do, and it's not just a theoretical discussion) is that you're in 
the C/C++ mindset, and using variables in the C/C++ fashion.

you simply _wouldn't_ try to make things so complicated when 
coding, and you simply _wouldn't_ run into these problems the way 
you /think/ you would, as a C++ programmer.


Regardless, it looks to me like you two are arguing for two 
orthogonal issues:

F i L:  The compiler should detect uninitialized variables.
Walter: The compiler should choose initialize variables with NaN.


What I'm failing to understand is, why can't we have both?

1. Compiler _warns_ about "uninitialized variables" (or scalars, 

the address of the variable, in which case the compiler gives up 

Bonus points: Try to detect a couple of common cases (e.g. 
if/else) instead of giving up so easily.

2. In any case, the compiler initializes the variable with 
whatever default value Walter deems useful.


Then you get the best of both worlds:

1. You force the programmer to manually initialize the variable 
in most cases, forcing him to think about the default value. It's 
almost no trouble for

2. In the cases where it's not possible, the language helps the 
programmer catch bugs.










Nothing lost, anyway.

to be gained.

Aug 14 2012

"Michal Minich" <michal.minich gmail.com> writes:

On Tuesday, 14 August 2012 at 10:31:30 UTC, Mehrdad wrote:
 Note to Walter:

 You're obviously correct that you can make an arbitrarily 
 complex program to make it too difficult for the compiler to 

 cases).

 What you seem to be missing is that the issue you're saying is 
 correct in theory, but too much of a corner case in practice.


 you're mentioning, and even when they do, they don't have 
 nearly as much of a problem with fixing it as you seem to think.


hairy code (nested if/foreach/try) to make sure all cases are 
handled when initializing variable. Compilation errors can be 
simply dismissed by assigning a 'default' value to variable at 
the beginning the functions, but is generally a sloppy programing 
and you loose useful help of the compiler.


applied to D
http://msdn.microsoft.com/en-us/library/aa691172%28v=vs.71%29.aspx

Aug 14 2012

Don Clugston <dac nospam.com> writes:

On 14/08/12 12:31, Mehrdad wrote:
 On Saturday, 11 August 2012 at 05:41:23 UTC, Walter Bright wrote:
 On 8/10/2012 9:55 PM, F i L wrote:
 On the first condition, without an 'else z = ...', or if the
 condition was removed at a later time, then you'll get a compiler
 error and be forced to explicitly assign 'z' somewhere above using

 catches these issues at compile-time, whereas in D you need to:
   1. run the program
   2. get bad result
   3. hunt down bug

 However, and I've seen this happen, people will satisfy the compiler
 complaint by initializing the variable to any old value (usually 0),
 because that value will never get used. Later, after other things
 change in the code, that value suddenly gets used, even though it may
 be an incorrect value for the use.


 Note to Walter:

 You're obviously correct that you can make an arbitrarily complex
 program to make it too difficult for the compiler to enforce


 What you seem to be missing is that the issue you're saying is correct
 in theory, but too much of a corner case in practice.


 mentioning, and even when they do, they don't have nearly as much of a
 problem with fixing it as you seem to think.

 The only reason you run into this sort of problem (assuming you do, and
 it's not just a theoretical discussion) is that you're in the C/C++
 mindset, and using variables in the C/C++ fashion.

 simply _wouldn't_ try to make things so complicated when coding, and you
 simply _wouldn't_ run into these problems the way you /think/ you would,
 as a C++ programmer.


 Regardless, it looks to me like you two are arguing for two orthogonal
 issues:

 F i L:  The compiler should detect uninitialized variables.
 Walter: The compiler should choose initialize variables with NaN.


 What I'm failing to understand is, why can't we have both?

 1. Compiler _warns_ about "uninitialized variables" (or scalars, at

 address of the variable, in which case the compiler gives up trying to

 Bonus points: Try to detect a couple of common cases (e.g. if/else)
 instead of giving up so easily.

 2. In any case, the compiler initializes the variable with whatever
 default value Walter deems useful.


 Then you get the best of both worlds:

 1. You force the programmer to manually initialize the variable in most
 cases, forcing him to think about the default value. It's almost no
 trouble for

 2. In the cases where it's not possible, the language helps the
 programmer catch bugs.




DMD detects uninitialized variables if you compile with -O. It's hard to 
implement the full Monty at the moment, because all that code is in the 
backend rather than the front-end.




Completely agree.
I always thought the intention was that assigning to NaN was simply a 
way of catching the difficult cases that slip through compile-time 
checks. Which includes the situation where the compile-time checking 
isn't yet implemented at all.
This is the first time I've heard the suggestion that it might never be 
implemented.

The thing which is really bizarre though, is float.init. I don't know 
what the semantics of it are.

Aug 14 2012

"F i L" <witte2008 gmail.com> writes:

Mehrdad wrote:
 Note to Walter:

 You're obviously correct that you can make an arbitrarily 
 complex program to make it too difficult for the compiler to 

 cases).
 
 [ ... ]

I think some here are mis-interpreting Walters position 
concerning static analysis from our earlier conversation, so I'll 
share my impression of his thoughts.

I can't speak for Walter, of course, but I'm pretty sure that 
early on in our conversation he agreed that having the compiler 
catch local scope initialization issues was a good idea, or at 
least, wasn't a bad one (again, correct me if I'm wrong). I doubt 
he would be adverse to eventually having DMD perform this sort of 
static analysis to help developers, though I doubt it's a high 
priority for him.

The majority of the conversation after that was concerning 
struct/class fields defaults:

   class Foo
   {
       float x; // I think this should be 0.0f
                // Walter thinks it should be NaN
   }

In this situation static analysis can't help catch issues, and 
we're forced to rely on a default value of some kind. Both Walter 
and I have stated our opinion's reasoning previously, so I won't 
repeat them here.

Aug 14 2012

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Tue, 14 Aug 2012 16:32:25 +0200, F i L <witte2008 gmail.com> wrote:

    class Foo
    {
        float x; // I think this should be 0.0f
                 // Walter thinks it should be NaN
    }

 In this situation static analysis can't help catch issues, and we're  
 forced to rely on a default value of some kind.

Really? We can catch (or, should be able to) missing initialization
of stuff with  disable this(), but not floats?

Classes have constructors, which lend themselves perfectly to doing
exactly this (just pretend the member is a local variable).

Perhaps there are problems with structs without disabled default
constructors, but even those are trivially solvable by requiring
a default value at declaration time.

-- 
Simen

Aug 14 2012

"F i L" <witte2008 gmail.com> writes:

On Tuesday, 14 August 2012 at 14:46:30 UTC, Simen Kjaeraas wrote:
 On Tue, 14 Aug 2012 16:32:25 +0200, F i L <witte2008 gmail.com> 
 wrote:

   class Foo
   {
       float x; // I think this should be 0.0f
                // Walter thinks it should be NaN
   }

 In this situation static analysis can't help catch issues, and 
 we're forced to rely on a default value of some kind.

 Really? We can catch (or, should be able to) missing 
 initialization
 of stuff with  disable this(), but not floats?

 Classes have constructors, which lend themselves perfectly to 
 doing
 exactly this (just pretend the member is a local variable).

 Perhaps there are problems with structs without disabled default
 constructors, but even those are trivially solvable by requiring
 a default value at declaration time.

You know, I never actually thought about it much, but I think 
you're right. I guess the same rules could apply to type fields.

Aug 14 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Tuesday, 14 August 2012 at 15:24:30 UTC, F i L wrote:
 Really? We can catch (or, should be able to) missing 
 initialization of stuff with  disable this(), but not floats?

 Classes have constructors, which lend themselves perfectly to 
 doing exactly this (just pretend the member is a local 
 variable).

 Perhaps there are problems with structs without disabled 
 default constructors, but even those are trivially solvable by 
 requiring a default value at declaration time.

 You know, I never actually thought about it much, but I think 
 you're right. I guess the same rules could apply to type fields.

Mmmm... What if you added a command that has a file/local scope? 
perhaps following the  disable this(), it could be  disable init; 
or  disable .init. This would only work for built-in types, and 
possibly structs with variables that aren't explicitly set with 
default values. It sorta already fits with what's there.

 disable init; //global scope in file, like  safe.

struct someCipher {
    disable init; //local scope, in this case the whole struct.

   int[][] tables; //now gives compile-time error unless  disable 
this() used.
   ubyte[] key = [1,2,3,4]; //explicitly defined as a default
   this(ubyte[] k, int[][] t){key=k;tables=t;}
}

void myfun() {
   someCipher x; //compile time error since struct fails (But not 
at this line unless  disable this() used)
   someCipher y = someCipher([[1,2],[1,2]]); //should work as 
expected.
}

Aug 14 2012

"Mehrdad" <wfunction hotmail.com> writes:

On Tuesday, 14 August 2012 at 15:24:30 UTC, F i L wrote:
 On Tuesday, 14 August 2012 at 14:46:30 UTC, Simen Kjaeraas 
 wrote:
 On Tue, 14 Aug 2012 16:32:25 +0200, F i L 
 <witte2008 gmail.com> wrote:

  class Foo
  {
      float x; // I think this should be 0.0f
               // Walter thinks it should be NaN
  }

 In this situation static analysis can't help catch issues, 
 and we're forced to rely on a default value of some kind.

 Really? We can catch (or, should be able to) missing 
 initialization
 of stuff with  disable this(), but not floats?

 Classes have constructors, which lend themselves perfectly to 
 doing
 exactly this (just pretend the member is a local variable).

 Perhaps there are problems with structs without disabled 
 default
 constructors, but even those are trivially solvable by 
 requiring
 a default value at declaration time.

 You know, I never actually thought about it much, but I think 
 you're right. I guess the same rules could apply to type fields.


:)

We could do the same for structs and classes... what I said 
doesn't just apply to local variables.

Aug 14 2012

"Mehrdad" <wfunction hotmail.com> writes:

On Tuesday, 14 August 2012 at 14:32:26 UTC, F i L wrote:
 Mehrdad wrote:
 Note to Walter:

 You're obviously correct that you can make an arbitrarily 
 complex program to make it too difficult for the compiler to 

 cases).
 
 [ ... ]

 I think some here are mis-interpreting Walters position 
 concerning static analysis from our earlier conversation, so 
 I'll share my impression of his thoughts.

 I can't speak for Walter, of course, but I'm pretty sure that 
 early on in our conversation he agreed that having the compiler 
 catch local scope initialization issues was a good idea, or at 
 least, wasn't a bad one (again, correct me if I'm wrong). I 
 doubt he would be adverse to eventually having DMD perform this 
 sort of static analysis to help developers, though I doubt it's 
 a high priority for him.


Ah, well if he's for it, then I misunderstood. I read through the 
entire thread (but not too carefully, just 1 read) and my 
impression was that he didn't like the idea because it would fail 
in some cases (and because D doesn't seem to love emitting 
compiler warnings in general), but if he likes it, then great. :)

Aug 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/14/2012 3:31 AM, Mehrdad wrote:
 Then you get the best of both worlds:

 1. You force the programmer to manually initialize the variable in most cases,
 forcing him to think about the default value. It's almost no trouble for

 2. In the cases where it's not possible, the language helps the programmer
catch
 bugs.




As I've explained before, user defined types have "default constructors". If 
builtin types do not, then you've got a barrier to writing generic code.

Default initialization also applies to static arrays, tuples, structs and 
dynamic allocation. It seems a large inconsistency to complain about them only 
for local variables of basic types, and not for any aggregate type or user 
defined type.







As for the 'rarity' of the error I mentioned, yes, it is unusual. The trouble
is 
when it creeps unexpectedly into otherwise working code that has been working 
for a long time.

Aug 14 2012

"Mehrdad" <wfunction hotmail.com> writes:

On Tuesday, 14 August 2012 at 21:13:01 UTC, Walter Bright wrote:
 On 8/14/2012 3:31 AM, Mehrdad wrote:
 Then you get the best of both worlds:

 1. You force the programmer to manually initialize the 
 variable in most cases,
 forcing him to think about the default value. It's almost no 
 trouble for

 2. In the cases where it's not possible, the language helps 
 the programmer catch
 bugs.




 As I've explained before, user defined types have "default 
 constructors". If builtin types do not, then you've got a 
 barrier to writing generic code.

Just because they _have_ a default constructor doesn't mean the 
compiler should implicitly _call_ them on your behalf.










Huh? I think you completely misread my post...
I was talking about "definite assignment", i.e. the _lack_ of 
automatic initialization.


 As for the 'rarity' of the error I mentioned, yes, it is 
 unusual. The trouble is when it creeps unexpectedly into 
 otherwise working code that has been working for a long time.

It's no "trouble" in practice, that's what I'm trying to say. It 
only looks like "trouble" if you look at it from the C/C++

Aug 14 2012

"Mehrdad" <wfunction hotmail.com> writes:

On Tuesday, 14 August 2012 at 21:22:14 UTC, Mehrdad wrote:


Typo, scratch Java, it's N/A for Java.

Aug 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/14/2012 2:22 PM, Mehrdad wrote:
 I was talking about "definite assignment", i.e. the _lack_ of automatic
 initialization.

I know. How does that fit in with default construction?

Aug 14 2012

"Mehrdad" <wfunction hotmail.com> writes:

On Tuesday, 14 August 2012 at 21:58:20 UTC, Walter Bright wrote:
 On 8/14/2012 2:22 PM, Mehrdad wrote:
 I was talking about "definite assignment", i.e. the _lack_ of 
 automatic initialization.

 I know. How does that fit in with default construction?

They aren't called unless the user calls them.


	void Bar<T>(T value) { }
	
	void Foo<T>()
		where T : new()   // generic constraint for default constructor
	{
		T uninitialized;
		T initialized = new T();
	
		Bar(initialized);  // error
		Bar(uninitialized);  // OK
	}
	
	void Test() { Foo<int>(); Foo<Object>(); }


D could take a similar approach.

Aug 14 2012

"Mehrdad" <wfunction hotmail.com> writes:

On Tuesday, 14 August 2012 at 22:57:26 UTC, Mehrdad wrote:
 		Bar(initialized);  // error
 		Bar(uninitialized);  // OK


Er, other way around I mean...

Aug 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/14/2012 3:57 PM, Mehrdad wrote:
 I know. How does that fit in with default construction?

 They aren't called unless the user calls them.

I guess they aren't really default constructors, then <g>.

So what happens when you allocate an array of them?


 D could take a similar approach.

It could, but default construction is better (!).

Aug 14 2012

"Mehrdad" <wfunction hotmail.com> writes:

On Wednesday, 15 August 2012 at 00:32:43 UTC, Walter Bright wrote:
 On 8/14/2012 3:57 PM, Mehrdad wrote:
 I guess they aren't really default constructors, then <g>.

I say potayto, you say potahto...  :P


 So what happens when you allocate an array of them?

For arrays, they're called automatically.


Well, OK, that's a bit of a simplification.

It's what happens from the user perspective, not the compiler's 
(or runtime's).

Here's the full story.
And please read it carefully, since I'm __not__ saying D should 



- You can define a custom default constructor for classes, but 
not structs.
- Structs _always_ have a zero-initializing default 
(no-parameter) constructor.
- Therefore, there is no such thing as "copy construction"; it's 
bitwise-copied.
- Ctors for _structs_ MUST initialize every field (or call the 
default ctor)
- Ctors for _classes_ don't have this restriction.
- Since initialization is "Cheap", the runtime _always_ does it, 
for _security_.
- The above^ is IRRELEVANT to the compiler!
   * It enforces initialization where it can.
   * It explicitly tells the runtime to auto-initialize when it 
can't.
     -- You can ONLY take the address of a variable in unsafe{} 
blocks.
     -- This implies you know what you're doing, so it's not a 
problem.


What D would do _ideally_, IMO:

1. Keep the ability to define default (no-args) and postblit 
constructors.

2. _Always_ force the programmer to initialize _all_ variables 
explicitly.
    * No, this is NOT what C++ does.
    * Yes, it is tested & DOES work well in practice. But NOT in 
the C++ mindset.
    * If the programmer _needs_ vars to be uninitialized, he can 
say  = void.
    * If the programmer wants NaNs, he can just say = T.init. 
Bingo.


It should work pretty darn well, if you actually give it a try.

(Don't believe me? Put it behind a compiler switch, and see how 
many people start using it, and how many of them [don't] complain 
about it!)


 D could take a similar approach.

 It could, but default construction is better (!).

Well, that's so convincing, I'm left speechless!

Aug 14 2012

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Friday, 10 August 2012 at 22:01:46 UTC, Walter Bright wrote:
 It catches only a subset of these at compile time. I can craft 
 any number of ways of getting it to miss diagnosing it. 
 Consider this one:

     float z;
     if (condition1)
          z = 5;
     ... lotsa code ...
     if (condition2)
          z++;

 To diagnose this correctly, the static analyzer would have to 
 determine that condition1 produces the same result as 
 condition2, or not. This is impossible to prove. So the static 
 analyzer either gives up and lets it pass, or issues an 
 incorrect diagnostic. So our intrepid programmer is forced to 
 write:

     float z = 0;
     if (condition1)
          z = 5;
     ... lotsa code ...
     if (condition2)
          z++;

 Now, as it may turn out, for your algorithm the value "0" is an 
 out-of-range, incorrect value. Not a problem as it is a dead 
 assignment, right?

 But then the maintenance programmer comes along and changes 
 condition1 so it is not always the same as condition2, and now 
 the z++ sees the invalid "0" value sometimes, and a silent bug 
 is introduced.

 This bug will not remain undetected with the default NaN 
 initialization.


variable is NOT set and then emits an error. It tries to prove 
that the variable IS set, and if it can't prove that, it's an 
error.

It's not an incorrect diagnostic, it does exactly what it's 
supposed to do and the programmer has to be explicit when one 
takes on the responsibility of initialization. I don't see 

programmers I've talked to love it (I much prefer it too).

Leaving a local variable initially uninitialized (or rather, not 
explicitly initialized) is a good way to portray the intention 

your program compiles, your variable is guaranteed to be 
initialized later but before use. This is a useful guarantee when 
reading/maintaining code.

In D, on the other hand, it's possible to write D code like:

for(size_t i; i < length; ++i)
{
     ...
}

And I've actually seen this kind of code a lot in the wild. It 
boggles my mind that you think that this code should be legal. I 
think it's lazy - the intention is not clear. Is the default 
initializer being intentionally relied on, or was it 
unintentional? I've seen both cases. The for-loop example is an 
extreme one for demonstrative purposes, most examples are less 
obvious.

Saying that most programmers will explicitly initialize floating 
point numbers to 0 instead of NaN when taking on initialization 
responsibility is a cop-out - float.init and float.nan are 
obviously the values you should be going for. The benefit is easy 
for programmers to understand, especially if they already 
understand why float.init is NaN. You say yelling at them 
probably won't help - why not? I personally use 
float.init/double.init etc. in my own code, and I'm sure other 
informed programmers do too. I can understand why people don't do 
it in, say, C, with NaN being less defined there afaik. D 
promotes NaN actively and programmers should be eager to leverage 
NaN explicitly too.


non-local variables - they all have a defined default initializer 

that the local-variable analysis is limited to the scope of a 
single function body, it does not do inter-procedural analysis.

I think this would be a great thing for D, and I believe that all 
code this change breaks is actually broken to begin with.

Aug 11 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/11/2012 1:57 AM, Jakob Ovrum wrote:

 set and then emits an error. It tries to prove that the variable IS set, and if
 it can't prove that, it's an error.

 It's not an incorrect diagnostic, it does exactly what it's supposed to do

Of course it is doing what the language requires, but it is an incorrect 
diagnostic because a dead assignment is required.

And being a dead assignment, it can lead to errors when the code is later 
modified, as I explained. I also dislike on aesthetic grounds meaningless code 
being required.

 In D, on the other hand, it's possible to write D code like:

 for(size_t i; i < length; ++i)
 {
      ...
 }

 And I've actually seen this kind of code a lot in the wild. It boggles my mind
 that you think that this code should be legal. I think it's lazy - the
intention
 is not clear. Is the default initializer being intentionally relied on, or was
 it unintentional? I've seen both cases. The for-loop example is an extreme one
 for demonstrative purposes, most examples are less obvious.

That perhaps is your experience with other languages (that do not default 
initialize) showing. I don't think that default initialization is so awful. In 
fact, C++ enables one to specify default initialization for user defined types. 
Are you against that, too?


 Saying that most programmers will explicitly initialize floating point numbers
 to 0 instead of NaN when taking on initialization responsibility is a cop-out -

You can certainly say it's a copout, but it's what I see them do. I've never 
seen them initialize to NaN, but I've seen the "just throw in a 0" many times.


 float.init and float.nan are obviously the values you should be going for. The
 benefit is easy for programmers to understand, especially if they already
 understand why float.init is NaN. You say yelling at them probably won't help -
 why not?

Because experience shows that even the yellers tend to do the short, convenient 
one rather than the longer, correct one. Bruce Eckel wrote an article about
this 
years ago in reference to why Java exception specifications were a failure and 
actually caused people to write bad code, including those who knew better.

Aug 11 2012

"Paulo Pinto" <pjmlp progtools.org> writes:

On Saturday, 11 August 2012 at 09:40:39 UTC, Walter Bright wrote:
 On 8/11/2012 1:57 AM, Jakob Ovrum wrote:
 Because experience shows that even the yellers tend to do the 
 short, convenient one rather than the longer, correct one. 
 Bruce Eckel wrote an article about this years ago in reference 
 to why Java exception specifications were a failure and 
 actually caused people to write bad code, including those who 
 knew better.

I have to agree here.

I spend my work time between JVM and .NET based languages, and
checked exceptions are on my top 5 list of what went wrong with 
Java.

You see lots of

try {
  ...
} catch (Exception e) {
   e.printStackException();
}

in enterprise code.

--
Paulo

Aug 11 2012

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Saturday, 11 August 2012 at 09:40:39 UTC, Walter Bright wrote:
 Of course it is doing what the language requires, but it is an 
 incorrect diagnostic because a dead assignment is required.

 And being a dead assignment, it can lead to errors when the 
 code is later modified, as I explained. I also dislike on 
 aesthetic grounds meaningless code being required.

It is not meaningless, it's declarative. The same resulting code 
as now would be generated, but it's easier for the maintainer to 
understand what's being meant.

 That perhaps is your experience with other languages (that do 
 not default initialize) showing. I don't think that default 
 initialization is so awful. In fact, C++ enables one to specify 
 default initialization for user defined types. Are you against 
 that, too?

No, because user-defined types can have explicitly initialized 
members. I do think that member fields relying on the default 
initializer are ambiguous and should be explicit, but flow 
analysis on aggregate members is not going to work in any current 

point.



even though D is my personal favourite.

 You can certainly say it's a copout, but it's what I see them 
 do. I've never seen them initialize to NaN, but I've seen the 
 "just throw in a 0" many times.

Again, I agree with this - except the examples are not from D, 
and certainly not from the future D that is being proposed. I 
don't blame anyone from steering away from NaN in other C-style 
languages.

I do, however, believe that D programmers are perfectly capable 
of doing the right thing if informed. And let's face it - there's 
a lot that relies on education in D, like whether to receive a 
string parameter as const or immutable, and using scope on a 
subset of callback parameters. Both of these examples require 
more typing than the intuitive/straight-forward choice (always 
receive `string` and no `scope` on delegates), but informed D 
programmers still choose the more lengthy, correct version.

Consider `pure` member functions - turns out most of them are 
actually pure because the implicit `this` parameter is allowed to 
be mutated and it's rare for a member function to mutate global 
state, yet we all strive to correctly decorate our methods `pure` 
when applicable.

 Because experience shows that even the yellers tend to do the 
 short, convenient one rather than the longer, correct one. 
 Bruce Eckel wrote an article about this years ago in reference 
 to why Java exception specifications were a failure and 
 actually caused people to write bad code, including those who 
 knew better.

I don't think the comparison is fair.

Compared to Java exception specifications, the difference between 
'0' and 'float.nan'/'float.init' is negligible, especially in 
generic functions when the desired initializer would typically be 
'T.init'.

Java exception specifications have widespread implications for 
the entire codebase, while the difference between '0' and 
'float.nan' is constant and entirely a local improvement.

Aug 11 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/11/2012 7:30 AM, Jakob Ovrum wrote:
 On Saturday, 11 August 2012 at 09:40:39 UTC, Walter Bright wrote:
 Of course it is doing what the language requires, but it is an incorrect
 diagnostic because a dead assignment is required.

 And being a dead assignment, it can lead to errors when the code is later
 modified, as I explained. I also dislike on aesthetic grounds meaningless code
 being required.

 It is not meaningless, it's declarative. The same resulting code as now would
be
 generated, but it's easier for the maintainer to understand what's being meant.

No, it is not easier to understand, because there's no way to determine if the 
intent is to:

1. initialize to a valid value -or-
2. initialize to get the compiler to stop complaining


 I do, however, believe that D programmers are perfectly capable of doing the
 right thing if informed.

Of course they are capable of it. But experience shows they simply don't.


 Consider `pure` member functions - turns out most of them are actually pure
 because the implicit `this` parameter is allowed to be mutated and it's rare
for
 a member function to mutate global state, yet we all strive to correctly
 decorate our methods `pure` when applicable.

A better design would be to have pure be the default and impure would require 
annotation. The same for const/immutable. Unfortunately, it's too late for that 
now. My fault.


 Java exception specifications have widespread implications for the entire
 codebase, while the difference between '0' and 'float.nan' is constant and
 entirely a local improvement.

I believe there's a lot more potential for success when you have a design where 
the easiest way is the correct way, and you've got to make some effort to do it 
wrong. Much of my attitude on that goes back to my experience at Boeing on 
designing things (yes, my boring Boeing anecdotes again), and Boeing's long 
experience with pilots and mechanics and what they actually do vs what they're 
trained to do. (And not only are these people professionals, not fools, but 
their lives depend on doing it right.)

Over and over and over again, the easy way had better be the correct way. I 
could bore you even more with the aviation horror stories I heard that
justified 
that attitude.

Aug 12 2012

simendsjo <simendsjo gmail.com> writes:

On Sun, 12 Aug 2012 12:38:47 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:
 On 8/11/2012 7:30 AM, Jakob Ovrum wrote:
 On Saturday, 11 August 2012 at 09:40:39 UTC, Walter Bright wrote:
 Consider `pure` member functions - turns out most of them are actually  
 pure
 because the implicit `this` parameter is allowed to be mutated and it's  
 rare for
 a member function to mutate global state, yet we all strive to correctly
 decorate our methods `pure` when applicable.

 A better design would be to have pure be the default and impure would  
 require annotation. The same for const/immutable. Unfortunately, it's  
 too late for that now. My fault.

I have thought that many times. The same with default non-null class  
references. I keep adding assert(someClass) everywhere.

Aug 12 2012

dennis luehring <dl.soluz gmx.net> writes:

Am 12.08.2012 12:38, schrieb Walter Bright:
 On 8/11/2012 7:30 AM, Jakob Ovrum wrote:
 Consider `pure` member functions - turns out most of them are actually pure
 because the implicit `this` parameter is allowed to be mutated and it's rare
for
 a member function to mutate global state, yet we all strive to correctly
 decorate our methods `pure` when applicable.

 A better design would be to have pure be the default and impure would require
 annotation. The same for const/immutable. Unfortunately, it's too late for that
 now. My fault.

its never to late - put it back on the list for D 3 - please
(and local variables are immuteable by default - or seomthing like that)

Aug 12 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 12 August 2012 at 11:34:20 UTC, dennis luehring wrote:
 Am 12.08.2012 12:38, schrieb Walter Bright:
 A better design would be to have pure be the default and 
 impure would require annotation. The same for const/immutable. 
 Unfortunately, it's too late for that now. My fault.

 its never to late - put it back on the list for D 3 - please 
 (and local variables are immutable by default - or something 
 like that)

  Agreed. If it is only a signature change then it might have been 
possible to accept such a change; as I'm sure it would simplify 
quite a bit of signatures and only complicate a few. Probably 
default signatures to try and include are: pure, and  safe 
(Others off hand I can't think of).

  Make a list of all the issues/mistakes that can be done in D3 
(be it ten or fifteen years from now); who knows, maybe the 
future is just around the corner if there's a big enough reason 
for it. The largest reason not to make big changes is so people 
don't get fed up and quit (especially while still trying to write 
library code); That and this is suppose to be the 'stable' D2 
language right now with language changes having to be weighted 
heavily on.

Aug 12 2012

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Sunday, 12 August 2012 at 10:39:01 UTC, Walter Bright wrote:
 No, it is not easier to understand, because there's no way to 
 determine if the intent is to:

 1. initialize to a valid value -or-
 2. initialize to get the compiler to stop complaining

If there is an explicit initializer, it means that the intent is 
either of those two. The latter case is probably quite rare, and 
might suggest a problem with the code - if the compiler can't 
prove your variable to be initialized, then the programmer 
probably has to spend some time figuring out the real answer. 
Legitimate cases of the compiler being too conservative can be 
annotated with a comment to eliminate the ambiguity.

The interesting part is that you can be sure that variables 
*without* initializers are guaranteed to be initialized at a 
later point, or the program won't compile. Without the guarantee, 
the default value could be intended as a valid initializer or 
there could be a bug in the program.

The current situation is not bad, I just think the one that 
allows for catching more errors at compile-time is much, much 
better.

 Of course they are capable of it. But experience shows they 
 simply don't.

If they do it for contagious attributes like const, immutable and 
pure, I'm sure they'll do it for a simple fix like using explicit 
'float.nan' in the rare case the compiler can't prove 
initialization before use.

 A better design would be to have pure be the default and impure 
 would require annotation. The same for const/immutable. 
 Unfortunately, it's too late for that now. My fault.

I agree, but on the flip side it was easier to port D1 code to D2 
this way, and that might have saved D2 from even further 
alienation by some D1 users during its early stages. The most 
common complaints I remember from the IRC channel were complaints 
about const and immutable which was now forced on D programs to 
some degree due to string literals. This made some people really 
apprehensive about moving their code to D2, and I can imagine the 
fallout would be a lot worse if they had to annotate all their 
impure functions etc.

 I believe there's a lot more potential for success when you 
 have a design where the easiest way is the correct way, and 
 you've got to make some effort to do it wrong. Much of my 
 attitude on that goes back to my experience at Boeing on 
 designing things (yes, my boring Boeing anecdotes again), and 
 Boeing's long experience with pilots and mechanics and what 
 they actually do vs what they're trained to do. (And not only 
 are these people professionals, not fools, but their lives 
 depend on doing it right.)

 Over and over and over again, the easy way had better be the 
 correct way. I could bore you even more with the aviation 
 horror stories I heard that justified that attitude.

Problem is, we've pointed out the easy way has issues and is not 
necessarily correct.

Aug 12 2012

"Adam Wilson" <flyboynw gmail.com> writes:

On Sun, 12 Aug 2012 03:38:47 -0700, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 8/11/2012 7:30 AM, Jakob Ovrum wrote:
 On Saturday, 11 August 2012 at 09:40:39 UTC, Walter Bright wrote:
 Of course it is doing what the language requires, but it is an  
 incorrect
 diagnostic because a dead assignment is required.

 And being a dead assignment, it can lead to errors when the code is  
 later
 modified, as I explained. I also dislike on aesthetic grounds  
 meaningless code
 being required.

 It is not meaningless, it's declarative. The same resulting code as now  
 would be
 generated, but it's easier for the maintainer to understand what's  
 being meant.

 No, it is not easier to understand, because there's no way to determine  
 if the intent is to:

 1. initialize to a valid value -or-
 2. initialize to get the compiler to stop complaining


 I do, however, believe that D programmers are perfectly capable of  
 doing the
 right thing if informed.

 Of course they are capable of it. But experience shows they simply don't.


 Consider `pure` member functions - turns out most of them are actually  
 pure
 because the implicit `this` parameter is allowed to be mutated and it's  
 rare for
 a member function to mutate global state, yet we all strive to correctly
 decorate our methods `pure` when applicable.

 A better design would be to have pure be the default and impure would  
 require annotation. The same for const/immutable. Unfortunately, it's  
 too late for that now. My fault.


 Java exception specifications have widespread implications for the  
 entire
 codebase, while the difference between '0' and 'float.nan' is constant  
 and
 entirely a local improvement.

 I believe there's a lot more potential for success when you have a  
 design where the easiest way is the correct way, and you've got to make  
 some effort to do it wrong. Much of my attitude on that goes back to my  
 experience at Boeing on designing things (yes, my boring Boeing  
 anecdotes again), and Boeing's long experience with pilots and mechanics  
 and what they actually do vs what they're trained to do. (And not only  
 are these people professionals, not fools, but their lives depend on  
 doing it right.)

 Over and over and over again, the easy way had better be the correct  
 way. I could bore you even more with the aviation horror stories I heard  
 that justified that attitude.

As a pilot, I completely agree!

-- 
Adam Wilson
IRC: LightBender
Project Coordinator
The Horizon Project
http://www.thehorizonproject.org/

Aug 12 2012

Chad J <chadjoan __spam.is.bad__gmail.com> writes:

On 08/10/2012 06:01 PM, Walter Bright wrote:
 On 8/10/2012 1:38 AM, F i L wrote:
 Walter Bright wrote:
 3. Floating point values are default initialized to NaN.


 with just
 as much optimization/debugging benefit (arguably more so, because it
 catches NaN


 class Foo
 {
 float x; // defaults to 0.0f

 void bar()
 {
 float y; // doesn't default
 y ++; // ERROR: use of unassigned local

 float z = 0.0f;
 z ++; // OKAY
 }
 }

 This is the same behavior for any local variable,

 It catches only a subset of these at compile time. I can craft any
 number of ways of getting it to miss diagnosing it. Consider this one:

 float z;
 if (condition1)
 z = 5;
 ... lotsa code ...
 if (condition2)
 z++;

 To diagnose this correctly, the static analyzer would have to determine
 that condition1 produces the same result as condition2, or not. This is
 impossible to prove. So the static analyzer either gives up and lets it
 pass, or issues an incorrect diagnostic. So our intrepid programmer is
 forced to write:

 float z = 0;
 if (condition1)
 z = 5;
 ... lotsa code ...
 if (condition2)
 z++;

 Now, as it may turn out, for your algorithm the value "0" is an
 out-of-range, incorrect value. Not a problem as it is a dead assignment,
 right?

 But then the maintenance programmer comes along and changes condition1
 so it is not always the same as condition2, and now the z++ sees the
 invalid "0" value sometimes, and a silent bug is introduced.

 This bug will not remain undetected with the default NaN initialization.

To address the concern of static analysis being too hard: I wish we 
could have it but limit the amount of static analysis that's done. 
Something like this: the compiler will test branches of if-else 
statements and switch-case statements, but it will not drop into 
function calls with ref parameters nor will it accept initialization in 
looping constructs (foreach, for, while, etc).  A compiler is an 
incorrect implementation if it implements /too much/ static analysis.

The example code you give can be implemented with such limited static 
analysis:

void lotsaCode() {
	... lotsa code ...
}

float z;
if ( condition1 )
{
	z = 5;
	lotsaCode();
	z++;
}
else
{
	lotsaCode();
}

I will, in advance, concede that this does not prevent people from just 
writing "float z = 0;".  In my dream-world the compiler recognizes a set 
of common mistake-inducing patterns like the one you mentioned and then 
prints helpful error messages suggesting alternative design patterns. 
That way, bugs are prevented and users become better programmers.

Aug 11 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 11 August 2012 at 23:49:18 UTC, Chad J wrote:
 On 08/10/2012 06:01 PM, Walter Bright wrote:
 It catches only a subset of these at compile time. I can craft 
 any number of ways of getting it to miss diagnosing it. 
 Consider this one:

 float z;
 if (condition1)
 z = 5;
 ... lotsa code ...
 if (condition2)
 z++;

 To diagnose this correctly, the static analyzer would have to 
 determine that condition1 produces the same result as 
 condition2, or not. This is impossible to prove. So the static 
 analyzer either gives up and lets it pass, or issues an 
 incorrect diagnostic. So our intrepid programmer is forced to 
 write:

 float z = 0;
 if (condition1)
 z = 5;
 ... lotsa code ...
 if (condition2)
 z++;

 Now, as it may turn out, for your algorithm the value "0" is 
 an out-of-range, incorrect value. Not a problem as it is a 
 dead assignment, right?

 But then the maintenance programmer comes along and changes 
 condition1 so it is not always the same as condition2, and now 
 the z++ sees the invalid "0" value sometimes, and a silent bug 
 is introduced.

 This bug will not remain undetected with the default NaN 
 initialization.


  Let's keep in mind everyone of these truths:

1) Programmers are lazy; If you can get away with not 
initializing something then you'll avoid it. In C I've failed to 
initialized variables many times until a bug crops up and it's 
difficult to find sometimes, where a NaN or equiv would have 
quickly cropped them out before running with any real data.

2) There are a lot of inexperienced programmers. I worked for a 
company for a short period of time that did minimal training on a 
language like Java, where I ended up being seen as an utter 
genius (compared to even the teachers).

3) Bugs in a large environment and/or scenarios are far more 
difficult if not impossible to debug. I've made a program that 
handles merging of various dialogs (using double linked-like 
lists); I can debug them if they are 100 or less to work with, 
but after 100 (and often it's tens of thousands) it can become 
such a pain based on it's indirection and how the original 
structure was built that I refuse based on difficulty vs end 
results (Plus sanity).

  We also need to sometimes laugh at our mistakes, and learn from 
others. I'll recommend everyone read from rinkworks a bit if you 
have the time and refresh yourselves.

  http://www.rinkworks.com/stupid/cs_programming.shtml

Aug 11 2012

"bearophile" <bearophileHUGS lycos.com> writes:

F i L:

 Walter Bright wrote:
 3. Floating point values are default initialized to NaN.


 conveniently

An alternative possibility is to:
1) Default initialize variables just as currently done in D, with 
0s, NaNs, etc;
2) Where the compiler is certain a variable is read before any 
possible initialization, it generates a compile-time error;
3) Warnings for unused variables and unused last assignments.

Where the compiler is not sure, not able to tell, or sees there 
is one or more paths where the variable is initialized, it gives 
no errors, and eventually the code will use the default 
initialized values, as currently done in D.


The D compiler is already doing this a little, if you compile 
this with -O:

class Foo {
   void bar() {}
}
void main() {
   Foo f;
   f.bar();
}

You get at compile-time:
temp.d(6): Error: null dereference in function _Dmain


A side effect of those rules is that this code doesn't compile, 
and similarly lot of current D code:

class Foo {}
void main() {
   Foo f;
   assert(f is null);
}


Bye,
bearophile

Aug 11 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/11/2012 2:41 PM, bearophile wrote:
 2) Where the compiler is certain a variable is read before any possible
 initialization, it generates a compile-time error;

This has been suggested repeatedly, but it is in utter conflict with the whole 
notion of default initialization, which nobody complains about for user-defined 
types.

Aug 11 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/11/12 7:33 PM, Walter Bright wrote:
[snip]

Allow me to insert an opinion here. This post illustrates quite well how 
opinionated our community is (for better or worse).

The OP has asked a topical question in a matter that is interesting and 
also may influence the impact of the language to the larger community. 
Before long the thread has evolved into the familiar pattern of a debate 
over a minor issue on which reasonable people may disagree and that's 
unlikely to change. We should instead do our best to give a balanced 
high-level view of what D offers for econometrics.

To the OP - here are a few aspects that may deserve interest:

* Modeling power - from what I understand econometrics is 
modeling-heavy, which is more difficult to address in languages such as 
Fortran, C, C++, Java, Python, or the likes of Matlab.

* Efficiency - D generates native code for floating point operations and 
has control over data layout and allocation. Speed of generated code is 
dependent on the compiler, and the reference compiler (dmd) does a 
poorer job at it than the gnu-based compiler (gdc) compiler.

* Convenience - D is designed to "do what you mean" wherever possible 
and simplify common programming tasks, numeric or not. That makes the 
language comfortable to use even by a non-specialist, in particular in 
conjunction with appropriate libraries.

A few minuses I can think of:

- Maturity and availability of numeric and econometrics library is an 
obvious issue. There are some libraries (e.g. 
https://github.com/kyllingstad/scid/wiki) maintained and extended 
through volunteer effort.

- The language's superior modeling power and level of control comes at 
an increase in complexity compared to languages such as e.g. Python. So 
the statistician would need a larger upfront investment in order to reap 
the associated benefits.


Andrei

Aug 11 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 - The language's superior modeling power and level of control 
 comes at an increase in complexity compared to languages such 
 as e.g. Python. So the statistician would need a larger upfront 
 investment in order to reap the associated benefits.

Statistician often use the R language 
(http://en.wikipedia.org/wiki/R_language ).
Python contains much more "computer science" and CS complexity 
compared to R. Not just advanced stuff like coroutines, 
metaclasses, decorators, Abstract Base Classes, operator 
overloading, and so on, but even simpler things, like generators, 
standard library collections like heaps and deques, and so on.
For some statisticians I've seen, even several parts of Python 
are too much hard to use or understand. I have rewritten several 
of their Python scripts.

Bye,
bearophile

Aug 11 2012

"dsimcha" <dsimcha yahoo.com> writes:

On Sunday, 12 August 2012 at 03:30:24 UTC, bearophile wrote:
 Andrei Alexandrescu:

 - The language's superior modeling power and level of control 
 comes at an increase in complexity compared to languages such 
 as e.g. Python. So the statistician would need a larger 
 upfront investment in order to reap the associated benefits.

 Statistician often use the R language 
 (http://en.wikipedia.org/wiki/R_language ).
 Python contains much more "computer science" and CS complexity 
 compared to R. Not just advanced stuff like coroutines, 
 metaclasses, decorators, Abstract Base Classes, operator 
 overloading, and so on, but even simpler things, like 
 generators, standard library collections like heaps and deques, 
 and so on.
 For some statisticians I've seen, even several parts of Python 
 are too much hard to use or understand. I have rewritten 
 several of their Python scripts.

 Bye,
 bearophile


For people with more advanced CS/programming knowledge, though, 
this is an advantage of D.  I find Matlab and R incredibly 
frustrating to use for anything but very standard 
matrix/statistics computations on data that's already structured 
the way I like it.  This is mostly because the standard CS 
concepts you mention are at best awkward and at worst impossible 
to express and, being aware of them, I naturally want to take 
advantage of them.

Using Matlab or R feels like being forced to program with half 
the tools in my toolbox either missing or awkwardly misshapen, so 
I avoid it whenever practical.  (Actually, languages like C and 
Java that don't have much modeling power feel the same way to me 
now that I've primarily used D and to a lesser extent Python for 
the past few years.  Ironically, these are the languages that are 
easy to integrate with R and Matlab respectively.  Do most 
serious programmers who work in problem domains relevant to 
Matlab and R feel this way or is it just me?).  This was my 
motivation for writing Dstats and mentoring Cristi's fork of 
SciD.  D's modeling power is so outstanding that I was able to 
replace R and Matlab for a lot of use cases with plain old 
libraries written in D.

Aug 12 2012

"TJB" <broughtj gmail.com> writes:

On Sunday, 12 August 2012 at 17:22:21 UTC, dsimcha wrote:

 ...  I find Matlab and R incredibly frustrating to use for 
 anything but very standard matrix/statistics computations on 
 data that's already structured the way I like it.

This is exactly how I feel, and why I am turning to D. My data 
sets are huge (64 TB for just a few years of data) and my 
econometric methods computationally intensive and the limitations 
of Matlab and R are always almost instantly constraining.

 Using Matlab or R feels like being forced to program with half 
 the tools in my toolbox either missing or awkwardly misshapen, 
 so I avoid it whenever practical.  Actually, languages like C 
 and Java that don't have much modeling power feel the same way 
 to me ...

Very well put - it expresses my feeling precisely.  And C++ is 
such a complicated beast that I feel caught in between.  I'd been 
dreaming of a language that offers modeling power as well as 
efficiency.

 ...  Do most serious programmers who work in problem domains 
 relevant to Matlab and R feel this way or is it just me?.

I certainly feel the same. I only use them when I have to or for 
very simple prototyping.

 This was my motivation for writing Dstats and mentoring 
 Cristi's fork of SciD.  D's modeling power is so outstanding 
 that I was able to replace R and Matlab for a lot of use cases 
 with plain old libraries written in D.

Thanks for your work on these packages! I will for sure be 
including them in my write up. I think they offer great 
possibilities for econometrics in D.

TJB

Aug 12 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 12/08/12 18:22, dsimcha wrote:
 For people with more advanced CS/programming knowledge, though, this is an
 advantage of D.  I find Matlab and R incredibly frustrating to use for anything
 but very standard matrix/statistics computations on data that's already
 structured the way I like it.  This is mostly because the standard CS concepts
 you mention are at best awkward and at worst impossible to express and, being
 aware of them, I naturally want to take advantage of them.

The main use-case and advantage of both R and MATLAB/Octave seems to me to be 
the plotting functionality -- I've seen some exceptionally beautiful stuff done 
with R in particular, although I've not personally explored its capabilities
too 
far.

The annoyance of R in particular is the impenetrable thicket of dependencies 
that can arise among contributed packages; it feels very much like some are 
thrown over the wall and then built on without much concern for organization.
:-(

Aug 12 2012

"dsimcha" <dsimcha yahoo.com> writes:

On Monday, 13 August 2012 at 01:52:28 UTC, Joseph Rushton 
Wakeling wrote:
 The main use-case and advantage of both R and MATLAB/Octave 
 seems to me to be the plotting functionality -- I've seen some 
 exceptionally beautiful stuff done with R in particular, 
 although I've not personally explored its capabilities too far.

 The annoyance of R in particular is the impenetrable thicket of 
 dependencies that can arise among contributed packages; it 
 feels very much like some are thrown over the wall and then 
 built on without much concern for organization. :-(

I've addressed that, too :).

https://github.com/dsimcha/Plot2kill

Obviously this is a one-man project without nearly the same 
number of features that R and Matlab have, but like Dstats and 
SciD, it has probably the 20% of functionality that handles 80% 
of use cases.  I've used it for the figures in scientific 
articles that I've submitted for publication and in my Ph.D. 
proposal and dissertation.

Unlike SciD and Dstats, Plot2kill doesn't highlight D's modeling 
capabilities that much, but it does get the job done for simple 
2D plots.

Aug 12 2012

"TJB" <broughtj gmail.com> writes:

On Sunday, 12 August 2012 at 02:28:44 UTC, Andrei Alexandrescu 
wrote:
 On 8/11/12 7:33 PM, Walter Bright wrote:
 [snip]

 Allow me to insert an opinion here. This post illustrates quite 
 well how opinionated our community is (for better or worse).

 The OP has asked a topical question in a matter that is 
 interesting and also may influence the impact of the language 
 to the larger community. Before long the thread has evolved 
 into the familiar pattern of a debate over a minor issue on 
 which reasonable people may disagree and that's unlikely to 
 change. We should instead do our best to give a balanced 
 high-level view of what D offers for econometrics.

 To the OP - here are a few aspects that may deserve interest:

 * Modeling power - from what I understand econometrics is 
 modeling-heavy, which is more difficult to address in languages 
 such as Fortran, C, C++, Java, Python, or the likes of Matlab.

 * Efficiency - D generates native code for floating point 
 operations and has control over data layout and allocation. 
 Speed of generated code is dependent on the compiler, and the 
 reference compiler (dmd) does a poorer job at it than the 
 gnu-based compiler (gdc) compiler.

 * Convenience - D is designed to "do what you mean" wherever 
 possible and simplify common programming tasks, numeric or not. 
 That makes the language comfortable to use even by a 
 non-specialist, in particular in conjunction with appropriate 
 libraries.

 A few minuses I can think of:

 - Maturity and availability of numeric and econometrics library 
 is an obvious issue. There are some libraries (e.g. 
 https://github.com/kyllingstad/scid/wiki) maintained and 
 extended through volunteer effort.

 - The language's superior modeling power and level of control 
 comes at an increase in complexity compared to languages such 
 as e.g. Python. So the statistician would need a larger upfront 
 investment in order to reap the associated benefits.


 Andrei

Andrei,

Thanks for bringing this back to the original topic and for your 
thoughts.

Indeed, a lot of econometricians are using MATLAB, R, Guass, Ox 
and the like. But there are a number of econometricians who need 
the raw power of a natively compiled language (especially 
financial econometricians whose data are huge) who typically 
program in either Fortran or C/C++.  It is really this group that 
I am trying to reach.  I think D has a lot to offer this group in 
terms of programmer productivity and reliability of code.  I 
think this applies to statisticians as well, as I see a lot of 
them in this latter group too.

I also want to reach the MATLABers because I think they can get a 
lot more modeling power (I like how you put that) without too 
much more difficulty (see Ox - nearly as complicated as C++ but 
without the power).  Many MATLAB and R programmers end up 
recoding a good part of their algorithms in C++ and calling that 
code from the interpreted language.  I have always found this 
kind of mixed language programming to be messy, time consuming, 
and error prone.  Special tools are cropping up to handle this 
(see Rcpp).  This just proves to me the usefulness of a 
productive AND powerful language like D for econometricians!

I am sensitive to the drawbacks you mention (especially lack of 
numeric libraries).  I am so sick of wasting my time in C++ 
though that I have almost decided to just start writing my own 
econometric library in D.  Earlier in this thread there was a 
discussion of extended precision in D and I mentioned the need to 
recode things like BLAS and LAPACK in D.  Templates in D seem 
perfect for this problem.  As an expert in template 
meta-programming what are your thoughts?  How is this different 
than what is being done in SciD?  It seems they are mostly 
concerned about wrapping the old CBLAS and CLAPACK libraries.

Again, thanks for your thoughts and your TDPL book. Probably the 
best programming book I've ever read!

TJB

Aug 11 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/12/12 12:52 AM, TJB wrote:
 Thanks for bringing this back to the original topic and for your thoughts.

 Indeed, a lot of econometricians are using MATLAB, R, Guass, Ox and the
 like. But there are a number of econometricians who need the raw power
 of a natively compiled language (especially financial econometricians
 whose data are huge) who typically program in either Fortran or C/C++.
 It is really this group that I am trying to reach. I think D has a lot
 to offer this group in terms of programmer productivity and reliability
 of code. I think this applies to statisticians as well, as I see a lot
 of them in this latter group too.

 I also want to reach the MATLABers because I think they can get a lot
 more modeling power (I like how you put that) without too much more
 difficulty (see Ox - nearly as complicated as C++ but without the
 power). Many MATLAB and R programmers end up recoding a good part of
 their algorithms in C++ and calling that code from the interpreted
 language. I have always found this kind of mixed language programming to
 be messy, time consuming, and error prone. Special tools are cropping up
 to handle this (see Rcpp). This just proves to me the usefulness of a
 productive AND powerful language like D for econometricians!

I think this is a great angle. In our lab when I was a grad student in 
NLP/ML there was also a very annoying trend going on: people would start 
with Perl for text preprocessing and Matlab for math, and then, after 
the proof of concept, would need to recode most parts in C++. (I recall 
hearing complaints about large overheads in Matlab caused by eager copy 
semantics, is that true?)

 I am sensitive to the drawbacks you mention (especially lack of numeric
 libraries). I am so sick of wasting my time in C++ though that I have
 almost decided to just start writing my own econometric library in D.
 Earlier in this thread there was a discussion of extended precision in D
 and I mentioned the need to recode things like BLAS and LAPACK in D.
 Templates in D seem perfect for this problem. As an expert in template
 meta-programming what are your thoughts? How is this different than what
 is being done in SciD? It seems they are mostly concerned about wrapping
 the old CBLAS and CLAPACK libraries.

There's a large body of experience and many optimizations accumulated in 
these libraries, which are worth exploiting. The remaining matter is 
offering a convenient shell. I think Cristi's work on SciD goes that 
direction.


Andrei

Aug 12 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 (I recall hearing complaints about large overheads in Matlab 
 caused by eager copy semantics, is that true?)

In Matlab there is COW:
http://www.matlabtips.com/copy-on-write-in-subfunctions/

Bye,
bearophile

Aug 12 2012

"F i L" <witte2008 gmail.com> writes:

Andrei Alexandrescu wrote:
 * Efficiency - D generates native code for floating point 
 operations and has control over data layout and allocation. 
 Speed of generated code is dependent on the compiler, and the 
 reference compiler (dmd) does a poorer job at it than the 
 gnu-based compiler (gdc) compiler.

I'd like to add to this. Right now I'm reworking some libraries 
to include Simd support using DMD on Linux 64bit. A simple 
benchmark between DMD and GCC of 2 million simd vector 
addition/subtractions actually runs faster with my DMD D code 
than the GCC C code. Only by ~0.8 ms, and that could be due to a 
difference between D's sdt.datetime.StopWatch() and C's 
time.h/clock(), but it's consistently faster none-the-less, which 
is impressive.

That said, it's also much easier to "accidentally slow that 
figure down significantly in DMD, whereas GCC usually always 
optimizes very well.


Also, and I'm not sure this isn't just me, but I ran a DMD 


vs ~88ms). Now a similar test compiled with DMD 2.060 runs at 

optimization improvements in the internal DMD compiler over the 
last few version.

Aug 12 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/12/2012 6:38 PM, F i L wrote:
 Also, and I'm not sure this isn't just me, but I ran a DMD (v2.057 T think)

ms

2.060

optimization
 improvements in the internal DMD compiler over the last few version.

There's a fair amount of low hanging optimization fruit that D makes possible 
that dmd does not take advantage of. I hope to get to this.

One thing is I suspect that D can generate much better SIMD code than C/C++ can 
without compiler extensions.

Another is that D allows values to be moved without needing a 
copyconstruct/destruct operation.

Aug 13 2012

"TJB" <broughtj gmail.com> writes:

On Thursday, 9 August 2012 at 18:35:22 UTC, Walter Bright wrote:
 On 8/9/2012 10:40 AM, dsimcha wrote:
 I'd emphasize the following:

 I'd like to add to that:

 1. Proper support for 80 bit floating point types. Many 
 compilers' libraries have inaccurate 80 bit math functions, or 
 don't implement 80 bit floats at all. 80 bit floats reduce the 
 incidence of creeping roundoff error.

How unique to D is this feature?  Does this imply that things 
like BLAS and LAPACK, random number generators, statistical 
distribution functions, and other numerical software should be 
rewritten in pure D rather than calling out to external C or 
Fortran codes?

TJB

Aug 10 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/10/2012 8:31 AM, TJB wrote:
 On Thursday, 9 August 2012 at 18:35:22 UTC, Walter Bright wrote:
 On 8/9/2012 10:40 AM, dsimcha wrote:
 I'd emphasize the following:

 I'd like to add to that:

 1. Proper support for 80 bit floating point types. Many compilers' libraries
 have inaccurate 80 bit math functions, or don't implement 80 bit floats at
 all. 80 bit floats reduce the incidence of creeping roundoff error.

 How unique to D is this feature?  Does this imply that things like BLAS and
 LAPACK, random number generators, statistical distribution functions, and other
 numerical software should be rewritten in pure D rather than calling out to
 external C or Fortran codes?

I attended a talk given by a physicist a few months ago where he was using C 
transcendental functions. I pointed out to him that those functions were 
unreliable, producing wrong bits in a manner that suggested to me that they
were 
internally truncating to double precision.

He expressed astonishment and told me I must be mistaken.

What can I say? I run across this repeatedly, and that's exactly why Phobos 
(with Don's help) has its own implementations, rather than simply calling the 
corresponding C ones.

I encourage you to run your own tests, and draw your own conclusions.

Aug 10 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, August 10, 2012 15:10:47 Walter Bright wrote:
 What can I say? I run across this repeatedly, and that's exactly why Phobos
 (with Don's help) has its own implementations, rather than simply calling
 the corresponding C ones.

I think that it's pretty typical for programmers to think that something like 
a standard library function is essentially bug-free - especially for an older 
language like C. And unless you see results that are clearly wrong or someone 
else points out the problem, I don't know why you'd ever think that there was 
one. I certainly had no clue that C implementations had issues with floating 
point arithmetic before it was pointed out here. Regardless though, it's great 
that D gets it right.

- Jonathan M Davis

Aug 10 2012

"TJB" <broughtj gmail.com> writes:

On Friday, 10 August 2012 at 22:11:23 UTC, Walter Bright wrote:
 On 8/10/2012 8:31 AM, TJB wrote:
 On Thursday, 9 August 2012 at 18:35:22 UTC, Walter Bright 
 wrote:
 On 8/9/2012 10:40 AM, dsimcha wrote:
 I'd emphasize the following:

 I'd like to add to that:

 1. Proper support for 80 bit floating point types. Many 
 compilers' libraries
 have inaccurate 80 bit math functions, or don't implement 80 
 bit floats at
 all. 80 bit floats reduce the incidence of creeping roundoff 
 error.

 How unique to D is this feature?  Does this imply that things 
 like BLAS and
 LAPACK, random number generators, statistical distribution 
 functions, and other
 numerical software should be rewritten in pure D rather than 
 calling out to
 external C or Fortran codes?

 I attended a talk given by a physicist a few months ago where 
 he was using C transcendental functions. I pointed out to him 
 that those functions were unreliable, producing wrong bits in a 
 manner that suggested to me that they were internally 
 truncating to double precision.

 He expressed astonishment and told me I must be mistaken.

 What can I say? I run across this repeatedly, and that's 
 exactly why Phobos (with Don's help) has its own 
 implementations, rather than simply calling the corresponding C 
 ones.

 I encourage you to run your own tests, and draw your own 
 conclusions.

Hopefully this will help make the case that D is the best choice 
for numerical programmers. I want to do my part to convince 
economists.

Another reason to implement BLAS and LAPACK in pure D is that the 
old routines like dgemm, cgemm, sgemm, and zgemm (all defined for 
different types) seem ripe for templatization.

Almost thou convinceth me ...

TJB

Aug 10 2012

Justin Whear <justin economicmodeling.com> writes:

On Thu, 09 Aug 2012 17:57:27 +0200, TJB wrote:

 Hello D Users,
 
 The Software Editor for the Journal of Applied Econometrics has agreed
 to let me write a review of the D programming language for
 econometricians (econometrics is where economic theory and statistical
 analysis meet).  I will have only about 6 pages.  I have an idea of what
 I am going to write about, but I thought I would ask here what features
 are most relevant (in your minds) to numerical programmers writing codes
 for statistical inference.
 
 I look forward to your suggestions.
 
 Thanks,
 
 TJB

Lazy ranges are a lifesaver when dealing with big data.  E.g. read a 
large csv file, use filter and map to clean and transform the data, 
collect stats as you go, then output to a destination file.  The lazy 
nature of most of the ranges in Phobos means that you don't need to have 
the data in memory, but you can write simple imperative code just as if 
it was.

Aug 09 2012

"Paulo Pinto" <pjmlp progtools.org> writes:

On Thursday, 9 August 2012 at 18:20:08 UTC, Justin Whear wrote:
 On Thu, 09 Aug 2012 17:57:27 +0200, TJB wrote:

 Hello D Users,
 
 The Software Editor for the Journal of Applied Econometrics 
 has agreed
 to let me write a review of the D programming language for
 econometricians (econometrics is where economic theory and 
 statistical
 analysis meet).  I will have only about 6 pages.  I have an 
 idea of what
 I am going to write about, but I thought I would ask here what 
 features
 are most relevant (in your minds) to numerical programmers 
 writing codes
 for statistical inference.
 
 I look forward to your suggestions.
 
 Thanks,
 
 TJB

 Lazy ranges are a lifesaver when dealing with big data.  E.g. 
 read a
 large csv file, use filter and map to clean and transform the 
 data,
 collect stats as you go, then output to a destination file.  
 The lazy
 nature of most of the ranges in Phobos means that you don't 
 need to have
 the data in memory, but you can write simple imperative code 
 just as if
 it was.

Ah, the beauty of functional programming and streams.

Aug 09 2012

"Minas Mina" <minas_mina1990 hotmail.co.uk> writes:

1) I think compile-time function execution is a very big plus for 
people doing calculations.

For example:

ulong fibonacci(ulong n) { .... }

static x = fibonacci(50); // calculated at compile time! runtime 
cost = 0 !!!

2) It has support for a BigInt structure in its standard library 
(which is really fast!)

Aug 10 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Which D features to emphasize for academic review article