digitalmars.D - 'Aliasing problem' and D

Dave (10/10) Aug 01 2004 I'm very new to D (literally as of yesterday), but am very impressed wit...

Sha Chancellor (4/20) Aug 01 2004 What aliasing problem would you be referring too? ( No D does not deal

Dave (5/7) Aug 01 2004 I should have been more specific - sorry.

Dave (29/32) Aug 01 2004 Bah - this was cut off somehow when I posted it, I'll try again..

Dave (14/49) Aug 03 2004 Was this that stupid of a question or what? Seriously..

Sean Kelly (17/18) Aug 03 2004 If I understand your question, the D version would be this:

Dave (8/27) Aug 06 2004 Yes and it runs as the C version nominally would.

Nick (25/45) Aug 06 2004 I can't say if it would break anything, but I do like what you're propos...

Dave (12/16) Aug 06 2004 I agree - and the C code I've been working with recently has quite a few...
Sean Kelly (7/27) Aug 06 2004 I don't know. I would consider it a very bad idea to pass the same vari...

Jay (17/22) Aug 17 2004 Potential pointer aliasing crops on in all sorts of mundane situations. ...

Ben Hinkle (5/74) Aug 03 2004 Not a stupid question at all. See previous threads, for example

J C Calvarese (7/11) Aug 03 2004 Also, I found some ever older threads using "aliasing problem" at
Dave (49/56) Aug 03 2004 Doh! CTFA (Check the fargin' Archives). Sorry, and thanks for the respon...

Ilya Minkov (34/113) Aug 10 2004 No problem. It just takes time and threads get forgotten, because the

Carlos Santander B. (7/7) Aug 11 2004 "Ilya Minkov" escribi� en el mensaje

Dave (26/31) Aug 11 2004 No, it's more innocuous than that even..

Carlos Santander B. (46/46) Aug 11 2004 "Dave" escribi� en el mensaje

Carlos Santander B. (8/8) Aug 11 2004 "Carlos Santander B." escribi� en el mensaje

Dave (3/11) Aug 11 2004 Might be! Thanks..

Dave (136/146) Aug 08 2004

J C Calvarese (54/96) Aug 08 2004 You obvious have passion and knowledge about the issue of the "Aliasing

Dave (142/182) Aug 09 2004 Passion yes, knowledge, maybe a just a little bit. I'm no compiler write...

Ben Hinkle (5/34) Aug 09 2004 Have you tried using a BufferedFile? The default File is unbuffered. The...

Berin Loritsch (16/19) Aug 09 2004 I would be careful with this. Something I have found with Java IO

Stratus (6/20) Aug 09 2004 Really shouldn't waste the effort in replying, but this post is so full ...

Berin Loritsch (2/29) Aug 09 2004

Berin Loritsch (5/18) Aug 09 2004 Besides whats wrong with supplying both an UnbufferedFile and a

J C Calvarese (9/22) Aug 09 2004 I read "the default File should be buffered" to mean both are allowed

Dave (63/66) Aug 09 2004 Here is that along with a similiar Java version. Thanks for the tip on

Derek Parnell (61/82) Aug 08 2004 [big snip]

Dave (127/137) Aug 08 2004 What I'm specifically thinking of is something very close to what Walter

Dave (12/14) Aug 17 2004 [another]

Sampsa Lehtonen (76/91) Aug 10 2004 Howabout introducing a special keyword so that you could mark variables ...

Sampsa Lehtonen (14/16) Aug 10 2004 Duh!

Dave (98/101) Aug 10 2004

Dave (19/59) Aug 10 2004 I inadvertently skewed the results when I caused an overflow by bumping ...

Sean Kelly (5/5) Aug 10 2004 Kind of a contrived example, but still applicable I suppose. Walter had

Regan Heath (27/32) Aug 10 2004 If checking for aliasing is difficult/time consuming then we could only

Sampsa Lehtonen (53/58) Aug 11 2004 I understand that making the 'noalias' as a default for out and inout

Dave (35/46) Aug 11 2004 Please refer to my post just ahead of this one. Debug checks are already...

Sampsa Lehtonen (40/86) Aug 12 2004 Well, my idea was that the compiler extensions (pragmas) would just be

Dave (49/75) Aug 12 2004 _All_ this thread has been talking about is aliasing of _function parame...

Regan Heath (37/93) Aug 11 2004 Is it impossible? In a debug build couldn't the compiler insert checks t...

Dave (13/26) Aug 11 2004

Regan Heath (8/40) Aug 16 2004 Perhaps not. :)

Dave (24/44) Aug 17 2004 I was thinking along the lines of not adding complexity (from the user

Dave (12/18) Aug 11 2004 Is your proposal that the assert's be inserted by the compiler for debug

Dave (79/97) Aug 11 2004 Bahhh! This happened again!! Either my browser or the news server someho...

Norbert Nemec (27/42) Aug 14 2004 Hi there,

Dave <Dave_member pathlink.com> writes:

I'm very new to D (literally as of yesterday), but am very impressed with what
I'm seeing so far.

Being that I want this language to succeed and an important part of that will be
performance potential over C, I'm curious - how does/will D deal with the
pointer 'aliasing problem' that plagues C and C++ compiler developers?

From what little I've seen so far, it seems that this same problem has been
'forced' on D by it's backward compatability with C libraries and C/C++ -like
support for pointers.

IMHO, any language that seeks to replace C/C++ should do it's best to avoid this
problem, or at least discourage code that introduces it.

Aug 01 2004

Sha Chancellor <schancel pacific.net> writes:

In article <ceis7r$23bt$1 digitaldaemon.com>,
 Dave <Dave_member pathlink.com> wrote:

 I'm very new to D (literally as of yesterday), but am very impressed with 
 what
 I'm seeing so far.
 
 Being that I want this language to succeed and an important part of that will 
 be
 performance potential over C, I'm curious - how does/will D deal with the
 pointer 'aliasing problem' that plagues C and C++ compiler developers?
 
 From what little I've seen so far, it seems that this same problem has been
 'forced' on D by it's backward compatability with C libraries and C/C++ -like
 support for pointers.
 
 IMHO, any language that seeks to replace C/C++ should do it's best to avoid 
 this
 problem, or at least discourage code that introduces it.

What aliasing problem would you be referring too?  ( No D does not deal 
with it with denial, it's an honest question :)

Aug 01 2004

Dave <Dave_member pathlink.com> writes:

In article <schancel-8833C7.08301601082004 digitalmars.com>, Sha Chancellor
says...
What aliasing problem would you be referring too?  ( No D does not deal 
with it with denial, it's an honest question :)

I should have been more specific - sorry.

Code like this:

extern int i;

Aug 01 2004

Dave <Dave_member pathlink.com> writes:

In article <ceja1n$28um$1 digitaldaemon.com>, Dave says...
I should have been more specific - sorry.

Code like this:

extern int i;

Bah - this was cut off somehow when I posted it, I'll try again..

Code like this:

;---

extern int i;

void func( int &ri )
{
for( int j = 0; j < 10; j++ ) {
ri++;
i++;
}
}

int main()
{
i = 10;
func(i);
printf("%d\n",i);
return 0;
}

;---

where 'ri' refers to 'i' the compiler has to keep an in-memory copy instead of
binding 'i' to a register for example. I guess other optimizations opportunites
are lossed as well, even if the aliasing doesn't happen in the program, unless
the compiler is sophisticated enough to keep track of variables like this (which
of course adds compile time, complexity, bugs, etc..).

C99 has the 'restrict' keyword, but many aren't happy with that solution. More
on that here: http://www.cbau.freeserve.co.uk/Compiler/RestrictPointers.html

This 'aliasing problem' is oft refered to as a reason that FORTRAN (for example)
is easier to write optimizers for.

Aug 01 2004

Dave <Dave_member pathlink.com> writes:

Was this that stupid of a question or what? Seriously..

Anyone want to take a crack at it? Walter??

The reason I think this may be important for D is because a) a lot of scientific
computing types aren't real happy with FORTRAN 95, b) they aren't real happy
with C++ for many of the same reasons we aren't, c) they aren't real happy with
Java (say what you will on the Java performance issue, but when you have to
write C code in an an OOP language like Java to make it perform decently, it is
not suitable for HPF.) and d) so, all this leaves it wide open for D to step in
and make them happy, what with it's good support for native FP and built-in
complex types and all. Plus, it was designed by a compiler developer ;)

IMHO, get the HPC folks (both end-users and tool vendors) interested in an OOP
language that has the potential to crunch numbers fast and you have yourself a
good foothold on the language market.

Thanks..

In article <ceja1n$28um$1 digitaldaemon.com>, Dave says...
I should have been more specific - sorry.

Code like this:

extern int i;

Bah - this was cut off somehow when I posted it, I'll try again..

Code like this:

;---

extern int i;

void func( int &ri )
{
for( int j = 0; j < 10; j++ ) {
ri++;
i++;
}
}

int main()
{
i = 10;
func(i);
printf("%d\n",i);
return 0;
}

;---

where 'ri' refers to 'i' the compiler has to keep an in-memory copy instead of
binding 'i' to a register for example. I guess other optimizations opportunites
are lossed as well, even if the aliasing doesn't happen in the program, unless
the compiler is sophisticated enough to keep track of variables like this (which
of course adds compile time, complexity, bugs, etc..).

C99 has the 'restrict' keyword, but many aren't happy with that solution. More
on that here: http://www.cbau.freeserve.co.uk/Compiler/RestrictPointers.html

This 'aliasing problem' is oft refered to as a reason that FORTRAN (for example)
is easier to write optimizers for.

Aug 03 2004

Sean Kelly <sean f4.ca> writes:

In article <ceobns$1akh$1 digitaldaemon.com>, Dave says...
Was this that stupid of a question or what? Seriously..

If I understand your question, the D version would be this:

int i;

void func( inout int ri )
{
for( int j = 0; j < 10; j++ ) {
ri++;
i++;
}
}

int main()
{
i = 10;
func(i);
printf("%d\n",i);
return 0;
}

Aug 03 2004

Dave <Dave_member pathlink.com> writes:

In article <ceoc3l$1apl$1 digitaldaemon.com>, Sean Kelly says...
int i;

void func( inout int ri )
{
for( int j = 0; j < 10; j++ ) {
ri++;
i++;
}
}

int main()
{
i = 10;
func(i);
printf("%d\n",i);
return 0;
}

Yes and it runs as the C version nominally would.

Walter, back here: http://www.digitalmars.com/drn-bin/wwwnews?D/28904 you
mention

 What Fortran has over C is the 'noalias' on function parameters which allows
 for aggressive optimization. What I'm thinking of is writing the spec for D
 functions so that parameters are always 'noalias' (for extern (C) functions
 this would not apply).

Is that or some other resolution to aliasing still under consideration??

Others - how much current code would that break?? Would it have a big effect on
the DTL implementation for example?

Thanks..

Aug 06 2004

Nick <Nick_member pathlink.com> writes:

In article <cf0l3e$q9p$1 digitaldaemon.com>, Dave says...
In article <ceoc3l$1apl$1 digitaldaemon.com>, Sean Kelly says...
int i;

void func( inout int ri )
{
for( int j = 0; j < 10; j++ ) {
ri++;
i++;
}
}

int main()
{
i = 10;
func(i);
printf("%d\n",i);
return 0;
}


<snip>
Others - how much current code would that break?? Would it have a big effect on
the DTL implementation for example?

I can't say if it would break anything, but I do like what you're proposing. A
coding style like the one above is IMO a very bad one.

I was a bit surprised that the above worked as it did, I though func() would
make a local copy and "return" it when done with it. Not doing so could create
some unexpected effects, for example:
















This outputs 10 and 11. I think you should be guaranteed that local variables
are not altered by "external" influences during the course of executing a
function. And if you really want the above behavior, use pointers.

Nick

Aug 06 2004

Dave <Dave_member pathlink.com> writes:

In article <cf0ocn$sg3$1 digitaldaemon.com>, Nick says...
In article <cf0l3e$q9p$1 digitaldaemon.com>, Dave says...
In article <ceoc3l$1apl$1 digitaldaemon.com>, Sean Kelly says...

I can't say if it would break anything, but I do like what you're proposing. A
coding style like the one above is IMO a very bad one.

I agree - and the C code I've been working with recently has quite a few LOC
like that, and I hope future D code will not ;)

The biggest thing for me though are the compiler optimization oppotunities that
disappear for many functions that have an argument list like that (if C style
aliasing is allowed that is).

I think what Walter said here http://www.digitalmars.com/drn-bin/wwwnews?D/28904
would do D alot of good.

Thanks.

PS: I'm new to D and these newgroups, and I gotta say am very impressed with
both. Here's to a grand weekend for everybody..3..2..1..it is now Miller Time
for this cat!

Aug 06 2004

Sean Kelly <sean f4.ca> writes:

In article <cf0ocn$sg3$1 digitaldaemon.com>, Nick says...
I was a bit surprised that the above worked as it did, I though func() would
make a local copy and "return" it when done with it. Not doing so could create
some unexpected effects, for example:
















This outputs 10 and 11. I think you should be guaranteed that local variables
are not altered by "external" influences during the course of executing a
function. And if you really want the above behavior, use pointers.

I don't know.  I would consider it a very bad idea to pass the same variable as
multiple parameters of a function with side effects.  This is a case where I
don't think it's necessary for the compiler to protect the programmer from
himself.  Besides, I may not always want to pay for the extra instructions and
such that this would require.


Sean

Aug 06 2004

Jay <Jay_member pathlink.com> writes:

In article <cf0qis$tlh$1 digitaldaemon.com>, Sean Kelly says...
I don't know.  I would consider it a very bad idea to pass the same variable as
multiple parameters of a function with side effects.  This is a case where I
don't think it's necessary for the compiler to protect the programmer from
himself.  Besides, I may not always want to pay for the extra instructions and
such that this would require.

Potential pointer aliasing crops on in all sorts of mundane situations. Anywhere
the compiler cannot maintain a strict bounds on a pointer's domain, it has to
assume it could be pointing anywhere--your current loop termination variable,
for example.

Imagine you've got a performance-critical "for" loop over "0" to "n" that calls
a function, which calls a function, which calls a function, which writes to a
pointer. Now imaging that the compiler, for whatever reason, cannot maintain a
strict bounds on the address range of that pointer, which could possibly be
pointing at the stack. What should have been a very tight "for" loop has just
become much less optimizable, because "n" might have been altered by a
three-deep nested function that happens to write to a pointer for which strict
bounds cannot be determined.

It's a scary situation, and some compilers have an option to ignore pointer
aliasing during optimization, which I always enable (if I remember to).

I agree it is a serious problem and sure would be nice to address somehow, if
possible.

Aug 17 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Not a stupid question at all. See previous threads, for example 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/5274
Norbet and Walter (and probably others) had a nice discussion about future
possibilities. Walter doesn't like "restrict" much either.

Dave wrote:

 
 Was this that stupid of a question or what? Seriously..
 
 Anyone want to take a crack at it? Walter??
 
 The reason I think this may be important for D is because a) a lot of
 scientific computing types aren't real happy with FORTRAN 95, b) they
 aren't real happy with C++ for many of the same reasons we aren't, c) they
 aren't real happy with Java (say what you will on the Java performance
 issue, but when you have to write C code in an an OOP language like Java
 to make it perform decently, it is not suitable for HPF.) and d) so, all
 this leaves it wide open for D to step in and make them happy, what with
 it's good support for native FP and built-in complex types and all. Plus,
 it was designed by a compiler developer ;)
 
 IMHO, get the HPC folks (both end-users and tool vendors) interested in an
 OOP language that has the potential to crunch numbers fast and you have
 yourself a good foothold on the language market.
 
 Thanks..
 
In article <ceja1n$28um$1 digitaldaemon.com>, Dave says...
I should have been more specific - sorry.

Code like this:

extern int i;

Bah - this was cut off somehow when I posted it, I'll try again..

Code like this:

;---

extern int i;

void func( int &ri )
{
for( int j = 0; j < 10; j++ ) {
ri++;
i++;
}
}

int main()
{
i = 10;
func(i);
printf("%d\n",i);
return 0;
}

;---

where 'ri' refers to 'i' the compiler has to keep an in-memory copy
instead of binding 'i' to a register for example. I guess other
optimizations opportunites are lossed as well, even if the aliasing
doesn't happen in the program, unless the compiler is sophisticated enough
to keep track of variables like this (which of course adds compile time,
complexity, bugs, etc..).

C99 has the 'restrict' keyword, but many aren't happy with that solution.
More on that here:
http://www.cbau.freeserve.co.uk/Compiler/RestrictPointers.html

This 'aliasing problem' is oft refered to as a reason that FORTRAN (for
example) is easier to write optimizers for.

Aug 03 2004

J C Calvarese <jcc7 cox.net> writes:

In article <ceoed5$1btb$1 digitaldaemon.com>, Ben Hinkle says...
Not a stupid question at all. See previous threads, for example 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/5274
Norbet and Walter (and probably others) had a nice discussion about future
possibilities. Walter doesn't like "restrict" much either.

Also, I found some ever older threads using "aliasing problem" at
http://www.digitalmars.com/advancedsearch.html:

http://www.digitalmars.com/d/archives/1913.html
http://www.digitalmars.com/d/archives/333.html

(I think they relate to your topic, but I'm not 100% sure.)

jcc7

Aug 03 2004

Dave <Dave_member pathlink.com> writes:

Doh! CTFA (Check the fargin' Archives). Sorry, and thanks for the responses and
pointers to the archives.

Here are some from back a while.. Can anyone tell me if the non-overlapping
array prohibition is still part of the spec.?

http://www.digitalmars.com/drn-bin/wwwnews?D/348
http://www.digitalmars.com/drn-bin/wwwnews?D/1926
http://www.digitalmars.com/drn-bin/wwwnews?D/18534
http://www.digitalmars.com/drn-bin/wwwnews?D/18543
http://www.digitalmars.com/drn-bin/wwwnews?D/17971

The following deal with exactly what my original post had in mind:

http://www.digitalmars.com/drn-bin/wwwnews?D/28260, to qoute Walter: 

"Historically, adding in special keywords for such optimizations has not worked
out well. That's why I was thinking of making it implicit for D function
parameters."

I AGREE - let's do that (make it implicit, but warn in debug mode), or is it too
late?

http://www.digitalmars.com/drn-bin/wwwnews?D/28377, to qoute Walter again:

"I think it would be better to have the compiler assume they are not aliased
(since that is by far the usual case) and have to say when they are not aliased.
Also, a runtime check that they really are not aliased might be appropriate in
debug mode."

I, again, wholeheartedly AGREE.

I'm with Drew here: http://www.digitalmars.com/drn-bin/wwwnews?D/28904

What is the state of this as far as D goes?

God Bless you Walter, I can see you put a lot of thought into memory layout for
and vectorizing of arrays and such, and also the aliasing issue.

Hopefully these ideas (non-overlapping arrays and implicit no-alias with debug
warning) have stood the test of time and v1 implementation so far. Because if
they do, I think it covers a lot of the issue for not only HPC code, but also
many other things now-a-days, like writing high-throughput socket code, database
engines, AI engines, speech synthesis, etc., etc., etc...

It seems like Walter wants to give both us and compiler implementors a great
high-performance base to work with..

Not only that, but consider this: In these days of cool, productive and decent
performance interpreted languages like Perl, Python, etc. not to mention Java,
many people are just not going to switch just because of excellent new features
(most people think "their" language has enough features - after all, they've
been able to "get by with it so-far").

Now, if you give them:

- High-performance on the order Fortran,
- Intuitive (implicit aliasing like C is NOT intuitive to most people using
Perl, VB, Java, etc. and therefore is the source of a lot of bugs to them),
- True OOP language with all the features of D,

THAT in total is a great reason to switch.

How about the compiler developers?? Many of them would quit en-masse if you told
them they had to implement C++ all over again, except with MORE features ;)

Thanks for all of the pointers to the archived messages.

- Dave

In article <ceoed5$1btb$1 digitaldaemon.com>, Ben Hinkle says...
Not a stupid question at all. See previous threads, for example 
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/5274
Norbet and Walter (and probably others) had a nice discussion about future
possibilities. Walter doesn't like "restrict" much either.

Dave wrote:

 
 Was this that stupid of a question or what? Seriously..

Aug 03 2004

Ilya Minkov <minkov cs.tum.edu> writes:

Dave schrieb:

 Doh! CTFA (Check the fargin' Archives). Sorry, and thanks for the responses and
 pointers to the archives.

No problem. It just takes time and threads get forgotten, because the
newsgroup grows too fast for anyone to follow.

 Here are some from back a while.. Can anyone tell me if the non-overlapping
 array prohibition is still part of the spec.?

As far as i know, it is with whole-array operations (they are not
implemented yet iirc). It is not specified in which order the elements
would be processed, and thus they must be unaliased...

I don't think it holds true of functions in general. One solution might
be a sort-of unoverlapping assert, which would be standard and both keep
us safe and enable optimizations. BTW, asserts usually do enable
optimizations in D - in release mode, their true-ness is assumed for
optimization, although the assert code itself is left out.

Even if it's not there, the whole array operations would probably carry
the bulk of performance optimization by utilizing the SIMD units.

The check itself is technically very simple and doesn't consume time,
since arrays are pointer&length and one only has to check their
overlapping. Since C doesn't have a concept of an array, i hardly
imagine how such a check could be done.

 http://www.digitalmars.com/drn-bin/wwwnews?D/348
 http://www.digitalmars.com/drn-bin/wwwnews?D/1926
 http://www.digitalmars.com/drn-bin/wwwnews?D/18534
 http://www.digitalmars.com/drn-bin/wwwnews?D/18543
 http://www.digitalmars.com/drn-bin/wwwnews?D/17971

Phew, you do some investigation work. :)

 The following deal with exactly what my original post had in mind:
 
 http://www.digitalmars.com/drn-bin/wwwnews?D/28260, to qoute Walter: 
 
 "Historically, adding in special keywords for such optimizations has not worked
 out well. That's why I was thinking of making it implicit for D function
 parameters."
 
 I AGREE - let's do that (make it implicit, but warn in debug mode), or is it
too
 late?

No, i don't think things like that are late. The compiler doesn't look
like it's approaching the spec very soon, and code gets broken every now
and then.

 http://www.digitalmars.com/drn-bin/wwwnews?D/28377, to qoute Walter again:
 
 "I think it would be better to have the compiler assume they are not aliased
 (since that is by far the usual case) and have to say when they are not
aliased.
 Also, a runtime check that they really are not aliased might be appropriate in
 debug mode."
 
 I, again, wholeheartedly AGREE.

Hmmm... And what would one do when one is willing to accept aliased
arrays? And is there any necessity in this case anyway?

What i see in front of my eyes, is a function, which inputs 2 arrays and 
outputs a new one. I understand that there should be no aliasing if it 
was to output in one of the inputs, but aliasing between the inputs 
would be OK if the output is a new array.

You mention Fortran doesn't have the problem. How does Fortran deal with 
aliasing?

If i remember correctly, Sather has some way to identify possible 
aliasing hazards statically, and allow aliasing if nothing speaks 
against it, though i might be wrong - perhaps it was just planned. 
Sather is a whole-program compiler. D is geared towards whole-program 
compilation, but should also work without it. C++ is too weak for all that.

 I'm with Drew here: http://www.digitalmars.com/drn-bin/wwwnews?D/28904
 
 What is the state of this as far as D goes?
 
 God Bless you Walter, I can see you put a lot of thought into memory layout for
 and vectorizing of arrays and such, and also the aliasing issue.
 
 Hopefully these ideas (non-overlapping arrays and implicit no-alias with debug
 warning) have stood the test of time and v1 implementation so far. Because if
 they do, I think it covers a lot of the issue for not only HPC code, but also
 many other things now-a-days, like writing high-throughput socket code,
database
 engines, AI engines, speech synthesis, etc., etc., etc...
 
 It seems like Walter wants to give both us and compiler implementors a great
 high-performance base to work with..
 
 Not only that, but consider this: In these days of cool, productive and decent
 performance interpreted languages like Perl, Python, etc. not to mention Java,
 many people are just not going to switch just because of excellent new features
 (most people think "their" language has enough features - after all, they've
 been able to "get by with it so-far").
 
 Now, if you give them:
 
 - High-performance on the order Fortran,
 - Intuitive (implicit aliasing like C is NOT intuitive to most people using
 Perl, VB, Java, etc. and therefore is the source of a lot of bugs to them),
 - True OOP language with all the features of D,
 
 THAT in total is a great reason to switch.
 
 How about the compiler developers?? Many of them would quit en-masse if you
told
 them they had to implement C++ all over again, except with MORE features ;)
 
 Thanks for all of the pointers to the archived messages.
 
 - Dave
 
 In article <ceoed5$1btb$1 digitaldaemon.com>, Ben Hinkle says...
 
Not a stupid question at all. See previous threads, for example 
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/5274
Norbet and Walter (and probably others) had a nice discussion about future
possibilities. Walter doesn't like "restrict" much either.

Dave wrote:


Was this that stupid of a question or what? Seriously..

Aug 10 2004

"Carlos Santander B." <carlos8294 msn.com> writes:

"Ilya Minkov" <minkov cs.tum.edu> escribi� en el mensaje
news:cfasdk$30jv$1 digitaldaemon.com
| You mention Fortran doesn't have the problem. How does Fortran deal with
| aliasing?

Fortran doesn't have pointers, so there's no aliasing.

-----------------------
Carlos Santander Bernal

Aug 11 2004

Dave <Dave_member pathlink.com> writes:

In article <cfdm6s$126s$1 digitaldaemon.com>, Carlos Santander B. says...
"Ilya Minkov" <minkov cs.tum.edu> escribi� en el mensaje
news:cfasdk$30jv$1 digitaldaemon.com
| You mention Fortran doesn't have the problem. How does Fortran deal with
| aliasing?

Fortran doesn't have pointers, so there's no aliasing.

No, it's more innocuous than that even..

This f77 compiled with g77 (all I have):

I=1
CALL FOO(I,I)
PRINT *,I
END

SUBROUTINE FOO(J,K)
c       D could have a debug build runtime 'assert' placed by the compiler here
c       checking if J & K referenced the same variable, along the sames lines of
c       array bounds checking in D
J = J + K
K = J * K
PRINT *, J, K
END

produces the 'intuitively wrong' results through aliasing w/o any compiler or
runtime warnings:

4 4
4

The way (g77 at least) handles it is it expects you to follow the specs. that
say not to write code like that.

The proposal I suggested a little earlier today would not 'noalias' pointers or
the inside of 'common' data like arrays, only value types and the built-in
arrays passed by ref. (with a debug runtime warning as illustrated above, just
like what happens with array bounds checking for debug builds now).

- Dave

Aug 11 2004

"Carlos Santander B." <carlos8294 msn.com> writes:

"Dave" <Dave_member pathlink.com> escribi� en el mensaje
news:cfdr1n$15ba$1 digitaldaemon.com
| No, it's more innocuous than that even..
|
| This f77 compiled with g77 (all I have):
|
| I=1
| CALL FOO(I,I)
| PRINT *,I
| END
|
| SUBROUTINE FOO(J,K)
| c       D could have a debug build runtime 'assert' placed by the compiler
here
| c       checking if J & K referenced the same variable, along the sames lines
of
| c       array bounds checking in D
| J = J + K
| K = J * K
| PRINT *, J, K
| END
|
| produces the 'intuitively wrong' results through aliasing w/o any compiler or
| runtime warnings:
|
| 4 4
| 4
|
| The way (g77 at least) handles it is it expects you to follow the specs. that
| say not to write code like that.
|
| The proposal I suggested a little earlier today would not 'noalias' pointers
or
| the inside of 'common' data like arrays, only value types and the built-in
| arrays passed by ref. (with a debug runtime warning as illustrated above, just
| like what happens with array bounds checking for debug builds now).
|
| - Dave

Sorry about what I said: I had read it here on this ng, so I just repeated it.

I have a copy of Salford FTN77 Compiler (4.03, Personal Edition. Used to be
free, not anymore I think), and it produced the exact same result until I used
the "/UNSAFE" flag. From the help file: "/UNSAFE: Used in conjunction with
/OPTIMISE in order to improve the execution speed of certain programs by using
code re-arrangement techniques". With that flag I got 2's instead of 4's.

-----------------------
Carlos Santander Bernal

Aug 11 2004

"Carlos Santander B." <carlos8294 msn.com> writes:

"Carlos Santander B." <carlos8294 msn.com> escribi� en el mensaje
news:cfejqa$1g7i$1 digitaldaemon.com
| I have a copy of Salford FTN77 Compiler (4.03, Personal Edition. Used to be
| free, not anymore I think)

It's still free. And their FTN95 Personal Editional is also free.
Just in case someone's interested...

-----------------------
Carlos Santander Bernal

Aug 11 2004

Dave <Dave_member pathlink.com> writes:

In article <cfek28$1gbg$1 digitaldaemon.com>, Carlos Santander B. says...
"Carlos Santander B." <carlos8294 msn.com> escribi� en el mensaje
news:cfejqa$1g7i$1 digitaldaemon.com
| I have a copy of Salford FTN77 Compiler (4.03, Personal Edition. Used to be
| free, not anymore I think)

It's still free. And their FTN95 Personal Editional is also free.
Just in case someone's interested...

Might be! Thanks..

- Dave

-----------------------
Carlos Santander Bernal

Aug 11 2004

Dave <Dave_member pathlink.com> writes:

In article <schancel-8833C7.08301601082004 digitalmars.com>, Sha Chancellor
says...
In article <ceis7r$23bt$1 digitaldaemon.com>,
 Dave <Dave_member pathlink.com> wrote:

 I'm very new to D (literally as of yesterday), but am very impressed with 
 what I'm seeing so far.
 
 Being that I want this language to succeed and an important part of that will 
 be performance potential over C, I'm curious - how does/will D deal with the
 pointer 'aliasing problem' that plagues C and C++ compiler developers?


<snip>
What aliasing problem would you be referring too?  ( No D does not deal 
with it with denial, it's an honest question :)

--- Thump, thump - up on the soapbox ---

I hate to say it, but it's appearing more and more like it is being dealt with
by denial ;)

I've asked the question a couple of times in a couple of different ways and it
seems that none of the 'principles' of D design and implementation can or will
give me an answer on future direction [or even future plans] for this issue.

And all the while more code (that a change may break) is being written, large
efforts are underway to develop big libraries (like the DTL, parts of which
which may break) and version 1.0 is just around the corner (which more-or-less
cements the issue for a while, or maybe forever - we are still living issues
like this down with C/C++).

Look, FWIW, I'm smitten with D. Absolutely love what I've been able to absorb in
a week's worth of late nights. I think Walter is a gutsy genious. Matthew as
well - brilliant guy, investing tons of time and stomach lining on developing
the DTL and writing about and promoting D. And I'm impressed with all the work
and input all the others have put into this language and this newsgroup.

What I'm saying here is all in the spirit of trying to make sure D is being used
by more than a couple thousand Die harD D Developers in the coming years. I want
this language to succeed because I'm sick and tired of basic inadequecies in C
and its derivitive languages. Although I personally hate ocaml, I don't want to
see D end up like that - lot's of promise, little future (although I'm damn glad
I'm not forced by popularity to use it. It may be intuituve if you speak french,
but I don't and loath that language [ocaml, not french] ;).

I work 70+ hrs. a week, have a family that needs attention and STILL want to
help D anyway I can, but I simply will not start working (as opposed to playing)
with D until fundamental issues like aliasing behaviour, runtime allocated
rectangular arrays, etc. are worked out better. All the OOP stuff is great, but
it's damn hard to write, improve and _trust the future of_ code developed with a
language that hasn't nailed down even more fundamental issues when there are
many other good choices out there already that have mature compilers, libraries,
etc.

It appears to me (given the versioning scheme of dmc, etc.) that Walter has a
different - maybe the correct, who knows - take on versioning than the rest of
us. In other words, v1.0 seems like more of an increment than a milestone to
him. Problem is almost no one else outside of this community sees it that way.
If D takes off (and I'm convinced it won't unless some basic issues like 'the
aliasing problem' are taken care of) the D community may have to live with
features/results of v1.0 for a long time, even if v1.0 itself is obsoleted in a
month.

I like what I've seen so far, but the aliasing issue is one of the underlying
reasons WHY C and C++ need to be REPLACED, and are not used for whole classes of
applications. Jagged dynamic arrays being the only built-in choice for runtime
allocated arrays is another big one. 

I hate to rain on the parade here, but I think this language will really suffer
in both popularity and somewhat in utility w/o a decent resolution to these two
issues (and there may be others as well, but these really stand out). Now a
days, most people will simply not adopt another natively compiled language over

compiler vendors will not support a language where there is primarily C++ style
OOP wrapped in different semantics. Face it folks, in the mainstream OOP

man out here, but may well be a better choice than any of those technically. You
will win few of those people over, but not many. In the C and Fortran worlds,
there is just no way that they will switch until D proves itself a better
performer, or at least there is now a reason to believe from the language design
that it can perform the same (Fortran) or better (C).

D fans - you gotta see it as most of the rest of the world will see it when a
new project comes along. When you think of it, it is not hard at all to imagine
this dialog:

Geek: "Boss, there is this new language out there called D that I really like.
Howz about we give it a shot for our next project."

PHB: "Ummm, well, why?"

Geek: "Because it's really cool and the compiler is free!"

PHB: "Ok, you've just described about 20 other languages that I can read about
and download on the net. Now give me some real good reasons or go back to your
C++ IDE, or if you are you still tuning that Java app., finish that.. I have a
tee time in a 1/2 hr."

Geek: "Ummm, well I can be maybe about 20% more productive with it, unless the
unproven compiler and libraries give me problems."

PHB: "Hmmm, sounds more risky than being near the fairway when I tee off.. How
about performance?"

Geek: "Ummm, well the compilers are young yet, but should give me performance on
par with Java and maybe reasonably close to C++, at least for some things. But
as I say, it's young yet.".

PHB: "Ok, I can understand that every great tool has to start somwhere and the
productivity sounds enticing.  You know how picky some users and the development
team can be about stuff like performance.. Is there a potential for this
language to provide us with better performance once the tools mature - can we
finally get rid of those damn Fortran legacy apps. we're always hiring
consultants for"?

Geek: "Umm, well, hmmmm, uhhh, well, I guess not, not really anyway. Ok, I'll
level with you - it inherited basically about the same performance potential as
C++ has, but, but, well it has some cool built-ins and auto-GC".

PHB: "Hmph, well Ok, GC is cool. How useful are the built-ins".

Geek: (Shit, I knew he'd ask that) "Well, for specialized high-performance
needs, we can bypass the GC if needed and build our own classes".

PHB: "Sounds like what we've already done and tested with C++"

Geek: "Hmmm, yep".

PHB: (Pats Geek on the shoulder) "Ok, well will talk about it some other time
(once I get council from the finger-in-the-wind oracle). In the meantime, get
back out there and bust your ass with C++ or finish tuning that Java app. until
we sign that contract with that Indian company, would ya".

Geek: (Mumbling under his breath) "[ok - just try not to hit the CEO in the butt
with your golf ball again, dork.]"

If you are going to try and win mindshare for a high-performance language, then
you have to pay attention to basic issues like this, and do your best to get it
right the first time even if the release of the big 1.0 is delayed a while. It
doesn't take a rocket scientist to figure out that mimicking C/C++ aliasing and
jagged arrays is NOT THE WAY TO GO for a language aspiring to win developers
away from it.

I could friggen kick myself for not finding out about D earlier so I could have
bitched then. And I'm guessing that many reading this (if any) would like to
kick me now.. But anyhow...

PHP is a good example of how to promise and deliver on a language to initially
fill a good size niche and then expand to win more mind-share. They didn't try
to mimick Java (as used for that class of applications) or ASP on performance or
features. What that community started with, promised and delivered on was an
easy to learn and use dynamic website scripting language with most of the useful
features and decent performance.

I think D will need to do something similiar as far as expectation management.

Fortran, some basic underlying design of the language has to be different enough
to attract them. IMO, adding more and more, or slightly improving existing, OOP
features is not the way to go. Adding high-performance orientated semantics to
the language along with /high-performance/ built-ins is the way to go.

In this group and in the archives, I've seen /ALOT/ of talk, argument and
mindshare wasted on how this or that OOP feature doesn't exactly coincide with
what they expected/wanted/whatever, but comparitively little on the performance
of the language and tools or changing the semantics to support high-performance
compiler development. The worst part of this, given the squeeky wheel truism, is
that Walter has been driven to pay so much attention to OOP nits that the other
more basic stuff hasn't gotten the attention it deserves.

Since this is a new language, I really think the most basic needs of the
underlying language should be tended to first, don't you??

--- THUMP off the soapbox.. ---

If you made it here, thanks for taking the time to read this..

- Dave

PS: I post more-or-less anonymously to these newsgroups. Rest assured it hasn't
been because I'm somehow gutless when it comes to expressing my opinions,
defending them or having them attributed to me forever. A family member was
recently the victim of ID theft, and I'm a bit wary of any info. I put out into
the public domain, including e-mail addresses, etc. (the ID theft modus-operandi
included use of an e-mail address).

Aug 08 2004

J C Calvarese <jcc7 cox.net> writes:

You obvious have passion and knowledge about the issue of the "Aliasing 
Problem" that far exceeds my interest. I hope that D doesn't disappoint 
you in this respect, but I fear that it will. Nonetheless, I do have a 
few minor points I'd like to make.

Dave wrote:
 In article <schancel-8833C7.08301601082004 digitalmars.com>, Sha Chancellor
 It appears to me (given the versioning scheme of dmc, etc.) that Walter has a
 different - maybe the correct, who knows - take on versioning than the rest of
 us. In other words, v1.0 seems like more of an increment than a milestone to
 him. Problem is almost no one else outside of this community sees it that way.
 If D takes off (and I'm convinced it won't unless some basic issues like 'the
 aliasing problem' are taken care of) the D community may have to live with
 features/results of v1.0 for a long time, even if v1.0 itself is obsoleted in a
 month.

Unless Walter has told you something that he hasn't told anyone else, I 
don't think anyone knows when DMD 1.0 is coming. I know DMD 0.98 has 
just been released, but unless Walter slows the releases way down I 
can't believe we're a couple releases away from DMD 1.0. I think we're 
going to see a 0.101, 0.102, 0.103, etc. before 1.0 appears. This 
"floating point" vs. "major.minor" controversy hasn't been commented on 
by Walter, but Walter hasn't indicated that he wants to release anything 
other than a polished D 1.0 that will make D look good and stable.

 I like what I've seen so far, but the aliasing issue is one of the underlying
 reasons WHY C and C++ need to be REPLACED, and are not used for whole classes
of
 applications. Jagged dynamic arrays being the only built-in choice for runtime
 allocated arrays is another big one. 
 
 I hate to rain on the parade here, but I think this language will really suffer
 in both popularity and somewhat in utility w/o a decent resolution to these two
 issues (and there may be others as well, but these really stand out). Now a
 days, most people will simply not adopt another natively compiled language over


I'm not particularly concerned about the performance of C/C++. I care 
about the easy of programming in those languages (or lack thereof).

I don't like Java because of the performance issues and the requirement 
of using OOP, but the people who like Java usually don't mind the 
performance loss and enjoy the OOP aspects. If they're looking for 
better performance, they'll either like the D syntax or they won't. I 
doubt they'd switch to D just because it's x percent faster than C/C++. 
But I'm just guessing.

 compiler vendors will not support a language where there is primarily C++ style
 OOP wrapped in different semantics. Face it folks, in the mainstream OOP

odd
 man out here, but may well be a better choice than any of those technically.
You
 will win few of those people over, but not many. In the C and Fortran worlds,
 there is just no way that they will switch until D proves itself a better
 performer, or at least there is now a reason to believe from the language
design
 that it can perform the same (Fortran) or better (C).

Honestly, I don't think D is even targeted at Fortran programmers. Or 
Cobol programmers.

It'd be great if D could appeal to programmers of every language, but I 
really don't think we're going too get many Visual Basic converts 
either. Or Caml or Haskell.

D is designed to be what C++ should have been. C programmers that don't 
like C++ because it's too complicated should like D. C++ programmers who 
don't think that C is powerful enough should like D. That's the niche 
that D is targeting. If the libraries become powerful enough so that 

vision has never been "one language to rule them all". Just the best 
all-around language yet (perhaps I overstate the goal?). Yes, 
performance is important, but so is the ease of programming.

 Since this is a new language, I really think the most basic needs of the
 underlying language should be tended to first, don't you??

If you're building a house and essentially all you have left to do 
before you sell it is install the carpet, is that a good time to pull 
all of the incandescent light fixtures and replace them with 
fluorescence lights. I think you're talking about big changes and 
recently Walter has been talking about being done with the design. If 
you were making these suggestions a year ago, you might have found more 
interest. But even then could have been too late. Walter's been 
designing D for several years.

 
 --- THUMP off the soapbox.. ---
 
 If you made it here, thanks for taking the time to read this..

Thanks for posting. Maybe Walter can think of an easy way to adjust the 
Aliasing to allay your performance fears.

We'd all like D to be the best language, but there are different ideas 
of how that's accomplished. With the increasing speed and memory of new 
hardware, performance issues aren't always the most important issue. And 
I think Walter's current goal is for DMD to work right and he'll work on 
getting it to be "leaner and meaner" later.

 
 - Dave
 
 PS: I post more-or-less anonymously to these newsgroups. Rest assured it hasn't
 been because I'm somehow gutless when it comes to expressing my opinions,
 defending them or having them attributed to me forever. A family member was
 recently the victim of ID theft, and I'm a bit wary of any info. I put out into
 the public domain, including e-mail addresses, etc. (the ID theft
modus-operandi
 included use of an e-mail address).

A few people around here post more anonymously than you do, I wouldn't 
feel bad about it.

-- 
Justin (a/k/a jcc7)
http://jcc_7.tripod.com/d/

Aug 08 2004

Dave <Dave_member pathlink.com> writes:

In article <cf66ab$350$1 digitaldaemon.com>, J C Calvarese says...
You obvious have passion and knowledge about the issue of the "Aliasing 
Problem" that far exceeds my interest. I hope that D doesn't disappoint 
you in this respect, but I fear that it will. Nonetheless, I do have a 
few minor points I'd like to make.

Passion yes, knowledge, maybe a just a little bit. I'm no compiler writer, heck
I'm not even a numerics guy.

The reason I care about it is because D currently adapts something bad from C
that I think could finally be done away with w/o a lot of pain. And I hate
paying for something many times (in terms of clean code generation) that is
there to accomidate a bare minority of the code.

Who knows - maybe the reason Walter is not commenting on this thread is because
he's got it figured out or he no longer thinks it as important as he once did in
the grand scheme of things (check the archives or my earlier posts pointing to
some of these archives if you're curious about that).

just been released, but unless Walter slows the releases way down I 
can't believe we're a couple releases away from DMD 1.0. I think we're 
going to see a 0.101, 0.102, 0.103, etc. before 1.0 appears. This 
"floating point" vs. "major.minor" controversy hasn't been commented on 
by Walter, but Walter hasn't indicated that he wants to release anything 
other than a polished D 1.0 that will make D look good and stable.

Well, that's encouraging at least. 2 things made me think we were heading to
1.0: I believe there was some talk of a release date of March 04 in the
archives. Obviously we are way past that. 2nd, the version number scheme that is
used for other DM products indicates 1.0 is a couple of releases away unless
they start being numbered 0.99.1, 0.99.2, etc. or I guess 1.0 alpha1, 1.0
alpha2, 1.0 beta1, etc. could be used. But your right - none of this is set in
stone.

I'm not particularly concerned about the performance of C/C++. I care 
about the easy of programming in those languages (or lack thereof).

I don't like Java because of the performance issues and the requirement 
of using OOP, but the people who like Java usually don't mind the 

I'm confused - you don't care about C++ performance, but do about Java
performance?

The truth is becoming, except for probably some (important) OOP stuff like
generics and others, a few Java runtimes are closing the gap with top C++
compilers in both benchmarks and more than just some 'real world' code from what
I've heard from fellow developers who've used both.

That's another reason I want D to be superior in performance, or at this stage,
at least superior in performance /potential/.

performance loss and enjoy the OOP aspects. If they're looking for 
better performance, they'll either like the D syntax or they won't. I 
doubt they'd switch to D just because it's x percent faster than C/C++. 
But I'm just guessing.

Right now, according to some benchmarks I've run, D seems a good margin slower
for somethings, especially when the built-in strings and AA's are used. For
example char[] concatenation in a tight loop like this:

;---
D:

import std.string;
import std.stream;

int main()
{
char[] input, output;

output = "<TABLE><TR><TD>\n";
File f = new File("some_large_file",FileMode.In);
while(!f.eof()) {
input = f.readLine();
output ~= input;
output ~= "</TD><TD>\n";
}
f.close();

output ~= "</TD></TR></TABLE>\n";

printf("output length: %d\n",output.length);
return(0);
}
output length: 6944735
real    0m37.747s
user    0m31.840s
sys     0m3.180s
;---

C++:

#include <string>
#include <fstream>
using namespace std;

int main()
{
string input, output;

output = "<TABLE><TR><TD>\n";
ifstream f("some_large_file",fstream::in);
while(getline(f,input)) {
output += input;
output += "</TD><TD>\n";
}
f.close();

output += "</TD></TR></TABLE>\n";

printf("output length: %d\n",output.length());
return(0);
}
output length: 6944735
real    0m0.311s
user    0m0.260s
sys     0m0.040s
;---

is alot slower than basic_string<> in C++. Now I know about OutBuffer, but
people will be drawn to using char[] for this, just like they are drawn to Java
String instead of StringBuffer or StringBuilder (Java v1.5). Here's the
OutBuffer version:

import std.string;
import std.stream;
import std.outbuffer;

int main()
{
char[] input;
OutBuffer output = new OutBuffer();

output.write("<TABLE><TR><TD>\n");
File f = new File("some_large_file",FileMode.In);
while(!f.eof()) {
input = f.readLine();
output.write(input);
output.write("</TD><TD>\n");
}
f.close();

output.write("</TD></TR></TABLE>\n");

printf("output length: %d\n",output.toString().length);
return(0);
}
output length: 6944735
real    0m5.217s
user    0m2.640s
sys     0m2.280s
;---

C++ is still 10x faster, and the Java version would probably be not far behind
C++ from what I've seen.

Like it or not, many people will choose whether or not to give a language a 2nd
look based on benchmarks. _Especially_ when the competition is C/C++ or Java.

Honestly, I don't think D is even targeted at Fortran programmers. Or 
Cobol programmers.

Maybe not Cobol programmers, but check the archives on the Fortran question.
While maybe not an explicit goal, it seems to be a sincere wish of Walter and
others that D would be considered as a replacement for Fortran. It is one of the
areas C/C++ falls short. One of the reasons why C/C++ falls short here is [you
guessed it] 'the aliasing problem'.

It'd be great if D could appeal to programmers of every language, but I 
really don't think we're going too get many Visual Basic converts 
either. Or Caml or Haskell.

Caml or Haskell I agree with, but that is probably a pretty small minority.
Visual Basic I strongly disagree with. Apparently many of those people have

their alley.

D is designed to be what C++ should have been. C programmers that don't 
like C++ because it's too complicated should like D. C++ programmers who 

That's the point - C++ should have handled the aliasing problem (among other C
weaknesses) differently, but the principal designers of that language were too
damn busy fighting over arcane OOP issues (sound familiar? ;).

If you're building a house and essentially all you have left to do 
before you sell it is install the carpet, is that a good time to pull 
all of the incandescent light fixtures and replace them with 
fluorescence lights. I think you're talking about big changes and 
recently Walter has been talking about being done with the design. If 
you were making these suggestions a year ago, you might have found more 
interest. But even then could have been too late. Walter's been 
designing D for several years.

Funny you should mention that. Something like it happened to a house I had
built. It turns out some of the electrical would not pass inspection so they had
to change that before they could lay the carpet (true story).

And I /may not/ be talking about changes that are all that huge or drastically
hard to implement either. It's certainly worth more discussion IMHO, which is
why I keep at this thread.

 If you made it here, thanks for taking the time to read this..

Thanks for posting. Maybe Walter can think of an easy way to adjust the 
Aliasing to allay your performance fears.

I hope so.

of how that's accomplished. With the increasing speed and memory of new 
hardware, performance issues aren't always the most important issue. And 
I think Walter's current goal is for DMD to work right and he'll work on 
getting it to be "leaner and meaner" later.

Sorry, but I have to call you on that.. Many have been saying similiar for years
and it seems that software complexity always fills the gap, and then some. Java
fans are qouted as saying the same ad nauseum in the early days and even now to
some extent. In the eight or so years since Java took off, people are still
comlaining about it and /both/ machines and Java runtimes are quite a bit more
performant now than they were then.

I've heard tell that a machine costing under about $5000 hasn't even been built
yet that can run the next version of Windows with generally acceptable
performance. MS is literally betting the farm that Moore's Law will continue
unimpeded so they have something to run what they sell. It's a good bet, but
still a bet.

Maybe with a D spec. that takes care of the aliasing issue, they could have
wrote the new Windows with D and I could run it on my trusty 'old' Pentium 4
machine, and spend the $3000 I'd save on a new Harley ;)

- Dave

Aug 09 2004

Ben Hinkle <bhinkle4 juno.com> writes:

[mega-snip]

 ;---
 D:
 
 import std.string;
 import std.stream;
 
 int main()
 {
 char[] input, output;
 
 output = "<TABLE><TR><TD>\n";
 File f = new File("some_large_file",FileMode.In);
 while(!f.eof()) {
 input = f.readLine();
 output ~= input;
 output ~= "</TD><TD>\n";
 }
 f.close();
 
 output ~= "</TD></TR></TABLE>\n";
 
 printf("output length: %d\n",output.length);
 return(0);
 }

Have you tried using a BufferedFile? The default File is unbuffered. There
are a number of posters who think the default File should be buffered and
I'm starting to agree - just because people seem to assume it is buffered.

 output length: 6944735
 real    0m37.747s
 user    0m31.840s
 sys     0m3.180s
 ;---

[mega-snip]

Aug 09 2004

Berin Loritsch <bloritsch d-haven.org> writes:

Ben Hinkle wrote:
<snip>

 Have you tried using a BufferedFile? The default File is unbuffered. There
 are a number of posters who think the default File should be buffered and
 I'm starting to agree - just because people seem to assume it is buffered.

I would be careful with this.  Something I have found with Java IO
Streams is that buffering can backfire--and if there is no way to
turn it off then the developer is stuck.  Let me give an example:

In a web environment you need to handle as many requests as you can
at one time.  This is key to scalability.  Initial testing might
suggest that using a BufferedInputStream for file IO would speed up
the request/response time on the server.  Then later, when you are
doing load testing, you find that the extra KB or so of RAM taken
up by the buffer is adding up quickly and your system starts falling
apart at the seams due to the heavy load on the memory system.

This is something I have been bitten by in the past.  I would be
surprised to find it only limited to Java.  The solution to use
unbuffered IO or even a greatly reduced buffer size helped the
scalability of the webapp.

Aug 09 2004

"Stratus" <vdai spamnet.net> writes:

Really shouldn't waste the effort in replying, but this post is so full of
FUD ... Puhhhleez ! I will, however, not pollute this decent topic further
by pointing out exactly how much fallacy is involved. Instead, I'll try to
assume it's an offbeat attempt at humor.

"Berin Loritsch" <bloritsch d-haven.org> wrote in message
news:cf8c0q$qnr$1 digitaldaemon.com...
 I would be careful with this.  Something I have found with Java IO
 Streams is that buffering can backfire--and if there is no way to
 turn it off then the developer is stuck.  Let me give an example:

 In a web environment you need to handle as many requests as you can
 at one time.  This is key to scalability.  Initial testing might
 suggest that using a BufferedInputStream for file IO would speed up
 the request/response time on the server.  Then later, when you are
 doing load testing, you find that the extra KB or so of RAM taken
 up by the buffer is adding up quickly and your system starts falling
 apart at the seams due to the heavy load on the memory system.

 This is something I have been bitten by in the past.  I would be
 surprised to find it only limited to Java.  The solution to use
 unbuffered IO or even a greatly reduced buffer size helped the
 scalability of the webapp.

Aug 09 2004

Berin Loritsch <bloritsch d-haven.org> writes:

Stratus wrote:

 Really shouldn't waste the effort in replying, but this post is so full of
 FUD ... Puhhhleez ! I will, however, not pollute this decent topic further
 by pointing out exactly how much fallacy is involved. Instead, I'll try to
 assume it's an offbeat attempt at humor.

Actually, it is an anecdote of history.

 
 "Berin Loritsch" <bloritsch d-haven.org> wrote in message
 news:cf8c0q$qnr$1 digitaldaemon.com...
 
I would be careful with this.  Something I have found with Java IO
Streams is that buffering can backfire--and if there is no way to
turn it off then the developer is stuck.  Let me give an example:

In a web environment you need to handle as many requests as you can
at one time.  This is key to scalability.  Initial testing might
suggest that using a BufferedInputStream for file IO would speed up
the request/response time on the server.  Then later, when you are
doing load testing, you find that the extra KB or so of RAM taken
up by the buffer is adding up quickly and your system starts falling
apart at the seams due to the heavy load on the memory system.

This is something I have been bitten by in the past.  I would be
surprised to find it only limited to Java.  The solution to use
unbuffered IO or even a greatly reduced buffer size helped the
scalability of the webapp.

Aug 09 2004

Berin Loritsch <bloritsch d-haven.org> writes:

Berin Loritsch wrote:

 Stratus wrote:
 
 Really shouldn't waste the effort in replying, but this post is so 
 full of
 FUD ... Puhhhleez ! I will, however, not pollute this decent topic 
 further
 by pointing out exactly how much fallacy is involved. Instead, I'll 
 try to
 assume it's an offbeat attempt at humor.

 
 
 Actually, it is an anecdote of history.
 

Besides whats wrong with supplying both an UnbufferedFile and a
BufferedFile?  You have the flexibility when you need it.  I assume
you don't want to take the time to discover how little falacy is
involved.  Or how that particular premature optmization bit me.

Aug 09 2004

J C Calvarese <jcc7 cox.net> writes:

Berin Loritsch wrote:

 Ben Hinkle wrote:
 <snip>
 
 Have you tried using a BufferedFile? The default File is unbuffered. 
 There
 are a number of posters who think the default File should be buffered and
 I'm starting to agree - just because people seem to assume it is 
 buffered.

 
 
 I would be careful with this.  Something I have found with Java IO
 Streams is that buffering can backfire--and if there is no way to
 turn it off then the developer is stuck.  Let me give an example:

I read "the default File should be buffered" to mean both are allowed 
and it's the developer's choice.

Just because one is given a preferential name doesn't mean that the 
other isn't allowed.

No need for alarm. ;)

-- 
Justin (a/k/a jcc7)
http://jcc_7.tripod.com/d/

Aug 09 2004

Dave <Dave_member pathlink.com> writes:

In article <cf8b1l$qb0$1 digitaldaemon.com>, Ben Hinkle says...
Have you tried using a BufferedFile? The default File is unbuffered. There
are a number of posters who think the default File should be buffered and
I'm starting to agree - just because people seem to assume it is buffered.

Here is that along with a similiar Java version. Thanks for the tip on
BufferedFile().

;----------
D version:

import std.string;
import std.stream;
import std.outbuffer;

int main()
{
char[] input;
OutBuffer output = new OutBuffer();

output.write("<TABLE><TR><TD>\n");
BufferedFile f = new BufferedFile("some_large_file",FileMode.In);
while(!f.eof()) {
input = f.readLine();
output.write(input);
output.write("</TD><TD>\n");
}
f.close();

output.write("</TD></TR></TABLE>\n");

printf("output length: %d\n",output.toString().length);
return(0);
}
;---
output length: 6944735
real    0m0.767s
user    0m0.710s
sys     0m0.060s

;;-----------------------

Java version:

import java.io.*;
import java.util.*;
import java.text.*;

public class html {
public static void main(String[] args)
{
String input;
StringBuffer output = new StringBuffer();

output.append("<TABLE><TR><TD>\n");
try {
FileReader f = new FileReader("some_large_file");
BufferedReader in = new BufferedReader(f);
while ((input = in.readLine()) != null) {
output.append(input);
output.append("</TD><TD>\n");
}
f.close();
} catch (IOException e) {
System.err.println(e);
return;
}

output.append("</TD></TR></TABLE>\n");

System.out.println("output length: " + output.toString().length());
}
}
;---


output length: 6944735
real    0m1.881s
user    0m0.780s
sys     0m0.050s

Aug 09 2004

Derek Parnell <derek psych.ward> writes:

On Sun, 8 Aug 2004 20:48:36 +0000 (UTC), Dave wrote:

 In article <schancel-8833C7.08301601082004 digitalmars.com>, Sha Chancellor
 says...
In article <ceis7r$23bt$1 digitaldaemon.com>,
 Dave <Dave_member pathlink.com> wrote:

 I'm very new to D (literally as of yesterday), but am very impressed with 
 what I'm seeing so far.
 
 Being that I want this language to succeed and an important part of that will 
 be performance potential over C, I'm curious - how does/will D deal with the
 pointer 'aliasing problem' that plagues C and C++ compiler developers?


 <snip>
What aliasing problem would you be referring too?  ( No D does not deal 
with it with denial, it's an honest question :)

 
 --- Thump, thump - up on the soapbox ---
 
 I hate to say it, but it's appearing more and more like it is being dealt with
 by denial ;)

[big snip]

I think I understand what the "aliasing" issue is. What I can't quite see
is what you think the resolution should be. Are you ask for the compiler to
detect (potential?) aliasing situations and issue an error message? Or is
it a run-time solution you are requiring? 

If someone on my team coded something like the example you gave, I'd have
them rewrite that potentially dangerous code into something a *lot* more
sane. Even if updating a shared variable absolutely had to be coded, I'd
insist on a runtime check, along the lines of ...

int i;

void func( inout int ri )
{
    if (&ri != &i)
    {
        for( int j = 0; j < 10; j++ ) {
            ri++;
            i++;
        }
    }
    else
        throw new Error("Aliased ri/i");
}

int main()
{
i = 10;
func(i);
printf("%d\n",i);
return 0;
}

//--------------Though this is a safer way...
int i;

int func( in int ri )
{
    for( int j = 0; j < 10; j++ ) {
        ri++;
        i++;
    }
    return ri;
}

int main()
{
int res;
i = 10;

i = func(i);
printf("i=%d\n",i);

return 0;
}

//-----------------

For a compiler to detect aliasing and abort when found, is a dangerous
route. It assumes that the compiler absolutely knows what is in the mind of
the coder. Who's not to say that the aliasing situation is not the desired
one in some circumstances. eg. To demonstrate it to students.

It think Walter saying that it is better to assume that variables are not
aliased is the better way to go. And if the coder writes code that can
possibly result in it happening, then it is their responsibility to check
for it.
-- 
Derek
Melbourne, Australia
9/Aug/04 9:42:21 AM

Aug 08 2004

Dave <Dave_member pathlink.com> writes:

In article <cf6f15$557$1 digitaldaemon.com>, Derek Parnell says...
On Sun, 8 Aug 2004 20:48:36 +0000 (UTC), Dave wrote:
[big snip]

I think I understand what the "aliasing" issue is. What I can't quite see
is what you think the resolution should be. Are you ask for the compiler to
detect (potential?) aliasing situations and issue an error message? Or is
it a run-time solution you are requiring? 

What I'm specifically thinking of is something very close to what Walter
mentions here: http://www.digitalmars.com/drn-bin/wwwnews?D/28215

Strictly speaking of aliasing problems, since I think non-overlapping array
slices are covered by the spec. already, non-aliasing function params would give
the most bang for the buck because this is where it messes up optimization the
most and probably where it is easiest to check.

I think it would be do-able for a compile-time check to be made on out/inout
function/method parameters for native, built-in and object data types, but /not/
pointers, and not for extern (C) functions. That way, the compiler could be
'sure' that it could aggressively optimize for these types of functions and not
worry about side effects.

I think this would also have a lot of utility because D coding won't require
pointers so much as C and to some extent C++. And of course, it may keep
progammers from stepping on each others (heck, there own) code quite a bit for
large programs with quite a few modules.

I think the above would be a reasonably workable solution for the following
types of situations, because the scoping, out/inout params. and 'import'
functionality of D gives the compiler the visibility it needs to determine
aliasing like this.

Ok, I know the following is simplistic and doesn't cover all situations, but
something like this is what I'm talking about.

;---
objx.d:
class ObjX {
int varY;
int varZ;
}

main.d:
import objx;

int i;
ObjX x, y, z;
// The compiler would "only" need to track variables declared outside
//  a functions scope of the type(s) passed through out/inout param(s).
//  and accessed in a function. The scope of all variables inside a
//  function has to be known when parsing the function, right?
//  Otherwise an "undefined identifier" error would occur.
void foo(inout int ri) {  ri++;  i++; }
// Function _D5main3fooFKivz stored in a linked list attribute in the symbol
//  table for variable _D5main1ii

void bar(out int ri, inout int rj) {  ri--;  rj++; }
void baz(inout ObjX ox) { ox.varY *= 10; x.varZ--; }

class A {
int j;
}
class B : A {
int k;
this() { this.x = new ObjX; }
void foobar(out int z) { k++; z = k / 10; }
ObjX x;
}

B b;
void foobaz(inout A a) { b.k / a.j; }

int a[];
void snafu(inout int arr[]) {
arr[] = 10;
for(int i; i < 10; i++) {
a[i] /= arr[i];
}
}

void main(char[][] args)
{
foo(i);   // Compile error: i is accessed in foo(); can't be passed byref
bar(i,i); // Compile error: i is passed by ref. for more than 1 param.
x = new ObjX();
baz(x);   // Error: x is accessed in baz()
y = x;    // Stored in symbol table for y as currently referring to x
FuncY();  // Error: x is referred to by y and accessed in baz(), called by
FuncY() (see * below)
z = new ObjX();
y = z;    // Symbol table says y is now referring to z
FuncY();  // Ok
FuncZ();  // Error: a refers to b which is accessed in foobaz

b.foobar(b.k); // Error: b.k is passed by ref. and accessed in b.foobar()

a.length = 10;
snafu(a); // Error: a is passed by ref. and accessed in snafu()
for(int i; i < 10; i++) {
printf("a[%d] = %d\n",i,a[i]);
}

y = b.x;
FuncY(); // Ok: y referring to b.x, which is not accessed in baz()

b.x = x;
FuncY(); // Error: b.x refers to x, which is accessed in baz(), called by
FuncY()
}

// (*) A symbol table lookup would check the scope of y for use in FuncY
//      The symbol lookup on y and a check of the 'refers to' attribute could
//       tell the compiler that y currently refers to x
//      symbol table attribute for x would list baz() as a function with a
//        referrence parameter of type ObjX inside x's scope, generating an
//        error.
//  Possible w/o jumping through hoops??
void FuncY() {  baz(y); }
void FuncZ() { b = new B; A a = b; foobaz(a); }

;---

If someone on my team coded something like the example you gave, I'd have
them rewrite that potentially dangerous code into something a *lot* more
sane. Even if updating a shared variable absolutely had to be coded, I'd
insist on a runtime check, along the lines of ...

I think you may be talking more of the specifics of the example, whereas I'm
talking more generalities exemplified by it. From what I've seen there is a lot
of code out there where file scope vars. could be passed into large functions by
reference, and break things. Those are nasty bugs to track down also, especially
in someone elses code.

With the C and C++ specs. (and right now for D without 'noalias function
reference parameters' in the spec.), the compiler has to produce the
semantically correct results, so the compiler can't safely optimize many
functions even when in actuality they are never used in a way that would be
broken by the optimizations.

That's the crux of the issue. Aliasing can effect the code generated for a lot
more than in just the few functions where it may actually apply. It's one of
those "a few bad apples spoil the barrel" type of things.

C has another problem because extern scope vars. can be accessed in library
functions that are linked in, the programmer and compiler can't reasonably check
for this in alot of cases.

D on the other hand can check for this easier if I'm correct that the import
statement gives the compiler visibility to imported module variable definitions.
That's also why it should be able to inline functions better over an entire
program, which is a huge plus for D (if I'm right about how import works).

I'm guessing Intel spent several man-years on Whole Program Optimization and
alias tracking so they could safely use aggressive optimization techniques when
building their C/C++ compiler. Looks like they succeeded - comparable C code
often (but not always) runs as fast as their Fortran compiler it seems for even
numerical stuff, at least for artificial benchmark types of code, and there
Fortran compiler is supposed to be quite good.

I don't think the alias fix would take several man years for Walter. Because of
the language design (and who we have writing the compiler), I suspect it maybe
something do-able, maybe even b4 v1.0 is released.

The whole point of my earlier rant is that I think a reasonable amount of effort
could pay big dividends for D. It's worth some more discussion anyhow, I think..

- Dave

Aug 08 2004

Dave <Dave_member pathlink.com> writes:

Dave wrote:

[big snip]
 community sees it that way. If D takes off (and I'm convinced it won't
 unless some basic issues like 'the aliasing problem' are taken care of)

[another]

For the record, after a bit more experience with D and a lot more thought, I
hereby officially "retract" the above statement ;).

On the contrary and FWIW, I'm starting to become more and more convinced D
will be a hit. I think in many ways it is already given its maturity
relative to other languages.

It just offers too many other advantages (in terms of optimization
opportunities and other major areas) to 'fail' because of this issue.

Thanks,

- Dave

Aug 17 2004

"Sampsa Lehtonen" <snlehton cc.hut.fi> writes:

Howabout introducing a special keyword so that you could mark variables  
that will not alias?

I suggest this because the alias detection is very costly to do. During  
the compilation it is impossible (just imagine an array of  
pointers/objects that is filled runtime). During runtime you need to do it  
all the time for it to be effective - always when utilizing two variables  
of the same kind, or taken to the extreme: when accessing two memory  
locations.

For example:

void mangle(inout int a, inout int b)
{
   b += a + 5;
   a += b;
}

generates something like in pseudo-risc-asm:

// calculate b+a+5
mov reg1, a // reg1 <- a
mov reg2, b
add reg2, reg2, 5 // reg1 <- reg1 + 5
add reg2, reg2, reg1
// store b
mov b, reg2
// calculate a+b
mov reg1, a
add reg1, reg1, reg2
// store a
mov a, reg1

in this example there are one unnecessary read and one store, if the  
values do not alias. The b needs to be stored in the first statement  
because the value of a would be changed if they aliased. Similarly, we  
need to read the value a again in the latter statement because we don't  
know if the variables aliased or not.

If you know that these variables will never ever alias, you can touch them  
with noalias keyword which will tell the compiler that it can perform some  
optimizations.

void mangle(inout noalias int a, inout noalias int b)
{
   b += a + 5;
   a += b;
}

that would generate:

// calculate b+a+5
mov reg1, a
mov reg2, b
add reg2, reg2, 5
add reg2, reg2, reg1
// calculate a+b
add reg1, reg1, reg2
// store both a and b
mov a, reg1
mov b, reg2

Now the two memory accesses in the between are removed and the algorithm  
would run faster. With more complex algorithms the benefits get even  
bigger as more stuff could be stored in the registers.

This syntax might be familiar for some of you from compiler called  
VectorC. (http://www.codeplay.com/)

To sum it up:

Automatic alias detection is a very hard task to do - it will involve  
complex data flow analysis compile-time and tedious checking run-time.
In compile-time, the optimizations must be safe. If there is a change that  
variables might alias, they are expected to alias. This reduces the  
changes to optimize while the calculation time is huge (compare for  
example to inline-optimizations). It works on local variables though  
(which will be optimized anyway). Run-time checks bloat the code so that  
the benefits will vanish. And what would happen in variables alias?  
Exception thrown?

But manually aiding the compiler is very easy to implement and can be  
efficient. Instead of introducing new keyword it could be done with the  
compiler extensions that D supports (this is a compiler design problem,  
after all). There is still risk of human error, but then again if you need  
to optimize your code, you should know what you are doing.

-- texmex/sampsa lehtonen



On Sun, 1 Aug 2004 13:46:35 +0000 (UTC), Dave <Dave_member pathlink.com>  
wrote:

 I'm very new to D (literally as of yesterday), but am very impressed  
 with what
 I'm seeing so far.

 Being that I want this language to succeed and an important part of that  
 will be
 performance potential over C, I'm curious - how does/will D deal with the
 pointer 'aliasing problem' that plagues C and C++ compiler developers?

 From what little I've seen so far, it seems that this same problem has  
 been
 'forced' on D by it's backward compatability with C libraries and  
 C/C++ -like
 support for pointers.

 IMHO, any language that seeks to replace C/C++ should do it's best to  
 avoid this
 problem, or at least discourage code that introduces it.

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 10 2004

"Sampsa Lehtonen" <snlehton cc.hut.fi> writes:

Duh!

I should have read all the messages in the thread before posting... never  
mind.

I think that function parameters should be considered to alias, at least  
in the regular builds. In the optimized versions (release), they could be  
expected not to alias... But then again, this might have weird side  
effects, where debug code works as expected and optimized doesn't.

At least it would be good idea to have option for "safe-compile". Coupled  
with compiler extensions for noalias/alias it could prove powerful.

-- texmex/sampsa lehtonen

On Tue, 10 Aug 2004 16:09:57 +0300, Sampsa Lehtonen <snlehton cc.hut.fi>  
wrote:

 Howabout introducing a special keyword so that you could mark variables  
 that will not alias?



-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 10 2004

Dave <Dave_member pathlink.com> writes:

In article <opscip3stw35qbu1 macray>, Sampsa Lehtonen says...
Duh!

I should have read all the messages in the thread before posting... never  
mind.

<snip>

Thanks for posting - your example helped clarify the 'cost' of aliasing on
performance, and why.

Below is a simple example of what I'm talking about and why I think this issue
is important to consider.

#include <cstdio>
// This sample is in C++ because it can demonstrate the large
//  performance difference with an Intel C++ compiler switch.
//  Think of '&' as 'inout' in D
void foo(int& ri, int& rj)
{
// ri and rj are not related as used and are operated on separately.
//  But the C/C++ spec. says thecompiler has to assume that they may
//  reference the same var., so this code cannot be optimized aggressively.
for(int idx = 0; idx < 10000000; idx++) { ri++; rj++; }
}

void bar(int& ri, int& rj, int& rk)
{
// Same here - successive calls passing around references (very common,
//  especially in OOP code) only exaserbates the issue.
foo(ri,rj);
for(int idx = 0; idx < 10000; idx++) { ri = rj % 10; rk += rj - rk; }
}

int main()
{
int i = 0, j = 0, k = 0;
for(int idx = 0; idx < 1000; idx++) bar(i,j,k);
printf("i = %d, j = %d, k = %d\n",i,j,k);
}





i = 8, j = 1410065408, k = 1410065408
real    0m17.371s
user    0m17.310s
sys     0m0.010s


i = 8, j = 1410065408, k = 1410065408
real    0m2.414s
user    0m2.410s
sys     0m0.000s

It's apparent this can make a very large difference in peformance, yet the
results and code are identical.

Considering methods calling methods calling methods, etc... The end-results can
be pretty large.

The magnitude of this result actually surprised me also, and I knew it was a
problem.

The whole concern is that more often than not, the aliasing 'de-optimization'
effects common, often used, correctly used code because C/C++ compilers allow
aliasing.

My proposal would be something along the lines of changing out/inout
function/method params to 'noalias' by default, if a compile-time or even debug
run-time check could be done.

How about this for a proposal:
- 'noalias' for out/inout params limited to primitive types, structs and arrays
of primitive types.
- pointers left as-is for C compatibility, and so code like the above could be
done if the side effects of param1 and param2 referencing the same variable are
desired.
- For primitive type params (not pointers to prim. types), warn on passing of a
de-referenced pointer. For example:
int i;
int* j = &i;
foo(i,*j); //Compiler warning: de-referenced ptr. passed inout.

Among others I'm sure, that leaves this case in D:

class X {
int i;
}

int main()
{
X x = new X();
x.i = 10;
Y y = x;        // y is now a reference to x in D
foo(x.i,y.i);
}

Anyone think of a solution for that?? Maybe that would be a case for a debug
runtime check. Or could the compiler reasonably check for this at compile time?

Something like the above would probably satisfy the numerics crowd along with
many other applications of inout params., because often the most performance
sensitive code (i.e.: passed params used in tight loops) deals with primitive
types.

They added the 'restrict' keyword in C, and I guess many lib. function calls are
being re-written to use it. Even functions like fopen are being changed:

/* C89: */
fopen(const char *path, const char *mode);
// C99:
fopen(const char * restrict path, const char * restrict mode);

The difference can be that large I guess. Who would've thought fopen()? It's
used a lot but usually not in tight loops..

Think of UI code passing primitive types around. Think of the inards of a lot of
templates using native types. Think of socket calls de/serializing structs.

Just about anything called in functions passing around references used
repetitively can be effected in a big way by aliasing.

As for your mention of aliasing within arrays, I believe the spec. already
prohibits some 'overlapping' for slicing, so that takes care of part of your
concern for native type arrays, I think.

- Dave

Aug 10 2004

Dave <Dave_member pathlink.com> writes:

I inadvertently skewed the results when I caused an overflow by bumping the loop
count up to make the test run for a decent amt. of time.

If you change the loop in both foo() and bar() to 1000000 iterations, the
relative difference is even larger (>7x compared to >10x):


i = 0, j = 1000000000, k = 1000000000
real    0m1.010s
user    0m1.000s
sys     0m0.000s


i = 0, j = 1000000000, k = 1000000000
real    0m11.287s
user    0m10.810s
sys     0m0.020s

BTW - Just to be clear, my intention is /not/ to suggest a compiler switch for D
like the demonstration C/C++ compiler has.

Thanks,

- Dave

In article <cfaq4t$2v67$1 digitaldaemon.com>, Dave says...
Below is a simple example of what I'm talking about and why I think this issue
is important to consider.

#include <cstdio>
// This sample is in C++ because it can demonstrate the large
//  performance difference with an Intel C++ compiler switch.
//  Think of '&' as 'inout' in D
void foo(int& ri, int& rj)
{
// ri and rj are not related as used and are operated on separately.
//  But the C/C++ spec. says thecompiler has to assume that they may
//  reference the same var., so this code cannot be optimized aggressively.
for(int idx = 0; idx < 10000000; idx++) { ri++; rj++; }
}

void bar(int& ri, int& rj, int& rk)
{
// Same here - successive calls passing around references (very common,
//  especially in OOP code) only exaserbates the issue.
foo(ri,rj);
for(int idx = 0; idx < 10000; idx++) { ri = rj % 10; rk += rj - rk; }
}

int main()
{
int i = 0, j = 0, k = 0;
for(int idx = 0; idx < 1000; idx++) bar(i,j,k);
printf("i = %d, j = %d, k = %d\n",i,j,k);
}





i = 8, j = 1410065408, k = 1410065408
real    0m17.371s
user    0m17.310s
sys     0m0.010s


i = 8, j = 1410065408, k = 1410065408
real    0m2.414s
user    0m2.410s
sys     0m0.000s

It's apparent this can make a very large difference in peformance, yet the
results and code are identical.

Aug 10 2004

Sean Kelly <sean f4.ca> writes:

Kind of a contrived example, but still applicable I suppose.  Walter had
mentioned defaulting to "noalias" for function parameters.  If the compiler can
enforce this for inout and class parameters then I'm all for it.  I can
understand how this may not be possible for pointers, however.


Sean

Aug 10 2004

Regan Heath <regan netwin.co.nz> writes:

On Tue, 10 Aug 2004 19:37:13 +0000 (UTC), Sean Kelly <sean f4.ca> wrote:

 Kind of a contrived example, but still applicable I suppose.  Walter had
 mentioned defaulting to "noalias" for function parameters.  If the 
 compiler can
 enforce this for inout and class parameters then I'm all for it.  I can
 understand how this may not be possible for pointers, however.


If checking for aliasing is difficult/time consuming then we could only 
check in debug builds. eg.

void foo(inout int a, inout int b)
{
   //check a and b and assert if aliased (only in debug builds).
}


If aliasing is rare then noalias is a good default, then an 'alias' 
keyword is required to tell the compiler when a parameter could be aliased:

void bar(alias inout int a, alias inout int b)
{
   //no check and assert (even in debug builds).
}


If aliasing is only a problem with 'inout' parameters instead of a new 
keyword, what about a new parameter mode, eg:

void bar(alias int a, alias int b)
{
   //no check and assert (even in debug builds).
}

So we have:
'in'    - (as is currently)
'out'   - (as is currently)
'inout' - (should not be aliased, debug check and assert)
'alias' - (same as inout except could be aliased, no debug check)

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 10 2004

"Sampsa Lehtonen" <snlehton cc.hut.fi> writes:

On Wed, 11 Aug 2004 10:28:15 +1200, Regan Heath <regan netwin.co.nz> wrote:

 So we have:
 'in'    - (as is currently)
 'out'   - (as is currently)
 'inout' - (should not be aliased, debug check and assert)
 'alias' - (same as inout except could be aliased, no debug check)

I understand that making the 'noalias' as a default for out and inout  
parameters sounds tempting, but because it is then impossible to quarantee  
the correctness of the program, it shouldn't be done. I think that the  
decision whether to take the risk or not should be given to the user. It  
might be a compiler flag ('assume-noalias') or ability to manually mark  
the parameters with 'noalias' keyword or compiler extensions. And this  
doesn't mean that the compiler wouldn't optimize the non-aliased variables  
unless it has been an order to do so. Of course it can take the  
optimizations that it can be sure of. For example, if the variables that  
are used at the same time are insulated in a certain scope, the compiler  
can quite easily see if aliasing would occur. In my opinion, compilers  
should produce 100% correct code unless the user is willing to take risks  
and try out some optimizations.

Also, I think that a separation between the meaning of the program code  
and the optimization performed on it should be made. That's why the  
keyword noalias feels a bit bad... How about pragmas?

pragma(noalias, a, b)
void foo(inout int a, inout int b)
{

}


Btw, how can i define multiple pragmas that affect same declaration?  
Should it be

pragma(foo)
{
pragma(bar)
{
void foobar() { }
}
}

? Looks a bit akward.

pragma(foo)
pragma(bar)
void foobar() {}

looks better.

Also, I find it a bit odd that the compiler must report unknown pragmas as  
an error... if we had some optimization pragmas that not all compilers  
support, we must wrap them in a version statements. But can't create a  
pragma that affected a function AND which would affect only certain  
compiler:

version(DigitalMars)
{
   pragma(noalias)
}
void foobar(inout a, inout b) {}

Or does this work? If it does, it doesn't make sense, because isn't the  
version statement considered as a block of fully formed statements (which  
the pragma currently isn't, it's not terminated with ; but instead it's  
bound the foobar)?

Well, that went a bit OT... sorry :)

-texmex/sampsa lehtonen
-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 11 2004

Dave <Dave_member pathlink.com> writes:

In article <opsckkmb1035qbu1 macray>, Sampsa Lehtonen says...
I understand that making the 'noalias' as a default for out and inout  
parameters sounds tempting, but because it is then impossible to quarantee  
the correctness of the program, it shouldn't be done. I think that the  

Please refer to my post just ahead of this one. Debug checks are already done
for array bounds and I think would be consistent for noalias for value types and
arrays if that is clearly spelled out in the language spec. (like here:
http://digitalmars.com/d/arrays.html for Array Bounds Checking spec.).

decision whether to take the risk or not should be given to the user. It  
might be a compiler flag ('assume-noalias') or ability to manually mark  
the parameters with 'noalias' keyword or compiler extensions. And this  
doesn't mean that the compiler wouldn't optimize the non-aliased variables  

I don't think the language needs more keyword/type specifier complexity to carry
this out, especially if it is clearly outlined in the spec. what the debug build
runtime checks would be responsible for.

I'm way against non-spec. compiler extensions. This language is intended to be
portable. And I don't want my CD full of code examples to break because I change
compilers.

Compiler flags like 'assume-noalias' to me cover the shortcomings of a language
and are one of the biggest learning-curve challenges to beginners, and add a lot
of time to optimizing/tuning code (i.e.: let see, if I code this and set this
flag, it is this fast, hmmmm, what if I code this, and set this flag, or, oh
yea, how about this...). Plus, and this is a very 'real-world' scenario, what if
some code is changed that 'assume-noalias' breaks and the developer forgets to
tell the build-master about it (or change the build him/herself)?

I think flags like this would be one of the biggest nits for people coming from


Finally, it is possible, given that some of the infrastructure/archtecture is
already there for debug array bounds checking, that the 'noalias' debug runtime
checks would both keep the compiler implementation simpler and be more
consistent with what users expect, as well as warning them on potential aliasing
issues when then are not intended.

Referring to my post just ahead of this one, another pitfall to that idea as
outlined there.. I realize that COM Interfaces may be a problem, as well as
extern and export. An addition to that spec. proposal would be to exempt those
from noalias. That again would be consistent with other parts of the language
spec.

unless it has been an order to do so. Of course it can take the  
optimizations that it can be sure of. For example, if the variables that  
are used at the same time are insulated in a certain scope, the compiler  
can quite easily see if aliasing would occur. In my opinion, compilers  

Yes - in this case, the compiler should abort/report/warn and not leave it to
the runtime debug checks, as is in the spec. for array bounds checking. But if
that is difficult to do or makes things more confusing, I think runtime debug
checks are good enough.

- Dave

Aug 11 2004

"Sampsa Lehtonen" <snlehton cc.hut.fi> writes:

On Wed, 11 Aug 2004 15:33:48 +0000 (UTC), Dave <Dave_member pathlink.com>  
wrote:


 I'm way against non-spec. compiler extensions. This language is intended  
 to be
 portable. And I don't want my CD full of code examples to break because  
 I change
 compilers.

Well, my idea was that the compiler extensions (pragmas) would just be  
hints for the compiler how to optimize the code. If compiler doesn't  
support them, it would ignore them.

 Compiler flags like 'assume-noalias' to me cover the shortcomings of a  
 language
 and are one of the biggest learning-curve challenges to beginners, and  
 add a lot
 of time to optimizing/tuning code (i.e.: let see, if I code this and set  
 this
 flag, it is this fast, hmmmm, what if I code this, and set this flag,  
 or, oh
 yea, how about this...). Plus, and this is a very 'real-world' scenario,  
 what if
 some code is changed that 'assume-noalias' breaks and the developer  
 forgets to
 tell the build-master about it (or change the build him/herself)?

Been there, done that. I know it should be avoided. However, making  
compiler that produces optimal code automagically just isn't that easy.  
The other solution is to tell the aliasing info variable by variable.

 Finally, it is possible, given that some of the  
 infrastructure/archtecture is
 already there for debug array bounds checking, that the 'noalias' debug  
 runtime
 checks would both keep the compiler implementation simpler and be more
 consistent with what users expect, as well as warning them on potential  
 aliasing
 issues when then are not intended.

I hope you understand what these runtime checks would be. They aren't done  
just in the header of the function, but they should be done all the time.  
For example:

MyClass a = x;
MyClass b = y;
if (random > 0.5)
   b = x;
// check for aliasing here or assume aliasing
a.i += b.i;
b.i += a.i;

In that example, it can't be detected compile-time whether aliasing will  
occur or not. The compiler cannot hold a.i in a register but it must flush  
it back to the memory if aliasing is assumed OR a check must be inserted  
to the code if we want to detect aliasing. This is just a simple example,  
with more complex one the amount of checks would be huge.

And the checks must be made against all variables that are live  
simultaneously, which would bloat the code even more.

 Referring to my post just ahead of this one, another pitfall to that  
 idea as
 outlined there.. I realize that COM Interfaces may be a problem, as well  
 as
 extern and export. An addition to that spec. proposal would be to exempt  
 those
 from noalias. That again would be consistent with other parts of the  
 language
 spec.

Well all external variables should be considered aliasing. But then again,  
if the programmer knew that certain variables wouldn't alias, how would he  
tell this to the compiler?...

 unless it has been an order to do so. Of course it can take the
 optimizations that it can be sure of. For example, if the variables that
 are used at the same time are insulated in a certain scope, the compiler
 can quite easily see if aliasing would occur. In my opinion, compilers

 Yes - in this case, the compiler should abort/report/warn and not leave  
 it to
 the runtime debug checks, as is in the spec. for array bounds checking.  
 But if
 that is difficult to do or makes things more confusing, I think runtime  
 debug
 checks are good enough.

Compiler might not be sure whether aliasing will occur and unnecessary  
warnings would be printed.

The aliasing problem isn't as simple as you might think. As I said, it  
isn't just a matter of function parameters, it's everything that has  
something to do with pointers. By pointers I mean _actual_ pointers to  
memory locations, not just * pointers in D or C/C++. Objects in D and Java  
are pointers as well.

There is no magic bullet to this matter. Compiler cannot do everything for  
the programmer.

-texmex/sampsa lehtonen
-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 12 2004

Dave <Dave_member pathlink.com> writes:

In article <opscl6lhin35qbu1 macray>, Sampsa Lehtonen says...
I hope you understand what these runtime checks would be. They aren't done  
just in the header of the function, but they should be done all the time.  
For example:

MyClass a = x;
MyClass b = y;
if (random > 0.5)
   b = x;
// check for aliasing here or assume aliasing
a.i += b.i;
b.i += a.i;

_All_ this thread has been talking about is aliasing of _function parameters_.

Not necessarily all of them either. For example, the spec. could be written to
/just/ check aliasing on /value types and value type arrays/, not reference
types, not array members (the spec. disallows overlapping already), not arrays
of reference types, maybe leave structs out as well, and _not_ pointers.

extern (C) or extern (Windows) would be left as-is in any case.

void foo(inout int[] a, inout int[] b)
{
// runtime check here for &a != &b, if so, warn

// Code following is compiled as it is now
MyClass a = x;
MyClass b = y;
if (random > 0.5)
   b = x;
// check for aliasing here or assume aliasing
a.i += b.i;
b.i += a.i;

}

And the checks must be made against all variables that are live  
simultaneously, which would bloat the code even more.

The run time checks would be debug only, like they are now for array bounds. No
release bloat, just smaller and faster release code ;)

See above for when the checks would be inserted. This is consistent with array
bounds checking.

Well all external variables should be considered aliasing. But then again,  
if the programmer knew that certain variables wouldn't alias, how would he  
tell this to the compiler?...

If the spec. is limited to just function params., externs would be checked, i.e.
(C++ for clarity):

extern int i;
extern int *j;

void foo(int &ri)
{

ri++;

if(random > 5) {
// i is declared outside function scope and is accessed in this function and
//  the function has int ref. param.
// Compiler inserts ri != i check here - consistent with array bounds
i++;
}

if(random > 10) {
// pointer involved
// Code generated here would be the same as now - assume aliasing with j
*j = ++ri;
}
}

Ok - so *j = ++ri; might complicate the job of the compiler a bit. If that's the
case, switch back to assume aliasing for the whole function if it contains
pointers declared outside it's scope. With D that probably won't effect nearly
as many functions as in C/C++.

Compiler might not be sure whether aliasing will occur and unnecessary  
warnings would be printed.

Fine - debug runtime checks only.. However, I would think that in some cases the
compile-time check would be pretty much foolproof, i.e.:

int i;
foo(i,i); // Warning during type checking of func. params

The aliasing problem isn't as simple as you might think. As I said, it  
isn't just a matter of function parameters, it's everything that has  

I think you're making it harder than it has to be ;) I'm /not/talking about
global aliasing or even operations on pointers, just with value type variables
and array function params.

Doing this for just function params. for value types (_as explained above_)
would probably provide a big bang for the buck and keep 'noalias' manageable and
reasonably safe.

Aug 12 2004

Regan Heath <regan netwin.co.nz> writes:

On Wed, 11 Aug 2004 16:19:13 +0300, Sampsa Lehtonen <snlehton cc.hut.fi> 
wrote:
 On Wed, 11 Aug 2004 10:28:15 +1200, Regan Heath <regan netwin.co.nz> 
 wrote:

 So we have:
 'in'    - (as is currently)
 'out'   - (as is currently)
 'inout' - (should not be aliased, debug check and assert)
 'alias' - (same as inout except could be aliased, no debug check)

 I understand that making the 'noalias' as a default for out and inout  
 parameters sounds tempting, but because it is then impossible to 
 quarantee  the correctness of the program, it shouldn't be done.

Is it impossible? In a debug build couldn't the compiler insert checks to 
ensure the variables are not aliased?

I understand that for pointers you cannot determine whether they are 
aliased or not, but D's arrays can be checked trivially, and pointers (so 
far) seem much less important in D than in C/C++.

So perhaps 2 statements could be made:
  - pointers are assumed to be aliased unless 'noalias' is used.
  - other variables are assumed not to be aliased, unless 'alias' is used.

And in debug builds the validity of the above could be checked (where 
possible).

 I think that the  decision whether to take the risk or not should be 
 given to the user. It  might be a compiler flag ('assume-noalias') or 
 ability to manually mark  the parameters with 'noalias' keyword or 
 compiler extensions. And this  doesn't mean that the compiler wouldn't 
 optimize the non-aliased variables  unless it has been an order to do 
 so. Of course it can take the  optimizations that it can be sure of. For 
 example, if the variables that  are used at the same time are insulated 
 in a certain scope, the compiler  can quite easily see if aliasing would 
 occur. In my opinion, compilers  should produce 100% correct code unless 
 the user is willing to take risks  and try out some optimizations.

I agree with the general statement here. The compiler should produce 
stable code by default.

I think for most variables in D you can verify whether they are aliased or 
not, pointers being the big exception that comes to mind, so, given that, 
if a feature was added to the compilter to check them (where it can) in 
debug builds and optimise them (where it can) in release builds wouldn't 
we get stable /and/ fast code.

 Also, I think that a separation between the meaning of the program code  
 and the optimization performed on it should be made. That's why the  
 keyword noalias feels a bit bad... How about pragmas?

I dislike pragmas in general.

That said, on one hand I see what you're saying, but on the other, an 
'alias' keyword does effect the meaning of the program code, or rather 
describes a property of the variable which then can have an effect on the 
program code.

 pragma(noalias, a, b)
 void foo(inout int a, inout int b)
 {

 }

Lastly, I don't want to have to type that much. :) Some call me lazy, I 
prefer efficient.

 Btw, how can i define multiple pragmas that affect same declaration?  
 Should it be

 pragma(foo)
 {
 pragma(bar)
 {
 void foobar() { }
 }
 }

 ? Looks a bit akward.

 pragma(foo)
 pragma(bar)
 void foobar() {}

 looks better.

Is foobar missing it's parameters foo and bar?
If so, I prefer

void foobar(alias inout foo, alias inout bar)
{
}

 Also, I find it a bit odd that the compiler must report unknown pragmas 
 as  an error... if we had some optimization pragmas that not all 
 compilers  support, we must wrap them in a version statements. But can't 
 create a  pragma that affected a function AND which would affect only 
 certain  compiler:

 version(DigitalMars)
 {
    pragma(noalias)
 }
 void foobar(inout a, inout b) {}

 Or does this work? If it does, it doesn't make sense, because isn't the  
 version statement considered as a block of fully formed statements 
 (which  the pragma currently isn't, it's not terminated with ; but 
 instead it's  bound the foobar)?

 Well, that went a bit OT... sorry :)

NP.. If you want this to get some attention I'd post it as it's own topic 
(if you haven't already.. I have not checked).

Regards,
Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 11 2004

Dave <Dave_member pathlink.com> writes:

In article <opsclc1kpg5a2sq9 digitalmars.com>, Regan Heath says...
On Wed, 11 Aug 2004 16:19:13 +0300, Sampsa Lehtonen <snlehton cc.hut.fi> 
wrote:
 I understand that making the 'noalias' as a default for out and inout  
 parameters sounds tempting, but because it is then impossible to 
 quarantee  the correctness of the program, it shouldn't be done.

Is it impossible? In a debug build couldn't the compiler insert checks to 
ensure the variables are not aliased?

I understand that for pointers you cannot determine whether they are 
aliased or not, but D's arrays can be checked trivially, and pointers (so 
far) seem much less important in D than in C/C++.

So perhaps 2 statements could be made:
  - pointers are assumed to be aliased unless 'noalias' is used.
  - other variables are assumed not to be aliased, unless 'alias' is used.

<snip>

Just curious..

If:
- D still aliases by default with pointers (and D uses ptrs. less anyhow)
- non-pointers are changed to noalias
- pointers could still be used to get the alias side-effects if desired
- and the debug runtime checks would presumably work like array bounds
checking..

Why would D need new keywords, compiler extensions, pragmas or compiler
switches?

Thanks,

Dave

Aug 11 2004

Regan Heath <regan netwin.co.nz> writes:

On Thu, 12 Aug 2004 04:57:24 +0000 (UTC), Dave <Dave_member pathlink.com> 
wrote:

 In article <opsclc1kpg5a2sq9 digitalmars.com>, Regan Heath says...
 On Wed, 11 Aug 2004 16:19:13 +0300, Sampsa Lehtonen <snlehton cc.hut.fi>
 wrote:
 I understand that making the 'noalias' as a default for out and inout
 parameters sounds tempting, but because it is then impossible to
 quarantee  the correctness of the program, it shouldn't be done.

 Is it impossible? In a debug build couldn't the compiler insert checks 
 to
 ensure the variables are not aliased?

 I understand that for pointers you cannot determine whether they are
 aliased or not, but D's arrays can be checked trivially, and pointers 
 (so
 far) seem much less important in D than in C/C++.

 So perhaps 2 statements could be made:
  - pointers are assumed to be aliased unless 'noalias' is used.
  - other variables are assumed not to be aliased, unless 'alias' is 
 used.

 <snip>

 Just curious..

 If:
 - D still aliases by default with pointers (and D uses ptrs. less anyhow)
 - non-pointers are changed to noalias
 - pointers could still be used to get the alias side-effects if desired
 - and the debug runtime checks would presumably work like array bounds
 checking..

 Why would D need new keywords, compiler extensions, pragmas or compiler
 switches?

Perhaps not. :)

Unless you're forced to use a pointer, know it's not aliased and want to 
have it optimise.

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 16 2004

Dave <Dave_member pathlink.com> writes:

Regan Heath wrote:

 On Thu, 12 Aug 2004 04:57:24 +0000 (UTC), Dave <Dave_member pathlink.com>
 wrote:
 
 Just curious..

 If:
 - D still aliases by default with pointers (and D uses ptrs. less anyhow)
 - non-pointers are changed to noalias
 - pointeimplicitstill be used to get the alias side-effects if desired
 - and the debug runtime checks would presumably work like array bounds
 checking..

 Why would D need new keywords, compiler extensions, pragmas or compiler
 switches?

 
 Perhaps not. :)
 
 Unless you're forced to use a pointer, know it's not aliased and want to
 have it optimise.
 

I was thinking along the lines of not adding complexity (from the user
standpoint) to the language or tools if it could be avoided and still cover
most cases.

I figure being 'forced' to use pointers in D would almost always happen when
calling libs. from other languages and in that case the rules for the other
language inside the lib. function would apply anyhow.

From my understanding of D so far, even things that would ideally run really
fast like iterators will be implicit references and not explicit pointers,
but I'm sure there are other cases.

This and the better code/data visibility of the D compiler through import
are two of the reasons I'm so eager to try and fix part of the aliasing
overhead with D - I think it can be done w/o complicated kludges like
"link-time code generation" as C/++ compilers are forced to use.

The implications of import on inlining and finalizing along with other
things in D like foreach(...) and first-class arrays to simplify
optimization are pretty large I think, even if aliasing is not directly
addressed.

This really is a very, very cool language design, IMHO.

BTW - In an earlier rant, I mentioned I thought D would not succeed if the
aliasing issue is not addressed. I certainly /do not/ think that way
anymore.

Thanks,

- Dave

Aug 17 2004

Dave <Dave_member pathlink.com> writes:

In article <opscjfddfc5a2sq9 digitalmars.com>, Regan Heath says...
If checking for aliasing is difficult/time consuming then we could only
check in debug builds. eg.

void foo(inout int a, inout int b)
{
   //check a and b and assert if aliased (only in debug builds).
}

Is your proposal that the assert's be inserted by the compiler for debug
builds? If so, I like it!

It's consistent with other runtime checks for D, like array bounds, so that
would make sense, be intuitive and presumably more straightforward to implement
for the compiler developer.

I would guess that array bounds are checked at runtime and not compile time for
the same reasons it's hard for aliasing (i.e.: it's very hard for the compiler
to resolve if array bounds will be violated or not at compile time).

I think it would also have to check the case where the function accessed
a variable outside the function scope that is the same type as 'a':

int b;

Aug 11 2004

Dave <Dave_member pathlink.com> writes:

Bahhh! This happened again!! Either my browser or the news server somehow cut my
post short (Ok, quit clapping <g>).

Here is is again:

;---

In article <opscjfddfc5a2sq9 digitalmars.com>, Regan Heath says...
If checking for aliasing is difficult/time consuming then we could only
check in debug builds. eg.

void foo(inout int a, inout int b)
{
   //check a and b and assert if aliased (only in debug builds).
}

Is your proposal that the assert's be inserted by the compiler for debug
builds?

If so, I like it!

It's consistent with other runtime checks for D, like array bounds, so that
would make sense and be intuitive.

I think it would also have to check the case where the function accessed
a variable outside the function scope that is the same type as 'a':

int b;
int[] arr;

/* more code */

void foo(inout int a)
{
// (i) debug assert here?
a++;
arr.length = 5;
if(whatever == true) {
// or (ii) debug assert here - depends on a thorough test case?

b++;

// I'm for (ii) because that is consistent with debug build array bounds
//  checking now and would be the easiest to implement.

// runtime array bounds error for debug builds happens here now.
arr[5] = b;
}
}

If aliasing is rare then noalias is a good default, then an 'alias'
keyword is required to tell the compiler when a parameter could be aliased:

void bar(alias inout int a, alias inout int b)
{
   //no check and assert (even in debug builds).
}

If aliasing is only a problem with 'inout' parameters instead of a new
keyword, what about a new parameter mode, eg:

It would be 'out' and 'inout' for value types; in, out and inout for reference
types.

However, I think just changing how out and inout are handled for value types and
in/out/inout for array refs. would be Ok, since that would be pretty consistent
with both how value types vs. reference types are expected to be handled in D
now, and arrays are 'built-in' just like other value types like int, double,
etc.

The reason I say this is partly to remove complexity, but also partly because
the performance/memory advantage of passing objects by ref. instead of copying
them probably outweighs the advantages of 'noalias' most of the time (for
reference types). Again, consistent with how the language is implemented.

When I say debug check for array objects, I'm not proposing it be done for
individual elements. For value types that is covered by 'no overlapping' in the
spec. already and for ref. types it wouldn't matter because the are never
'noalias' anyway (so this would be consistent as above).

This would probably even be pretty intuitive for users of other languages like
Java too.

void bar(alias int a, alias int b)
{
   //no check and assert (even in debug builds).
}

Maybe we could get by w/o a new keyword at all, since pointers can be
used for value types if aliasing is desired. This would also enforce the
'noalias for value types' idea for developers.

I think something close to this just may work!

The proposal would be something close to:
- Change the spec. to 'noalias' for out/inout on value types.
- Same for array references, except all of in/out/inout. This is consistent
because arrays are built-in's while other things passed by ref. are not strictly
built-in's like value types, while arrays are.
- Since non-builtin ref. types are always passed by ref., they act as they do
now.
- Mimic the debug build runtime array bounds check for a 'noalias' check.
- (I added this the 2nd time around:) extern and export specifiers would exempt
the noalias.

This idea:
- Shares consistency with how D handles array bounds checks now.
- Shares consistency with how D handles value, ref. types and built-in's now,
providing a clear delineator that can be followed by developers.
- Much easier to implement than strict compile-time checks.
- Allows D a one-up over other languages:
- Warns on aliasing issues (I believe that Fortran does not)
- Allows for aggressive code gen. for these functions (C/C++ does not)
- Shares the by-value/by-ref. func. param. semantics of Java
- Takes into account COM interfaces, I think.

Pitfalls I can think of off-hand:
- pointer struct members or struct members that reference by ref. types.
- How big a problem would this be out in the real world?
- Could the runtime check be applied for each member here as well, w/o jumping
through hoops?
- Could the spec. leave this as undefined perhaps?

Other thoughts?

Thanks,

- Dave

Aug 11 2004

Norbert Nemec <Norbert Nemec-online.de> writes:

Hi there,

sorry I did not comment on this before - I just returned home after two
weeks of traveling.

Just a short statement about my current view on aliasing. ("current",
because this view is still evolving...)

First, the problem of aliasing in C was not something that came up because
of some design fault, but because of the existance of references. Fortran
has no references. The only possible cause for different names refering to
the same object in memory would be function arguments. There, aliasing is
prohibited, so Fortran has no aliasing at all, allowing very aggressive
optimizations. In C, function arguments are only one of many possible
causes for aliasing, so the "restrict" does not solve the problem but only
softens it a bit.

As I see it, there are only two ways to rival the performance of Fortran:
either step back into stone-age and create a language without references or
step forward into the future and introduce high-level abstractions that
allow the compiler to know more about the semantics of the code.

Vectorized expressions, as they have been discussed before, will hopefully
solve the problem of aliasing in the most common cases (by allowing the
compiler to freely choose an order of execution). For other cases, we can
only wait for the problems to arise and solve them then (maybe even by
introducing something like the dreaded "restrict") I strongly doubt, that
anyone will have success in finding "the solution" for the problem as a
whole.

Ciao,
Norbert


Dave wrote:

 
 I'm very new to D (literally as of yesterday), but am very impressed with
 what I'm seeing so far.
 
 Being that I want this language to succeed and an important part of that
 will be performance potential over C, I'm curious - how does/will D deal
 with the pointer 'aliasing problem' that plagues C and C++ compiler
 developers?
 
 From what little I've seen so far, it seems that this same problem has
 been 'forced' on D by it's backward compatability with C libraries and
 C/C++ -like support for pointers.
 
 IMHO, any language that seeks to replace C/C++ should do it's best to
 avoid this problem, or at least discourage code that introduces it.

Aug 14 2004

D Programming

C/C++ Programming

Other

digitalmars.D - 'Aliasing problem' and D