digitalmars.D - Const, strings, and other things.

Jarrett Billingsley (24/24) Nov 12 2007 That topic name is _dangerously_ catchy.

Janice Caron (4/4) Nov 12 2007 I like const!

Jarrett Billingsley (47/51) Nov 12 2007 Oh, _come on_. Chances are if your code is passing data to my function,...

Bill Baxter (32/64) Nov 12 2007 Also important to point out that "you" in the above often means "myself

Jarrett Billingsley (18/30) Nov 12 2007 Which would be a good reason to require out at the call site, like C#. ...

Paul Findlay (13/19) Nov 12 2007 Yeah, but it happens with code I use in ruby land. This is a bug I repor...

David B. Held (59/98) Nov 12 2007 Hahahahaha!!! I see you've never worked for a company with more than 5

Bruce Adams (24/69) Nov 13 2007 I share your pain but most organisations that acknowledge they are doing...

Janice Caron (10/10) Nov 12 2007 In C++ (and, I believe, D), it is possible for a statement like

Robert Fraser (2/5) Nov 12 2007 That's a rather paranoid approach. I never worry about that in Java, or ...

Robert Fraser (3/12) Nov 12 2007 Coming from a Java background, I agree with you quite a bit. However, I ...

Walter Bright (3/6) Nov 12 2007 I also want to support functional programming, and without invariants,

Jarrett Billingsley <kb3ctd2 yahoo.com> writes:

That topic name is _dangerously_ catchy.

Anyway some noobish thoughts were running through my constless brain 
today.  By constless I mean I'm in the "I've never used const and 
haven't really run into any cases where I've thought it 
necessary/useful" camp, although I can't deny that there are _some_ uses 
for it.  But the more I thought about it, the more it seemed to me 
that.. well, if we're not trying to be C++, and we're not trying to make 
it possible to have user types behave exactly like built-in types, then 
maybe a generic const system isn't really all that necessary.

What my thoughts boiled down to is that constness seems useful for 
strings and not much else.  I suppose it could be useful for other kinds 
of arrays, but overwhelmingly the use cases for const seem to be for 
strings, and constness helps to make some operations with strings more 
efficient.  Other than that, constness for other types seems more like a 
logical convenience.  So you want to return a const(Foo) where Foo is a 
class reference?  How often do you need to do that?  Passing const refs 
-- have YOU ever had a bug where you tried to modify a const ref param 
that was caught by the const?

The more I hear about const, and the more conversations I watch about 
it, the more complex and esoteric it gets.  I think that constness for 
strings only could cover a large majority of use cases for const without 
having to have a pervasive, complex addition to the type system.

(I just keep looking at D2 and seeing this awful const wart on it, and 
thinking I'd really like to try it out if it weren't for const.  Ugh.)

Nov 12 2007

"Janice Caron" <caron800 googlemail.com> writes:

I like const!

I like to know that when I pass /my/ data to /your/ function, then
your function is not going to mess with my data. const in the
declaration of your function is what gives me that guarantee.

Nov 12 2007

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.52.1194907553.2338.digitalmars-d puremagic.com...
I like const!

 I like to know that when I pass /my/ data to /your/ function, then
 your function is not going to mess with my data. const in the
 declaration of your function is what gives me that guarantee.

Oh, _come on_.  Chances are if your code is passing data to my function, one 
or more of the following is true:

1) We're on the same team, and we have some sort of convention set up for 
this kind of thing.

2) You're using a library that I wrote, and it's probably well-documented, 
or at least self-obvious, whether or not my function modifies your data. 
(Keep in mind that even if there _is_ const available and my function is not 
well-documented, I don't _have_ to use const, and so you don't have any 
guarantee anyway.)

3) I'm not a complete moron and actually name my functions after what they 
do.  Face it, most people don't write functions with one name and have them 
do something completely different, and if that's happening, constness is not 
really going to help you with that.

My big question is: how often do you pass some reference type into a 
function and hope that it will/won't be modified?  Here are some possible 
reference types:

- Const refs to structs.  Why do these exist?  ref to get the compiler to 
pass the struct efficiently, and const to preserve value semantics. 
Wouldn't this be better served by some other mechanism, such as the 
optimizer?

- Pointers.  Come on, don't use pointers.  ;)

- Class instances.  I'll get back to this.

- Arrays of things other than characters.  OK, I can see a use for declaring 
an array as read-only, although most of the time I'm modifying my arrays 
left and right.

- Strings.  The single most common kind of array, and they even have special 
read-only literals built into the language.  It makes sense to provide 
read-only strings, if for nothing else efficiency.  Most of the time you're 
not going to be modifying the strings after you've done some machinations.

As for class instances, this is where it's a grey area.  So I did a bit of 
research into my own code, and this is what I've come up with, as far as why 
I'm passing class instances into functions:

- As some sort of object to be mutated, i.e. a context or state class (like 
a function that adds some symbol to a given symbol table).

- In order to construct/set up an aggregate object, such as some kind of 
complex IO class which takes an output stream, a layout instance, etc.  In 
virtually all cases these instances are subsequently modified by the methods 
of the owner object.

- As a sort of 'struct on steroids', that is, just a data-holding class 
instance with some methods.  These I guess are candidates for constness, but 
the incidence of these is so small and the functions which use them are so 
short and what they do is so obvious that I have no idea what gains I will 
get from using const.

I'm just absolutely curious as to what on earth, _other than strings_, that 
you'll be passing to other code that you don't want it to be modified.

Nov 12 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

Jarrett Billingsley wrote:
 "Janice Caron" <caron800 googlemail.com> wrote in message 
 news:mailman.52.1194907553.2338.digitalmars-d puremagic.com...
 I like const!

 I like to know that when I pass /my/ data to /your/ function, then
 your function is not going to mess with my data. const in the
 declaration of your function is what gives me that guarantee.


Also important to point out that "you" in the above often means "myself 
two months ago".

 Oh, _come on_.  Chances are if your code is passing data to my function, one 
 or more of the following is true:
 
 1) We're on the same team, and we have some sort of convention set up for 
 this kind of thing.

My convention in D1 is this:
    foo(/*const*/ ref T a, /*const*/string b);

It works ok.  Puts the label right there in front of the parameter where 
you can see it when looking at the function signature.  Sure would be 
nice if it actually did some checking, though.

My other convention is to name ref parameters that are meant to be 
output like  "ref Foo something_out".  I also end up making a lot of out 
parameters be pointers instead references, because then looking at the 
signature I know that it must be an out parameter, because there's no 
other reason to pass it as a pointer.

 2) You're using a library that I wrote, and it's probably well-documented, 
 or at least self-obvious, whether or not my function modifies your data. 

Come on.  Code is poorly documented.  For 90% of open source projects 
out there "better documentation" is in the top 5 on the TODO list.

 (Keep in mind that even if there _is_ const available and my function is not 
 well-documented, I don't _have_ to use const, and so you don't have any 
 guarantee anyway.)

Yes, the library writer might not have used const.  In C++ those are 
usually the ones I glance at quickly and conclude "this guy probably had 
no idea what he was doing ... tread with caution".

 3) I'm not a complete moron and actually name my functions after what they 
 do.  Face it, most people don't write functions with one name and have them 
 do something completely different, and if that's happening, constness is not 
 really going to help you with that.

Bad naming really has nothing to do with it.

 
 My big question is: how often do you pass some reference type into a 
 function and hope that it will/won't be modified?  Here are some possible 
 reference types:
 
 - Const refs to structs.  Why do these exist?  ref to get the compiler to 
 pass the struct efficiently, and const to preserve value semantics. 
 Wouldn't this be better served by some other mechanism, such as the 
 optimizer?

For me, I do this all the time.  Passing 4 different 4-vectors of 
doubles to a function is a good formula for killing your performance.
But I'm with you.  I shouldn't *have* to worry about the performance. 
If I could pass structs around efficiently it would kill 90% of what I 
want const for.  I think the idea of having the compiler quietly 
substitute in pass-by-reference for pass-by-value to improve efficiency 
has some promise.

 I'm just absolutely curious as to what on earth, _other than strings_, that 
 you'll be passing to other code that you don't want it to be modified. 

I do hardly any text processing.  I'm mostly passing vectors and 
matrices and meshes and images around.  And a most of those things are 
big enough that you really don't want to make a copy unless you 
absolutely have to.  So whether you plan to modify the data or not 
you're going to pass it around by some sort of reference/pointer/handle.

--bb

Nov 12 2007

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Bill Baxter" <dnewsgroup billbaxter.com> wrote in message 
news:fhav0e$2stp$1 digitalmars.com...

 My other convention is to name ref parameters that are meant to be output 
 like  "ref Foo something_out".  I also end up making a lot of out 
 parameters be pointers instead references, because then looking at the 
 signature I know that it must be an out parameter, because there's no 
 other reason to pass it as a pointer.


more pointer nonsense.

 Come on.  Code is poorly documented.  For 90% of open source projects out 
 there "better documentation" is in the top 5 on the TODO list.

:D

 3) I'm not a complete moron and actually name my functions after what 
 they do.  Face it, most people don't write functions with one name and 
 have them do something completely different, and if that's happening, 
 constness is not really going to help you with that.

 Bad naming really has nothing to do with it.

It has something to do with it.  If you have a function like so:

void drawImage(Image i, int x, int y) ...

And Image is a class type, do you expect images that you pass in to be 
modified?  No, because that would be stupid.  It _draws an image_.  Making 
this method take a const(Image) instead is kind of beating a dead horse. 
Yes, now the compiler will enforce it but why would anyone in their right 
mind modify the passed-in image anyway?  I've never made the stupid mistake 
of going "Oh!  Why don't I modify the image data in the draw method?!  Makes 
perfect sense to me!"

And if someone really really wanted to modify the image in drawImage, at 
that point they have to have the source code, so there's nothing preventing 
them from changing that const(Image) parameter to an Image.  So I *really* 
don't see what const gets us here.

Nov 12 2007

Paul Findlay <r.lph50+d gmail.com> writes:

 It has something to do with it.  If you have a function like so:
 
 void drawImage(Image i, int x, int y) ...
 
 And Image is a class type, do you expect images that you pass in to be
 modified?  No, because that would be stupid.

Yeah, but it happens with code I use in ruby land. This is a bug I reported:
http://projects.jkraemer.net/acts_as_ferret/ticket/181
(happens easily since ruby strings are like references to classes).

Having optional const may not have prevented the error, but if ruby had
something like a const attribute I imagine I could have simply added it and
tracked down the modification by waiting for a compiler/runtime error
rather than tracing the whole flow of the code manually.

I don't really know, but I wish the D compiler had support for ensuring
things that could be const are, and that functions that can return
null-references get checked their return value checked by their users. I
don't think it should substitute for programming habits, but a compiler
should be able to do so much more grunt work.

 - Paul

Nov 12 2007

"David B. Held" <dheld codelogicconsulting.com> writes:

Jarrett Billingsley wrote:
 [...]
 1) We're on the same team, and we have some sort of convention
 set up for this kind of thing.

Hahahahaha!!!  I see you've never worked for a company with more than 5 
people.  The idea that you can get even 5 people in a team in a large 
company to agree on a "convention" is rather amusing.  If you work in a 
company with at least 5 *teams*, you have another class of problem 
altogether.

 2) You're using a library that I wrote, and it's probably well-
 documented, or at least self-obvious, whether or not my function
 modifies your data. 

Wow, maybe you write libraries that way, but you should take a look 
around.  95% of software engineers don't.  Most of the code I see, I'd 
be happy of there were *comments*, let alone documentation about 
mutability and constness.

 (Keep in mind that even if there _is_ const available and my function is
 not well-documented, I don't _have_ to use const, and so you don't have
 any guarantee anyway.)

I have the guarantee that if you use const, you can't hack my data, and 
if you don't use const, I don't have to use your code.  At the worst, I 
will copy my data before hand and curse you for the performance hit. 
And I can show you code bases where you will do this for your own 
sanity; and yes, the code will make you cry.

 3) I'm not a complete moron and actually name my functions after what
 they do.  Face it, most people don't write functions with one name and
 have them do something completely different, and if that's happening,
 constness is not really going to help you with that.

So you're in the middle of an app that executes a bunch of business 
logic, and the function is a hundred lines long and is called 
processFoo(Foo foo, Bar bar, Baz baz).  Which of those arguments do you 
expect to get modified and why?  Does this sound like an unreasonable 
name?  Well, unreasonable or not, I see names like this a hundred times 
a day.  I don't have the luxury of going around and renaming them as 
processFooModifiesFooButNotBarOrBaz(Foo foo, Bar bar, Baz baz).

 My big question is: how often do you pass some reference type into a 
 function and hope that it will/won't be modified?

Just about every time I write some code.

 [...]
 - Const refs to structs.  Why do these exist?  ref to get the compiler
 to pass the struct efficiently, and const to preserve value semantics. 
 Wouldn't this be better served by some other mechanism, such as the 
 optimizer?

Sure.  And for the optimizer to tell that it can replace a copy with a 
const ref, what does it need to do?  It needs to prove that there are no 
mutable operations on the object.  I.e., it needs to prove that the 
object is const!  Granted, once you have const, you could have the 
optimizer do these replacements for you, but clearly, the const-proving 
mechanism itself is essential.

 - Pointers.  Come on, don't use pointers.  ;)

Easier said in D than C++.

 - Class instances.  I'll get back to this.

Yeah, since classes only cover maybe 2% of user types in D.

 - Arrays of things other than characters.  OK, I can see a use for
 declaring an array as read-only, although most of the time I'm modifying
 my arrays left and right.

This depends entirely on the nature of the code you're writing.  For 
some kinds of apps, in-place modification makes perfect sense.  For 
others, you want to transform data from one form to another, and the 
input and output types are different, so there is no opportunity for 
in-place modification.  In this case, you may want the input to be 
immutable so you can reuse it in a subsequent call.

 - Strings.  The single most common kind of array, and they even have
 special read-only literals built into the language.  It makes sense to
 provide read-only strings, if for nothing else efficiency.  Most of the
 time you're not going to be modifying the strings after you've done some
 machinations.

But clearly, nobody needs read-only vectors, because scientific 
computing *always* entails modifying your matrices, tensors, etc.  And 
if they do, they can just translate them to read-only strings instead, 
which are faster.

 [...]
 - As a sort of 'struct on steroids', that is, just a data-holding class 
 instance with some methods.  These I guess are candidates for constness,
 but the incidence of these is so small and the functions which use them
 are so short and what they do is so obvious that I have no idea what
 gains I will get from using const.

Have you ever heard of a little thing called "Service-Oriented 
Architecture"?  The idea there is to decompose a large distributed app 
into independent components so that you have loose coupling and 
well-defined interface boundaries, as well as nice opportunities for 
parallelization for scaling.  Well, in SOA, all the data gets passed 
between services as messages.  And in a business with big data objects, 
those messages can get rather large.  Most of the time, there is no 
reason to manipulate the message itself (any more than you need to 
manipulate a packet you received over a network).  When a function needs 
to dig into one of these, there's no reason for it to be non-const, and 
you may want to pass the message to multiple functions for processing.

Any time you read a business object off a database for display to a 
user, you don't want to mutate the object.  This kind of thing happens a 
*lot* in business apps.  Things that happen to be very dynamic and 
stateful will have more mutation (like games, simulations, etc.).  The 
rest are going to benefit a lot from const.

 I'm just absolutely curious as to what on earth, _other than strings_,
 that you'll be passing to other code that you don't want it to be
 modified.

How about "just about everything"?

Dave

Nov 12 2007

Bruce Adams <tortoise_74 yeah.who.co.uk> writes:

David B. Held Wrote:

 Jarrett Billingsley wrote:
 [...]
 1) We're on the same team, and we have some sort of convention
 set up for this kind of thing.

 
 Hahahahaha!!!  I see you've never worked for a company with more than 5 
 people.  The idea that you can get even 5 people in a team in a large 
 company to agree on a "convention" is rather amusing.  If you work in a 
 company with at least 5 *teams*, you have another class of problem 
 altogether.

I share your pain but most organisations that acknowledge they are doing
software  (rather than some kind of 'business analysis') at now accept the idea
of process improvement in principle at least.
 
 2) You're using a library that I wrote, and it's probably well-
 documented, or at least self-obvious, whether or not my function
 modifies your data. 

 
 Wow, maybe you write libraries that way, but you should take a look 
 around.  95% of software engineers don't.  Most of the code I see, I'd 
 be happy of there were *comments*, let alone documentation about 
 mutability and constness.

Again I share your pain. Leading by example helps. Doxygen is your friend
(DIYF) etc etc. Of course in D we have contracts so you get less out of the
 pre &  post part of doxygen.
 
 (Keep in mind that even if there _is_ const available and my function is
 not well-documented, I don't _have_ to use const, and so you don't have
 any guarantee anyway.)

 
 I have the guarantee that if you use const, you can't hack my data, and 
 if you don't use const, I don't have to use your code.  At the worst, I 
 will copy my data before hand and curse you for the performance hit. 
 And I can show you code bases where you will do this for your own 
 sanity; and yes, the code will make you cry.
 
 3) I'm not a complete moron and actually name my functions after what
 they do.  Face it, most people don't write functions with one name and
 have them do something completely different, and if that's happening,
 constness is not really going to help you with that.

 
 So you're in the middle of an app that executes a bunch of business 
 logic, and the function is a hundred lines long and is called 
 processFoo(Foo foo, Bar bar, Baz baz).  Which of those arguments do you 
 expect to get modified and why?  Does this sound like an unreasonable 
 name?  Well, unreasonable or not, I see names like this a hundred times 
 a day.  I don't have the luxury of going around and renaming them as 
 processFooModifiesFooButNotBarOrBaz(Foo foo, Bar bar, Baz baz).

Don't forget we have in out and inout for documentation purposes.
I haven't tried it (mainly because its a sick idea) but presumably you can get
ersatz constness with a contract:

pre {
   signature sigX = md5sum(X);
}
foo(/* const */ X) {
   ...
}
post {
  assert(sigX == md5sum(X));
}
   
This is a bit sick because its a run-time check. Might be useful for unit tests.
 
 
 But clearly, nobody needs read-only vectors, because scientific 
 computing *always* entails modifying your matrices, tensors, etc.  And 
 if they do, they can just translate them to read-only strings instead, 
 which are faster.

Actually there are cases where its useful. You might want copy-on-write
semantics (though those kill const though mutability). 
Another case is when you are slicing. I have seen applications in C++ where
successive filters are applied to a vector-like container. In fact each filter
creates a new slice of the data. (The structure is actually a tree so D array
slices wouldn't work). That would make a good use case for iterators and
opStar/opDeref/opSlice.
 
Regards,

Bruce.

Nov 13 2007

"Janice Caron" <caron800 googlemail.com> writes:

In C++ (and, I believe, D), it is possible for a statement like

    a = b;

to modify b. That's because the argument to operator=() or opAssign()
could be a non-const reference. You could argue that this would be a
silly thing to do, but auto_ptr<T> does that by design.

Likewise, it is possible in D for

    a[n] = b;

to modify b (and even n). You could argue that a function /shouldn't/
do stuff like that - but what if it does it by accident? As in,
because of a bug? const is your only guarantee that that won't happen.

Nov 12 2007

Robert Fraser <fraserofthenight gmail.com> writes:

Janice Caron Wrote:

 I like to know that when I pass /my/ data to /your/ function, then
 your function is not going to mess with my data. const in the
 declaration of your function is what gives me that guarantee.

That's a rather paranoid approach. I never worry about that in Java, or D1, or
any other language without const. That information belongs in the
documentation, rather than waste the programmer's valuable time dealing with
obscure const bugs and typing it all over the place.

Nov 12 2007

Robert Fraser <fraserofthenight gmail.com> writes:

Jarrett Billingsley Wrote:

 What my thoughts boiled down to is that constness seems useful for 
 strings and not much else.  I suppose it could be useful for other kinds 
 of arrays, but overwhelmingly the use cases for const seem to be for 
 strings, and constness helps to make some operations with strings more 
 efficient.  Other than that, constness for other types seems more like a 
 logical convenience.  So you want to return a const(Foo) where Foo is a 
 class reference?  How often do you need to do that?  Passing const refs 
 -- have YOU ever had a bug where you tried to modify a const ref param 
 that was caught by the const?

Coming from a Java background, I agree with you quite a bit. However, I have
run into one case where having const in Java would help a lot (although I think
if I can get my debugger to set a modification watchpoint, it'll be a
non-issue). Somewhere in the Descent semantic analysis, some static members
that are supposed not to change are being somehow modified. Since the codebase
is large, figuring out where exactly thats happening is proving problematic.

That said, I agree -- having a const string would be good enough for software
engineering purposes for me. But I think Walter wants to be able to optimize
based on invariantness.

Nov 12 2007

Walter Bright <newshound1 digitalmars.com> writes:

Robert Fraser wrote:
 That said, I agree -- having a const string would be good enough for
 software engineering purposes for me. But I think Walter wants to be
 able to optimize based on invariantness.

I also want to support functional programming, and without invariants, 
that's dead in the water.

Nov 12 2007

D Programming

C/C++ Programming

Other

digitalmars.D - Const, strings, and other things.