digitalmars.D - improvement request - enabling by-value-containers
- Simon Buerger (40/40) Dec 08 2010 For Every lib its a design descision if containers should be value- or
- Jonathan M Davis (39/80) Dec 08 2010 It's extremely rare in my experience that it makes any sense to copy a c...
- Simon Buerger (23/102) Dec 09 2010 From a pragmatic viewpoint you are right, copying containers is rare.
- Jesse Phillips (7/26) Dec 09 2010 Why? You put row in there and said there was 5 of them.
- Simon Buerger (11/35) Dec 09 2010 No, that line would duplicate row once, and store that same copy in
- Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= (58/58) Dec 14 2010 Hi!
- Andrei Alexandrescu (24/81) Dec 14 2010 I think this argument goes exactly the other way. C++ containers have
- bearophile (5/9) Dec 14 2010 Yes, this is a common thing (it happened to me too, with Python and othe...
- spir (18/20) Dec 15 2010 languages). You need to be careful and think three times before designi...
- Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= (26/46) Dec 14 2010 Probably, and with param_type bla it is not shorter, but at least it _is...
- Kagamin (2/5) Dec 14 2010 So you want your containers by reference everywhere?
- Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= (5/5) Dec 15 2010 Sorry, ref const scope for such containers and PODs etc., const scope fo...
- Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= (9/14) Dec 15 2010 Maybe you want to say that this is waste because containers often have o...
- Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= (6/11) Dec 14 2010 In my opinion by-value-types are good for mathematical objects and data ...
- spir (12/24) Dec 15 2010 ike=20
- Michel Fortin (23/52) Dec 14 2010 That would depend on what you're doing. Sure, if the next thing you do
- Michel Fortin (10/16) Dec 14 2010 I have to echo a similar concern with by-reference containers from my
- bearophile (5/8) Dec 14 2010 A partial (but maybe better) solution to this problem is to introduce "l...
- KennyTM~ (2/10) Dec 15 2010 std.typecons.Unqiue ?
- KennyTM~ (2/17) Dec 15 2010 (BTW, I meant 'Unique' :) )
- Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= (2/6) Dec 15 2010 Well, then you would have a lot of null-ptrs when using the by-reference...
- Bruno Medeiros (29/64) Dec 21 2010 I would go further than that actually, it seems to me that the idea of
- bearophile (4/7) Dec 21 2010 I agree that in general collections are better managed by reference. But...
- Simon Buerger (48/77) Dec 22 2010 Identity is wrong, because if I pass th set {1,2,3} to a function, I
- Bruno Medeiros (32/91) Jan 27 2011 I don't understand this, it doesn't seem to make sense. You say you
- Kagamin (2/8) Dec 14 2010 Hmm... never needed to clone a container. Is there a use case for by-val...
- Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= (3/5) Dec 15 2010 I have implemented the Quine McCluskey algorithm in Ruby, it was really
- Kagamin (2/7) Dec 15 2010 What I understand from description, the algorithm consists of several st...
- Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= (4/6) Dec 15 2010 Can't remember, I had to move and copy some sets around and test which s...
For Every lib its a design descision if containers should be value- or reference-types. In C++ STL they are value-types (i.e. the copy-constructor does a real copy), while in tango and phobos the descision was to go for reference-types afaik, but I would like to be able to write value-types too, which isn't possible (in a really good way) currently. Following points would need some love (by-value containers are probably not the only area, where these could be useful) (1) Allow default-constructors for structs I don't see a reason, why "this(int foo)" is allowed, but "this()" is not. There might be some useful non-trivial init to do for complex structs. (2) const parameters by reference If a parameter to a function is read-only, the right notion depends on the type of that parameter. I.e. "in" for simple stuff like ints, and "ref const" for big structures. Using "in" for big data implies a whole copy, even though it's constant, and using "ref const" for simple types is a useless indirection. This is a problem for generic code, when the type is templated, because there is now way to switch between "in" and "ref const" with compile-time-reflection. Solution one: make "ref" a real type-constructor, so you could do the following (this is possible in C++): static if(is(T == struct)) alias ref const T const_type; else alias const scope T const_type; // "const scope" is (currently) equivalent to "in" void foo(const_type x) Solution two: let "in" decide wheather to pass by reference or value, depending on the type. Probably the better solution cause the programmer dont need to care of the descision himself anymore. (3) make foreach parameters constant when you do "foreach(x;a)" the x value gets copied in each iteration, once again, that matters for big types especially when you have a copy-constructor. Current work-around is prepending "ref": nothing gets copied, but the compiler wont know it is meant to be read-only. Solution: either allow "ref const" or "in" in foreach. Or you could even make x default to constant if not stated as "ref" explicitly. Last alternative seems logical to me, but it may break existing code. Comments welcome, Krox
Dec 08 2010
On Wednesday, December 08, 2010 14:14:57 Simon Buerger wrote:For Every lib its a design descision if containers should be value- or reference-types. In C++ STL they are value-types (i.e. the copy-constructor does a real copy), while in tango and phobos the descision was to go for reference-types afaik, but I would like to be able to write value-types too, which isn't possible (in a really good way) currently. Following points would need some love (by-value containers are probably not the only area, where these could be useful)It's extremely rare in my experience that it makes any sense to copy a container on a regular basis. Having an easy means of creating a deep copy of a container or copying the elements from one container to another efficiently would be good, but having containers be value types is almost always a bad idea. It's just not a typical need to need to copy containers - certainly not enough to have them be copied just because you passed them to a function or returned them from one. I think that reference types for containers is very much the correct decision. There should be good ways to copy containers, but copying shouldn't be the default for much of anything in the way of containers.(1) Allow default-constructors for structs I don't see a reason, why "this(int foo)" is allowed, but "this()" is not. There might be some useful non-trivial init to do for complex structs.It has to do with the init property. It has to be known at compile-time for all types. For classes, that's easy because it's null, but for structs, that's what all of their member variables are directly initialized to. If you add a default constructor, then it would have to be to whatever that constructed them to, which would shift it from compile time to runtime. It should be possible to have default constructors which are definitely limited in a number of ways (like having to be nothrow and possibly pure), but that hasn't been sorted out, and even if it is, plenty of cases where people want default constructors still wouldn't likely work. It just doesn't work to have default constructors which can run completely arbitrary code. You could get exceptions thrown in weird places and a variety of other problems which we can't have in situations where init is used. Hopefully, we'll get limited default constructors at some point, but it hasn't happened yet (and probably won't without a good proposal that deals with all of the potentiall issues), and regardless, it will never be as flexible as what C++ does. It's primarily a side effect of insisting that all variables be default initialized if they're not directly initialized.(2) const parameters by reference If a parameter to a function is read-only, the right notion depends on the type of that parameter. I.e. "in" for simple stuff like ints, and "ref const" for big structures. Using "in" for big data implies a whole copy, even though it's constant, and using "ref const" for simple types is a useless indirection. This is a problem for generic code, when the type is templated, because there is now way to switch between "in" and "ref const" with compile-time-reflection. Solution one: make "ref" a real type-constructor, so you could do the following (this is possible in C++): static if(is(T == struct)) alias ref const T const_type; else alias const scope T const_type; // "const scope" is (currently) equivalent to "in" void foo(const_type x) Solution two: let "in" decide wheather to pass by reference or value, depending on the type. Probably the better solution cause the programmer dont need to care of the descision himself anymore.I think that auto ref is supposed to deal with some of this, but it's buggy at the moment, and I'm not sure exactly what it's supposed to do. There was some discussion on this one in a recent thread.(3) make foreach parameters constant when you do "foreach(x;a)" the x value gets copied in each iteration, once again, that matters for big types especially when you have a copy-constructor. Current work-around is prepending "ref": nothing gets copied, but the compiler wont know it is meant to be read-only. Solution: either allow "ref const" or "in" in foreach. Or you could even make x default to constant if not stated as "ref" explicitly. Last alternative seems logical to me, but it may break existing code.I'd hate to see foreach variables be const by default. That would be overly limiting and would definitely break a lot of code. Making ref const work properly would be good (I think that it works in at least some cases) for structs that you don't want to be copied but wouldn't be all that useful otherwise. Nothing in D is const by default, and I think that making anything const by default would clash with the rest of the language. Particularly since then how would you make it mutable? No, it should be possible to have const refs to structs for foreach variables, but it shouldn't be the default. The language as a whole just does not support that. - Jonathan M Davis
Dec 08 2010
On 08.12.2010 23:45, Jonathan M Davis wrote:On Wednesday, December 08, 2010 14:14:57 Simon Buerger wrote:From a pragmatic viewpoint you are right, copying containers is rare. But on the other hand, classes imply a kind of identity, so that a set is a different obejct then an other object with the very same elements. That feels wrong from an aesthetical or mathematical viewpoint. Furthermore, if you have for example a vector of vectors, vector!int row = [1,2,3]; auto vec = Vector!(Vector!int)(5, row); then vec should be 5 rows, and not 5 times the same row.For Every lib its a design descision if containers should be value- or reference-types. In C++ STL they are value-types (i.e. the copy-constructor does a real copy), while in tango and phobos the descision was to go for reference-types afaik, but I would like to be able to write value-types too, which isn't possible (in a really good way) currently. Following points would need some love (by-value containers are probably not the only area, where these could be useful)It's extremely rare in my experience that it makes any sense to copy a container on a regular basis. Having an easy means of creating a deep copy of a container or copying the elements from one container to another efficiently would be good, but having containers be value types is almost always a bad idea. It's just not a typical need to need to copy containers - certainly not enough to have them be copied just because you passed them to a function or returned them from one. I think that reference types for containers is very much the correct decision. There should be good ways to copy containers, but copying shouldn't be the default for much of anything in the way of containers.I partially see your point, the constructor would be called in places the programmer didnt expect, but actually, what's the problem with an exception? They can always happen anyway (at least outOfMemory)(1) Allow default-constructors for structs I don't see a reason, why "this(int foo)" is allowed, but "this()" is not. There might be some useful non-trivial init to do for complex structs.It has to do with the init property. It has to be known at compile-time for all types. For classes, that's easy because it's null, but for structs, that's what all of their member variables are directly initialized to. If you add a default constructor, then it would have to be to whatever that constructed them to, which would shift it from compile time to runtime. It should be possible to have default constructors which are definitely limited in a number of ways (like having to be nothrow and possibly pure), but that hasn't been sorted out, and even if it is, plenty of cases where people want default constructors still wouldn't likely work. It just doesn't work to have default constructors which can run completely arbitrary code. You could get exceptions thrown in weird places and a variety of other problems which we can't have in situations where init is used. Hopefully, we'll get limited default constructors at some point, but it hasn't happened yet (and probably won't without a good proposal that deals with all of the potentiall issues), and regardless, it will never be as flexible as what C++ does. It's primarily a side effect of insisting that all variables be default initialized if they're not directly initialized.letting "in" decide would be cleaner IMO, but anyway good to hear that problem is recognized. Will look for the other thread.(2) const parameters by reference If a parameter to a function is read-only, the right notion depends on the type of that parameter. I.e. "in" for simple stuff like ints, and "ref const" for big structures. Using "in" for big data implies a whole copy, even though it's constant, and using "ref const" for simple types is a useless indirection. This is a problem for generic code, when the type is templated, because there is now way to switch between "in" and "ref const" with compile-time-reflection. Solution one: make "ref" a real type-constructor, so you could do the following (this is possible in C++): static if(is(T == struct)) alias ref const T const_type; else alias const scope T const_type; // "const scope" is (currently) equivalent to "in" void foo(const_type x) Solution two: let "in" decide wheather to pass by reference or value, depending on the type. Probably the better solution cause the programmer dont need to care of the descision himself anymore.I think that auto ref is supposed to deal with some of this, but it's buggy at the moment, and I'm not sure exactly what it's supposed to do. There was some discussion on this one in a recent thread.You are right that default-const would be contrary to the rest of the language, but when I think longer about this... the same default-const should apply for all function parameter. They should be input, output or inout. But the "mutable copy of the original" which is common in C/C++/D/everything-alike, is actually pretty weird. (modifying non-output parameters inside a function is considered bad style even in C++ and Java). But well, that would be really a step too big for D2... maybe I'll suggest it for D3 some day *g* Krox(3) make foreach parameters constant when you do "foreach(x;a)" the x value gets copied in each iteration, once again, that matters for big types especially when you have a copy-constructor. Current work-around is prepending "ref": nothing gets copied, but the compiler wont know it is meant to be read-only. Solution: either allow "ref const" or "in" in foreach. Or you could even make x default to constant if not stated as "ref" explicitly. Last alternative seems logical to me, but it may break existing code.I'd hate to see foreach variables be const by default. That would be overly limiting and would definitely break a lot of code. Making ref const work properly would be good (I think that it works in at least some cases) for structs that you don't want to be copied but wouldn't be all that useful otherwise. Nothing in D is const by default, and I think that making anything const by default would clash with the rest of the language. Particularly since then how would you make it mutable? No, it should be possible to have const refs to structs for foreach variables, but it shouldn't be the default. The language as a whole just does not support that.
Dec 09 2010
Simon Buerger Wrote:vector!int row = [1,2,3]; auto vec = Vector!(Vector!int)(5, row); then vec should be 5 rows, and not 5 times the same row.Why? You put row in there and said there was 5 of them. vec[] = row.dup; I believe that would be the correct syntax if you wanted to store 5 different vectors of the same content (Works for arrays).I partially see your point, the constructor would be called in places the programmer didnt expect, but actually, what's the problem with an exception? They can always happen anyway (at least outOfMemory)I think there is even more too it. init is used during compile time so properties of the class/struct can be checked. I don't think exceptions are supported for CTFE.letting "in" decide would be cleaner IMO, but anyway good to hear that problem is recognized. Will look for the other thread.I'm not sure if the spec says in must be passed by reference, only that is how it is done. I'd think it'd be up to the compiler.You are right that default-const would be contrary to the rest of the language, but when I think longer about this... the same default-const should apply for all function parameter. They should be input, output or inout. But the "mutable copy of the original" which is common in C/C++/D/everything-alike, is actually pretty weird. (modifying non-output parameters inside a function is considered bad style even in C++ and Java). But well, that would be really a step too big for D2... maybe I'll suggest it for D3 some day *g* KroxI believe Bearophile has beat you too that. Think it is even in Bugzilla. I think it would only make sense to add it to D3 if it becomes common to mark functions parameters as in. But I agree it is easier to think, I want to modify this then it is to say I'm not modifying this so it should be in. Though currently I don't think there is a way to mark the current default behavior.
Dec 09 2010
On 09.12.2010 23:39, Jesse Phillips wrote:Simon Buerger Wrote:No, that line would duplicate row once, and store that same copy in every element of vec.vector!int row = [1,2,3]; auto vec = Vector!(Vector!int)(5, row); then vec should be 5 rows, and not 5 times the same row.Why? You put row in there and said there was 5 of them. vec[] = row.dup; I believe that would be the correct syntax if you wanted to store 5 different vectors of the same content (Works for arrays).Other way around, "in" is currently passed by value, though the spec does not explicitly disallow by reference, so it might be implemented without even changing the spec.I partially see your point, the constructor would be called in places the programmer didnt expect, but actually, what's the problem with an exception? They can always happen anyway (at least outOfMemory)I think there is even more too it. init is used during compile time so properties of the class/struct can be checked. I don't think exceptions are supported for CTFE.letting "in" decide would be cleaner IMO, but anyway good to hear that problem is recognized. Will look for the other thread.I'm not sure if the spec says in must be passed by reference, only that is how it is done. I'd think it'd be up to the compiler.Well, it would be good style to add in/out to each and every parameter there is, but I dont do it myself either (except in container-implementations, where good style and the last bit of optimizer seems important) KroxYou are right that default-const would be contrary to the rest of the language, but when I think longer about this... the same default-const should apply for all function parameter. They should be input, output or inout. But the "mutable copy of the original" which is common in C/C++/D/everything-alike, is actually pretty weird. (modifying non-output parameters inside a function is considered bad style even in C++ and Java). But well, that would be really a step too big for D2... maybe I'll suggest it for D3 some day *g*I believe Bearophile has beat you too that. Think it is even in Bugzilla. I think it would only make sense to add it to D3 if it becomes common to mark functions parameters as in. But I agree it is easier to think, I want to modify this then it is to say I'm not modifying this so it should be in. Though currently I don't think there is a way to mark the current default behavior.
Dec 09 2010
Hi! Just about my experiences: When trying to hack some algorithms quickly in Ruby I made a lot of mistakes because I had to care about a .clone everywhere and because Array.new(5, []) does not work as expected (sorry, but Array.new(5) { return [] } is not nice). So in fact C++ made my life easier than the new, stylish, simple Ruby-programming-language, because of the great by-value-containers in the STL. However, some reasons for by-value-containers: *First of all you often have to deal with mutli-dimensional data- structures, you map something to a map of lists of whatever and you want to manage suche data in a simple and generic way, simplification or extensions to the data- structure should not force you to refactor all more or less generic code- fragments. For example copying some entries around should not look different just because you added a dimension in your data-structure. But without proper value-semantics you are forced to do that, because at some point you will have to switch from by-value to by-reference because of limitations made somewhere. *Another argument: It should be very simple (at least in C++ it is, I have never had problems with it, I just added the & here and there) to handle references to by-value-types, but wrapping by-reference-types into by-value- types is really ugly, although it may be the right thing somewhere. *By-value-containers support more generic code. A copy_if on a multi- dimensional container should of course copy the element and not just copy some references, and it would be bad if the generic implementation would have to test if the type is by-reference, but a container supporting clone, eventually using clone, bothering if it is a deep clone etc. I just do not see a simple way to make such generic algorithms easy to implement (even with new language features) if by-value-types are not fully supported and not used for containers. *Whether or not you think by-value-containers are good, a better alternative for in would be great for generic code. In C++ I can use something like parameter_type<T>::result choosing by-reference or by-value automatically, it is not very nice, but it is simply impossible with D, there are no reference-types, there are no ways to implement such decisions in the parameter-lists, and in normal code it is impossible more than ever. But imagine there would be a simple you could put in front of a variable or paramater declaration choosing by-ref/by-value automatically, lets say §, and of course it should be const and scope, so you would put a § before your parameters, because you may not know if type is big or not (there are sometimes big PODs, somebody may want to pass a FILE-object to a generic function or whatever), you would use a § an a read-only-foreach, and you would not have to bother about anything, when extracting a container- element temporarily (pivot in quicksort or whatever, you may want to sort your PODs, by-value-containers, primitives, pointers, class-objects etc.), you would use a § and it would be perfect. This would allow the implementation of by- value-containers (of course default-constructors are also required), but it would allow to write more generic high-quality code everywhere. In D3 this could even be the default for function-parameters, the C-compatible default- behaviour is simply nonsense. Regards The User
Dec 14 2010
On 12/14/10 3:29 PM, Jonathan Schmidt-Dominé wrote:Hi! Just about my experiences: When trying to hack some algorithms quickly in Ruby I made a lot of mistakes because I had to care about a .clone everywhere and because Array.new(5, []) does not work as expected (sorry, but Array.new(5) { return [] } is not nice). So in fact C++ made my life easier than the new, stylish, simple Ruby-programming-language, because of the great by-value-containers in the STL.Thanks for sharing.However, some reasons for by-value-containers: *First of all you often have to deal with mutli-dimensional data- structures, you map something to a map of lists of whatever and you want to manage suche data in a simple and generic way, simplification or extensions to the data- structure should not force you to refactor all more or less generic code- fragments. For example copying some entries around should not look different just because you added a dimension in your data-structure. But without proper value-semantics you are forced to do that, because at some point you will have to switch from by-value to by-reference because of limitations made somewhere.I think this argument goes exactly the other way. C++ containers have terrible compositional behavior. Using vector<vector<T> > or vector<map<T> > in C++98 is suicide. C++0x fixes that by means of introducing rvalue references, but reference semantics obviate all that.*Another argument: It should be very simple (at least in C++ it is, I have never had problems with it, I just added the& here and there) to handle references to by-value-types, but wrapping by-reference-types into by-value- types is really ugly, although it may be the right thing somewhere."here and there" is more like "every time I define a function". I mean that's a lot, no? Wrapping could work either way, and after thinking about it a lot I have difficulty decreeing one is considerably easier/simpler than the other.*By-value-containers support more generic code. A copy_if on a multi- dimensional container should of course copy the element and not just copy some references, and it would be bad if the generic implementation would have to test if the type is by-reference, but a container supporting clone, eventually using clone, bothering if it is a deep clone etc. I just do not see a simple way to make such generic algorithms easy to implement (even with new language features) if by-value-types are not fully supported and not used for containers.copy_if on a multidimensional container should not naively copy entire hyperplanes. More generally, I think that whenever an arbitrarily large object is to be copied, that should be explicit instead of implicit. A lot of focus in C++ is dedicated to making sure you don't copy the wrong thing.*Whether or not you think by-value-containers are good, a better alternative for in would be great for generic code. In C++ I can use something like parameter_type<T>::result choosing by-reference or by-value automatically, it is not very nice, but it is simply impossible with D, there are no reference-types, there are no ways to implement such decisions in the parameter-lists, and in normal code it is impossible more than ever. But imagine there would be a simple you could put in front of a variable or paramater declaration choosing by-ref/by-value automatically, lets say §, and of course it should be const and scope, so you would put a § before your parameters, because you may not know if type is big or not (there are sometimes big PODs, somebody may want to pass a FILE-object to a generic function or whatever), you would use a § an a read-only-foreach, and you would not have to bother about anything, when extracting a container- element temporarily (pivot in quicksort or whatever, you may want to sort your PODs, by-value-containers, primitives, pointers, class-objects etc.), you would use a § and it would be perfect. This would allow the implementation of by- value-containers (of course default-constructors are also required), but it would allow to write more generic high-quality code everywhere. In D3 this could even be the default for function-parameters, the C-compatible default- behaviour is simply nonsense.It would be great to have D just obviate the necessity of parameter_type<T>::result in the first place, which is what auto ref is meant for (it's §).The Users/The/A/ I guess ;o). One issue that I noticed about myself and other people coming to D from C++ is that we expect to bring with us, along with the many things that make C++ great, the baggage of common worries, misgivings, and just rote work that we got used to. Andrei
Dec 14 2010
Andrei:One issue that I noticed about myself and other people coming to D from C++ is that we expect to bring with us, along with the many things that make C++ great, the baggage of common worries, misgivings, and just rote work that we got used to.Yes, this is a common thing (it happened to me too, with Python and other languages). You need to be careful and think three times before designing things. Knowing several languages helps a bit against that. I think so far reference semantics (but final methods) for containers is having the upper hand so far in this discussion. Surely other people will write other kind of D containers (like the C++ containers used by Electronic Arts), but I think the std lib needs to be not too much hard to use. (C++ is sometimes too much hard to use for me. In D I was looking for a bit simpler to use language). Bye, bearophile
Dec 14 2010
On Tue, 14 Dec 2010 18:24:09 -0500 bearophile <bearophileHUGS lycos.com> wrote:Yes, this is a common thing (it happened to me too, with Python and other=languages). You need to be careful and think three times before designing = things. Knowing several languages helps a bit against that. s/bit/lot/ ;-) ?I think so far reference semantics (but final methods) for containers is =having the upper hand so far in this discussion. Surely other people will w= rite other kind of D containers (like the C++ containers used by Electronic= Arts), but I think the std lib needs to be not too much hard to use. (C++ = is sometimes too much hard to use for me. In D I was looking for a bit simp= ler to use language). Would these ones be useful? https://bitbucket.org/denispir/denispir-d/src/a= 5975e94f15c/collections.d=20 (quickly written for personal use -- definitions as struct/class can be cha= nged) Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 15 2010
Hi!Probably, and with param_type bla it is not shorter, but at least it _is_ possible to write generic sort, copy_if etc. without unnecessary copies or references.*Another argument: It should be very simple (at least in C++ it is, I have never had problems with it, I just added the& here and there) to handle references to by-value-types, but wrapping by-reference-types into by-value- types is really ugly, although it may be the right thing somewhere."here and there" is more like "every time I define a function". I mean that's a lot, no?Wrapping could work either way, and after thinking about it a lot I have difficulty decreeing one is considerably easier/simpler than the other.Wrapping a by-reference-type in a by-value-type induces a lot of overhead, wrapping by-value-types in by-reference-types does not. And structs without default-constructors are not very nice, bacause you would have to check first in any operation if the wrapped by-reference-container is null.copy_if on a multidimensional container should not naively copy entire hyperplanes. More generally, I think that whenever an arbitrarily large object is to be copied, that should be explicit instead of implicit. A lot of focus in C++ is dedicated to making sure you don't copy the wrong thing.Why should it be explicit? The result of copy_if should be a copy and not a view on the original data, and even such a view should not rely on the input-types to be by-reference (by-value types should also be accessible by reference in such a view, unless it is const, but that is offtopic). When functions like copy_if, map or fold should be combined in a generic way, by-value-containers are much more intuitive and they work without guesses where to add a clone. And Array x(5, [0,1,2]) should result in a 5·3 array.It would be great to have D just obviate the necessity of parameter_type<T>::result in the first place, which is what auto ref is meant for (it's §).For me something like ref const scope (or ref scope for primitives) should be used as often as possible, it should even be the default in many cases (function parameters, foreach-variables etc.), although I understand that this is unlikely because of compatibility. Then by-value container would simply make the world consistent and the places where you would have to write ref or & would be less obtrusive and more generic than all the clone- stuff you would have to write with by-reference-containers.s/The/A/ I guess ;o).“User” has no specific meaning, it would even be my name, if I would not use anything in this world. :D The User
Dec 14 2010
Jonathan Schmidt-Dominé Wrote:For me something like ref const scope (or ref scope for primitives) should be used as often as possible, it should even be the default in many cases (function parameters, foreach-variables etc.)So you want your containers by reference everywhere?
Dec 14 2010
Sorry, ref const scope for such containers and PODs etc., const scope for primitives and references. Of course, that is exactly what you want to have for function-paramaters, temporary variables temporarily holding a read-only container-element, like foreach-variables etc. That would not influence the value-semantics.
Dec 15 2010
Sorry, ref const scope for such containers and PODs etc., const scope for primitives and references. Of course, that is exactly what you want to have for function-paramaters, temporary variables temporarily holding a read-only container-element, like foreach-variables etc. That would not influence the value-semantics.Maybe you want to say that this is waste because containers often have only one or two elements, pointer and length, “d”-pointer, etc. Well, I think there should be a possibility to tell D that passing a container as const ref does not require the extra-level of indirection. Same could be possible for non-const-references and for swapping. swap(x, y) would simply check how to swap x and y in a nice way (I think swap(x, y) is always better than tmp = x; x = y; y =x;). That would make it optimal against extra-indirection and unneccessary copying. However, even with normal references it would not require more indirection than by-reference types.
Dec 15 2010
copy_if on a multidimensional container should not naively copy entire hyperplanes. More generally, I think that whenever an arbitrarily large object is to be copied, that should be explicit instead of implicit. A lot of focus in C++ is dedicated to making sure you don't copy the wrong thing.In my opinion by-value-types are good for mathematical objects and data like numbers, tuples, sets, lists, vectors, dictionaries etc., by-reference-types are good for things like graphical objects/widgets, hardware-ressources, io- handles, factories etc. Unnecessary copies? The compiler should care about that, I want to have clear semantics and generic syntax.
Dec 14 2010
On Wed, 15 Dec 2010 03:06 +0100 Jonathan Schmidt-Domin=C3=A9 <devel the-user.org> wrote:ike=20copy_if on a multidimensional container should not naively copy entire hyperplanes. More generally, I think that whenever an arbitrarily large object is to be copied, that should be explicit instead of implicit. A lot of focus in C++ is dedicated to making sure you don't copy the wrong thing.=20 In my opinion by-value-types are good for mathematical objects and data l=numbers, tuples, sets, lists, vectors, dictionaries etc., by-reference-ty=pes=20are good for things like graphical objects/widgets, hardware-ressources, =io-handles, factories etc. Unnecessary copies? The compiler should care about that, I want to have=20 clear semantics and generic syntax.+++ The choice of value/ref should always be based on (human) semantics _only_.= Any other design (of language or app) is wrong. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 15 2010
On 2010-12-14 17:05:47 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:On 12/14/10 3:29 PM, Jonathan Schmidt-Domin wrote:That would depend on what you're doing. Sure, if the next thing you do is run an algorithm that swaps vectors or maps then will be stupidly slow. But if you're just filling the containers and iterating on the data later then it works quite well, and is likely to perform better than by-reference containers. Also, won't move semantics fix this whole performance problem? Reference semantics isn't the only solution.However, some reasons for by-value-containers: *First of all you often have to deal with mutli-dimensional data- structures, you map something to a map of lists of whatever and you want to manage suche data in a simple and generic way, simplification or extensions to the data- structure should not force you to refactor all more or less generic code- fragments. For example copying some entries around should not look different just because you added a dimension in your data-structure. But without proper value-semantics you are forced to do that, because at some point you will have to switch from by-value to by-reference because of limitations made somewhere.I think this argument goes exactly the other way. C++ containers have terrible compositional behavior. Using vector<vector<T> > or vector<map<T> > in C++98 is suicide. C++0x fixes that by means of introducing rvalue references, but reference semantics obviate all that.I agree writing 'const T &' everywhere in C++ is a pain, and it shouldn't be that way. Perhaps what we need is a way to tell the compiler that a certain type should automatically be passed by reference when given as a function argument. By passing them by reference as in 'ref', you know the reference won't escape the function's scope, and your container can even be located on the stack.*Another argument: It should be very simple (at least in C++ it is, I have never had problems with it, I just added the& here and there) to handle references to by-value-types, but wrapping by-reference-types into by-value- types is really ugly, although it may be the right thing somewhere."here and there" is more like "every time I define a function". I mean that's a lot, no?Wrapping could work either way, and after thinking about it a lot I have difficulty decreeing one is considerably easier/simpler than the other.Wrapping a by-reference inside a by-value container is easy, but wasteful (extra allocation, extra dereference, extra null pointer check). I think that was the point. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 14 2010
On 2010-12-14 16:29:18 -0500, Jonathan Schmidt-Domin <devel the-user.org> said:Just about my experiences: When trying to hack some algorithms quickly in Ruby I made a lot of mistakes because I had to care about a .clone everywhere and because Array.new(5, []) does not work as expected (sorry, but Array.new(5) { return [] } is not nice). So in fact C++ made my life easier than the new, stylish, simple Ruby-programming-language, because of the great by-value-containers in the STL.I have to echo a similar concern with by-reference containers from my experience of Cocoa. It's really too easy to have two references to the same container without realizing it. I feel much more secure that my logic is correct when I play with C++ containers than with Cocoa's. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 14 2010
Michel Fortin:I have to echo a similar concern with by-reference containers from my experience of Cocoa. It's really too easy to have two references to the same container without realizing it.A partial (but maybe better) solution to this problem is to introduce "linear types" in D, ad then let the compiler allocate a container on the stack as an automatic optimization where possible: http://en.wikipedia.org/wiki/Linear_types Bye, bearophile
Dec 14 2010
On Dec 15, 10 14:23, bearophile wrote:Michel Fortin:std.typecons.Unqiue ?I have to echo a similar concern with by-reference containers from my experience of Cocoa. It's really too easy to have two references to the same container without realizing it.A partial (but maybe better) solution to this problem is to introduce "linear types" in D, ad then let the compiler allocate a container on the stack as an automatic optimization where possible: http://en.wikipedia.org/wiki/Linear_types Bye, bearophile
Dec 15 2010
On Dec 16, 10 02:24, KennyTM~ wrote:On Dec 15, 10 14:23, bearophile wrote:(BTW, I meant 'Unique' :) )Michel Fortin:std.typecons.Unqiue ?I have to echo a similar concern with by-reference containers from my experience of Cocoa. It's really too easy to have two references to the same container without realizing it.A partial (but maybe better) solution to this problem is to introduce "linear types" in D, ad then let the compiler allocate a container on the stack as an automatic optimization where possible: http://en.wikipedia.org/wiki/Linear_types Bye, bearophile
Dec 15 2010
A partial (but maybe better) solution to this problem is to introduce "linear types" in D, ad then let the compiler allocate a container on the stack as an automatic optimization where possible: http://en.wikipedia.org/wiki/Linear_typesWell, then you would have a lot of null-ptrs when using the by-reference- containers, not very intuitive. What should it be good for?
Dec 15 2010
On 09/12/2010 21:55, Simon Buerger wrote:On 08.12.2010 23:45, Jonathan M Davis wrote:I would go further than that actually, it seems to me that the idea of by-value containers is completely idiotic. I was *hesitant* to say this because it goes against conventional C++ "wisdom" (or rather, C++ mentality), and I'm just a random junior programmer on a web forum, and I am saying it in a somewhat inflammatory way... But frankly, I've been thinking about it for the last few days (the issue came up earlier, in the "Destructors, const structs, and opEquals" thread), and I could not change my mind. For the love of life, how can anyone think this is a good idea? I'm struggling to find even one use-case where it would make sense. (a non-subjective use-case at least)On Wednesday, December 08, 2010 14:14:57 Simon Buerger wrote:For Every lib its a design descision if containers should be value- or reference-types. In C++ STL they are value-types (i.e. the copy-constructor does a real copy), while in tango and phobos the descision was to go for reference-types afaik, but I would like to be able to write value-types too, which isn't possible (in a really good way) currently. Following points would need some love (by-value containers are probably not the only area, where these could be useful)It's extremely rare in my experience that it makes any sense to copy a container on a regular basis. Having an easy means of creating a deep copy of a container or copying the elements from one container to another efficiently would be good, but having containers be value types is almost always a bad idea. It's just not a typical need to need to copy containers - certainly not enough to have them be copied just because you passed them to a function or returned them from one. I think that reference types for containers is very much the correct decision. There should be good ways to copy containers, but copying shouldn't be the default for much of anything in the way of containers.From a pragmatic viewpoint you are right, copying containers is rare. But on the other hand, classes imply a kind of identity, so that a set is a different obejct then an other object with the very same elements.Yeah, classes have identity, but they retain the concept of equality. So what's wrong with that? Equality comparisons would still work the same way as by-value containers.That feels wrong from an aesthetical or mathematical viewpoint.Aesthetics are very subjective (I can say the exact same thing about the opposite case). As for a mathematical viewpoint, yes, it's not exactly the same, but first of all, it's not generally a good idea to strictly emulate mathematical semantics in programming languages. So to speak, mathematical "objects" are immutable, and they exist in a magical infinite space world without the notion of execution or side-effects. Trying to model those semantics in a programming language brings forth a host issues (most of them performance-related). But more important, even if you wanted to do that (to have it right from a mathematical viewpoint), mutable by-value containers are just as bad, you should use immutable data instead.Furthermore, if you have for example a vector of vectors, vector!int row = [1,2,3]; auto vec = Vector!(Vector!int)(5, row); then vec should be 5 rows, and not 5 times the same row.Then instead of "Vector" use a static-length vector type, don't use a container. -- Bruno Medeiros - Software Engineer
Dec 21 2010
Bruno Medeiros:For the love of life, how can anyone think this is a good idea? I'm struggling to find even one use-case where it would make sense. (a non-subjective use-case at least)I agree that in general collections are better managed by reference. But if you need a hash that you know will not contain more than 10-20 items, and you need max performance, and you don't need to pass it around, than value in-place hash may be useful. I have used it some times. This is not a generic case, but beside the normal collections, Phobos2 may add few little value ones like this. Bye, bearophile
Dec 21 2010
On 21.12.2010 18:45, Bruno Medeiros wrote:On 09/12/2010 21:55, Simon Buerger wrote:Identity is wrong, because if I pass th set {1,2,3} to a function, I would like to pass exactly these three values, not some mutable object. This may imply that the function-parameter should be const, which is probably a good idea anyway. I want it to be mutable, I want to use "out"/"ref", the same way as with the simple builtin-types.From a pragmatic viewpoint you are right, copying containers is rare. But on the other hand, classes imply a kind of identity, so that a set is a different obejct then an other object with the very same elements.Yeah, classes have identity, but they retain the concept of equality. So what's wrong with that? Equality comparisons would still work the same way as by-value containers.You might be right that modeling mathematics is not perfect, at least in C/C++/D/java. Though the functional-programming is fine with it, and it uses immutable data just as you suggested. But I'm aware that thats not the way to go for D. Anyway, though total math-like behavior is impossible, but with auto A = Set(1,2,3); auto B = A; B.add(42); letting A and B have different contents is much closer to math, than letting both be equal. Though both is not perfect. And for the "immutable data": Its not perfectly possible, but in many circumstances it is considered good style to use "const" and "assumeUnique" as much as possible. It helps optimizing, multi-threading and code-correctness. So it is a topic not only in functional programming but also in D.That feels wrong from an aesthetical or mathematical viewpoint.Aesthetics are very subjective (I can say the exact same thing about the opposite case). As for a mathematical viewpoint, yes, it's not exactly the same, but first of all, it's not generally a good idea to strictly emulate mathematical semantics in programming languages. So to speak, mathematical "objects" are immutable, and they exist in a magical infinite space world without the notion of execution or side-effects. Trying to model those semantics in a programming language brings forth a host issues (most of them performance-related). But more important, even if you wanted to do that (to have it right from a mathematical viewpoint), mutable by-value containers are just as bad, you should use immutable data instead.Maybe you want to change that stuff later on, so static-length is no option. Following example might demonstrate the problem more clearly. It is intended to init a couple of sets to empty. set!int[42] a; version(by_reference_wrong): a[] = set!int.empty; // this does not work as intended version(by_reference_correct): foreach(ref x; a) x = set!int.empty; version(by_value): //nothing to be done, already everything empty Obviously the by_value version is the cleanest. Furthermore, the first example demonstrates that by-reference does not work together with the slice-syntax (which is equivalent to the constructor-call in my original example). Replacing "set!int.empty" with "new set!int" doesnt change the situation, but make it sound only more weird in my ears: "new vector"? what was wrong with the old one? and I dont want "_an_ empty set", I want "_the_ empty set". Every empty set is equal, so there is only one. Last but not least let me state: I do _not_ think, that value-containers will go into phobos/tango some day, that would to difficult in practice. I just want to state that there are certain reasons for it. (And originally this thread asked for some small changes in the language to make it possible, not the standard). Krox ps: I'll go on vacation now, see you next year, if there is still need for discussion. Merry christmas all :)Furthermore, if you have for example a vector of vectors, vector!int row = [1,2,3]; auto vec = Vector!(Vector!int)(5, row); then vec should be 5 rows, and not 5 times the same row.Then instead of "Vector" use a static-length vector type, don't use a container.
Dec 22 2010
Sorry for the long delay in replying. On 22/12/2010 12:04, Simon Buerger wrote:On 21.12.2010 18:45, Bruno Medeiros wrote:I don't understand this, it doesn't seem to make sense. You say you don't want the set to be "some mutable object", yet also say you "want it to be mutable". Does "it" refer to something else? I don't get it. Assuming just this text: " Identity is wrong, because if I pass th set {1,2,3} to a function, I would like to pass exactly these three values, not some mutable object. " Then pass in some unmodifiable collection. Hard to suggest a better alternative without a concrete example.On 09/12/2010 21:55, Simon Buerger wrote:Identity is wrong, because if I pass th set {1,2,3} to a function, I would like to pass exactly these three values, not some mutable object. This may imply that the function-parameter should be const, which is probably a good idea anyway. I want it to be mutable, I want to use "out"/"ref", the same way as with the simple builtin-types.From a pragmatic viewpoint you are right, copying containers is rare. But on the other hand, classes imply a kind of identity, so that a set is a different obejct then an other object with the very same elements.Yeah, classes have identity, but they retain the concept of equality. So what's wrong with that? Equality comparisons would still work the same way as by-value containers.Why is it not the way to go for D? Why is "total math-like behavior is impossible" ?You might be right that modeling mathematics is not perfect, at least in C/C++/D/java. Though the functional-programming is fine with it, and it uses immutable data just as you suggested. But I'm aware that thats not the way to go for D. Anyway, though total math-like behavior is impossible, but withThat feels wrong from an aesthetical or mathematical viewpoint.Aesthetics are very subjective (I can say the exact same thing about the opposite case). As for a mathematical viewpoint, yes, it's not exactly the same, but first of all, it's not generally a good idea to strictly emulate mathematical semantics in programming languages. So to speak, mathematical "objects" are immutable, and they exist in a magical infinite space world without the notion of execution or side-effects. Trying to model those semantics in a programming language brings forth a host issues (most of them performance-related). But more important, even if you wanted to do that (to have it right from a mathematical viewpoint), mutable by-value containers are just as bad, you should use immutable data instead.Not the best comparison, since the by_reference_correct version could be improved to something like: applyFill(a, set!int.empty) // if the last parameter is lazy or applyFill(a, { set!int.empty }) // otherwise, param is delegate instead but in any case this is just a very specific example. What about the other cases where by value could would be more verbose than by reference? (particularly when you want to avoid needless copies)Maybe you want to change that stuff later on, so static-length is no option. Following example might demonstrate the problem more clearly. It is intended to init a couple of sets to empty. set!int[42] a; version(by_reference_wrong): a[] = set!int.empty; // this does not work as intended version(by_reference_correct): foreach(ref x; a) x = set!int.empty; version(by_value): //nothing to be done, already everything empty Obviously the by_value version is the cleanest.Furthermore, if you have for example a vector of vectors, vector!int row = [1,2,3]; auto vec = Vector!(Vector!int)(5, row); then vec should be 5 rows, and not 5 times the same row.Then instead of "Vector" use a static-length vector type, don't use a container.Replacing "set!int.empty" with "new set!int" doesnt change the situation,but make it sound only more weird in my ears: "new vector"? what was wrong with the old one? and I dont want "_an_ empty set", I want "_the_ empty set". Every empty set is equal, so there is only one.The only way to truly solve this problem for you is to use by-value containers, and actually use them with value semantics all the time (ie, don't turn them into by-ref by passing pointers or other references to them) ! ...Well, this does contradict a bit what I originally said that "how can anyone think this is a good idea? " But note that I was talking about within the context of C++. If you use by-value containers in the described above (with actually value semantics usage) you will be sooner or later incurring heavy performance costs such that it's no longer a good idea to be using C++ in the first place. -- Bruno Medeiros - Software Engineer
Jan 27 2011
Jonathan Schmidt-Dominé Wrote:Just about my experiences: When trying to hack some algorithms quickly in Ruby I made a lot of mistakes because I had to care about a .clone everywhere and because Array.new(5, []) does not work as expected (sorry, but Array.new(5) { return [] } is not nice). So in fact C++ made my life easier than the new, stylish, simple Ruby-programming-language, because of the great by-value-containers in the STL.Hmm... never needed to clone a container. Is there a use case for by-value containers?
Dec 14 2010
Kagamin wrote:Hmm... never needed to clone a container. Is there a use case for by-value containers?I have implemented the Quine McCluskey algorithm in Ruby, it was really annoying and difficult to find the bugs.
Dec 15 2010
Jonathan Schmidt-Dominé Wrote:Kagamin wrote:What I understand from description, the algorithm consists of several steps of joining. Did you try to join in place?Hmm... never needed to clone a container. Is there a use case for by-value containers?I have implemented the Quine McCluskey algorithm in Ruby, it was really annoying and difficult to find the bugs.
Dec 15 2010
Kagamin wrote:What I understand from description, the algorithm consists of several steps of joining. Did you try to join in place?Can't remember, I had to move and copy some sets around and test which set of sets is the best in some steps or something like that, however, it is somehow irrelevant.
Dec 15 2010