digitalmars.D - opIndexMutable - a proposal
- Kevin Bealer (42/42) Apr 25 2005 There are problems with the current division of opIndex and opIndexAssig...
- Ben Hinkle (8/16) Apr 25 2005 It's actually tempting to chuck opIndexAssign and make opIndex return T*...
- Kevin Bealer (21/39) Apr 25 2005 That would work, and D does that for the hash table (AA), but in the cas...
- Ben Hinkle (21/73) Apr 25 2005 The on-the-fly container doesn't seem practical to me. The only use I'm
- Ben Hinkle (17/31) Apr 25 2005 If anyone's following at home here's a concrete example of what I mean:
- Kevin Bealer (20/46) Apr 25 2005 Approach #1: prioritize the mechanisms. If we had all three opIndex's:
- Ben Hinkle (25/83) Apr 25 2005 Not only annoying but I think language writers would freak at the ambigu...
- Uwe Salomon (14/14) Apr 26 2005 I like this idea very much. After the AA-discussion (should lookup inser...
- Ben Hinkle (6/20) Apr 26 2005 I disagree that fine-tuning indexing semantics have to wait for array op...
- Uwe Salomon (5/9) Apr 26 2005 No. I'm just not sure if/how the opIndexMutable() depend on the way the ...
- Ben Hinkle (24/36) Apr 26 2005 [snip]
- Kevin Bealer (28/66) Apr 26 2005 I like this. One of the good things about the STL is that the mechanism...
- Ben Hinkle (8/33) Apr 26 2005 agreed. I remember seeing proposals for inout return values (though that...
- Kevin Bealer (28/65) Apr 26 2005 No, they could both exist; what I mean is, it might be useful to be abl...
- Ben Hinkle (4/8) Apr 26 2005 Note also returning T* from opIndexMutable and having x[i] rewritten as
There are problems with the current division of opIndex and opIndexAssign. Specifically, opIndexAssign() only handles one case of modifying a container-owned object. Other cases, like passing the object via an inout parameter, or using the object with a method call x.methodCall() (for struct objects) do not work cleanly. The fix I propose is a new operator overload: T * opIndexMutable(int i); The pointer "tunnels" the mutability of the object out of the container. Expressions would be converted from Z[i] --> (*Z.opIndexMutable(i)) ContainerType z; void foo(inout Foo1 x); foo(z[10]); --> foo(*z.opIndexMutable(10)); z[10].methodCall(); --> z.opIndexMutable(10).methodCall(); . OR--> (*z.opIndexMutable(10)).methodCall(); This could be a post 1.0 feature. I think it could replace opIndexAssign() eventually, or just be applied to all "inout" and mutable situations where the "X[i]=?" pattern cannot be matched. As a further example of opIndexMutable()'s semantics, here is opIndexAssign() written in terms of opIndexMutable(): T opIndexAssign(int i, T value) { return *opIndexMutable(i) = value; } Whether opIndexAssign() is kept or not, or both methods kept but remaining independent, is a subject for debate; not sure where I stand on that. The motivating example (that I ran into) is: :class Container { : T opIndex(size_t i); : T opIndexAssign(size_t i, T value); : :private: : T[] data; // underlying storage :} Unfortunately, if T is a struct type, then: x[i].method(); // only modifies the temporary copy returned from opIndex x[i].field ++; // likewise -- the struct field is not updated permanently. In both cases, opIndexMutable() would provide faster results, (no struct copy needed) and of course, it would have the expected mutable action in these two cases. For classes, these mutable versions already work because a class is already returned by reference. Kevin
Apr 25 2005
"Kevin Bealer" <Kevin_member pathlink.com> wrote in message news:d4iqja$1qvr$1 digitaldaemon.com...There are problems with the current division of opIndex and opIndexAssign. Specifically, opIndexAssign() only handles one case of modifying a container-owned object. Other cases, like passing the object via an inout parameter, or using the object with a method call x.methodCall() (for struct objects) do not work cleanly. The fix I propose is a new operator overload: T * opIndexMutable(int i);It's actually tempting to chuck opIndexAssign and make opIndex return T* directly (as in C++). Then the compiler would turn x[n] into (*x.opIndex(n)) so that it becomes an lvalue. That would allow all the expressions x[n]=y, x[n]++, x[n].foo++ or whatever else one wants. Since D arrays keep the C++ indexing behavior (eg, looking up an element in an AA inserts) there's really no need for multiple indexing overloads.
Apr 25 2005
In article <d4iurp$1vig$1 digitaldaemon.com>, Ben Hinkle says..."Kevin Bealer" <Kevin_member pathlink.com> wrote in message news:d4iqja$1qvr$1 digitaldaemon.com...That would work, and D does that for the hash table (AA), but in the case of user designed containers, there may be a need for returning a copy of data, particularly if the data is constructed on the fly (returning the sum of two integers, maybe). The user might want the plain Index() version in that case. Also, in the case of X[i].callMethod(), the class version doesn't modify the container, so the language can call opIndex. The language only needs to call opIndexMutable() for struct and value types, or for classes if the reference is to be changed. struct x[i].method() -> opIndexMutable struct y = x[i].field -> opIndexMutable class .method() -> opIndex class inout arg -> opIndexMutable class x[i] = z; -> opIndexMutable or opIndexAssign primitive inout or ++ -> opIndexMutable primitive y = x[i]; -> opIndex In this way, the container can either enforce non-modifiability, or (for a tree based set or hash table) adjust the position in the tree if the value may have changed. KevinThere are problems with the current division of opIndex and opIndexAssign. Specifically, opIndexAssign() only handles one case of modifying a container-owned object. Other cases, like passing the object via an inout parameter, or using the object with a method call x.methodCall() (for struct objects) do not work cleanly. The fix I propose is a new operator overload: T * opIndexMutable(int i);It's actually tempting to chuck opIndexAssign and make opIndex return T* directly (as in C++). Then the compiler would turn x[n] into (*x.opIndex(n)) so that it becomes an lvalue. That would allow all the expressions x[n]=y, x[n]++, x[n].foo++ or whatever else one wants. Since D arrays keep the C++ indexing behavior (eg, looking up an element in an AA inserts) there's really no need for multiple indexing overloads.
Apr 25 2005
"Kevin Bealer" <Kevin_member pathlink.com> wrote in message news:d4jejt$2ge3$1 digitaldaemon.com...In article <d4iurp$1vig$1 digitaldaemon.com>, Ben Hinkle says...The on-the-fly container doesn't seem practical to me. The only use I'm aware of that depends on opIndex not changing the container is the concurrent AA use of opIndex to look up keys lock-free. I just grep'ed around in dfl, dtl, mango, mintl and they use/define opIndex in a way where it can be replaced with one that returns an lvalue (except for that concurrent AA). The uses of opIndexAssign can be more tricky actually since dfl, for example, uses opIndexAssign to keep a system resource (a combo-box) in sync with the D object. That wouldn't be possible with a single function. So there are fairly big downsides to going the C++ way."Kevin Bealer" <Kevin_member pathlink.com> wrote in message news:d4iqja$1qvr$1 digitaldaemon.com...That would work, and D does that for the hash table (AA), but in the case of user designed containers, there may be a need for returning a copy of data, particularly if the data is constructed on the fly (returning the sum of two integers, maybe). The user might want the plain Index() version in that case.There are problems with the current division of opIndex and opIndexAssign. Specifically, opIndexAssign() only handles one case of modifying a container-owned object. Other cases, like passing the object via an inout parameter, or using the object with a method call x.methodCall() (for struct objects) do not work cleanly. The fix I propose is a new operator overload: T * opIndexMutable(int i);It's actually tempting to chuck opIndexAssign and make opIndex return T* directly (as in C++). Then the compiler would turn x[n] into (*x.opIndex(n)) so that it becomes an lvalue. That would allow all the expressions x[n]=y, x[n]++, x[n].foo++ or whatever else one wants. Since D arrays keep the C++ indexing behavior (eg, looking up an element in an AA inserts) there's really no need for multiple indexing overloads.Also, in the case of X[i].callMethod(), the class version doesn't modify the container, so the language can call opIndex. The language only needs to call opIndexMutable() for struct and value types, or for classes if the reference is to be changed. struct x[i].method() -> opIndexMutable struct y = x[i].field -> opIndexMutable class .method() -> opIndex class inout arg -> opIndexMutable class x[i] = z; -> opIndexMutable or opIndexAssign primitive inout or ++ -> opIndexMutable primitive y = x[i]; -> opIndexThe ones with ++ or struct field dereferencing can be done the same way []= is done today pretty much. The table of what gets rewritten would be pretty large, though. I can see the inout rules as being hard/impractical to detect since it must come after the function resolution stage for the expression containing the indexing expression. If there were some syntax at the call be used to call opIndexMutable instead of opIndex. Something like y = func(inout x[n], 10, 20); The indexing rewriting rules would be getting more and more complex, but it might indeed be worth it.In this way, the container can either enforce non-modifiability, or (for a tree based set or hash table) adjust the position in the tree if the value may have changed.
Apr 25 2005
I can see the inout rules as being hard/impractical to detect since it must come after the function resolution stage for the expression containing the indexing expression.If anyone's following at home here's a concrete example of what I mean: void f(inout int x){...} void f(double x){...} struct X { int opIndexMutable(int n) {...} double opIndex(int n) {...} } ... X x; f(x[n]); Which f is picked? The compiler can't decide on opIndexMutable or opIndex until it knows if the first arg to f is inout or not. But it can't answer that until it can finish overload resolution based on the return type of the x[n] expression.If there were some syntax at the call site that one wanted an lvalue (eg instead of opIndex. Something like y = func(inout x[n], 10, 20); The indexing rewriting rules would be getting more and more complex, but it might indeed be worth it.With this variation the above example would pick opIndex. To get the opIndexMutable version you'd have to write f(inout x[n]);In this way, the container can either enforce non-modifiability, or (for a tree based set or hash table) adjust the position in the tree if the value may have changed.
Apr 25 2005
In article <d4k7io$7ds$1 digitaldaemon.com>, Ben Hinkle says...opIndex() is used in a "value consuming" context. opIndexAssign() is used if that can't be made to work. (*opIndexMutable()) is used if that can't be made to work. But as you say, this requires function call resolution and operator rewriting to be decided together, which is probably an annoying burden. ambiguous, so produce an error. You could work around it: f(*& x[i]); // force mutable version... (like your "inout", but uglier.) f(x, y[i]=, z); // invoke opMutable(), by analogy with opIndexAssign.I can see the inout rules as being hard/impractical to detect since it must come after the function resolution stage for the expression containing the indexing expression.If anyone's following at home here's a concrete example of what I mean: void f(inout int x){...} void f(double x){...} struct X { int opIndexMutable(int n) {...} double opIndex(int n) {...} } ... X x; f(x[n]); Which f is picked? The compiler can't decide on opIndexMutable or opIndex until it knows if the first arg to f is inout or not. But it can't answer that until it can finish overload resolution based on the return type of the x[n] expression.The truth is, I like the non-mutable opIndex() enough that I might be willing to stipulate that it can't coexist with opIndexMutable(). It's enough to be able to write non-mutable for the cases that need it. What I'm thinking of is interfaces to SQL backed data or "tied" hashes, where the data on disk has to be updated. In C++ you might do this by returning a proxy object, but opIndexAssign() is as close as you can get to operator= in D (and that decision is fine with me, in general). KevinIf there were some syntax at the call site that one wanted an lvalue (eg instead of opIndex. Something like y = func(inout x[n], 10, 20); The indexing rewriting rules would be getting more and more complex, but it might indeed be worth it.With this variation the above example would pick opIndex. To get the opIndexMutable version you'd have to write f(inout x[n]);
Apr 25 2005
"Kevin Bealer" <Kevin_member pathlink.com> wrote in message news:d4k9nu$9hs$1 digitaldaemon.com...In article <d4k7io$7ds$1 digitaldaemon.com>, Ben Hinkle says...Not only annoying but I think language writers would freak at the ambiguity.opIndex() is used in a "value consuming" context. opIndexAssign() is used if that can't be made to work. (*opIndexMutable()) is used if that can't be made to work. But as you say, this requires function call resolution and operator rewriting to be decided together, which is probably an annoying burden.I can see the inout rules as being hard/impractical to detect since it must come after the function resolution stage for the expression containing the indexing expression.If anyone's following at home here's a concrete example of what I mean: void f(inout int x){...} void f(double x){...} struct X { int opIndexMutable(int n) {...} double opIndex(int n) {...} } ... X x; f(x[n]); Which f is picked? The compiler can't decide on opIndexMutable or opIndex until it knows if the first arg to f is inout or not. But it can't answer that until it can finish overload resolution based on the return type of the x[n] expression.is ambiguous, so produce an error. You could work around it: f(*& x[i]); // force mutable version... (like your "inout", but uglier.)Good point. The use of &x[i] actually raises a question that I had about opIndexMutable - does it work with the & operator? I guess so but it wasn't clear to me from the first couple of posts and there doesn't seem to be any reason not to allow it. The unary & isn't overloadable in general (not that it is directly related to opIndexMutable).f(x, y[i]=, z); // invoke opMutable(), by analogy with opIndexAssign.(and probably other languages, too - marking an argument as pass-by-reference feels familiar) plus it makes it obvious the requested value has to be an lvalue. Adding it should be a pretty simple set of changes to Parser::parseArguments in dmd/src/parse.c that look for the 'inout' token and some poking around in expression.c with preFunctionsArguments than preprocesses the function arguments.If there were some syntax at the call site that one wanted an lvalue (eg instead of opIndex. Something like y = func(inout x[n], 10, 20); The indexing rewriting rules would be getting more and more complex, but it might indeed be worth it.With this variation the above example would pick opIndex. To get the opIndexMutable version you'd have to write f(inout x[n]);The truth is, I like the non-mutable opIndex() enough that I might be willing to stipulate that it can't coexist with opIndexMutable(). It's enough to be able to write non-mutable for the cases that need it.That would be a pity to lose the ability to have both in the same container. I'd love to add opIndexMutable to my MinTL containers (or some way to have indexing work in lvalue contexts) while continuing to have opIndex for rvalue contexts.What I'm thinking of is interfaces to SQL backed data or "tied" hashes, where the data on disk has to be updated. In C++ you might do this by returning a proxy object, but opIndexAssign() is as close as you can get to operator= in D (and that decision is fine with me, in general).yeah- it all depends on how complex Walter wants to make things. The more control he gives us programmers the more we can do but it also makes it more complex and potentially confusing for everyone. I expect he realizes something like opIndexMutable will be needed and so he needs to decide if/how he wants to go there. The half-way solution of today won't work forever.
Apr 25 2005
I like this idea very much. After the AA-discussion (should lookup insert a new element or not) i changed my opIndex() _not_ to insert a new element (if i recall it right, Ben himself convinced me of that idea). opIndexAssign() inserts the new element, of course. I think the point is that how this problem will be solved depends on how array operations are implemented. Walter said in an earlier thread that he moved the implementation of this feature behind 1.0 because he wants to do it perfectly (to make some impress on the Fortran guys?). Your idea may be a good possibility to implement array ops for own datastructures, but perhaps there are solutions that can be implemented faster? We have to wait for a definitive decision until it is clear how array ops are going to be implemented, i guess. Ciao uwe
Apr 26 2005
"Uwe Salomon" <post uwesalomon.de> wrote in message news:opspum6vpy6yjbe6 sandmann.maerchenwald.net...I like this idea very much. After the AA-discussion (should lookup insert a new element or not) i changed my opIndex() _not_ to insert a new element (if i recall it right, Ben himself convinced me of that idea). opIndexAssign() inserts the new element, of course. I think the point is that how this problem will be solved depends on how array operations are implemented. Walter said in an earlier thread that he moved the implementation of this feature behind 1.0 because he wants to do it perfectly (to make some impress on the Fortran guys?). Your idea may be a good possibility to implement array ops for own datastructures, but perhaps there are solutions that can be implemented faster? We have to wait for a definitive decision until it is clear how array ops are going to be implemented, i guess. Ciao uweI disagree that fine-tuning indexing semantics have to wait for array ops. I see array ops as giving meaning to expressions like x[] + y[] and x[] += 10. I think those operations are different enough from x[n]+y[n] and x[n]+=10 that we shouldn't have to wait. Or am I misunderstanding?
Apr 26 2005
I see array ops as giving meaning to expressions like x[] + y[] and x[] += 10. I think those operations are different enough from x[n]+y[n] and x[n]+=10 that we shouldn't have to wait. Or am I misunderstanding?No. I'm just not sure if/how the opIndexMutable() depend on the way the array ops are implemented. Thus i was more expressing a "fear" than an opinion. Ciao uwe
Apr 26 2005
"Kevin Bealer" <Kevin_member pathlink.com> wrote in message news:d4iqja$1qvr$1 digitaldaemon.com...There are problems with the current division of opIndex and opIndexAssign. Specifically, opIndexAssign() only handles one case of modifying a container-owned object. Other cases, like passing the object via an inout parameter, or using the object with a method call x.methodCall() (for struct objects) do not work cleanly. The fix I propose is a new operator overload: T * opIndexMutable(int i);[snip]T opIndexAssign(int i, T value) { return *opIndexMutable(i) = value; }This suggestion reminds me of the recent thread about AA indexing behavior http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/21045 I like the idea of introducing opIndexMutable to distinguish lvalue indexing from rvalue indexing. Assuming such a distinction becomes possible I'd like to see the AA API become: bit opIn(Key key) // possibly return value* instead Value opIndex(Key key) // throws on missing key Value* opIndexMutable(Key key) // lookup and insert if not present Value opIndexAssign(Value value, Key key) // see above definition bit contains(Key key, out Value value) void remove(Key key) // ignores missing key .. rest as before except "delete" is removed... (the difference between this list and the one I posted in that previous thread is "insert" is removed since it is covered by opIndexMutable) With this API Walter's word-count example could continue to use the statement count[word]++; since opIndexMutable would insert if not present. The key difference from today's AA behavior would be that rvalue indexing would not modify the container, the new "contains" function and the switch from "delete" to "remove".
Apr 26 2005
In article <d4lkqk$1m6m$1 digitaldaemon.com>, Ben Hinkle says..."Kevin Bealer" <Kevin_member pathlink.com> wrote in message news:d4iqja$1qvr$1 digitaldaemon.com...I like this. One of the good things about the STL is that the mechanisms for every container in C++ are uniform. C++ is not great with this, but the STL is all user space, so it has this property. D has a good start on this; it would be great if built-in containers could emulate existing containers in every way. It is something like the "namespace" argument used by reiserfs, and others: the bigger the set of things that speak the same language (interface) the better. After thinking about your "ref" argument, I'm thinking that the concept of opIndexMutable returning "T*" and the language doing (*op..) for you, is almost the same as if the method returned a "T &" or "inout T". Maybe "inout" or "ref" return types are the real question here. In C++ you can write this as such: T A::getData(); T & B::getData(); A a; B b; You can then say: T x = a.getData(); T y = b.getData(); . but if you say: foo(T & z); foo(a.getData()); // fails - passing a temporary to a non-const & foo(b.getData()); // works This is similar to the distinction between x[]= and =x[]. What I really want from opIndex{,Mutable} is the ability to return a reference or value without the client needing to distinguish. This is mostly doable with class but not struct or primitive types. KevinThere are problems with the current division of opIndex and opIndexAssign. Specifically, opIndexAssign() only handles one case of modifying a container-owned object. Other cases, like passing the object via an inout parameter, or using the object with a method call x.methodCall() (for struct objects) do not work cleanly. The fix I propose is a new operator overload: T * opIndexMutable(int i);[snip]T opIndexAssign(int i, T value) { return *opIndexMutable(i) = value; }This suggestion reminds me of the recent thread about AA indexing behavior http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/21045 I like the idea of introducing opIndexMutable to distinguish lvalue indexing from rvalue indexing. Assuming such a distinction becomes possible I'd like to see the AA API become: bit opIn(Key key) // possibly return value* instead Value opIndex(Key key) // throws on missing key Value* opIndexMutable(Key key) // lookup and insert if not present Value opIndexAssign(Value value, Key key) // see above definition bit contains(Key key, out Value value) void remove(Key key) // ignores missing key .. rest as before except "delete" is removed... (the difference between this list and the one I posted in that previous thread is "insert" is removed since it is covered by opIndexMutable) With this API Walter's word-count example could continue to use the statement count[word]++; since opIndexMutable would insert if not present. The key difference from today's AA behavior would be that rvalue indexing would not modify the container, the new "contains" function and the switch from "delete" to "remove".
Apr 26 2005
After thinking about your "ref" argument, I'm thinking that the concept of opIndexMutable returning "T*" and the language doing (*op..) for you, is almost the same as if the method returned a "T &" or "inout T".agreed. I remember seeing proposals for inout return values (though that sounds a bit odd since the 'in' part wouldn't make sense). That could make the rewriting rules a little cleaner. To the user the only difference would be when writing the opIndexMutable function if the return value needs & in front of it and if the return type is "T*" or "inout T" (or whatever).Maybe "inout" or "ref" return types are the real question here. In C++ you can write this as such: T A::getData(); T & B::getData(); A a; B b; You can then say: T x = a.getData(); T y = b.getData(); . but if you say: foo(T & z); foo(a.getData()); // fails - passing a temporary to a non-const & foo(b.getData()); // works This is similar to the distinction between x[]= and =x[]. What I really want from opIndex{,Mutable} is the ability to return a reference or value without the client needing to distinguish. This is mostly doable with class but not struct or primitive types. KevinI'm not sure what you are driving at. Are you saying you would like to remove opIndex and only have opIndexMutable which returns a reference instead of a pointer?
Apr 26 2005
In article <d4luik$1vtl$1 digitaldaemon.com>, Ben Hinkle says...No, they could both exist; what I mean is, it might be useful to be able to write (I'm using inout, but it could be one of the other syntaxen): class A { int b; inout T foo() { return b; } T bar() { return b; } } client code: A a; int i = a.foo(); int j = a.bar(); Essentially, I want the class to decide whether to provide a temporary or access to the internal 'foo'. If you rename foo() to opIndexMutable() and bar to opIndex(), its the behavior we've been talking about. foo returns a reference, bar returns a temporary. In the case of opIndex and opIndexMutable, you don't need the rewriting of x.foo() to (*x.foo()) to be special cased. The class (A) would decide whether the returned object was a temporary, the client code could just consume it. C++ lets you do this by returning a "&". Currently D (AFAIK) only returns temporaries, although often they are object references so they work about like references if you don't want to swap in a different reference. The C++ temporary / const& restriction in D would be something like: you can't use an unnamed temporary in an inout or out context. Not sure if D has anything like this yet. KevinMaybe "inout" or "ref" return types are the real question here. In C++ you can write this as such: T A::getData(); T & B::getData(); A a; B b; You can then say: T x = a.getData(); T y = b.getData(); . but if you say: foo(T & z); foo(a.getData()); // fails - passing a temporary to a non-const & foo(b.getData()); // works This is similar to the distinction between x[]= and =x[]. What I really want from opIndex{,Mutable} is the ability to return a reference or value without the client needing to distinguish. This is mostly doable with class but not struct or primitive types. KevinI'm not sure what you are driving at. Are you saying you would like to remove opIndex and only have opIndexMutable which returns a reference instead of a pointer?(from the other reply) Note also returning T* from opIndexMutable and having x[i] rewritten as *x.opIndexMutable(i) is in keeping with the original C definition of x[i] as *(x+i). So in some sense opIndexMutable is the most easily explained indexing overload (ie - it lets you replace x+i with some other algorithm).Yes... a return to the classics. Kevin
Apr 26 2005
After thinking about your "ref" argument, I'm thinking that the concept of opIndexMutable returning "T*" and the language doing (*op..) for you, is almost the same as if the method returned a "T &" or "inout T".Note also returning T* from opIndexMutable and having x[i] rewritten as *x.opIndexMutable(i) is in keeping with the original C definition of x[i] as *(x+i). So in some sense opIndexMutable is the most easily explained indexing overload (ie - it lets you replace x+i with some other algorithm).
Apr 26 2005