digitalmars.D - Possible rewrite of array operation spec
- Stewart Gordon (126/126) Feb 15 2005 This'll probably get people asking the prospect of array operations for
- xs0 (21/30) Feb 15 2005 Nice job on the spec (especially the different dimensions handling). I
- Stewart Gordon (17/31) Feb 15 2005 The essence of the essence is to keep the spec relatively simple and the...
- xs0 (15/28) Feb 15 2005 But it's not just an issue of optimization - if the spec says it must be...
- Stewart Gordon (26/47) Feb 16 2005 Well, the spec doesn't say that it _must_ be allocated. The spec states...
- xs0 (19/47) Feb 16 2005 Well, I read that in "a new array is created to hold the result";
- Stewart Gordon (24/66) Feb 16 2005 It's already supported. At the moment,
- xs0 (38/56) Feb 16 2005 but a[]=b doesn't look like a can be [re]allocated (and it can be, if it...
- Stewart Gordon (66/99) Feb 17 2005 What would be the sense of assigning in-place if the destination array
- xs0 (110/145) Feb 17 2005 that would, of course, make no sense :)
- Stewart Gordon (41/125) Feb 17 2005 Code that cares about performance would probably be written with at
- xs0 (73/115) Feb 17 2005 Again, why would this not be the default case (i.e. done by the
- Stewart Gordon (39/140) Feb 18 2005 At the moment, a[][][] is just the same as a[], since it's just slicing
- xs0 (63/104) Feb 18 2005 Well, I guess we understand semantics differently.. This is how I see it...
- Stewart Gordon (25/90) Feb 18 2005 Of course not, here it's obvious that I'm changing the contents of an
- xs0 (45/56) Feb 18 2005 Well, I should've written {{ int a; int *b=&a; a=2+3; }} and b still
- Georg Wrede (8/41) Feb 21 2005 What if we had an assignment operator that forces copy?
- Stewart Gordon (24/31) Feb 18 2005 Just thinking about this, at the moment we don't seem to have a means of...
-
Stewart Gordon
(9/22)
Feb 21 2005
- Regan Heath (5/25) Feb 17 2005 Doesn't array bounds checking handle this?
- xs0 (12/12) Feb 15 2005 One more thing I just thought of - there's quite a difference between th...
- Regan Heath (13/19) Feb 15 2005 I think they should be allowed, I don't think they look weird, and I thi...
-
Stewart Gordon
(18/33)
Feb 16 2005
- Regan Heath (9/37) Feb 16 2005 Which is exactly why I had them :)
- Stewart Gordon (11/25) Feb 17 2005 What do you mean? b = a on dynamic arrays is, by definition, a
- Regan Heath (33/56) Feb 17 2005 Gargh, you're totally right.
- Regan Heath (3/61) Feb 17 2005 forgot
- pragma (17/34) Feb 15 2005 That makes sense to me. Just allow arrays of function pointers, delegat...
- Unknown W. Brackets (14/21) Feb 15 2005 That, and the events, looks very nice in my opinion. Then again, would
- Stewart Gordon (25/45) Feb 16 2005 That would lead to such troubles as
- pragma (31/64) Feb 16 2005 That is a problem. On the one hand, this is obviously a potential sourc...
- Regan Heath (7/33) Feb 16 2005 What about:
- Norbert Nemec (3/7) Feb 16 2005 Bad idea: 'test[]' is an array by itself, so it has a length property by...
- Regan Heath (14/20) Feb 17 2005 Is it? I thought it was a class, with a length property.
- Stewart Gordon (14/21) Feb 18 2005 The trouble is that, at the moment, test[][] is equivalent to test[].
- Norbert Nemec (6/20) Feb 18 2005 Why "at the moment"? If you want to change this, I believe you are
- Norbert Nemec (22/22) Feb 16 2005 Nice work, Stewart! This is definitely much more than I have come up
- Stewart Gordon (7/7) Feb 16 2005 Another open question: should we allow all this on char arrays?
- Norbert Nemec (13/193) Feb 16 2005 One more detail: it should be clear, that the order of evaluation is not...
- Dave (16/209) Feb 17 2005 Norbert - a few questions on all this if you have the time.
- Norbert Nemec (25/40) Feb 17 2005 Very good questions indeed. I cannot easily give an answer to them.
- Dave (7/47) Feb 17 2005 Aye, in some ways at least. Maybe D can change some of that - I think th...
This'll probably get people asking the prospect of array operations for 1.0 to be resurrected, but still.... Here is a possible specification for array operations that I feel is better-defined than the one in the current out-of-date spec. Of course, there are still some open questions, which I've put at the bottom. Array operations ---------------- Arithmetic and bitwise operators are defined on array operands. An expression involving an array evaluates to a new array in which the operator has been applied to the elements of the operands in turn. In essence, when an expression contains more than one array operation, a new array is created to hold the result of each operation. However, a quality implementation will optimize the evaluation of the expression to eliminate temporaries where possible. Unary operations ~~~~~~~~~~~~~~~~ For the unary operators +, - and ~, the expression evaluates to a new array containing the result of applying the operator to each element. For example, with the declaration int[] x, y; then the statement y = -x; is simply equivalent to y = new int[x.length]; for (int i = 0; i < y.length; i++) { y[i] = -x[i]; } Binary operations ~~~~~~~~~~~~~~~~~ The binary operations supported are +, -, *, /, %, &, |, ^, <<, >> and >>>. If the two arrays are of the same dimension and of compatible types, then the expression evaluates to a new array in which each element is the result of applying the operator to corresponding elements of the operands. For example, with the declarations int[] x, y, z; the statement z = x + y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] + y[i]; } Both operands must be of the same length. If they are not, an ArrayBoundsError is thrown. For higher dimensions, this definition is applied recursively. For example, with int[][] x, y, z; the statement z = x * y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] * y[i]; } which is in turn equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = new int[x[i].length]; for (int j = 0; j < z[i].length; j++) { z[i][j] = x[i][j] * y[i][j]; } } If the operands do not match in dimension, then the operator is applied to each element of the higher-dimension operation with the whole of the lower-dimension one. For example, with int[] x, z; int y; the statement z = x - y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] - y; } Similarly, z = y - x; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = y - x[i]; } This definition is applied recursively if the dimensions differ by two or more. Assignment operations ~~~~~~~~~~~~~~~~~~~~~ When x is an array, the assignment x op= y; is taken as equivalent to x = x op y; whether y is an array of matching dimension, an array of lower dimension or a scalar. Thus the operation creates a new array and assigns it to x. If a sliced lvalue is used, the array is modified in place, so that x[] op= y; is equivalent to x[] = x[] op y; The preincrement and predecrement operators are handled in the same way. User-defined types ~~~~~~~~~~~~~~~~~~ A class, struct or union type may have operators overloaded with array types as parameters. To avoid conflicts between overloaded operators and array operations, binary operations involving both array and user-defined types are resolved as follows: 1. The normal operator overloading rules are applied. 2. If no match is found, the array operation rules are applied until both operands are reduced to scalar type; operator overloading rules are then applied to the result. 3. If the expression still does not resolve, it is an error. Open questions ~~~~~~~~~~~~~~ Should postincrement and postdecrement be allowed? How should they be handled? Should we generalise the concept to function calls? If so, I guess that overload resolution would work in much the same way as for operations on user-defined types. If we do allow it on function calls, should we allow it to work on functions of three or more parameters? In this case, the highest-dimension argument would be reduced to the dimension of the second highest, and then these two reduced together to match the third highest, and so on. Of course, these questions raise one more: how easy or hard would these ideas be to implement? Any thoughts? Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 15 2005
Nice job on the spec (especially the different dimensions handling). I do feel, however, that it should not be necessary to always create new arrays for result -- I'd say that if the left-hand side is non-null and of the appropriate size (i.e. exactly equal to size of result), it need not be reallocated. Consider just iterating through a list of vectors to find a specific one (for example, by multiplying each with some constant vector (i.e. dot product))- you wouldn't want a gazillion arrays created..Open questions ~~~~~~~~~~~~~~ Should postincrement and postdecrement be allowed? How should they be handled?I don't think so.. It would look weird and you can achieve the same by typing a+=1..Should we generalise the concept to function calls? If so, I guess that overload resolution would work in much the same way as for operations on user-defined types.Well, AFAIK, the whole point of having language-supported ops is run-time efficiency - you can otherwise easily wrap your arrays in classes like Vectors and Matrices and overload ops. Functions just don't make much sense in that respect (assuming you mean calling func(int, int) for each element of int[], int[]?)Of course, these questions raise one more: how easy or hard would these ideas be to implement?No idea :) If this is done, I'd also suggest adding a few util functions to arrays, like .min, .minIndex, .max, .maxIndex, .sum and .vectorLength (as in sqrt(a[0]^2 + a[1]^2 + ...)). Dot product is then simply (a*b).sum, and the compiler can detect this is being done and not even produce the result array.. xs0
Feb 15 2005
xs0 wrote:Nice job on the spec (especially the different dimensions handling). I do feel, however, that it should not be necessary to always create new arrays for result -- I'd say that if the left-hand side is non-null and of the appropriate size (i.e. exactly equal to size of result), it need not be reallocated. Consider just iterating through a list of vectors to find a specific one (for example, by multiplying each with some constant vector (i.e. dot product))- you wouldn't want a gazillion arrays created..That's basically what I said here:The essence of the essence is to keep the spec relatively simple and the semantics clear. <snip>However, a quality implementation will optimize the evaluation of the expression to eliminate temporaries where possible.If this is done, I'd also suggest adding a few util functions to arrays, like .min, .minIndex, .max, .maxIndex, .sum and .vectorLength (as in sqrt(a[0]^2 + a[1]^2 + ...)). Dot product is then simply (a*b).sum, and the compiler can detect this is being done and not even produce the result array..I started to mention this idea before. http://www.digitalmars.com/drn-bin/wwwnews?D/21671 But calling them .min and .max is asking for confusion with the minimum and maximum allowable values of a type. Maybe .minValue and .maxValue, making them parallel with the Index counterparts? And if there is a tie for maximum or minimum, which index should minIndex and maxIndex return? The first? The last? Any old one? What would be worth any difference in computational cost? Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 15 2005
But it's not just an issue of optimization - if the spec says it must be allocated, you can't change the semantics of that by optimization. Consider two objects holding a reference to the same array; if one of them does a vector op with the result stored in the variable holding the reference and if the spec says you must reallocate, they mustn't have the same reference after the op is finished (or actually just after it starts), and that can't be optimized away (at least according to such spec)..[snip - don't reallocate result array if it is already the correct size (and not null, of course)]That's basically what I said here:However, a quality implementation will optimize the evaluation of the expression to eliminate temporaries where possible.But calling them .min and .max is asking for confusion with the minimum and maximum allowable values of a type. Maybe .minValue and .maxValue, making them parallel with the Index counterparts?or .minEl and .maxEl (El for element)? and perhaps min/maxElIndex would be more obvious, although verbosity doesn't seem to be preferred in D :)And if there is a tie for maximum or minimum, which index should minIndex and maxIndex return? The first? The last? Any old one? What would be worth any difference in computational cost?When implemented by hand, this is usually the first matching index, I think, so that'd be an acceptable spec. Any old one is not very good - if it's the first (or the last), you can easily check for multiples with the slice syntax, otherwise you can't.. xs0
Feb 15 2005
xs0 wrote:Well, the spec doesn't say that it _must_ be allocated. The spec states what rather than how. So what matters is that the resulting behaviour is consistent.That's basically what I said here:But it's not just an issue of optimization - if the spec says it must be allocated, you can't change the semantics of that by optimization.However, a quality implementation will optimize the evaluation of the expression to eliminate temporaries where possible.Consider two objects holding a reference to the same array; if one of them does a vector op with the result stored in the variable holding the reference and if the spec says you must reallocate, they mustn't have the same reference after the op is finished (or actually just after it starts), and that can't be optimized away (at least according to such spec)..The distinction between reference assignment and copying is already covered by the current spec. In-place vector modification follows directly from copying. For example, consider int[] x, y, z; ... y = x; ... y = z * 2; then y refers to a new array, and x still refers to the original. If you want to modify/repopulate y in place, you would do y[] = z * 2; by which you are indicating that you want to preserve the state of x and y referring to the same data. The same applies if op= is used instead of just =. Of course, if x is never used again and has never been assigned to anything else by reference, the two would be effectively equivalent. <snip>How could I check for multiples with the slice syntax? Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.And if there is a tie for maximum or minimum, which index should minIndex and maxIndex return? The first? The last? Any old one? What would be worth any difference in computational cost?When implemented by hand, this is usually the first matching index, I think, so that'd be an acceptable spec. Any old one is not very good - if it's the first (or the last), you can easily check for multiples with the slice syntax, otherwise you can't..
Feb 16 2005
Well, the spec doesn't say that it _must_ be allocated. The spec states what rather than how. So what matters is that the resulting behaviour is consistent.Well, I read that in "a new array is created to hold the result"; however, having re-read the paragraph, I see that it can be read both ways.. It should be clear, though, and I think that it should explicitly say that the result array will be reused if it exists (is non-null) and is exactly the right size. If you want a new array, you can always use (a+b*c).dup or set result=null before the expression (or we can introduce a new syntax: result = new(a+b*c) :). You should have the option of reusing the array, though, so it should be supported in the spec.The distinction between reference assignment and copying is already covered by the current spec. In-place vector modification follows directly from copying. For example, consider int[] x, y, z; ... y = x; ... y = z * 2; then y refers to a new array, and x still refers to the original. If you want to modify/repopulate y in place, you would do y[] = z * 2; by which you are indicating that you want to preserve the state of x and y referring to the same data. The same applies if op= is used instead of just =.But why would you prefer a new array instead of in-place (when possible)? Considering arrays are references (with static arrays, you can't even create a new one, right?), it should be exactly the same if you say y=z*2 or x=z*2, as it is with objects, otherwise a whole lot of confusion and bugs will result from this, if you ask me..int[] data; int idx=data.minElIndex; if (data[idx+1..length].minEl == data[idx]) { // more than one } xs0When implemented by hand, this is usually the first matching index, I think, so that'd be an acceptable spec. Any old one is not very good - if it's the first (or the last), you can easily check for multiples with the slice syntax, otherwise you can't..How could I check for multiples with the slice syntax?
Feb 16 2005
xs0 wrote: <snip>Well, I read that in "a new array is created to hold the result"; however, having re-read the paragraph, I see that it can be read both ways.. It should be clear, though, and I think that it should explicitly say that the result array will be reused if it exists (is non-null) and is exactly the right size. If you want a new array, you can always use (a+b*c).dup or set result=null before the expression (or we can introduce a new syntax: result = new(a+b*c) :). You should have the option of reusing the array, though, so it should be supported in the spec.It's already supported. At the moment, int[] a, b; ... a = b; does a reference assignment. To reuse the result array, you use a[] = b; This would remain the same when array operations are involved.Because you want x to still contain the same old data, of course.The distinction between reference assignment and copying is already covered by the current spec. In-place vector modification follows directly from copying. For example, consider int[] x, y, z; ... y = x; ... y = z * 2; then y refers to a new array, and x still refers to the original. If you want to modify/repopulate y in place, you would do y[] = z * 2; by which you are indicating that you want to preserve the state of x and y referring to the same data. The same applies if op= is used instead of just =.But why would you prefer a new array instead of in-place (when possible)?Considering arrays are references (with static arrays, you can't even create a new one, right?), it should be exactly the same if you say y=z*2 or x=z*2, as it is with objects, otherwise a whole lot of confusion and bugs will result from this, if you ask me..Of course not. Why would you declare two references to the same array in the same scope if (x === y) is going to remain true throughout? <snip>Oh yes, that makes perfect sense. But can the idiom be made efficient? A further idea might be to add minElCount and maxElCount. The compiler could optimise so that minEl, minElIndex and minElCount are calculated with one pass (a bit like common subexps) if two or more of them are used without the array being modified between retrievals. (Or maybe there could be something that returns a structure of value, index and count.) Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.How could I check for multiples with the slice syntax?int[] data; int idx=data.minElIndex; if (data[idx+1..length].minEl == data[idx]) { // more than one }
Feb 16 2005
Stewart Gordon wrote:To reuse the result array, you use a[] = b; This would remain the same when array operations are involved.but a[]=b doesn't look like a can be [re]allocated (and it can be, if it is the wrong size or null). anyhow, it's not that important :)But in this case you'd normally do x=y.dup; y=...; Except in case of array expressions, where you suggest x=y; y=...; // y is newBut why would you prefer a new array instead of in-place (when possible)?Because you want x to still contain the same old data, of course.Of course not. Why would you declare two references to the same array in the same scope if (x === y) is going to remain true throughout?Who said anything about the same scope? Any anyhow, when I asked about your preference, I meant why do you think it's better to lose references as easily as possible, than the opposite? If there's only one reference, there is no argument - it's obviously better to reuse the array to avoid allocation/gc costs. If there's more than one reference, I'd definitely argue that the intention is usually to have them point to the same something (array or class) than to different somethings. If a snapshot of data is required, there's .dup (which is there in any case and serves exactly this purpose)Oh yes, that makes perfect sense. But can the idiom be made efficient? A further idea might be to add minElCount and maxElCount. The compiler could optimise so that minEl, minElIndex and minElCount are calculated with one pass (a bit like common subexps) if two or more of them are used without the array being modified between retrievals. (Or maybe there could be something that returns a structure of value, index and count.)Sure, a struct would work, it might needlessly complicate things, though.. In most cases you just want the min/max value and checking for multiples is much rarer. I'd then prefer a .count(el) method which counts the number of occurences of _any_ value, not just min/max. The compiler can still easily optimize occurrences like data.count(data.maxEl) or even int firstMinIdx; int minValue; int reps=data.count(minValue=data[firstMinIdx=data.minElIndex]); while always avoiding counting and/or keeping min value/index when it's not necessary (and having these done as fast as possible is the whole point of compiler support - writing functions that do these things is trivial; and just consider how much faster a .count(1) can be on a bit[] than fetching each bit separately :). There's another problem with a struct - if you'd want the whole struct, it would have to be templatized to hold any kind of value, producing a new template instance (costing disk, memory, compilation time etc..) on each type of array where you'd want access to these properties. xs0
Feb 16 2005
xs0 wrote:Stewart Gordon wrote:What would be the sense of assigning in-place if the destination array is going to be either the wrong size or null? I see two motives for assigning in-place: - to keep the data in sync when there are multiple references to it. But this already isn't going to work if you need to keep resizing the array, unless you wrap the array in a pointer or class to make it work. - to save the overhead of memory allocation during calculations. But when the arrays can arbitrarily change in size, this overhead just comes back.To reuse the result array, you use a[] = b; This would remain the same when array operations are involved.but a[]=b doesn't look like a can be [re]allocated (and it can be, if it is the wrong size or null). anyhow, it's not that important :)Only if there are other references to the same data somewhere in the program, and I want x to break away so that I can modify it independently. Otherwise, the .dup is just wasteful.But in this case you'd normally do x=y.dup; y=...;But why would you prefer a new array instead of in-place (when possible)?Because you want x to still contain the same old data, of course.Except in case of array expressions, where you suggest x=y; y=...; // y is newNo "except" to it. The semantics of a = ... and a[] = ... are completely independent of the form of the right operand.I'm not sure what you mean by this. My spec was written with three basic intentions: (a) to be well-defined (b) to keep D consistent within itself (c) to be backward compatible with the current assignment semantics Do you actually understand how my spec works? Let's look at some examples mechanically. Current D (semantics will be preserved): int[] x, y, z; ... z = x; assign x by reference to z z[] = x; copy the contents of x into z (requires that z and x are already the same size) z = x.dup; make a copy of x, and assign this copy by reference to z (The array previously referenced by z may still have references somewhere, or otherwise it would become eligible for GC.) With array ops: z = x + y; create an array of x[i] + y[i] assign this new array by reference to z z[] = x + y; create an array of x[i] + y[i] copy the contents of this new array in z (A decent compiler would implement this by adding the elements directly into z, bypassing the temporary array. But this is part of compiler optimisations, rather than of the basic semantics.) z = x * 2 + y; create an array of x[i] * 2 create an array of result[i] + y[i] assign this new result by reference to z (x * 2 is a temporary, and so the compiler can optimise it away.) z[] = x * 2 + y; create an array of x[i] * 2 create an array of result[i] + y[i] copy the contents of this new array into z (Here there are two temporaries: x * 2 and x * 2 + y.) See how they fit together now? <snip>Of course not. Why would you declare two references to the same array in the same scope if (x === y) is going to remain true throughout?Who said anything about the same scope? Any anyhow, when I asked about your preference, I meant why do you think it's better to lose references as easily as possible, than the opposite?Sure, a struct would work, it might needlessly complicate things, though. In most cases you just want the min/max value and checking for multiples is much rarer.<snip> Indeed, there's no reason we can't have both the struct and plain minEl and maxEl. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 17 2005
that would, of course, make no sense :) that's why I don't like a[]=b+c; a[] implies that a is already allocated and the right size (you even state that yourself below).. With what you're suggesting, any code that cares about performance will look like: while (...) { if (a==null || a.length!=x.length) a=new int[x.length]; a[]=x+y; ... } And I'd like it to simply look like while (...) { a=x+y; ... } while having the same thing done performance..but a[]=b doesn't look like a can be [re]allocated (and it can be, if it is the wrong size or null). anyhow, it's not that important :)What would be the sense of assigning in-place if the destination array is going to be either the wrong size or null?I see two motives for assigning in-place: - to keep the data in sync when there are multiple references to it. But this already isn't going to work if you need to keep resizing the array, unless you wrap the array in a pointer or class to make it work.I don't think most cases will be working with arrays of multiple sizes; when they do, my guess that the actual arrays will already be wrapped with their meta-data.. And like I suggested, you can alternatively leave other references alone by doing a=(x+y).dup where it's at least obvious you want the allocation done.- to save the overhead of memory allocation during calculations. But when the arrays can arbitrarily change in size, this overhead just comes back.True, but when the arrays don't arbitrarily change (like I said, IMO that is the majority), there's a chance to save a lot of overhead (with small arrays, memory allocation will easily take more time than calculations, so what's the point of having compiler support anyway then?).I'm not sure what you mean by this. My spec was written with three basic intentions: (a) to be well-defined (b) to keep D consistent within itself (c) to be backward compatible with the current assignment semanticsI don't see how either of these is broken by reusing the result array even when [] is not used. The spec can easily be well-defined by "if the result is assigned to an array reference which is non-null and of exactly the same dimensions as the result, its allocated memory will be reused; otherwise, a new array will be allocated to store the result." And that's far more similar to how slices work now (memory is reused, unless you extend a slice beyond the original size). As for consistency, consider: SomeClass a=new ...; SomeClass b=a; a+=5; // now, a is still the same object and a===b ---------- int[] a=new ...; int[] b=a; a+=5; // with my suggestion, a is still the same object and a===b // according to your spec a!==b OK, granted, reuse also breaks some consistency, but so does reallocating always.Do you actually understand how my spec works?Completely :)Let's look at some examples mechanically. Current D (semantics will be preserved):These all stay the same, even if arrays are reused.With array ops: z = x + y; create an array of x[i] + y[i] assign this new array by reference to zI really fail to see why a new array is necessary here, if it already exists. To me it looks the same, as if: int a=3; int* b=&a; a=2+3; // or even a+=2 if (*b==5) { // this is normal } else { // this is what you're suggesting with arrays } Now that I think about it, I don't even think the result should be completely new, instead its length can be adjusted. Then, you can allocate a big chunk of memory (to hold even your biggest inputs) and just use it without any allocation whatsoever. Basically, instead of what you wrote, z=x+y should translate to: -------------------------- assert(x.length=y.length); if (z==null) z=new int[x.length]; else if (z.length!=x.length) { // unless this is already checked z.length=x.length; } for(...) { ... } -------------------------- and z[]=x+y should translate to -------------------------- assert(z!=null && z.length==x.length && x.length==y.length); for(...) { ... } -------------------------- This means that z[] is expected to be non-null and the same size as x and y (and it is an error if it is not so), exactly like it is now when arrays are just copied (and that's the only current array op, afaik).z = x * 2 + y; z[] = x * 2 + y;This is exactly the same case.<snip> Indeed, there's no reason we can't have both the struct and plain minEl and maxEl.Have you even read what I wrote? What type would the struct be? It'd have to be MinMaxValues!(int) or something. Like I said, that needlessly instantiates templates and having .count(value) is more useful as it can be used for things other than counting min/max values.. Actually, min/maxElIndex and count are all that is necessary, although data.minEl does looks better than data[data.minElIndex]. To finish this post, I'd like to reiterate again that having array ops is not important just because of prettier syntax but because of the speed the compiler can achieve by exploiting the knowledge it gets when you use such operations (e.g. the whole point of Expression templates in C++ is to have the compiler see what it is you want to do, rather than just seeing parts of it). I assume you agree. Now, if speed is important, you don't want to waste time allocating memory (it costs allocation, collection of the old array, breaks cache on processor etc. etc). So, it is better to reuse the array even when the programmer forgots to do so (or is too lazy to write the checks), as most probably, it was still the effect he wanted. If it is not, he can still easily create new arrays explicitly just by typing .dup, which is far less characters than those nulls and lengths checking :) Finally, consider all the potential users that D will lose without this. I bet that someone that wants to evaluate D will not read enough to know better and will write a=b+c instead of a[]=b+c. Running that in a loop (of, say, 5_000_000_000 iterations :) just to see performance will make him run away, as billions of new vectors will get allocated. Why is it so slow? Why did something so simple allocate MBs of memory? I thought native code was supposed to be fast, so I guess this D and its compiler really suck. I'd better learn Fortran! :) xs0
Feb 17 2005
xs0 wrote: <snip>that's why I don't like a[]=b+c; a[] implies that a is already allocated and the right size (you even state that yourself below).. With what you're suggesting, any code that cares about performance will look like:Code that cares about performance would probably be written with at least some preconception of whether the array is going to be the same size.while (...) { if (a==null || a.length!=x.length) a=new int[x.length]; a[]=x+y; ... } And I'd like it to simply look like while (...) { a=x+y; ... }How about this? while (...) { a.length = x.length; a[] = x + y; } (Assuming that the contents of ... have some side effect on the length of x - otherwise how about making it even more efficient by putting the length assignment outside the loop?)while having the same thing done performance..Exactly. So it seems silly to convolute the language semantics for this minority of cases. <snip>I see two motives for assigning in-place: - to keep the data in sync when there are multiple references to it. But this already isn't going to work if you need to keep resizing the array, unless you wrap the array in a pointer or class to make it work.I don't think most cases will be working with arrays of multiple sizes;True, but when the arrays don't arbitrarily change (like I said, IMO that is the majority), there's a chance to save a lot of overhead (with small arrays, memory allocation will easily take more time than calculations, so what's the point of having compiler support anyway then?).Exactly. Then one would use the [].Where a is a dynamic array, a = ... by definition does a reference assignment. <snip>I'm not sure what you mean by this. My spec was written with three basic intentions: (a) to be well-defined (b) to keep D consistent within itself (c) to be backward compatible with the current assignment semanticsI don't see how either of these is broken by reusing the result array even when [] is not used.As for consistency, consider: SomeClass a=new ...; SomeClass b=a; a+=5; // now, a is still the same object and a===bYes, iff the class defines opAddAssign. (At least from what I can make of the spec.)---------- int[] a=new ...; int[] b=a; a+=5; // with my suggestion, a is still the same object and a===b // according to your spec a!==b OK, granted, reuse also breaks some consistency, but so does reallocating always.Hmm....Just think about it. Suppose z = x; does a reference assignment, but z = x + y; does an in-place modification. Then you are changing the semantics of the = operator by changing the right operand. And you're not even changing the type of the right operand, only its form. This is a cause of confusion that is bound to lead to bugs, never mind trying to generalise the semantics of = applied to arbitrary expressions.Do you actually understand how my spec works?Completely :)Let's look at some examples mechanically.Current D (semantics will be preserved):These all stay the same, even if arrays are reused.Because this is what the programmer asked for. <snip>With array ops: z = x + y; create an array of x[i] + y[i] assign this new array by reference to zI really fail to see why a new array is necessary here, if it already exists. To me it looks the same, as if:So, it is better to reuse the array even when the programmer forgots to do so (or is too lazy to write the checks), as most probably, it was still the effect he wanted. If it is not, he can still easily create new arrays explicitly just by typing .dup, which is far less characters than those nulls and lengths checking :)<snip> As has been said plenty of times, a program is only as good as the person who wrote it. And so if the programmer is lazy, he/she/it shouldn't be too surprised if the program is lazy. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 17 2005
How about this? while (...) { a.length = x.length; a[] = x + y; }Again, why would this not be the default case (i.e. done by the compiler?) Also consider that if a is int[][][], you'll also need to allocate all the int[][]'s and really really quickly that a=x+y becomes totally invisible, all because you didn't want to change your spec (which is completely fine, aside from the issue we're arguing over). And I'd guess that only the outer array gets reused with a[]=x+y? So what would you have the user type then? a[][][]=x+y? That's probably not even legal.. And what if a is of type T as in template(T)?Exactly. So it seems silly to convolute the language semantics for this minority of cases.There's nothing convoluted.. both a= and a[]= are defined only for things that are currently supported. x+y on arrays is not defined currently, so you can define = for array ops as you want. = doesn't do the same for structs and classes either, and no one seems to mind (even though they're really similar otherwise).<snip>you tend to snip my best arguments, methinks :PWhere a is a dynamic array, a = ... by definition does a reference assignment.I don't agree there's any such definition, but let's not go there. What if it is a static array, you'll force the users to type [] every time? When they (this time really) obviously want it in-place. Or would you have it work differently in each case?Yes, iff the class defines opAddAssign. (At least from what I can make of the spec.)obviously.. the point was, however, that a and b still point to the same object. a+=b acts like in-place for classes, is in-place for primitives, so why would it be different for arrays?Hmm what?int[] a=new ...; int[] b=a; a+=5; // with my suggestion, a is still the same object and a===b // according to your spec a!==b <snip>Hmm....Just think about it. Suppose z = x; does a reference assignment, but z = x + y; does an in-place modification.Maybe, but z+=x certainly indicates in-place modification, so it would seem you can't have it totally consistent anyhow.. If that is the case I'd certainly prefer in-place whenever possible.<snip>The semantics of the = operator are not always the same even without array ops. x+y is also not defined to be anything in particular (currently). Why do you keep insisting that they are? Considering that your spec itself says that the effect is the same as if you wrote the indicated code, you can just change the expansion in the spec to not reallocate when not necessary and there you have it.No, he didn't. I can't see from that statement that the programmer wishes to allocate a new z. _You_ say that is what he asked for; all I can see is he wanted z to contain the sum and if he really wanted a new array, he'd write (x+y).dup. It's the same with slices - sometimes you get a completely new memory space, and sometimes you don't, and the default is what's most efficient ; how come it doesn't bother anyone that setting subslice.length can have the effect of losing the reference to original memory? And that's exactly the same thing - you get to reuse memory when possible, and other cases are handled as well as they can be automatically. When you're certain you want a stand-alone copy, you use .dup, and that's it.Because this is what the programmer asked for.z = x + y;I really fail to see why a new array is necessary here, if it already exists. To me it looks the same, as if:Well, considering that everyone is a newbie at first, and that D does need more users, saying "they suck so why deal with them" is not helping anything.. class Sumator(T) { T sum; void set(T val) { sum=val; } void add(T val) { sum+=val; } T get() { return sum; } } Newbie or not, Sumator!(int) is fine, while Sumator!(int[]) totally sucks (unless, of course, you prefer code that runs really slowly) and you can't even make it better, unless you specialize it for all array dimensions in which you use it, which makes using a template pointless.. For what reason? And this is just a trivial case... To sum up, you'd have stuff defined like this: a = b + c; // create a new array (really bad performance-wise, // but still desired behavior like 1% of time) a[] = b + c; // fail if a is null/wrong size, otherwise work in-place // useful when the operands are really known to be the same // size all the time (that would be like 20% of time) a = (b+c).dup; // same as a=b+c (and as useless) And I'd have it defined like this: a = b + c; // work in-place when possible (useful like 80% of the time) a[] = b + c; // same as above a = (b+c).dup; // same as above So you'd have two syntaxes for the same crappy-most-of-the-time behavior, and you'd have the programmer write additional code for the most common case (for handling arbitrary lengths with memory reuse). Would you agree that is a good summary? xs0<snip on newbies><snip on newbies suck>
Feb 17 2005
xs0 wrote:At the moment, a[][][] is just the same as a[], since it's just slicing the outermost dimension again and again. Well done, you do have a point there. Maybe someone else has an idea....How about this? while (...) { a.length = x.length; a[] = x + y; }Again, why would this not be the default case (i.e. done by the compiler?) Also consider that if a is int[][][], you'll also need to allocate all the int[][]'s and really really quickly that a=x+y becomes totally invisible, all because you didn't want to change your spec (which is completely fine, aside from the issue we're arguing over). And I'd guess that only the outer array gets reused with a[]=x+y? So what would you have the user type then? a[][][]=x+y? That's probably not even legal.. And what if a is of type T as in template(T)?If you simply mean that structs are contained by value and classes by reference, then the semantics _are_ the same: copy what is represented by the right operand into the piece of memory represented by the left operand, whether this is a value or a reference.Exactly. So it seems silly to convolute the language semantics for this minority of cases.There's nothing convoluted.. both a= and a[]= are defined only for things that are currently supported. x+y on arrays is not defined currently, so you can define = for array ops as you want. = doesn't do the same for structs and classes either, and no one seems to mind (even though they're really similar otherwise).I wouldn't consider it at all obvious that someone who types z = x + y; really meant z[] = x + y; If I wrote the former, I would consider it obvious that it is what I meant. <snip><snip>you tend to snip my best arguments, methinks :PWhere a is a dynamic array, a = ... by definition does a reference assignment.I don't agree there's any such definition, but let's not go there. What if it is a static array, you'll force the users to type [] every time? When they (this time really) obviously want it in-place. Or would you have it work differently in each case?a+=b acts like in-place for classes,Like I said, only if the class defines opAddAssign. Otherwise, it's equivalent to a = a + b.is in-place for primitives,Because primitives are contained by value.so why would it be different for arrays?My thought is that it would be just like a class on which opAddAssign isn't defined.Good question....Hmm what?int[] a=new ...; int[] b=a; a+=5; // with my suggestion, a is still the same object and a===b // according to your spec a!==b <snip>Hmm....Again, there's always the option of using z[] += x.Just think about it. Suppose z = x; does a reference assignment, but z = x + y; does an in-place modification.Maybe, but z+=x certainly indicates in-place modification, so it would seem you can't have it totally consistent anyhow.. If that is the case I'd certainly prefer in-place whenever possible.It already does, in the second paragraph of the first heading. Except that it appears your use of the word "necessary" might be inconsistent with mine.<snip>The semantics of the = operator are not always the same even without array ops. x+y is also not defined to be anything in particular (currently). Why do you keep insisting that they are? Considering that your spec itself says that the effect is the same as if you wrote the indicated code, you can just change the expansion in the spec to not reallocate when not necessary and there you have it.The basic request is that any other references to the data pointed to by z before the assignment will still be pointing to the same data as they were. <snip>No, he didn't. I can't see from that statement that the programmer wishes to allocate a new z.Because this is what the programmer asked for.z = x + y;I really fail to see why a new array is necessary here, if it already exists. To me it looks the same, as if:To sum up, you'd have stuff defined like this: a = b + c; // create a new array (really bad performance-wise, // but still desired behavior like 1% of time) a[] = b + c; // fail if a is null/wrong size, otherwise work in-place // useful when the operands are really known to be the same // size all the time (that would be like 20% of time) a = (b+c).dup; // same as a=b+c (and as useless) And I'd have it defined like this: a = b + c; // work in-place when possible (useful like 80% of the time) a[] = b + c; // same as above a = (b+c).dup; // same as above So you'd have two syntaxes for the same crappy-most-of-the-time behavior, and you'd have the programmer write additional code for the most common case (for handling arbitrary lengths with memory reuse). Would you agree that is a good summary?Good at summarising, yes. Good at convincing me, I'm not sure. Maybe we should disallow a = b + c; altogether, instead requiring the coder to specify explicitly a[] = b + c; or a = (b + c).dup; Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 18 2005
At the moment, a[][][] is just the same as a[], since it's just slicing the outermost dimension again and again. Well done, you do have a point there. Maybe someone else has an idea....Wow .. You actually agreed with something :)If you simply mean that structs are contained by value and classes by reference, then the semantics _are_ the same: copy what is represented by the right operand into the piece of memory represented by the left operand, whether this is a value or a reference.Well, I guess we understand semantics differently.. This is how I see it for the = operator: int a=b; // copy value of b to a struct a=b; // copy contents of b to a (might seem the same as // the previous case, but struct is a composite // and furthermore is different from a class) class a=b; // make a point to b array a=b; // make a point to b array a[]=(int)b; // set all elements of a to b array a[]=b; // copy data from b to a array a=b[c..d]; // create a new slice with same contents as b[c..d] // (memory is reused when possible, so watch out) array a=b.dup; // create a copy of b and make a point to it These explanations are not all equal, so you can't say that the semantics are the same. IMHO, of course :)I wouldn't consider it at all obvious that someone who types z = x + y; really meant z[] = x + y; If I wrote the former, I would consider it obvious that it is what I meant.If you knew that all the vars are arrays, that z=x+y means that a new array gets created and you wrote that, you would indeed consider it obvious. It's not obvious in itself; if x, y and z are ints, you don't expect a new int to get allocated.. You also don't expect (*a)=2+3 to change the pointer, yet it is just as reference to int, as the above z is reference to array..Like I said, only if the class defines opAddAssign. Otherwise, it's equivalent to a = a + b.But regardless of whether defined opAddAssign is defined, when you see a+=b, you interpret it as in-place, even though behind the scenes something else happens. The difference is also that you write (or not) opAddAssign yourself, so you can change what is going on, which is not the case for arrays.My thought is that it would be just like a class on which opAddAssign isn't defined.But why would it not be like a class on which opAddAssign _is_ defined? At this point it can go either way and I don't see any real benefits in your case and I do see real benefits in my case.Again, there's always the option of using z[] += x.If you know in advance z is an array (and possibly even an array of dimension exactly 1). What would you write in all other cases?It already does, in the second paragraph of the first heading. Except that it appears your use of the word "necessary" might be inconsistent with mine.Well, the whole spec could just as easily be defined only with loop expansions of such expressions and there'd be no need to create new arrays and then optimize them away (isn't that kind of pointless?). (a+b*c).dup can be defined as a special case that actually creates a new array and there you have it. The only issue is with what checks are done, where I again suggest: a=b+c (or any other expression with arrays) means that a is made to be a valid array to hold the result of b+c (the "it just works in 99% of cases" principle); if it already is, nothing is done; if it's null, it gets allocated; if it's the wrong size, its .length is adjusted. this is really useful functionality and saves a lot of typing, errors, CPU time and whatnot, even though it doesn't change a as reference, only a's length/values/nullness.. a[]=b+c, on the other hand, means that a is already expected to be the right size and it is an error, if it is not (like it currently is with a[]=b or a[]=b[]). Just like a=b+c except an added assert(a!=null && a.length=b.length).Why is this a basic request? Slices also have perfectly useful behavior, but it's not required that either they point to the same memory or to different memory; it works in a way that's efficient (memory gets reused) yet still user-friendly (if reuse is not possible, it automatically gets reallocated). Why not have the same with array ops? Automatically being the keyword - so it just works.No, he didn't. I can't see from that statement that the programmer wishes to allocate a new z.The basic request is that any other references to the data pointed to by z before the assignment will still be pointing to the same data as they were.Maybe we should disallow a = b + c; altogether, instead requiring the coder to specify explicitly a[] = b + c; or a = (b + c).dup;Well, that's an option, but I don't think it's a good one, unless we get a new operator that does what I'm trying to suggest (automatic handling of stuff that really can be automatically handled). Like <== or something, and it can be equal to = in all cases but arrays (so templates are possible). With templates you can then use = if you don't expect arrays (as in you feel that they will not work), and <== if you do expect/support them. xs0
Feb 18 2005
xs0 wrote: <snip>Of course not, here it's obvious that I'm changing the contents of an already allocated piece of memory. Just like it is when I use [] to force in-place assignment. <snip>I wouldn't consider it at all obvious that someone who types z = x + y; really meant z[] = x + y; If I wrote the former, I would consider it obvious that it is what I meant.If you knew that all the vars are arrays, that z=x+y means that a new array gets created and you wrote that, you would indeed consider it obvious. It's not obvious in itself; if x, y and z are ints, you don't expect a new int to get allocated.. You also don't expect (*a)=2+3 to change the pointer, yet it is just as reference to int, as the above z is reference to array..Again, you have a point or two.My thought is that it would be just like a class on which opAddAssign isn't defined.But why would it not be like a class on which opAddAssign _is_ defined? At this point it can go either way and I don't see any real benefits in your case and I do see real benefits in my case.Again, there's always the option of using z[] += x.If you know in advance z is an array (and possibly even an array of dimension exactly 1). What would you write in all other cases?This would seem a step back towards the poorly-defined old spec. But I'm not sure. Moreover, we should certainly be able to pass an array expression to a function that expects an array. <snip>It already does, in the second paragraph of the first heading. Except that it appears your use of the word "necessary" might be inconsistent with mine.Well, the whole spec could just as easily be defined only with loop expansions of such expressions and there'd be no need to create new arrays and then optimize them away (isn't that kind of pointless?).0Yes, it should reuse memory where possible. But only where the compiler can determine that this doesn't affect the behaviour of the program - and hence again this would be part of optimisation rather than language spec. After all, D is designed with the optimisation technology of modern compilers in mind.The basic request is that any other references to the data pointed to by z before the assignment will still be pointing to the same data as they were.Why is this a basic request? Slices also have perfectly useful behavior, but it's not required that either they point to the same memory or to different memory; it works in a way that's efficient (memory gets reused) yet still user-friendly (if reuse is not possible, it automatically gets reallocated). Why not have the same with array ops? Automatically being the keyword - so it just works.Yes, generic programming implications are something else to consider. Which option are you suggesting that <== should be equivalent to? The "don't expect arrays" case could also cover instances that don't do arithmetic on the parameter type at all, and as such will work implicitly. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.Maybe we should disallow a = b + c; altogether, instead requiring the coder to specify explicitly a[] = b + c; or a = (b + c).dup;Well, that's an option, but I don't think it's a good one, unless we get a new operator that does what I'm trying to suggest (automatic handling of stuff that really can be automatically handled). Like <== or something, and it can be equal to = in all cases but arrays (so templates are possible). With templates you can then use = if you don't expect arrays (as in you feel that they will not work), and <== if you do expect/support them.
Feb 18 2005
Of course not, here it's obvious that I'm changing the contents of an already allocated piece of memory. Just like it is when I use [] to force in-place assignment.Well, I should've written {{ int a; int *b=&a; a=2+3; }} and b still points to a.. But anyhow, we now seem to agree that there are issues, so let's leave = alone :)Yes, it should reuse memory where possible. But only where the compiler can determine that this doesn't affect the behaviour of the program - and hence again this would be part of optimisation rather than language spec.Well, but this is exactly the problem - the compiler can't determine that (except for inner operations, as in + and * in (a+b*c) and possibly for local variables that are created within the function and don't get passed as parameters and don't get sliced or casted to pointers or whatnot..) But that totally ruins reuse between function calls, for example, and also a bunch of other cases.. (even if no other method touches an object var and it is protected from the outside world, you can still extend the class and write a method that does, for example). I don't believe it is a problem to reuse even when it could affect the behavior, if it is documented. First, in many cases it will not really affect the behavior, even if the compiler can't determine that. Second, in a lot of other cases, this will actually be desired. In the few cases it will not be desired, the programmer will know to take care of that, as it can be the first thing written in the description of <== operator and assuming that a=b+c is made illegal for arrays, all cases are covered.. And, for the umpteenth time, slices also reuse memory without causing major problems :)Yes, generic programming implications are something else to consider. Which option are you suggesting that <== should be equivalent to?a <== expr means variable a gets allocated (if null) or resized (if non-null but wrong size) so it matches expr, including all potential dimensions. This also means that it will be useful as a replacement of a[]=b (as in a<==b), when you'll want lazy allocation and/or automatic handling of different sizes.. For non-array types, it behaves exactly like =.The "don't expect arrays" case could also cover instances that don't do arithmetic on the parameter type at all, and as such will work implicitly.Well, if all you do is a=b, it will still work with arrays, although if that's really all you do, it shouldn't be an issue? Of course, there's also op=. Which one should it be ( op= or op<== ), assuming not both get supported (which will probably not happen.. think about <=<== or >>><== or !<>=<== :)? I'd vote for <==, as op= does mean in-place in the case where it is normally used (i.e. primitives). To further improve templating it might also make sense to extend .dup to value-based types (to handle cases where you do want copies made). Unlike most other properties, it does make sense on all types - it says I want a copy. On primitive types and structs, .dup would just be their value itself (and that is their copy already). Classes can also implement .dup to return their clone. This only leaves a slight inefficiency when you do something like {{ a=(b+c*d).dup }} with classes, but when this becomes an issue, you can handle all class types with a single template specialization for Object. With all that, .minIdx, .maxIdx, .sum and .count and decent compiler support, some really neat things could be done, effective both in amount of code needed and resulting speed. xs0
Feb 18 2005
xs0 wrote:....Of course not, here it's obvious that I'm changing the contents of an already allocated piece of memory. Just like it is when I use [] to force in-place assignment.Well, I should've written {{ int a; int *b=&a; a=2+3; }} and b still points to a.. But anyhow, we now seem to agree that there are issues, so let's leave = alone :)Well, if all you do is a=b, it will still work with arrays, although if that's really all you do, it shouldn't be an issue? Of course, there's also op=. Which one should it be ( op= or op<== ), assuming not both get supported (which will probably not happen.. think about <=<== or >>><== or !<>=<== :)? I'd vote for <==, as op= does mean in-place in the case where it is normally used (i.e. primitives). To further improve templating it might also make sense to extend .dup to value-based types (to handle cases where you do want copies made).What if we had an assignment operator that forces copy? That would certainly stand out better in the code than a .dup hidden somewhere in a long expression. (I know that mentioning this probably causes a riot (if anyone else is actually still reading this thread)), so the case would have to be really strong. Is it?Unlike most other properties, it does make sense on all types - it says I want a copy. On primitive types and structs, .dup would just be their value itself (and that is their copy already). Classes can also implement .dup to return their clone. This only leaves a slight inefficiency when you do something like {{ a=(b+c*d).dup }} with classes, but when this becomes an issue, you can handle all class types with a single template specialization for Object. With all that, .minIdx, .maxIdx, .sum and .count and decent compiler support, some really neat things could be done, effective both in amount of code needed and resulting speed. xs0
Feb 21 2005
xs0 wrote: <snip>And I'd guess that only the outer array gets reused with a[]=x+y? So what would you have the user type then? a[][][]=x+y? That's probably not even legal.. And what if a is of type T as in template(T)?Just thinking about this, at the moment we don't seem to have a means of deep in-place copying of nested dynamic arrays. If we're going to support deep in-place assignment when doing arithmetic, we should have a corresponding means of simply copying an array in this way. And so I guess the syntax would correspond. <snip>I don't agree there's any such definition, but let's not go there. What if it is a static array, you'll force the users to type [] every time? When they (this time really) obviously want it in-place. Or would you have it work differently in each case?<snip> Oops, I kind of slipped in my last reply. With static arrays, always assigning by value. I didn't get round to covering static arrays in my proposal, but figured it would be straightforward. Indeed, when assigning to a static array, obviously it would be in place, since static arrays are by value. And if you have int[6][] x = new int[6][42]; x[] = ...; then the assignment would be fully in place. But at least one question remains: how should assignment of an expression on static arrays to a dynamic array be handled? (Is this a reference assignment at the moment? I should expect so.) Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 18 2005
Stewart Gordon wrote:xs0 wrote: <snip><snip> Just thinking about it again. Is it really worth having deep in-place assignment, whether doing arithmetic or simple copying? Or might it just as well wait until we get true multidimensional arrays? Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.And I'd guess that only the outer array gets reused with a[]=x+y? So what would you have the user type then? a[][][]=x+y? That's probably not even legal.. And what if a is of type T as in template(T)?Just thinking about this, at the moment we don't seem to have a means of deep in-place copying of nested dynamic arrays. If we're going to support deep in-place assignment when doing arithmetic, we should have a corresponding means of simply copying an array in this way. And so I guess the syntax would correspond.
Feb 21 2005
On Thu, 17 Feb 2005 17:29:44 +0100, xs0 <xs0 xs0.com> wrote:Doesn't array bounds checking handle this? With bounds checking on, it will be handled at runtime with an exception. With it off, it will be fast as it has no checking. Reganthat would, of course, make no sense :) that's why I don't like a[]=b+c; a[] implies that a is already allocated and the right size (you even state that yourself below).. With what you're suggesting, any code that cares about performance will look like: while (...) { if (a==null || a.length!=x.length) a=new int[x.length]; a[]=x+y; ... } And I'd like it to simply look like while (...) { a=x+y; ... } while having the same thing done performance..but a[]=b doesn't look like a can be [re]allocated (and it can be, if it is the wrong size or null). anyhow, it's not that important :)What would be the sense of assigning in-place if the destination array is going to be either the wrong size or null?
Feb 17 2005
One more thing I just thought of - there's quite a difference between these: a = b; // copy _reference_ a = -b; // copy _contents_ and negate Which might not be good.. However, using slice syntax doesn't quite look that good: a[] = b[] + c[] * d[]; // abcd are somewhat lost in there Perhaps a slightly modified syntax could be used? Something like a <-- b + c * d; I'm sure not many people will like that :) It does clearly differentiate between the usual meaning of = and the looping on elements done with arrays, though.. Any thoughts? xs0
Feb 15 2005
On Tue, 15 Feb 2005 19:15:45 +0100, xs0 <xs0 xs0.com> wrote:I think they should be allowed, I don't think they look weird, and I think they're useful eg. int[] a; int[] b; int[] c; b[] = a[]--; //assigns b[x] = a[x], then a[x] = a[x]-1; b[] = c[] + a[]--; //assigns b[x] = c[x] + a[x], then a[x] = a[x]-1; Basically it's done as 2 operations, same as: int i; int j; j = i--; ReganOpen questions ~~~~~~~~~~~~~~ Should postincrement and postdecrement be allowed? How should they be handled?I don't think so.. It would look weird and you can achieve the same by typing a+=1..
Feb 15 2005
Regan Heath wrote:On Tue, 15 Feb 2005 19:15:45 +0100, xs0 <xs0 xs0.com> wrote:<snip>Open questions ~~~~~~~~~~~~~~ Should postincrement and postdecrement be allowed? How should they be handled?I think they should be allowed, I don't think they look weird, and I think they're useful eg. int[] a; int[] b; int[] c; b[] = a[]--; //assigns b[x] = a[x], then a[x] = a[x]-1; b[] = c[] + a[]--; //assigns b[x] = c[x] + a[x], then a[x] = a[x]-1;<snip> If you've got the []s to indicate in-place assignment, then this makes sense. But what about b = a--; ? Two possible interpretations: b = a; a = a.dup; foreach (inout x; a) x--; or b = a.dup; foreach (inout x; a) x--; Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 16 2005
On Wed, 16 Feb 2005 12:58:46 +0000, Stewart Gordon <smjg_1998 yahoo.com> wrote:Regan Heath wrote:Which is exactly why I had them :)On Tue, 15 Feb 2005 19:15:45 +0100, xs0 <xs0 xs0.com> wrote:<snip>Open questions ~~~~~~~~~~~~~~ Should postincrement and postdecrement be allowed? How should they be handled?I think they should be allowed, I don't think they look weird, and I think they're useful eg. int[] a; int[] b; int[] c; b[] = a[]--; //assigns b[x] = a[x], then a[x] = a[x]-1; b[] = c[] + a[]--; //assigns b[x] = c[x] + a[x], then a[x] = a[x]-1;<snip> If you've got the []s to indicate in-place assignment, then this makes sense.But what about b = a--; ?I probably should have thought about those too...Two possible interpretations: b = a; a = a.dup; foreach (inout x; a) x--;Not this, because "b = a" is a copy i.e. "b = a.dup"or b = a.dup; foreach (inout x; a) x--;This is probably the most sensible. or Error: postdecrement is not allowed on 'a' did you mean 'a[]' Regan
Feb 16 2005
Regan Heath wrote:On Wed, 16 Feb 2005 12:58:46 +0000, Stewart Gordon <smjg_1998 yahoo.com> wrote:<snip>What do you mean? b = a on dynamic arrays is, by definition, a reference assignment.b = a; a = a.dup; foreach (inout x; a) x--;Not this, because "b = a" is a copy i.e. "b = a.dup"<snip> And it's also consistent with what can be done with opPostInc and opPostDec on a class. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.or b = a.dup; foreach (inout x; a) x--;This is probably the most sensible.
Feb 17 2005
On Thu, 17 Feb 2005 11:10:13 +0000, Stewart Gordon <smjg_1998 yahoo.com> wrote:Regan Heath wrote:Gargh, you're totally right. If "b = a;" is: - b reference a If "a--;" is: - create a new array - set new[x] = a[x] - 1; Then "b = a--;" is: - b reference a - create new array - set new[x] = a[x] - 1 - a reference new which is exactly what you had above :)On Wed, 16 Feb 2005 12:58:46 +0000, Stewart Gordon <smjg_1998 yahoo.com> wrote:<snip>What do you mean? b = a on dynamic arrays is, by definition, a reference assignment.b = a; a = a.dup; foreach (inout x; a) x--;Not this, because "b = a" is a copy i.e. "b = a.dup"At the time I was thinking it was consistent with: int i = 5; int j; j = i--; The result in both cases is however the same, b contains what a did, and a contains a[x]-1, eg.<snip> And it's also consistent with what can be done with opPostInc and opPostDec on a class.or b = a.dup; foreach (inout x; a) x--;This is probably the most sensible.- b reference a - create new array - new[x] = a[x] - 1 - a reference new ----b = a; a = a.dup; foreach (inout x; a) x--;- b reference a.dup - a[x] = a[x] - 1 ---- I think the top one is more consistent with D's reference semantics, the bottom one is consistent with D's value type semantics i.e. it's what happens with ints. So I reckon the top one is the 'correct' one. Reganb = a.dup; foreach (inout x; a) x--;
Feb 17 2005
On Fri, 18 Feb 2005 11:55:09 +1300, Regan Heath <regan netwin.co.nz> wrote:On Thu, 17 Feb 2005 11:10:13 +0000, Stewart Gordon <smjg_1998 yahoo.com> wrote:forgot - a reference newRegan Heath wrote:Gargh, you're totally right. If "b = a;" is: - b reference a If "a--;" is: - create a new array - set new[x] = a[x] - 1;On Wed, 16 Feb 2005 12:58:46 +0000, Stewart Gordon <smjg_1998 yahoo.com> wrote:<snip>What do you mean? b = a on dynamic arrays is, by definition, a reference assignment.b = a; a = a.dup; foreach (inout x; a) x--;Not this, because "b = a" is a copy i.e. "b = a.dup"Then "b = a--;" is: - b reference a - create new array - set new[x] = a[x] - 1 - a reference new which is exactly what you had above :)At the time I was thinking it was consistent with: int i = 5; int j; j = i--; The result in both cases is however the same, b contains what a did, and a contains a[x]-1, eg.<snip> And it's also consistent with what can be done with opPostInc and opPostDec on a class.or b = a.dup; foreach (inout x; a) x--;This is probably the most sensible.- b reference a - create new array - new[x] = a[x] - 1 - a reference new ----b = a; a = a.dup; foreach (inout x; a) x--;- b reference a.dup - a[x] = a[x] - 1 ---- I think the top one is more consistent with D's reference semantics, the bottom one is consistent with D's value type semantics i.e. it's what happens with ints. So I reckon the top one is the 'correct' one. Reganb = a.dup; foreach (inout x; a) x--;
Feb 17 2005
In article <cussnt$fe8$1 digitaldaemon.com>, Stewart Gordon says...Should we generalise the concept to function calls? If so, I guess that overload resolution would work in much the same way as for operations on user-defined types.That makes sense to me. Just allow arrays of function pointers, delegates and anything with opCall() defined to be callable en masse. At a minimum, it would make event-based (1 to n messaging) programming a snap. I would also think that it could be extended to object members and methods as well?interface Foo{ int foo(); void bar(); }Foo[] test; test.foo(); // calls .foo() for each object int[] result = test.bar(); // calls .bar() for each object, result is an array.Seems exotic at first, but it does lend to D's mantra of being 'context free'.If we do allow it on function calls, should we allow it to work on functions of three or more parameters? In this case, the highest-dimension argument would be reduced to the dimension of the second highest, and then these two reduced together to match the third highest, and so on.I'm not sure I follow you here. Sounds like you're talking about using the function syntax for mapping to the array dimension space... is this correct?Of course, these questions raise one more: how easy or hard would these ideas be to implement?Not hard, but would require a certain familiarity with the frontend that only Walter and an intrepid few programmers share. There are also some holes, that would have to be handled, if one adhered to just this spec by itself. It would be much easier to implement the above as templates first (as a proof-of-concept) simply because each of your examples are merely transforms based on type and dimension. It could help with refining the spec. One question: What about associative arrays? - EricAnderton at yahoo
Feb 15 2005
That, and the events, looks very nice in my opinion. Then again, would it conflict with the current array overloading? int[] bar(Foo[] test); That's the only flaw I can see in it at this point. But it would make arrays of objects much more useful imho.Foo[] test; test.foo(); // calls .foo() for each object int[] result = test.bar(); // calls .bar() for each object, result is an array.Seems exotic at first, but it does lend to D's mantra of being 'context free'.One question: What about associative arrays?Well, if the two had matching sets of keys it might be possible: int[int] x, y; x = -y; foreach (int /* or char, etc. */ k; x.keys) x[k] = -y[k]; The question is, does this have to check that the keys all exist in both arrays, too? Meaning, x.keys.length == y.keys.length and that none of the evaluations cause an exception? Seems rather dangerous, to me... -[Unknown]
Feb 15 2005
pragma wrote: <snip>That makes sense to me. Just allow arrays of function pointers, delegates and anything with opCall() defined to be callable en masse. At a minimum, it would make event-based (1 to n messaging) programming a snap. I would also think that it could be extended to object members and methods as well?That would lead to such troubles as class Foo { int length; } Foo[] test; and then is test.length the length of the array, or an array of lengths of the Foo objects? <snip>interface Foo{ int foo(); void bar(); }Foo[] test; test.foo(); // calls .foo() for each object int[] result = test.bar(); // calls .bar() for each object, result is an array.I'm not sure I follow you here. Sounds like you're talking about using the function syntax for mapping to the array dimension space... is this correct?I'm not sure I follow you either. You seem to be talking about the ability to call arrays of functions. I was actually talking about the ability to pass array arguments to functions defined with scalar parameters. <snip>One question: What about associative arrays?Good question. I guess we could extend the concept to AAs. An operation on an AA would return a new AA in which the keys remain the same and the operation is applied to the values. Trying to do it on binary ops, let alone functions of three or more parameters, would require that the two arrays have the same set of keys ... but is this likely to happen in the real world? Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 16 2005
In article <cuvgv4$4fd$1 digitaldaemon.com>, Stewart Gordon says...pragma wrote:That is a problem. On the one hand, this is obviously a potential source of error, and could be flagged down by the compiler quite easily. On the other, it could be handled via some precedence given to array properties over array expression resolution (not a very good idea IMO). Perhaps an additional array pseudo-property could be added to avoid such conflicts? How about 'each' or 'every'?That would lead to such troubles as class Foo { int length; } Foo[] test; and then is test.length the length of the array, or an array of lengths of the Foo objects?Foo[] test; test.foo(); // calls .foo() for each object int[] result = test.bar(); // calls .bar() for each object, result is an array.Foo[] test; test.each.foo(); // calls .foo() for each object int[] result = test.each.bar();I see what you mean. It reminds me of database operations, when performing arithmetic in SQL statements. Perhaps the standard ruleset for dealing with null (non-existant) values might come into play here: any op performed against null is null. So far, you've applied the spec to associative arrays already. The dimensions thus far have merely been int's, so this much is done. I think that if you were to rewrite your translations based on AA's of int keys, you'd see that its most of the way there. All that's really left is set-notation, which would be necessary the instant you get away from using scalars for your dimensions. Overloading '+' for superset, or union, for example would be a bad move because objects may have operator overloads that are applicable. So you'd need to extend the array properties to include operations apart form minValue() and maxValue() like union(arr), superset(arr), exclusion(arr) and so forth. Fore example, assume that ValueType and KeyType can be *anything*, not just int or char[].One question: What about associative arrays?Good question. I guess we could extend the concept to AAs. An operation on an AA would return a new AA in which the keys remain the same and the operation is applied to the values. Trying to do it on binary ops, let alone functions of three or more parameters, would require that the two arrays have the same set of keys ... but is this likely to happen in the real world?alias ValueType[KeyType] ExampleAA; ExampleAA a,b,c; a = b + c;Which translates to:a = b.dup; foreach(KeyType key,ValueType value; c){ a[key] += value; }The above fails if ValueType is an object w/o opAdd() or a string, so the compiler would have to see the types invovled ahead of time and generate a compiler error (Cannot add char[][int] to char[][int]). This syntax would give the needed behavior:a = b.superset(c);Which translates to:a = b.dup; foreach(KeyType key,ValueType value; c){ a[key] = value; // subtle, but important }- EricAnderton at yahoo
Feb 16 2005
On Wed, 16 Feb 2005 16:10:40 +0000 (UTC), pragma <pragma_member pathlink.com> wrote:In article <cuvgv4$4fd$1 digitaldaemon.com>, Stewart Gordon says...What about: test.length //length of array test[].length //length of each element in array I think this idea could possibly be a source of bugs. Reganpragma wrote:That is a problem. On the one hand, this is obviously a potential source of error, and could be flagged down by the compiler quite easily. On the other, it could be handled via some precedence given to array properties over array expression resolution (not a very good idea IMO). Perhaps an additional array pseudo-property could be added to avoid such conflicts? How about 'each' or 'every'?That would lead to such troubles as class Foo { int length; } Foo[] test; and then is test.length the length of the array, or an array of lengths of the Foo objects?Foo[] test; test.foo(); // calls .foo() for each object int[] result = test.bar(); // calls .bar() for each object, result is an array.
Feb 16 2005
Regan Heath schrieb:What about: test.length //length of array test[].length //length of each element in arrayBad idea: 'test[]' is an array by itself, so it has a length property by itself...
Feb 16 2005
On Thu, 17 Feb 2005 05:58:39 +0100, Norbert Nemec <Norbert Nemec-online.de> wrote:Regan Heath schrieb:Is it? I thought it was a class, with a length property. now, if it had an [] operator (opIndex?) then that would be a problem. I have no bright ideas yet... If it was an array... Foo[][] test; test.length == length of [][] test[].length == length of each [] in [] test[][].length == foo length property. Regan.What about: test.length //length of array test[].length //length of each element in arrayBad idea: 'test[]' is an array by itself, so it has a length property by itself...
Feb 17 2005
Regan Heath wrote: <snip>If it was an array... Foo[][] test; test.length == length of [][] test[].length == length of each [] in [] test[][].length == foo length property.The trouble is that, at the moment, test[][] is equivalent to test[]. It's just the whole array taken a slice of again. Another catch is that if test is a struct/class, by writing test[].length you could be asking to take the length of the array returned by its opSlice method. And so either some existing code may well be broken or the semantics wouldn't be context-free.... (OK, so one would be an lvalue, and one an rvalue, but I recall commenting on that PCC before....) Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 18 2005
Stewart Gordon schrieb:Regan Heath wrote: <snip>Why "at the moment"? If you want to change this, I believe you are asking for deep trouble. (Not only compatibility, but consistency of the language) [] simply is the full-slice operator. If a regular slice operator returns an array, so should the full-slice operator.If it was an array... Foo[][] test; test.length == length of [][] test[].length == length of each [] in [] test[][].length == foo length property.The trouble is that, at the moment, test[][] is equivalent to test[]. It's just the whole array taken a slice of again.
Feb 18 2005
Nice work, Stewart! This is definitely much more than I have come up with over the past months. If only, I could get my head together, sit down and write up the complete proposal on arrays, array operations, etc. that I have in my mind... The proposal that you bring up seems mostly similar to the one that was in the specs over a long time - except of course, that it is a lot clearer and more detailed. For the time being, I see no fundamental flaw with it. However, the question that are left for me are: * How does one extend these array operations? Interpreting every function call on an array as an elementwise operation is not a good idea. The other extreme would be to limit array operations to the fixed set, predefined by the language. This would give you a toy solution that is nice for the occasional user but does not hold up for someone digging in deeper. * How does this extend to multidimensional arrays? The kind of nested arrays that you mention (like int[][]) is all we have at the moment, but I still feel that D needs true multidimensional (i.e. rectangular) arrays. And for these, there should be some more flexibility to specify which dimension should be looped over etc. I know that rectangular arrays are a thing of the future, but I still think that they are needed, and I see a certain danger in specifying array operations without rectangular arrays in mind.
Feb 16 2005
Another open question: should we allow all this on char arrays? I can imagine someone wanting to do Caesar or Vigenère cipher stuff with this. But does it really make any sense to do it under the UTF constraints? Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Feb 16 2005
One more detail: it should be clear, that the order of evaluation is not defined and that the temporary array is not guaranteed to be created. Any code that depends on the order of evaluation or the existance of temporaries is to be considered erraneous, like: a[1:9] = a[0:8]+1; (The compiler may not always be able to reliably detect such errors. Maybe a case for warnings?) If the order is unnecessarily defined in the specs, this might seriously limit the optimizability of the code. Currently, the wording "is equivalent to" sounds very dangerous in that respect. Ciao, Norbert Stewart Gordon schrieb:This'll probably get people asking the prospect of array operations for 1.0 to be resurrected, but still.... Here is a possible specification for array operations that I feel is better-defined than the one in the current out-of-date spec. Of course, there are still some open questions, which I've put at the bottom. Array operations ---------------- Arithmetic and bitwise operators are defined on array operands. An expression involving an array evaluates to a new array in which the operator has been applied to the elements of the operands in turn. In essence, when an expression contains more than one array operation, a new array is created to hold the result of each operation. However, a quality implementation will optimize the evaluation of the expression to eliminate temporaries where possible. Unary operations ~~~~~~~~~~~~~~~~ For the unary operators +, - and ~, the expression evaluates to a new array containing the result of applying the operator to each element. For example, with the declaration int[] x, y; then the statement y = -x; is simply equivalent to y = new int[x.length]; for (int i = 0; i < y.length; i++) { y[i] = -x[i]; } Binary operations ~~~~~~~~~~~~~~~~~ The binary operations supported are +, -, *, /, %, &, |, ^, <<, >> and >>>. If the two arrays are of the same dimension and of compatible types, then the expression evaluates to a new array in which each element is the result of applying the operator to corresponding elements of the operands. For example, with the declarations int[] x, y, z; the statement z = x + y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] + y[i]; } Both operands must be of the same length. If they are not, an ArrayBoundsError is thrown. For higher dimensions, this definition is applied recursively. For example, with int[][] x, y, z; the statement z = x * y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] * y[i]; } which is in turn equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = new int[x[i].length]; for (int j = 0; j < z[i].length; j++) { z[i][j] = x[i][j] * y[i][j]; } } If the operands do not match in dimension, then the operator is applied to each element of the higher-dimension operation with the whole of the lower-dimension one. For example, with int[] x, z; int y; the statement z = x - y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] - y; } Similarly, z = y - x; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = y - x[i]; } This definition is applied recursively if the dimensions differ by two or more. Assignment operations ~~~~~~~~~~~~~~~~~~~~~ When x is an array, the assignment x op= y; is taken as equivalent to x = x op y; whether y is an array of matching dimension, an array of lower dimension or a scalar. Thus the operation creates a new array and assigns it to x. If a sliced lvalue is used, the array is modified in place, so that x[] op= y; is equivalent to x[] = x[] op y; The preincrement and predecrement operators are handled in the same way. User-defined types ~~~~~~~~~~~~~~~~~~ A class, struct or union type may have operators overloaded with array types as parameters. To avoid conflicts between overloaded operators and array operations, binary operations involving both array and user-defined types are resolved as follows: 1. The normal operator overloading rules are applied. 2. If no match is found, the array operation rules are applied until both operands are reduced to scalar type; operator overloading rules are then applied to the result. 3. If the expression still does not resolve, it is an error. Open questions ~~~~~~~~~~~~~~ Should postincrement and postdecrement be allowed? How should they be handled? Should we generalise the concept to function calls? If so, I guess that overload resolution would work in much the same way as for operations on user-defined types. If we do allow it on function calls, should we allow it to work on functions of three or more parameters? In this case, the highest-dimension argument would be reduced to the dimension of the second highest, and then these two reduced together to match the third highest, and so on. Of course, these questions raise one more: how easy or hard would these ideas be to implement? Any thoughts? Stewart.
Feb 16 2005
In article <cv17tg$22nb$1 digitaldaemon.com>, Norbert Nemec says...One more detail: it should be clear, that the order of evaluation is not defined and that the temporary array is not guaranteed to be created. Any code that depends on the order of evaluation or the existance of temporaries is to be considered erraneous, like: a[1:9] = a[0:8]+1; (The compiler may not always be able to reliably detect such errors. Maybe a case for warnings?) If the order is unnecessarily defined in the specs, this might seriously limit the optimizability of the code. Currently, the wording "is equivalent to" sounds very dangerous in that respect. Ciao, NorbertNorbert - a few questions on all this if you have the time. How has the F95 FORALL statement gone over in numerical computing? Is it being used in a lot of codes and/or has there been a lot of software being re-written to use it? How are the optimizing compilers dealing with it - are they using it to good advantage? I gather there has been at least a little movement towards C/++ from Fortran, especially since more and more numerical software specialists have gotten into the field and pried the keyboard out of the hands of the people with the white lab coats. Problem has been that Fortran is so darn good at what it does. So, Is it possible for D (done correctly) to attract a good number of numerical computing people, or is Fortran so entrenched that this would be difficult with even a great language implementation? Is there/has there been a general feeling that it's time to "move past" Fortran in that field? Thanks, - DaveStewart Gordon schrieb:This'll probably get people asking the prospect of array operations for 1.0 to be resurrected, but still.... Here is a possible specification for array operations that I feel is better-defined than the one in the current out-of-date spec. Of course, there are still some open questions, which I've put at the bottom. Array operations ---------------- Arithmetic and bitwise operators are defined on array operands. An expression involving an array evaluates to a new array in which the operator has been applied to the elements of the operands in turn. In essence, when an expression contains more than one array operation, a new array is created to hold the result of each operation. However, a quality implementation will optimize the evaluation of the expression to eliminate temporaries where possible. Unary operations ~~~~~~~~~~~~~~~~ For the unary operators +, - and ~, the expression evaluates to a new array containing the result of applying the operator to each element. For example, with the declaration int[] x, y; then the statement y = -x; is simply equivalent to y = new int[x.length]; for (int i = 0; i < y.length; i++) { y[i] = -x[i]; } Binary operations ~~~~~~~~~~~~~~~~~ The binary operations supported are +, -, *, /, %, &, |, ^, <<, >> and >>>. If the two arrays are of the same dimension and of compatible types, then the expression evaluates to a new array in which each element is the result of applying the operator to corresponding elements of the operands. For example, with the declarations int[] x, y, z; the statement z = x + y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] + y[i]; } Both operands must be of the same length. If they are not, an ArrayBoundsError is thrown. For higher dimensions, this definition is applied recursively. For example, with int[][] x, y, z; the statement z = x * y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] * y[i]; } which is in turn equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = new int[x[i].length]; for (int j = 0; j < z[i].length; j++) { z[i][j] = x[i][j] * y[i][j]; } } If the operands do not match in dimension, then the operator is applied to each element of the higher-dimension operation with the whole of the lower-dimension one. For example, with int[] x, z; int y; the statement z = x - y; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = x[i] - y; } Similarly, z = y - x; is equivalent to z = new int[x.length]; for (int i = 0; i < z.length; i++) { z[i] = y - x[i]; } This definition is applied recursively if the dimensions differ by two or more. Assignment operations ~~~~~~~~~~~~~~~~~~~~~ When x is an array, the assignment x op= y; is taken as equivalent to x = x op y; whether y is an array of matching dimension, an array of lower dimension or a scalar. Thus the operation creates a new array and assigns it to x. If a sliced lvalue is used, the array is modified in place, so that x[] op= y; is equivalent to x[] = x[] op y; The preincrement and predecrement operators are handled in the same way. User-defined types ~~~~~~~~~~~~~~~~~~ A class, struct or union type may have operators overloaded with array types as parameters. To avoid conflicts between overloaded operators and array operations, binary operations involving both array and user-defined types are resolved as follows: 1. The normal operator overloading rules are applied. 2. If no match is found, the array operation rules are applied until both operands are reduced to scalar type; operator overloading rules are then applied to the result. 3. If the expression still does not resolve, it is an error. Open questions ~~~~~~~~~~~~~~ Should postincrement and postdecrement be allowed? How should they be handled? Should we generalise the concept to function calls? If so, I guess that overload resolution would work in much the same way as for operations on user-defined types. If we do allow it on function calls, should we allow it to work on functions of three or more parameters? In this case, the highest-dimension argument would be reduced to the dimension of the second highest, and then these two reduced together to match the third highest, and so on. Of course, these questions raise one more: how easy or hard would these ideas be to implement? Any thoughts? Stewart.
Feb 17 2005
Dave schrieb:Norbert - a few questions on all this if you have the time. How has the F95 FORALL statement gone over in numerical computing? Is it being used in a lot of codes and/or has there been a lot of software being re-written to use it? How are the optimizing compilers dealing with it - are they using it to good advantage? I gather there has been at least a little movement towards C/++ from Fortran, especially since more and more numerical software specialists have gotten into the field and pried the keyboard out of the hands of the people with the white lab coats. Problem has been that Fortran is so darn good at what it does. So, Is it possible for D (done correctly) to attract a good number of numerical computing people, or is Fortran so entrenched that this would be difficult with even a great language implementation? Is there/has there been a general feeling that it's time to "move past" Fortran in that field?Very good questions indeed. I cannot easily give an answer to them. Fact is that people move slowly. The use what they know and what has proven to work. Before they even think about changing, you'll have to demonstrate without a doubt that the alternative is superior. And even then, most will make a decision to change and need years before they find the time to actually do the step. In my immediate environment, people use * plain C, because they had some course as a junior * Fortran, because it has great libraries * Matlab, because it is so well documented and easy to start with * Python, because it is a beautiful wrapper around ugly C/Fortran code To pull any of them, it will take years. The only hope that I have is, that D gains momentum for its general purpose qualities, and when the Numerics people finally arrive, they realize that everything is prepared for them to dive in. As far as I can observe, numerics people usually have a very pragmatic approach and take what's there. To invest in their support is a very long-term investment, but once we have them, we can be sure that it will pay back by their investment in the quality of compilers. About the practical use of the vectorizing F95 features - I do not really know. I know that F95 is an awfully complicated language and that the implementations are far behind C++. I know that many people use F90 compilers only because of the nicer syntax but stick to their well-known programming style. The world is changing awfully slow...
Feb 17 2005
In article <cv27t5$2k6$1 digitaldaemon.com>, Norbert Nemec says...Dave schrieb:Aye, in some ways at least. Maybe D can change some of that - I think the array operations that you, Walter and Stewart (sorry if I forgot anyone) are hashing out may be a big part of it. Even if the numerics developers are slow on the uptake, I bet the rest of us will find some use for what you're discussing! Thanks, - DaveNorbert - a few questions on all this if you have the time. How has the F95 FORALL statement gone over in numerical computing? Is it being used in a lot of codes and/or has there been a lot of software being re-written to use it? How are the optimizing compilers dealing with it - are they using it to good advantage? I gather there has been at least a little movement towards C/++ from Fortran, especially since more and more numerical software specialists have gotten into the field and pried the keyboard out of the hands of the people with the white lab coats. Problem has been that Fortran is so darn good at what it does. So, Is it possible for D (done correctly) to attract a good number of numerical computing people, or is Fortran so entrenched that this would be difficult with even a great language implementation? Is there/has there been a general feeling that it's time to "move past" Fortran in that field?Very good questions indeed. I cannot easily give an answer to them. Fact is that people move slowly. The use what they know and what has proven to work. Before they even think about changing, you'll have to demonstrate without a doubt that the alternative is superior. And even then, most will make a decision to change and need years before they find the time to actually do the step. In my immediate environment, people use * plain C, because they had some course as a junior * Fortran, because it has great libraries * Matlab, because it is so well documented and easy to start with * Python, because it is a beautiful wrapper around ugly C/Fortran code To pull any of them, it will take years. The only hope that I have is, that D gains momentum for its general purpose qualities, and when the Numerics people finally arrive, they realize that everything is prepared for them to dive in. As far as I can observe, numerics people usually have a very pragmatic approach and take what's there. To invest in their support is a very long-term investment, but once we have them, we can be sure that it will pay back by their investment in the quality of compilers. About the practical use of the vectorizing F95 features - I do not really know. I know that F95 is an awfully complicated language and that the implementations are far behind C++. I know that many people use F90 compilers only because of the nicer syntax but stick to their well-known programming style. The world is changing awfully slow...
Feb 17 2005