www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - in not working for arrays is silly, change my view

reply JN <666total wp.pl> writes:
assert(1 in [1, 2, 3]);

Error: incompatible types for (1) in ([1, 2, 3]): int and int[

Yes, I know about .canFind(), but this is something that trips 
people over and over.

I think it would be better if "in" worked for both assoc arrays 
and normal arrays, or didn't work at all, for added consistency.
Feb 29 2020
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 2/29/20 2:38 PM, JN wrote:
 assert(1 in [1, 2, 3]);
 
 Error: incompatible types for (1) in ([1, 2, 3]): int and int[
 
 Yes, I know about .canFind(), but this is something that trips people 
 over and over.
 
 I think it would be better if "in" worked for both assoc arrays and 
 normal arrays, or didn't work at all, for added consistency.
1. in is supposed to be O(lg(n)) or better. Generic code may depend on this property. Searching an array is O(n). 2. Array elements are not guaranteed to be comparable. AA keys are. 3. As this cannot be done via library, the compiler would have to do it (at least process the syntax), we need to ensure its something we want to do. I don't see it adding much to the language. What MAY be a good idea is to have a specialized error telling people to import std.algorithm.canFind and use it when they try this. Another thing that may be useful is having something like "isIn" instead of "canFind", which swaps the left and right parameters to read more naturally. assert([1,2,3].canFind(1)); vs. assert(1.isIn([1, 2, 3])); I'd also add an overload to allow: assert(1.isIn(1, 2, 3)); -Steve
Feb 29 2020
parent reply Andrea Fontana <nospam example.com> writes:
On Saturday, 29 February 2020 at 20:11:24 UTC, Steven 
Schveighoffer wrote:
 1. in is supposed to be O(lg(n)) or better. Generic code may 
 depend on this property. Searching an array is O(n).
Probably it should work if we're using a "SortedRange". int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]; auto p = assumeSorted(a); assert(3 in p);
Mar 02 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/2/20 6:52 AM, Andrea Fontana wrote:
 On Saturday, 29 February 2020 at 20:11:24 UTC, Steven Schveighoffer wrote:
 1. in is supposed to be O(lg(n)) or better. Generic code may depend on 
 this property. Searching an array is O(n).
Probably it should work if we're using a "SortedRange". int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]; auto p = assumeSorted(a); assert(3 in p);
That could work. Currently, you need to use p.contains(3). opIn could be added as a shortcut. It only makes sense if you have it as a literal though, as p.contains(3) isn't that bad to use: assert(3 in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].assumeSorted); -Steve
Mar 02 2020
parent reply aliak <something something.com> writes:
On Monday, 2 March 2020 at 15:47:26 UTC, Steven Schveighoffer 
wrote:
 On 3/2/20 6:52 AM, Andrea Fontana wrote:
 On Saturday, 29 February 2020 at 20:11:24 UTC, Steven 
 Schveighoffer wrote:
 1. in is supposed to be O(lg(n)) or better. Generic code may 
 depend on this property. Searching an array is O(n).
Probably it should work if we're using a "SortedRange". int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]; auto p = assumeSorted(a); assert(3 in p);
That could work. Currently, you need to use p.contains(3). opIn could be added as a shortcut. It only makes sense if you have it as a literal though, as p.contains(3) isn't that bad to use: assert(3 in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].assumeSorted); -Steve
There's no guarantee that checking if a value is in a sorted list is any faster than checking if it's in a non sorted list. It's why sort usually switches from a binary-esque algorithm to a linear one at a certain size. The list could potentially need to be _very_ large for p.contains to make a significant impact over canFind(p) AFAIK. Here's a small test program, try playing with the numbers and see what happens: import std.random; import std.range; import std.algorithm; import std.datetime.stopwatch; import std.stdio; void main() { auto count = 1_000; auto max = int.max; alias randoms = generate!(() => uniform(0, max)); auto r1 = randoms.take(count).array; auto r2 = r1.dup.sort; auto elem = r1[uniform(0, count)]; benchmark!( () => r1.canFind(elem), () => r2.contains(elem), )(1_000).writeln; } Use LDC and -O3 of course. I was hard pressed to get the sorted contains to be any faster than canFind. This begs the question then: do these requirements on in make any sense? An algorithm can be log n (ala the sorted search) but still be a magnitude slower than a linear search... what has the world come to 🤦‍♂️ PS: Why is it named contains if it's on a SortedRange and canFind otherwise?
Mar 02 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/2/20 3:52 PM, aliak wrote:
 On Monday, 2 March 2020 at 15:47:26 UTC, Steven Schveighoffer wrote:
 On 3/2/20 6:52 AM, Andrea Fontana wrote:
 On Saturday, 29 February 2020 at 20:11:24 UTC, Steven Schveighoffer 
 wrote:
 1. in is supposed to be O(lg(n)) or better. Generic code may depend 
 on this property. Searching an array is O(n).
Probably it should work if we're using a "SortedRange". int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]; auto p = assumeSorted(a); assert(3 in p);
That could work. Currently, you need to use p.contains(3). opIn could be added as a shortcut. It only makes sense if you have it as a literal though, as p.contains(3) isn't that bad to use: assert(3 in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].assumeSorted);
There's no guarantee that checking if a value is in a sorted list is any faster than checking if it's in a non sorted list. It's why sort usually switches from a binary-esque algorithm to a linear one at a certain size.
Well of course! A binary search needs Lg(n) comparisons for pretty much any value, whereas a linear search is going to end early when it finds it. So there's no guarantee that searching for an element in the list is going to be faster one way or the other. But Binary search is going to be faster overall because the complexity is favorable.
 The list could potentially need to be _very_ large for p.contains 
 to make a significant impact over canFind(p) AFAIK.
 
 Here's a small test program, try playing with the numbers and see what 
 happens:
 
 import std.random;
 import std.range;
 import std.algorithm;
 import std.datetime.stopwatch;
 import std.stdio;
 
 void main()
 {
      auto count = 1_000;
      auto max = int.max;
 
      alias randoms = generate!(() => uniform(0, max));
 
      auto r1 = randoms.take(count).array;
      auto r2 = r1.dup.sort;
      auto elem = r1[uniform(0, count)];
auto elem = r1[$-1]; // try this instead
 
      benchmark!(
          () => r1.canFind(elem),
          () => r2.contains(elem),
      )(1_000).writeln;
 }
 
 Use LDC and -O3 of course. I was hard pressed to get the sorted contains 
 to be any faster than canFind.
 
 This begs the question then: do these requirements on in make any sense? 
 An algorithm can be log n (ala the sorted search) but still be a 
 magnitude slower than a linear search... what has the world come to
🤦‍♂️
 
 PS: Why is it named contains if it's on a SortedRange and canFind 
 otherwise?
 
A SortedRange uses O(lgn) steps vs. canFind which uses O(n) steps. If you change your code to testing 1000 random numbers, instead of a random number guaranteed to be included, then you will see a significant improvement with the sorted version. I found it to be about 10x faster. (most of the time, none of the other random numbers are included). Even if you randomly select 1000 numbers from the elements, the binary search will be faster. In my tests, it was about 5x faster. Note that the compiler can do a lot more tricks for linear searches, and CPUs are REALLY good at searching sequential data. But complexity is still going to win out eventually over heuristics. Phobos needs to be a general library, not one that only caters to certain situations. -Steve
Mar 02 2020
parent reply aliak <something something.com> writes:
On Monday, 2 March 2020 at 21:33:37 UTC, Steven Schveighoffer 
wrote:
 On 3/2/20 3:52 PM, aliak wrote:
 On Monday, 2 March 2020 at 15:47:26 UTC, Steven Schveighoffer 
 wrote:
 On 3/2/20 6:52 AM, Andrea Fontana wrote:
 On Saturday, 29 February 2020 at 20:11:24 UTC, Steven 
 Schveighoffer wrote:
 1. in is supposed to be O(lg(n)) or better. Generic code 
 may depend on this property. Searching an array is O(n).
Probably it should work if we're using a "SortedRange". int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]; auto p = assumeSorted(a); assert(3 in p);
That could work. Currently, you need to use p.contains(3). opIn could be added as a shortcut. It only makes sense if you have it as a literal though, as p.contains(3) isn't that bad to use: assert(3 in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].assumeSorted);
There's no guarantee that checking if a value is in a sorted list is any faster than checking if it's in a non sorted list. It's why sort usually switches from a binary-esque algorithm to a linear one at a certain size.
Well of course! A binary search needs Lg(n) comparisons for pretty much any value, whereas a linear search is going to end early when it finds it. So there's no guarantee that searching for an element in the list is going to be faster one way or the other. But Binary search is going to be faster overall because the complexity is favorable.
Overall tending towards infinity maybe, but not overall on the average case it would seem. Branch prediction in CPUs changes that in that with a binary search it is always a miss. Whereas with linear it's always a hit.
 The list could potentially need to be _very_ large for 
 p.contains to make a significant impact over canFind(p) AFAIK.
 
 Here's a small test program, try playing with the numbers and 
 see what happens:
 
 import std.random;
 import std.range;
 import std.algorithm;
 import std.datetime.stopwatch;
 import std.stdio;
 
 void main()
 {
      auto count = 1_000;
      auto max = int.max;
 
      alias randoms = generate!(() => uniform(0, max));
 
      auto r1 = randoms.take(count).array;
      auto r2 = r1.dup.sort;
      auto elem = r1[uniform(0, count)];
auto elem = r1[$-1]; // try this instead
 
      benchmark!(
          () => r1.canFind(elem),
          () => r2.contains(elem),
      )(1_000).writeln;
 }
 
 Use LDC and -O3 of course. I was hard pressed to get the 
 sorted contains to be any faster than canFind.
 
 This begs the question then: do these requirements on in make 
 any sense? An algorithm can be log n (ala the sorted search) 
 but still be a magnitude slower than a linear search... what 
 has the world come to 🤦‍♂️
 
 PS: Why is it named contains if it's on a SortedRange and 
 canFind otherwise?
 
A SortedRange uses O(lgn) steps vs. canFind which uses O(n) steps.
canFind is supposed to tell the reader that it's O(n) and contains O(lgn)?
 If you change your code to testing 1000 random numbers, instead 
 of a random number guaranteed to be included, then you will see 
 a significant improvement with the sorted version. I found it 
 to be about 10x faster. (most of the time, none of the other 
 random numbers are included). Even if you randomly select 1000 
 numbers from the elements, the binary search will be faster. In 
 my tests, it was about 5x faster.
Hmm... What am I doing wrong with this code? And also how are you compiling?: void main() { auto count = 1_000_000; auto max = int.max; alias randoms = generate!(() => uniform(0, max - 1)); auto r1 = randoms.take(count).array; auto r2 = r1.dup.sort; auto r3 = r1.dup.randomShuffle; auto results = benchmark!( () => r1.canFind(max), () => r2.contains(max), () => r3.canFind(max), )(5_000); results.writeln; } $ ldc2 -O3 test.d && ./test [1 hnsec, 84 μs and 7 hnsecs, 0 hnsecs]
 Note that the compiler can do a lot more tricks for linear 
 searches, and CPUs are REALLY good at searching sequential 
 data. But complexity is still going to win out eventually over 
 heuristics. Phobos needs to be a general library, not one that 
 only caters to certain situations.
General would be the most common case. I don't think extremely large (for some definition of large) lists are the more common ones. Or maybe they are. But I'd be surprised. I also don't think phobos is a very data-driven library. But, that's a whole other conversation :)
 -Steve
Mar 02 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/2/20 5:21 PM, aliak wrote:
 On Monday, 2 March 2020 at 21:33:37 UTC, Steven Schveighoffer wrote:
 On 3/2/20 3:52 PM, aliak wrote:
 On Monday, 2 March 2020 at 15:47:26 UTC, Steven Schveighoffer wrote:
 On 3/2/20 6:52 AM, Andrea Fontana wrote:
 On Saturday, 29 February 2020 at 20:11:24 UTC, Steven Schveighoffer 
 wrote:
 1. in is supposed to be O(lg(n)) or better. Generic code may 
 depend on this property. Searching an array is O(n).
Probably it should work if we're using a "SortedRange". int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]; auto p = assumeSorted(a); assert(3 in p);
That could work. Currently, you need to use p.contains(3). opIn could be added as a shortcut. It only makes sense if you have it as a literal though, as p.contains(3) isn't that bad to use: assert(3 in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].assumeSorted);
There's no guarantee that checking if a value is in a sorted list is any faster than checking if it's in a non sorted list. It's why sort usually switches from a binary-esque algorithm to a linear one at a certain size.
Well of course! A binary search needs Lg(n) comparisons for pretty much any value, whereas a linear search is going to end early when it finds it. So there's no guarantee that searching for an element in the list is going to be faster one way or the other. But Binary search is going to be faster overall because the complexity is favorable.
Overall tending towards infinity maybe, but not overall on the average case it would seem. Branch prediction in CPUs changes that in that with a binary search it is always a miss. Whereas with linear it's always a hit.
 The list could potentially need to be _very_ large for p.contains to 
 make a significant impact over canFind(p) AFAIK.

 Here's a small test program, try playing with the numbers and see 
 what happens:

 import std.random;
 import std.range;
 import std.algorithm;
 import std.datetime.stopwatch;
 import std.stdio;

 void main()
 {
      auto count = 1_000;
      auto max = int.max;

      alias randoms = generate!(() => uniform(0, max));

      auto r1 = randoms.take(count).array;
      auto r2 = r1.dup.sort;
      auto elem = r1[uniform(0, count)];
auto elem = r1[$-1]; // try this instead
      benchmark!(
          () => r1.canFind(elem),
          () => r2.contains(elem),
      )(1_000).writeln;
 }

 Use LDC and -O3 of course. I was hard pressed to get the sorted 
 contains to be any faster than canFind.

 This begs the question then: do these requirements on in make any 
 sense? An algorithm can be log n (ala the sorted search) but still be 
 a magnitude slower than a linear search... what has the world come to 
 🤦‍♂️

 PS: Why is it named contains if it's on a SortedRange and canFind 
 otherwise?
A SortedRange uses O(lgn) steps vs. canFind which uses O(n) steps.
canFind is supposed to tell the reader that it's O(n) and contains O(lgn)?
canFind means find will not return empty. find is linear search. Probably not the clearest distinction, but contains isn't an algorithm, it's a member. I don't think there's any general naming convention for members like this, but I would say if contains means O(lgn) then that's what we should use everywhere to mean that. I suppose the naming could be improved.
 
 If you change your code to testing 1000 random numbers, instead of a 
 random number guaranteed to be included, then you will see a 
 significant improvement with the sorted version. I found it to be 
 about 10x faster. (most of the time, none of the other random numbers 
 are included). Even if you randomly select 1000 numbers from the 
 elements, the binary search will be faster. In my tests, it was about 
 5x faster.
Hmm... What am I doing wrong with this code? And also how are you compiling?: void main() {     auto count = 1_000_000;     auto max = int.max;     alias randoms = generate!(() => uniform(0, max - 1));     auto r1 = randoms.take(count).array;     auto r2 = r1.dup.sort;     auto r3 = r1.dup.randomShuffle;     auto results = benchmark!(         () => r1.canFind(max),         () => r2.contains(max),         () => r3.canFind(max),     )(5_000);     results.writeln; } $ ldc2 -O3 test.d && ./test [1 hnsec, 84 μs and 7 hnsecs, 0 hnsecs]
Yeah, this looked very fishy to me. ldc can do some nasty "helpful" things to save you time! When I posted my results, I was using DMD. I used run.dlang.io with ldc, and verified I get the same (similar) result as you. But searching through 5 billion integers can't be instantaneous, on any modern hardware. So I also tried this. auto results = benchmark!( () => false, () => r2.contains(max), () => r3.canFind(max), )(5_000); [4 μs and 9 hnsecs, 166 μs and 8 hnsecs, 4 μs and 7 hnsecs] Hey look! returning a boolean is more expensive than a 1 million element linear search! What I think is happening is that it determines nobody is using the result, and the function is pure, so it doesn't bother calling that function (probably not even the lambda, and then probably removes the loop completely). I'm assuming for some reason, the binary search is not flagged pure, so it's not being skipped. If I change to this to ensure side effects: bool makeImpure; // TLS variable outside of main ... auto results = benchmark!( () => makeImpure = r1.canFind(max), () => makeImpure = r2.contains(max), () => makeImpure = r3.canFind(max), )(5_000); writefln("%(%s\n%)", results); // modified to help with the comma confusion I now get: 4 secs, 428 ms, and 3 hnsecs 221 μs and 9 hnsecs 4 secs, 49 ms, 982 μs, and 5 hnsecs More like what I expected!
 General would be the most common case. I don't think extremely large 
 (for some definition of large) lists are the more common ones. Or maybe 
 they are. But I'd be surprised. I also don't think phobos is a very 
 data-driven library. But, that's a whole other conversation :)
I don't think the general case is going to be large data sets either. But that doesn't mean Phobos should assume they are all small. And as you can see, when it actually has to end up running the code, the binary search is significantly faster. -Steve
Mar 02 2020
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 02, 2020 at 06:27:22PM -0500, Steven Schveighoffer via
Digitalmars-d-learn wrote:
[...]
 Yeah, this looked very fishy to me. ldc can do some nasty "helpful"
 things to save you time! When I posted my results, I was using DMD.
 
 I used run.dlang.io with ldc, and verified I get the same (similar)
 result as you. But searching through 5 billion integers can't be
 instantaneous, on any modern hardware.
[...] Beware that LDC's over-zealous optimizer can sometimes elide an entire function call tree if it determines that the return value is not used anywhere. I've also observed that it can sometimes execute the entire function call tree at compile-time and emit just a single instruction that loads the final result. So, always check the assembly output so that you're sure the benchmark is actually measuring what you think it's measuring. To prevent the optimizer from eliding "useless" code, you need to do something with the return value that isn't trivial (assigning to a variable that doesn't get used afterwards is "trivial", so that's not enough). The easiest way is to print the result: the optimizer cannot elide I/O. T -- Freedom: (n.) Man's self-given right to be enslaved by his own depravity.
Mar 02 2020
prev sibling parent reply aliak <something something.com> writes:
On Monday, 2 March 2020 at 23:27:22 UTC, Steven Schveighoffer 
wrote:
 What I think is happening is that it determines nobody is using 
 the result, and the function is pure, so it doesn't bother 
 calling that function (probably not even the lambda, and then 
 probably removes the loop completely).

 I'm assuming for some reason, the binary search is not flagged 
 pure, so it's not being skipped.
Apparently you're right: https://github.com/dlang/phobos/blob/5e13653a6eb55c1188396ae064717a1a03fd7483/std/range/package.d#L11107
 If I change to this to ensure side effects:

 bool makeImpure; // TLS variable outside of main

 ...

     auto results = benchmark!(
         () => makeImpure = r1.canFind(max),
         () => makeImpure = r2.contains(max),
         () => makeImpure = r3.canFind(max),
     )(5_000);

 writefln("%(%s\n%)", results); // modified to help with the 
 comma confusion

 I now get:
 4 secs, 428 ms, and 3 hnsecs
 221 μs and 9 hnsecs
 4 secs, 49 ms, 982 μs, and 5 hnsecs

 More like what I expected!
Ahhhh damn! And here I was thinking that branch prediction made a HUGE difference! Ok, I'm taking my tail and slowly moving away now :) Let us never speak of this again.
 -Steve
Mar 02 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/2/20 7:32 PM, aliak wrote:
 On Monday, 2 March 2020 at 23:27:22 UTC, Steven Schveighoffer wrote:
 What I think is happening is that it determines nobody is using the 
 result, and the function is pure, so it doesn't bother calling that 
 function (probably not even the lambda, and then probably removes the 
 loop completely).

 I'm assuming for some reason, the binary search is not flagged pure, 
 so it's not being skipped.
Apparently you're right: https://github.com/dlang/phobos/blob/5e13653a6eb55c1188396ae064717a1a03fd7483/std/range/package.d#L11107
That's not definitive. Note that a template member or member of a struct template can be *inferred* to be pure. It's also entirely possible for the function to be pure, but the compiler decides for another reason not to elide the whole thing. Optimization isn't ever guaranteed.
 
 
 If I change to this to ensure side effects:

 bool makeImpure; // TLS variable outside of main

 ...

     auto results = benchmark!(
         () => makeImpure = r1.canFind(max),
         () => makeImpure = r2.contains(max),
         () => makeImpure = r3.canFind(max),
     )(5_000);

 writefln("%(%s\n%)", results); // modified to help with the comma 
 confusion

 I now get:
 4 secs, 428 ms, and 3 hnsecs
 221 μs and 9 hnsecs
 4 secs, 49 ms, 982 μs, and 5 hnsecs

 More like what I expected!
Ahhhh damn! And here I was thinking that branch prediction made a HUGE difference! Ok, I'm taking my tail and slowly moving away now :) Let us never speak of this again.
LOL, I'm sure this will come up again ;) The forums are full of confusing benchmarks where LDC has elided the whole thing being tested. It's amazing at optimizing. Sometimes, too amazing. On 3/2/20 6:46 PM, H. S. Teoh wrote:
 To prevent the optimizer from eliding "useless" code, you need to do
 something with the return value that isn't trivial (assigning to a
 variable that doesn't get used afterwards is "trivial", so that's not
 enough). The easiest way is to print the result: the optimizer cannot
 elide I/O.
Yeah, well, that means you are also benchmarking the i/o (which would dwarf the other pieces being tested). I think assigning the result to a global fits the bill pretty well, but obviously only works when you're not inside a pure function. -Steve
Mar 02 2020
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 02, 2020 at 07:51:34PM -0500, Steven Schveighoffer via
Digitalmars-d-learn wrote:
[...]
 On 3/2/20 6:46 PM, H. S. Teoh wrote:
 To prevent the optimizer from eliding "useless" code, you need to do
 something with the return value that isn't trivial (assigning to a
 variable that doesn't get used afterwards is "trivial", so that's
 not enough). The easiest way is to print the result: the optimizer
 cannot elide I/O.
Yeah, well, that means you are also benchmarking the i/o (which would dwarf the other pieces being tested).
Not necessarily. Create a global variable whose sole purpose is to accumulate the return values of the functions being tested, then return its value as the return value of main(). The optimizer is bound by semantics not to elide anything then.
 I think assigning the result to a global fits the bill pretty well,
 but obviously only works when you're not inside a pure function.
[...] A sufficiently-advanced optimizer would notice the global isn't referred to anywhere else, and therefore of no effect, and elide it anyway. Not saying it actually would, 'cos I think you're probably right, but I'm leaving nothing to chance when the LDC optimizer is in question. :-P T -- It only takes one twig to burn down a forest.
Mar 02 2020
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 2/29/20 11:38 AM, JN wrote:

 assert(1 in [1, 2, 3]);
Because you mentioned canFind, I think you want the semantics to be "is there an element with this value." If so, it would be confusing to use the same operator for two different things: For associative arrays, it means "is there an element accessible with this key." Unless 'in' works with arrays to mean "is this index valid", then I don't see the benefit. If we had it, I think more people would ask "why does 'in' work differently for arrays?" Are there other languages that support this semantic? Checking... Ok, Python has it, highly likely because they don't have arrays to begin with. Ali
Feb 29 2020
parent reply JN <666total wp.pl> writes:
On Saturday, 29 February 2020 at 21:56:51 UTC, Ali Çehreli wrote:
 Because you mentioned canFind, I think you want the semantics 
 to be "is there an element with this value." If so, it would be 
 confusing to use the same operator for two different things: 
 For associative arrays, it means "is there an element 
 accessible with this key."
Does it? I always viewed it as "is this value in list of keys"
 Unless 'in' works with arrays to mean "is this index valid", 
 then I don't see the benefit. If we had it, I think more people 
 would ask "why does 'in' work differently for arrays?"
 Are there other languages that support this semantic? 
 Checking... Ok, Python has it, highly likely because they don't 
 have arrays to begin with.
Well, Python lists are for most purposes equivalent to arrays and it hasn't really been confusing for people.
Mar 02 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/2/20 6:39 AM, JN wrote:
 On Saturday, 29 February 2020 at 21:56:51 UTC, Ali Çehreli wrote:
 Because you mentioned canFind, I think you want the semantics to be 
 "is there an element with this value." If so, it would be confusing to 
 use the same operator for two different things: For associative 
 arrays, it means "is there an element accessible with this key."
Does it? I always viewed it as "is this value in list of keys"
Array keys are the element index.. So essentially: int[int] c1; int[] c2 = new int[4]; c1[3] = 10; c2[3] = 10; assert(3 in c1); // true assert(3 in c2); // what should this do? -Steve
Mar 02 2020
parent aliak <something something.com> writes:
On Monday, 2 March 2020 at 15:50:08 UTC, Steven Schveighoffer 
wrote:
 On 3/2/20 6:39 AM, JN wrote:
 On Saturday, 29 February 2020 at 21:56:51 UTC, Ali Çehreli 
 wrote:
 Because you mentioned canFind, I think you want the semantics 
 to be "is there an element with this value." If so, it would 
 be confusing to use the same operator for two different 
 things: For associative arrays, it means "is there an element 
 accessible with this key."
Does it? I always viewed it as "is this value in list of keys"
Array keys are the element index.. So essentially: int[int] c1; int[] c2 = new int[4]; c1[3] = 10; c2[3] = 10; assert(3 in c1); // true assert(3 in c2); // what should this do? -Steve
If in were to mean "is this value in list of keys" then to be consistent: 3 in c2 == 3 < c2.length
Mar 02 2020