digitalmars.D - !in
- bearophile (6/6) Feb 17 2010 I've seen this change:
- Michel Fortin (18/32) Feb 17 2010 Are you sure the 'not in' operator in the Python algorithm you liked to
- bearophile (14/34) Feb 17 2010 In Python "in" calls the standard method __contains__() of an object.
- Michel Fortin (10/14) Feb 17 2010 Actually that's an interesting topic. You're suggesting that 'in' for
- bearophile (12/18) Feb 17 2010 I have suggested this probably two years ago or more. The main differenc...
- Andrei Alexandrescu (5/29) Feb 17 2010 I think you're giving D programmers too much credit. I think a
- bearophile (5/7) Feb 17 2010 I can't agree. They are both arrays.
- bearophile (6/7) Feb 17 2010 if (5 in [1, 2, 5]) { ... // OK
- Andrei Alexandrescu (14/25) Feb 17 2010 Well I understand you don't agree from the frame you're now in, but you
- Andrei Alexandrescu (4/6) Feb 17 2010 Oops. I meant to say an array literal in the expression "a in [ ... ]".
- =?ISO-8859-1?Q?Pelle_M=E5nsson?= (6/18) Feb 17 2010 I don't see how this is any argument against opIn_r for array variables
- Andrei Alexandrescu (4/28) Feb 17 2010 Not more difficult, but not deceivingly easy either. I like how you can
- BCS (8/17) Feb 17 2010 How about expanding that to compile time constants of any kind. From tha...
- Michel Fortin (6/10) Feb 17 2010 I agree with this. I wrote an example of it in my reply to Andrei.
- daoryn (27/36) Feb 17 2010 I tried to rename the function opIn hoping it would get automatically as...
- Andrei Alexandrescu (17/31) Feb 17 2010 That algorithm is nice. Generally I'd be hesitant to make it all too
- Michel Fortin (17/25) Feb 17 2010 Are you talking about literals or compile-time constants? A literal can
- Andrei Alexandrescu (5/36) Feb 17 2010 I was thinking CTFE-able array literals, but I think linear search in a
- Walter Bright (2/3) Feb 17 2010 Indubitably. (sips brandy)
- Andrei Alexandrescu (4/8) Feb 17 2010 What's the cultural reference? I know of "indubitably" used as a
- Walter Bright (4/12) Feb 17 2010 Imagine a victorian era upper class gentlemen's club, where the members
- Andrei Alexandrescu (5/18) Feb 17 2010 My joke reference is better. Low-brow husband to low-brow woman: "What's...
- Walter Bright (2/3) Feb 18 2010 Harrumph. Pistols at dawn?
- Andrei Alexandrescu (3/7) Feb 18 2010 A stinky sock is my weapon of choice.
- BCS (4/10) Feb 18 2010 Now that would be D-isaster!
- Walter Bright (2/12) Feb 18 2010 Not really. I failed at marksmanship.
- Andrei Alexandrescu (4/17) Feb 18 2010 That means "really"! For the sake of D you'd better shoot me rather than...
-
Ellery Newcomer
(2/4)
Feb 18 2010
with a stinky sock?
I've seen this change: http://dsource.org/projects/dmd/changeset/388 !in is one of the things I have asked in the first list of requests I have posted in this newsgroup a lot of time ago :-) Glad to see few of those things get implemented. Thank you. The related thing I was asking for, is to use "in" for a linear search of single items inside an array, as in Python, a very handy thing that's used frequently (CPython it also implements an efficient algorithm to search for a substring: http://effbot.org/zone/stringlib.htm But probably Andrei wants such substring search to be a separated algorithm, even if it's very specifically tuned for strings only and not for arrays in general. I can understand this). Bye, bearophile
Feb 17 2010
On 2010-02-17 10:22:01 -0500, bearophile <bearophileHUGS lycos.com> said:I've seen this change: http://dsource.org/projects/dmd/changeset/388 !in is one of the things I have asked in the first list of requests I have posted in this newsgroup a lot of time ago :-) Glad to see few of those things get implemented. Thank you.Indeed, that's great.The related thing I was asking for, is to use "in" for a linear search of single items inside an array, as in Python, a very handy thing that's used frequently (CPython it also implements an efficient algorithm to search for a substring: http://effbot.org/zone/stringlib.htm But probably Andrei wants such substring search to be a separated algorithm, even if it's very specifically tuned for strings only and not for arrays in general. I can understand this).Are you sure the 'not in' operator in the Python algorithm you liked to works like D's '!in'? It seems to me like it's searching for the absence of a substring. In D 'in' and '!in' only look for one element. I agree that it'd be handy to have 'in' and '!in' work with arrays, especially with array literals: if (x in [1,2,5,6,10]) do(something); Perhaps this syntax could be allowed but only for array literals. The reason being that array literals are usually short so a linear search wouldn't be too bad. Moreover, an array literal being a literal, it's easy for the compiler to optimize things; you're not constrained to a linear search. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 17 2010
Michel Fortin:Are you sure the 'not in' operator in the Python algorithm you liked to works like D's '!in'? It seems to me like it's searching for the absence of a substring.In Python "in" calls the standard method __contains__() of an object. __contains__ performs hash search in an associative array and a linear search in a list, tuple, array, etc, or in a lazily generated sequence:Truetxt = "How are you?" "are" in txtTrueseq = [1, 2, 3, 4] 3 in seqTrueadict = {1:2, 3:4} 3 in adictTrue In Python there are no chars, there are just strings, a char is a string of length 1, so the __contains__ performs a substring search (with many optimizations, so if you look for a substring of length 1 it's faster, etc).from itertools import count evens = (2*i for i in count()) 16 in evensIn D 'in' and '!in' only look for one element.<Currently In D 'in' and '!in' only look for one element inside an AA, or calls the opIn/opIn_r if present.I agree that it'd be handy to have 'in' and '!in' work with arrays, especially with array literals: if (x in [1,2,5,6,10]) do(something); Perhaps this syntax could be allowed but only for array literals.No silly special cases please. If it works for arrays, it works for all arrays. Otherwise nothing is better. Special cases increase complexity and kill languages with a death by a thousand cuts.The reason being that array literals are usually short so a linear search wouldn't be too bad.<"bad" is meaningless. If I have to perform a single linear search in a long array I don't want a nanny compiler to tell me that's a bad thing. Not even Python does it.Moreover, an array literal being a literal, it's easy for the compiler to optimize things; you're not constrained to a linear search.<A linear search on a small array is quite quick in a low level language like D, faster than a badly implemeted hash search (see LDC+Tango, the breaking point for an array of integers is about 15 numbers. DMD is better here). Bye, bearophile
Feb 17 2010
On 2010-02-17 11:24:43 -0500, bearophile <bearophileHUGS lycos.com> said:In Python there are no chars, there are just strings, a char is a string of length 1, so the __contains__ performs a substring search (with many optimizations, so if you look for a substring of length 1 it's faster, etc).Actually that's an interesting topic. You're suggesting that 'in' for arrays could search not only for elements but also for subarrays? The interesting part is that the syntax is very natural. The less interesting part is that it's slower and slightly different from 'in' in associative arrays. I'm on the fence. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 17 2010
Michel Fortin:You're suggesting that 'in' for arrays could search not only for elements but also for subarrays?<I have suggested this probably two years ago or more. The main difference compared the Python situation is that D is statically typed, so the compiler can tell apart the case of searching of a single item from the case of searching a subsequence. A similar case: in Python lists have the .append() and .extend() methods because Python is dynamically typed, so it can't tell apart the two cases. But the static type system of D allows you to just use ~= for both cases.The interesting part is that the syntax is very natural. The less interesting part is that it's slower and slightly different from 'in' in associative arrays. I'm on the fence.I am for it, because the operation is the same, "in" is used to search inside a collection, then the collection uses the faster method it has to answer. But D2 is now finalized, so this feature might need to wait after the phase of debugging and improvement of D2. -------------- Andrei Alexandrescu:Generally I'd be hesitant to make it all too easy to do linear searches. Too much of that would encourage people to stick with linear searches because it's the path of least resistance.<D programmers know the difference between linear search, hash search and search in an ordered search tree. They are O(n), O(1), and O(ln n). Do you want to use a different name for each of those searches (and maybe a (n log log n if you search with interpolated search, etc) just because their computational complexity is different, or do you want just a syntax that asks the collection to find something using the faster code it has? Even in my Python programs I am well aware of the performance difference between a linear search in a list and an hash search in a set. If I know I will need to perform a search operation often, on in an important loop, I will use a set, otherwise using a list (that is an array) is often good enough. The syntax to search is the same so you can swap data structure in a moment, eventually.Searching a value in a literal should actually be allowed: x in [10, 20, 30, 0]<This is very wrong. No special cases, please. If you don't want to support the linear search on all arrays, then please don't add this feature at all, I will keep using my isIn templated function (that digests AAs too, lazy iterables, etc too of course). I don't want more warts and special cases in D, thank you. As Python Zen says: "Special cases aren't special enough to break the rules." Such rules must be read and understood by D devs too, if you want to make a language that's not a mess as C++. Bye, bearophile
Feb 17 2010
bearophile wrote:Andrei Alexandrescu:I think you're giving D programmers too much credit. I think a best-effort search is not widely useful.Generally I'd be hesitant to make it all too easy to do linear searches. Too much of that would encourage people to stick with linear searches because it's the path of least resistance.<D programmers know the difference between linear search, hash search and search in an ordered search tree. They are O(n), O(1), and O(ln n). Do you want to use a different name for each of those searches (and maybe a (n log log n if you search with interpolated search, etc) just because their computational complexity is different, or do you want just a syntax that asks the collection to find something using the faster code it has?Searching an array literal vs. an array variable is not a special case. AndreiSearching a value in a literal should actually be allowed: x in [10, 20, 30, 0]<This is very wrong. No special cases, please. If you don't want to support the linear search on all arrays, then please don't add this feature at all, I will keep using my isIn templated function (that digests AAs too, lazy iterables, etc too of course). I don't want more warts and special cases in D, thank you. As Python Zen says: "Special cases aren't special enough to break the rules." Such rules must be read and understood by D devs too, if you want to make a language that's not a mess as C++.
Feb 17 2010
Andrei Alexandrescu:I think a best-effort search is not widely useful.<I meant using a hash search in a hash and a linear search in an array, etc. And having the same syntax to search in different type of collections is widely useful.Searching an array literal vs. an array variable is not a special case.<I can't agree. They are both arrays. Bye, bearophile
Feb 17 2010
Andrei Alexandrescu:Searching an array literal vs. an array variable is not a special case.<if (5 in [1, 2, 5]) { ... // OK if (5 in [1, 2, 5].dup) { ... // not OK If you can't see a special case there you are blind. Bye, bearophile
Feb 17 2010
bearophile wrote:Andrei Alexandrescu:What evidence do you have that it's widely useful?I think a best-effort search is not widely useful.<I meant using a hash search in a hash and a linear search in an array, etc. And having the same syntax to search in different type of collections is widely useful.Well I understand you don't agree from the frame you're now in, but you haven't heard my argument yet. Under such conditions you shouldn't say "can't"! Here goes my argument. 1. An array literal has no name and cannot be aliased, hence it's private to the implementation. The implementation is therefore free to induce structure on the searched elements, e.g. by sorting the array or using a hash. 2. The size of an array variable must be conservatively assumed to scale with the size of the input. The size of an array literal scales with the size of the program, or in the worst case (code generation) with statically-known inputs to the program. AndreiSearching an array literal vs. an array variable is not a special case.<I can't agree. They are both arrays.
Feb 17 2010
Andrei Alexandrescu wrote:1. An array literal has no name and cannot be aliased, hence it's private to the implementation.Oops. I meant to say an array literal in the expression "a in [ ... ]". Generally array literals definitely can be aliased :o). Andrei
Feb 17 2010
On 02/17/2010 10:25 PM, Andrei Alexandrescu wrote:What evidence do you have that it's widely useful?I use it once in a while in python, evidence enough? :)Well I understand you don't agree from the frame you're now in, but you haven't heard my argument yet. Under such conditions you shouldn't say "can't"! Here goes my argument. 1. An array literal has no name and cannot be aliased, hence it's private to the implementation. The implementation is therefore free to induce structure on the searched elements, e.g. by sorting the array or using a hash. 2. The size of an array variable must be conservatively assumed to scale with the size of the input. The size of an array literal scales with the size of the program, or in the worst case (code generation) with statically-known inputs to the program.I don't see how this is any argument against opIn_r for array variables as well. I mean, the argument against it seems to be to make it more difficult to do a linear search over an array, which seems rather backwards to me.
Feb 17 2010
Pelle Månsson wrote:On 02/17/2010 10:25 PM, Andrei Alexandrescu wrote:Not more difficult, but not deceivingly easy either. I like how you can search with simple syntax in an associative array but not in a linear array. AndreiWhat evidence do you have that it's widely useful?I use it once in a while in python, evidence enough? :)Well I understand you don't agree from the frame you're now in, but you haven't heard my argument yet. Under such conditions you shouldn't say "can't"! Here goes my argument. 1. An array literal has no name and cannot be aliased, hence it's private to the implementation. The implementation is therefore free to induce structure on the searched elements, e.g. by sorting the array or using a hash. 2. The size of an array variable must be conservatively assumed to scale with the size of the input. The size of an array literal scales with the size of the program, or in the worst case (code generation) with statically-known inputs to the program.I don't see how this is any argument against opIn_r for array variables as well. I mean, the argument against it seems to be to make it more difficult to do a linear search over an array, which seems rather backwards to me.
Feb 17 2010
Hello Michel,I agree that it'd be handy to have 'in' and '!in' work with arrays, especially with array literals: if (x in [1,2,5,6,10]) do(something); Perhaps this syntax could be allowed but only for array literals. The reason being that array literals are usually short so a linear search wouldn't be too bad. Moreover, an array literal being a literal, it's easy for the compiler to optimize things; you're not constrained to a linear search.How about expanding that to compile time constants of any kind. From that, even with long arrays, you should always be able to get n-log(n) performance (tree search) or better (translate it into a normal AA 'in' or even a perfect hash). -- ... <IXOYE><
Feb 17 2010
On 2010-02-17 12:56:51 -0500, BCS <none anon.com> said:How about expanding that to compile time constants of any kind. From that, even with long arrays, you should always be able to get n-log(n) performance (tree search) or better (translate it into a normal AA 'in' or even a perfect hash). --I agree with this. I wrote an example of it in my reply to Andrei. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 17 2010
bearophile Wrote:I've seen this change: http://dsource.org/projects/dmd/changeset/388 !in is one of the things I have asked in the first list of requests I have posted in this newsgroup a lot of time ago :-) Glad to see few of those things get implemented. Thank you. The related thing I was asking for, is to use "in" for a linear search of single items inside an array, as in Python, a very handy thing that's used frequently (CPython it also implements an efficient algorithm to search for a substring: http://effbot.org/zone/stringlib.htm But probably Andrei wants such substring search to be a separated algorithm, even if it's very specifically tuned for strings only and not for arrays in general. I can understand this). Bye, bearophileI tried to rename the function opIn hoping it would get automatically assumed to be synonim to the "in" operator, but ended up having to put it on my "common.d" file and explicitly use it. public bool inArray(T)(in const(T)[] array, in T elem) pure { foreach(size_t i, T e; array) if (e==elem) return true; return false; } Would be nice also to have a general search function such as public int findFirst(T)(in const(T)[] haystack, in const(T)[] needle) pure in { assert(haystack.length!=0, "haystack.length on slices!() == 0"); assert(needle.length!=0, "haystack.length on slices!() == 0"); assert(haystack.length>=needle.length, "Needle is bigger than haystack."); } out (result) { assert(result>=-1); assert(result<=(haystack.length-needle.length)); } body { const int end=(haystack.length-needle.length)+1; for (int i=0; i<end; i++) if (haystack[i..i+needle.length]==needle) return i; return -1; } I found something similar on std.array but discovered that syntax was very obscure and ended up making my own functions.
Feb 17 2010
bearophile wrote:I've seen this change: http://dsource.org/projects/dmd/changeset/388 !in is one of the things I have asked in the first list of requests I have posted in this newsgroup a lot of time ago :-) Glad to see few of those things get implemented. Thank you. The related thing I was asking for, is to use "in" for a linear search of single items inside an array, as in Python, a very handy thing that's used frequently (CPython it also implements an efficient algorithm to search for a substring: http://effbot.org/zone/stringlib.htm But probably Andrei wants such substring search to be a separated algorithm, even if it's very specifically tuned for strings only and not for arrays in general. I can understand this).That algorithm is nice. Generally I'd be hesitant to make it all too easy to do linear searches. Too much of that would encourage people to stick with linear searches because it's the path of least resistance. Searching a value in a literal should actually be allowed: x in [10, 20, 30, 0] is great because the compiler has complete discretion in how to conduct the search (e.g. linear vs. binary vs. hash search; it can take the initiative of presorting the literal). But general search in an unstructured range... maybe not. That being said, the CPython string you pointed me to looks very interesting. Phobos has a Boyer-Moore search implementation, but, just like Fredrik Lundh, I'm unhappy that it does a fair amount of computation upfront and that it needs dynamic allocation. If someone has the time, please contribute a better implementation of find() in std.algorithm. Andrei
Feb 17 2010
On 2010-02-17 12:07:05 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Searching a value in a literal should actually be allowed: x in [10, 20, 30, 0] is great because the compiler has complete discretion in how to conduct the search (e.g. linear vs. binary vs. hash search; it can take the initiative of presorting the literal). But general search in an unstructured range... maybe not.Are you talking about literals or compile-time constants? A literal can be built using variables and functions, such as: x in [a, b, c, d, e] This would be mostly equivalent to this: x == a || x == b || x == c || x == d || x == e I'd tend to allow it as it makes it easier to write and read conditionals with repeated comparisons against the same variable. But I guess that's less important than supporting compile-time constants: const allowedCharacters = ['0','1','2','3','4','5','6','7','8','9']; if (x in allowedCharacters) ...; -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 17 2010
Michel Fortin wrote:On 2010-02-17 12:07:05 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:I was thinking CTFE-able array literals, but I think linear search in a short array (in a way all array literals are O(1)) is fair game too.Searching a value in a literal should actually be allowed: x in [10, 20, 30, 0] is great because the compiler has complete discretion in how to conduct the search (e.g. linear vs. binary vs. hash search; it can take the initiative of presorting the literal). But general search in an unstructured range... maybe not.Are you talking about literals or compile-time constants? A literal can be built using variables and functions, such as: x in [a, b, c, d, e] This would be mostly equivalent to this: x == a || x == b || x == c || x == d || x == e I'd tend to allow it as it makes it easier to write and read conditionals with repeated comparisons against the same variable.But I guess that's less important than supporting compile-time constants: const allowedCharacters = ['0','1','2','3','4','5','6','7','8','9']; if (x in allowedCharacters) ...;Indeed. Andrei
Feb 17 2010
Andrei Alexandrescu wrote:Indeed.Indubitably. (sips brandy)
Feb 17 2010
Walter Bright wrote:Andrei Alexandrescu wrote:What's the cultural reference? I know of "indubitably" used as a punchline to a great joke, but there's no brandy in that one. AndreiIndeed.Indubitably. (sips brandy)
Feb 17 2010
Andrei Alexandrescu wrote:Walter Bright wrote:Imagine a victorian era upper class gentlemen's club, where the members sit around sipping brandy and puffing on cigars, congratulating each other. (puffs on cigar)Andrei Alexandrescu wrote:What's the cultural reference? I know of "indubitably" used as a punchline to a great joke, but there's no brandy in that one.Indeed.Indubitably. (sips brandy)
Feb 17 2010
Walter Bright wrote:Andrei Alexandrescu wrote:My joke reference is better. Low-brow husband to low-brow woman: "What's with that vocabulary night class you go to, woman? I bet there's orgies in there!" "Indubitably." AndreiWalter Bright wrote:Imagine a victorian era upper class gentlemen's club, where the members sit around sipping brandy and puffing on cigars, congratulating each other. (puffs on cigar)Andrei Alexandrescu wrote:What's the cultural reference? I know of "indubitably" used as a punchline to a great joke, but there's no brandy in that one.Indeed.Indubitably. (sips brandy)
Feb 17 2010
Andrei Alexandrescu wrote:My joke reference is better.Harrumph. Pistols at dawn?
Feb 18 2010
Walter Bright wrote:Andrei Alexandrescu wrote:A stinky sock is my weapon of choice. AndreiMy joke reference is better.Harrumph. Pistols at dawn?
Feb 18 2010
Hello Walter,Andrei Alexandrescu wrote:Now that would be D-isaster! -- ... <IXOYE><My joke reference is better.Harrumph. Pistols at dawn?
Feb 18 2010
BCS wrote:Hello Walter,Not really. I failed at marksmanship.Andrei Alexandrescu wrote:Now that would be D-isaster!My joke reference is better.Harrumph. Pistols at dawn?
Feb 18 2010
Walter Bright wrote:BCS wrote:That means "really"! For the sake of D you'd better shoot me rather than the other way around :o). AndreiHello Walter,Not really. I failed at marksmanship.Andrei Alexandrescu wrote:Now that would be D-isaster!My joke reference is better.Harrumph. Pistols at dawn?
Feb 18 2010
On 02/18/2010 07:31 PM, Andrei Alexandrescu wrote:.. the other way around :o). Andrei<skeptical> with a stinky sock? </skeptical>
Feb 18 2010