digitalmars.D - Numerically largest entry in a trie of digits
- H. S. Teoh (15/15) Apr 19 2022 Just thought I'd pick all the bright minds here: I have a set of numbers
- rikki cattermole (2/2) Apr 19 2022 It sounds to me like you want to build a separate index for the trie
- Steven Schveighoffer (4/19) Apr 19 2022 Maybe store the longest chain in each node when inserting? Then you look...
- Stanislav Blinov (5/18) Apr 19 2022 Maybe maintain overall order via a linked list (head and tail
- Dom DiSc (4/7) Apr 19 2022 Is it necessary to store decimal numbers?
- H. S. Teoh (11/19) Apr 19 2022 Obviously, the same holds for decimal numbers. The question is, how to
- =?UTF-8?Q?Ali_=c3=87ehreli?= (7/12) Apr 19 2022 I think Dom DiSc meant zero pad all numbers to have the same maximum
- H. S. Teoh (13/27) Apr 19 2022 [...]
- H. S. Teoh (14/22) Apr 19 2022 [...]
- Gregor =?UTF-8?B?TcO8Y2ts?= (8/11) Apr 20 2022 Just to get the dumb answer out of the way: we need to assume
- Alexandru Ermicioi (6/19) Apr 20 2022 Perhaps have each node children sorted first by:
Just thought I'd pick all the bright minds here: I have a set of numbers in string form (as decimal digits), stored in a trie. How do I retrieve the largest number in the set? My initial guess was to find the rightmost child in the trie. Unfortunately, this is incorrect, because the rightmost child may not necessarily be the largest numerically. For example, consider the set {1, 9, 10}. In trie form, 10 would be a child of 1 (it would be stored as the digit path 1 -> 0), and the rightmost child would be 9, which is less than 10. It would appear that what I want is the rightmost *longest* entry in the trie. Is there a more efficient way to compute this, short of brute-force traversal over the entire trie? T -- Change is inevitable, except from a vending machine.
Apr 19 2022
It sounds to me like you want to build a separate index for the trie rather than do something clever.
Apr 19 2022
On 4/19/22 12:35 PM, H. S. Teoh wrote:Just thought I'd pick all the bright minds here: I have a set of numbers in string form (as decimal digits), stored in a trie. How do I retrieve the largest number in the set? My initial guess was to find the rightmost child in the trie. Unfortunately, this is incorrect, because the rightmost child may not necessarily be the largest numerically. For example, consider the set {1, 9, 10}. In trie form, 10 would be a child of 1 (it would be stored as the digit path 1 -> 0), and the rightmost child would be 9, which is less than 10. It would appear that what I want is the rightmost *longest* entry in the trie. Is there a more efficient way to compute this, short of brute-force traversal over the entire trie?Maybe store the longest chain in each node when inserting? Then you look for the longest chain first, and rightmost if there's more than one. -Steve
Apr 19 2022
On Tuesday, 19 April 2022 at 16:35:21 UTC, H. S. Teoh wrote:Just thought I'd pick all the bright minds here: I have a set of numbers in string form (as decimal digits), stored in a trie. How do I retrieve the largest number in the set? My initial guess was to find the rightmost child in the trie. Unfortunately, this is incorrect, because the rightmost child may not necessarily be the largest numerically. For example, consider the set {1, 9, 10}. In trie form, 10 would be a child of 1 (it would be stored as the digit path 1 -> 0), and the rightmost child would be 9, which is less than 10. It would appear that what I want is the rightmost *longest* entry in the trie. Is there a more efficient way to compute this, short of brute-force traversal over the entire trie? TMaybe maintain overall order via a linked list (head and tail pointers for the whole tree + next/prev on each node), modified at insertion/removal of nodes? That should provide O(1) access to largest and smallest number.
Apr 19 2022
On Tuesday, 19 April 2022 at 16:35:21 UTC, H. S. Teoh wrote:Just thought I'd pick all the bright minds here: I have a set of numbers in string form (as decimal digits), stored in a trie. How do I retrieve the largest number in the set?Is it necessary to store decimal numbers? Because if they would be stored binary, the highest number would automatically be among the longest strings.
Apr 19 2022
On Tue, Apr 19, 2022 at 09:16:18PM +0000, Dom DiSc via Digitalmars-d wrote:On Tuesday, 19 April 2022 at 16:35:21 UTC, H. S. Teoh wrote:Obviously, the same holds for decimal numbers. The question is, how to efficiently locate it in the trie (without traversing the entire trie). T -- A linguistics professor was lecturing to his class one day. "In English," he said, "A double negative forms a positive. In some languages, though, such as Russian, a double negative is still a negative. However, there is no language wherein a double positive can form a negative." A voice from the back of the room piped up, "Yeah, yeah."Just thought I'd pick all the bright minds here: I have a set of numbers in string form (as decimal digits), stored in a trie. How do I retrieve the largest number in the set?Is it necessary to store decimal numbers? Because if they would be stored binary, the highest number would automatically be among the longest strings.
Apr 19 2022
On 4/19/22 14:28, H. S. Teoh wrote:On Tue, Apr 19, 2022 at 09:16:18PM +0000, Dom DiSc via Digitalmars-dwrote:I think Dom DiSc meant zero pad all numbers to have the same maximum length? If there is indeed a maximum length that does not incur too much cost (I think tries naturally don't have that kind of cost?), then the rightmost will be the highest number. (?) AliIs it necessary to store decimal numbers? Because if they would be stored binary, the highest number would automatically be among the longest strings.Obviously, the same holds for decimal numbers.
Apr 19 2022
On Tue, Apr 19, 2022 at 02:37:07PM -0700, Ali Çehreli via Digitalmars-d wrote:On 4/19/22 14:28, H. S. Teoh wrote:[...] !!! Ali, you're a genius! Pad the numbers with 0's up to some maximum number of digits (for this particular trie I'm not expecting more than a handful of digits anyway), and that would ensure numerical order == lexicographic order, so the problem becomes trivial. It also simultaneously solves a bunch of other issues I have, all stemming from the fact that with digit strings of non-equal lengths, numerical order != lexicographic order. Thanks!!! T -- Why have vacation when you can work?? -- ECOn Tue, Apr 19, 2022 at 09:16:18PM +0000, Dom DiSc via Digitalmars-dwrote:I think Dom DiSc meant zero pad all numbers to have the same maximum length? If there is indeed a maximum length that does not incur too much cost (I think tries naturally don't have that kind of cost?), then the rightmost will be the highest number. (?)Is it necessary to store decimal numbers? Because if they would be stored binary, the highest number would automatically be among the longest strings.Obviously, the same holds for decimal numbers.
Apr 19 2022
On Tue, Apr 19, 2022 at 05:12:30PM -0700, H. S. Teoh via Digitalmars-d wrote:On Tue, Apr 19, 2022 at 02:37:07PM -0700, Ali Çehreli via Digitalmars-d wrote:[...]!!! Ali, you're a genius! Pad the numbers with 0's up to some maximum number of digits (for this particular trie I'm not expecting more than a handful of digits anyway), and that would ensure numerical order == lexicographic order, so the problem becomes trivial. It also simultaneously solves a bunch of other issues I have, all stemming from the fact that with digit strings of non-equal lengths, numerical order != lexicographic order.[...] Hmm, actually, it turns out that padding with 0's breaks the very reason I'm using a trie in the first place (it's for detecting potential matches of a set of numbers based on an incomplete input prefix). So it's a no-go, even though it was a genius idea. :-( Looks like the best solution is just to separately index the set with a sorted linear array... Steven's idea of storing max lengths in each node does work for this specific problem, but it makes other operations needlessly hard. So also no-go. :-/ T -- Be in denial for long enough, and one day you'll deny yourself of things you wish you hadn't.
Apr 19 2022
On Tuesday, 19 April 2022 at 16:35:21 UTC, H. S. Teoh wrote:Just thought I'd pick all the bright minds here: I have a set of numbers in string form (as decimal digits), stored in a trie. How do I retrieve the largest number in the set?Just to get the dumb answer out of the way: we need to assume that the trie is updated dynamically, right? So just trivially taking the maximum while inserting all the members won't work. In the dynamic case, keeping a separate sorted list sounds best if memory usage is of no concern to you. If you really need only the largest value and performance is a concern, a max heap would probably be faster than a fully sorted list.
Apr 20 2022
On Tuesday, 19 April 2022 at 16:35:21 UTC, H. S. Teoh wrote:Just thought I'd pick all the bright minds here: I have a set of numbers in string form (as decimal digits), stored in a trie. How do I retrieve the largest number in the set? My initial guess was to find the rightmost child in the trie. Unfortunately, this is incorrect, because the rightmost child may not necessarily be the largest numerically. For example, consider the set {1, 9, 10}. In trie form, 10 would be a child of 1 (it would be stored as the digit path 1 -> 0), and the rightmost child would be 9, which is less than 10. It would appear that what I want is the rightmost *longest* entry in the trie. Is there a more efficient way to compute this, short of brute-force traversal over the entire trie? TPerhaps have each node children sorted first by: - having themselves a child - digit itself Then rightmost/leftmost chain of nodes should give what you want. Alexandru.
Apr 20 2022