Calculate element which is most divisible by all elements in the same array (better than O(n^2))

Question

I have done this example in O(n^2). Given a array, I do the following:

max_key = 0
for k in set(keys):
    count = 0
    for divisor in keys:
        if key < divisor: break
        if key% divisor == 0: count += 1
    if count > max_key: max_key = count
print(max_key)

An example of this would be:

keys = [2,4,8,2]

Then the element most divisible by all elements in the keys is 8 because there are 4 elements (2,2,4,8) that can divide 8.

Can anyone suggest an approach better than O(n^2) ?

The question is not important. I want to come up with something better than n^2. — kl_divergence, Aug 09 '21 at 17:39
So this is from an ongoing competition, or why are you hiding the source? — Kelly Bundy, Aug 09 '21 at 17:40
This is not from a competition. I want to reduce inference time for the problem so that the output is received faster. — kl_divergence, Aug 09 '21 at 17:42
Must there be an element that is divisible by **all** elements in the list? — Yahya, Aug 09 '21 at 17:44
I'm not convinced that's possible in general. Maybe if there are for example limits mentioned in the task, but since you're hiding that, I can't tell. — Kelly Bundy, Aug 09 '21 at 17:45
Not all, but an element would exist that would be most divisible by all elements in the same list compared to others. — kl_divergence, Aug 09 '21 at 17:45
Consider an array of values `[b, b**n, b**(2*n), ...]`. Each value is divisible by all smaller values, so the O(n^2) bound is tight. (I *think* this example is general enough that you can't easily detect the pattern to avoid having to make all the necessary pairwise comparisons.) — chepner, Aug 09 '21 at 17:46
You might have some luck by decomposing every number into its prime factors. — Mark Ransom, Aug 09 '21 at 17:46
So the question is: What element in the list that is divisible by the **maximum number (i.e. not necessarily all other elements)** of other elements in the list? — Yahya, Aug 09 '21 at 17:46
No, the question is what element in the list is most divisible by elements in the same list. — kl_divergence, Aug 09 '21 at 17:48
Still did not get it, in your example, `8` is divisible by the **rest** of the elements in the list. And please tag my username when you reply so I get notified. — Yahya, Aug 09 '21 at 17:50
`8` is divisible by all in this case, but it might be the case that the answer in other cases, may not be divisible by all. We want the one that is divisible by most elements. — kl_divergence, Aug 09 '21 at 17:52
Yes, this is what I said in my previous comment, and you said *no*. — Yahya, Aug 09 '21 at 17:53
@chepner Not sure how you get an O(n^2) lower bound from your example. — kaya3, Aug 09 '21 at 17:55
I am not saying its a lower bound, all I am saying is, if anything better than n^2 exists, please let me know. — kl_divergence, Aug 09 '21 at 17:56
Besides crashing, your code also doesn't seem to print, return or in any way remember the desired element. And it relies on the order of the numbers. I suggest you first make it work, i.e., fix all the errors, and *then* ask about efficiency. — Kelly Bundy, Aug 09 '21 at 18:14
Can I sort the elements of the list for "free" or will you "charge" me `O(n log n)`? — JonSG, Aug 09 '21 at 19:32
@JonSG So, you have an algorithm that's otherwise faster than O(n log n)? Let's see it :-) — Kelly Bundy, Aug 09 '21 at 19:34
@KellyBundy no, I do not :-P However, if you can guarantee that the inputs are presorted then a naïve result can be determined in about half the steps as the original code (though this would still be still quadratic) — JonSG, Aug 09 '21 at 20:18
You can do O(n * sqrt(max_item)), but that's not necessarily less than O(n^2). — Matt Timmermans, Aug 09 '21 at 20:28
@kaya3 My idea is that you have to perform at least O(n^2) divisions into order to compute the correct count for each value in order to determine the maximum. While there are special cases that would let you bypass some of the divisions, I suspect you can't detect them all quickly enough to avoid the worst-case behavior. — chepner, Aug 09 '21 at 21:38
@chepner Well, the example you proposed seems to be designed to be a best case, because the smallest value `b` divides every other value and could plausibly be found in O(n) time as a special case of a suitable algorithm. — kaya3, Aug 09 '21 at 21:45
@kaya3 Assuming you know `b`. Use a less predictable series of exponents and throw in some values that don't have `b` as a factor, and I suspect it will take you too long to detect the special case for the special case to matter. — chepner, Aug 09 '21 at 21:51
@chepner It takes O(n) time to find `b` by a linear search for the smallest value, and O(n) time to find the elements which it divides, and a reasonable algorithm might start by doing those two things. On inputs like your example, the algorithm would be able to terminate immediately after doing that. — kaya3, Aug 09 '21 at 21:53
Whoops, I was thinking we were searching for the element dividing the most other elements; we're actually searching for the element which is divided by the most other elements, but the point stands; it is reasonable for an algorithm to start with a linear search for the largest element and then test how many numbers divide it. — kaya3, Aug 09 '21 at 21:57
@kaya3 OK, I'll give you a list where the largest value is coprime to any other element. That approach will just reduce a problem if size `n` to a problem of size `n-1`. — chepner, Aug 09 '21 at 22:00
@chepner Yes, but that still doesn't prove that it takes at least O(n^2) time. The hypothetical algorithm could start with a linear search to beat your first example in O(n) time, and then fall back to something else which is better than just doing repeated linear searches. — kaya3, Aug 09 '21 at 22:04
I just gave you an input that fails for a linear-search based algorithm. If the list is pairwise coprime, you'll make O(n) O(n) passes to discover that every value is divisible only by exactly one value (itself). — chepner, Aug 09 '21 at 22:08
@chepner No, you're assuming that the next thing the algorithm has to do is another linear search, then another linear search, and so on. But it doesn't; an algorithm can start with a linear search to beat your example and then do something else afterwards. To prove O(n^2) as a tight bound you need to show that *every* possible algorithm can't do better than O(n^2). — kaya3, Aug 09 '21 at 22:09
I think the worst is to get elements that are all relatively prime; I don't see how to avoid quadratic behaviour obviously in that case. Could factor it for a pseudo-polynomial running time? — Neil, Aug 09 '21 at 23:18
@chepner What do you think about Memoization in this case? It seems appealing since modulo is transitive. — Yahya, Aug 09 '21 at 23:53
Memoization would just speed up the individual divisions. I think the concern is the number of pairing to consider, not how long each pairing takes to consider. — chepner, Aug 10 '21 at 00:10
@chepner if an element x is divisible by y. And the latter is divisible by a set G, then x is divisible by G hence can skip all its elements. — Yahya, Aug 10 '21 at 00:20
@Yahya You have to show that a set `G` that will significantly decrease the runtime will always exist. It's one thing to find optimizations that will work in particular cases; quite another to show that *every* input can benefit from such optimizations. — chepner, Aug 10 '21 at 00:23
@chepner Yes. I agree. Although `G` might have a high chance to exist for large collection, yet it is not guaranteed to be large enough to significantly help in Memoization. — Yahya, Aug 10 '21 at 00:27
At the moment, I'm voting to close this because the code to be optimized doesn't work. Any discussion how to make it more efficient is futile. — Ulrich Eckhardt, Aug 10 '21 at 05:36

Neil · Answer 1 · 2021-08-10T03:25:04.710

I think you could change one of the n into a factor that is pseudo-polynomial by inserting the numbers into a dict (amortized, expected).

keys = [2,4,8,2]

# https://stackoverflow.com/a/280156/2472827
# max key, O(keys)
max_key = max(keys)

# https://stackoverflow.com/a/6582852/2472827
# occurrence counter, O(keys)
count = dict()
for k in keys:
    count[k] = count.get(k, 0) + 1

# https://stackoverflow.com/a/18634675/2472827
# transform into an answer counter, O(keys)
answer = dict.fromkeys(keys, 0)

# https://stackoverflow.com/a/1602964/2472827
# fill the answer, O(keys * max_key/min_key)
for a in answer:
    max_factor = int(max_key / a)
    for factor in range(1, max_factor + 1):
        number = a * factor
        if number in answer:
            answer[number] += count[a]

# O(keys)
a = max(answer, key = answer.get)
print answer[a], "/", len(keys), "list items dividing", a

I think it works in O(n * max_n/min_n) (expected.) In this case, it is pretty good, but if you have a high dynamic range of values, it's easy to make it go slow.

score 0 · Answer 2 · answered Aug 10 '21 at 02:26

keys = [2,4,5,8,2]

We can try something like memoization (from dynamic programming) to speed up while not doing any repeated calculations.

First, let's keep a hashmap, which stores all the divisors for a number in that array.

num_divisor = {} # hashmap

We keep another hashmap to store if a value is present in the array or not (count of that number).

cnt_num = {2: 2, 4: 1, 5: 1, 8: 1}

Now, we run prime sieve up to max(keys) to find the smallest prime factor for each number up to max(keys).

Now, we traverse the array while factoring out each number (factoring is only O(logn) given we know the smallest prime factor of each number now),

pseudo-code

for a in keys:
  temp_a = a # copy
  while temp_a != 1:
    prime_factor = smallest_prime_factor[temp_a]
    temp_a = temp_a / prime_factor
    if solution for new temp_a is already known in num_divisor just update it from there (no recalculation)
    else:
      if new temp_a is in keys, we increment the solution in num_divisor by 1 and continue

overall complexity: max(keys) * log(max(keys)) [for seive] + n * log(max(keys))

This should work well if the keys are uniformly distributed. For cases like,

keys = [2, 4, 1001210], it will do lots of unnecessary computation, so in those cases, it is better to avoid the sieve and instead compute the prime factors directly or in extreme cases, the pairwise divisor calculation should outperform.

I don't think this works. Could you include real code so we can test? — Stefan Pochmann, Aug 10 '21 at 03:13
My test would btw be something like `keys = [6, 30]`. For the 30, you'd right away go to 15 and never find out that it is divisible by the 6. — Stefan Pochmann, Aug 10 '21 at 03:20

Dave · Answer 3 · 2021-08-10T12:40:54.923

You could potentially improve your code by:

Account for duplicates by putting keys in a counting map first (e.g. so you don't have to parse the '2' twice in your example). This helps if there's a lot of repetition.
If the square root of the value being checked is smaller than the number of keys, check up to the square root of the value being checked (together with the value being checked divided by its divisors). This helps if there are lots of numbers having square roots that are smaller than the total number of elements.

E.g. If we're checking 30 and the list is big, we only need to check: 1 up to 5 to see if they divide 30 and their counts, as well as 30 divided by any of its divisors in this range (30/1=30, 30/2=15, 30/3=10, 30/5=6) and their counts.

E.g. if we're checking 10^100+17, and there are 10 items total, just check each of them in turn.

Neither of these affect worst case analysis since an adversary could choose inputs where they're useless. They may help in the problems you need to solve depending on your inputs, and may help more broadly if you have some guarantees on the inputs.

I don't understand why you don't want to check if 30 is divisible by 6, 10 or 15. — Ulrich Eckhardt, Aug 10 '21 at 05:33
@UlrichEckhardt They're covered by "as well as 30 divided by any of its divisors in this range". — Kelly Bundy, Aug 10 '21 at 09:56

kaya3 · Answer 4 · 2021-08-09T22:50:07.550

-1

Let's think about this a different way: instead of an array of numbers, consider a directed acyclic graph where each number becomes a vertex in the graph, and there is an edge from u → v if and only if u divides v. The problem is now to find a vertex with the greatest in-degree.

Note that we can't actually count the in-degree of a vertex directly in less than Θ(n) time, so the naive solution is to count the in-degree of every vertex in Θ(n²) time. To do better than Θ(n²), we need to take advantage of the fact that if there are edges u → v → w then there is also an edge u → w; we get knowledge of three edges for the price of two divisibility tests. If there are edges u → v → w → x then three divisibility tests can buy us knowledge of six edges in the graph, and so on.

However, crucially, we only get "free" information about edges if the numbers we test are divisible by each other. If we do a divisibility test and the result is negative, we get no "free" information about other possible edges. So in the worst case, there is only one edge in the graph (i.e. all the numbers in the array are not multiples of each other, except for one pair). For an algorithm to output the correct result, it must find the single number in the array which has a divisor in the array, but each divisibility test which doesn't find this pair gives no "free" information. So in the worst case, we would indeed have to test every pair to find the correct answer. This proves that a worst-case time complexity of Θ(n²) is the best you can do in the general* case.

^{*In case the array elements are bounded, then a pseudopolynomial algorithm could plausibly do better.}

edited Aug 09 '21 at 22:50

answered Aug 09 '21 at 22:07

kaya3

47,440
4
68
97

1

In fact, one can do better than `O(n^2)` in a little. – Yahya Aug 09 '21 at 22:11
@Yahya That is equivalent to saying you have an algorithm which, given an array of numbers where only one pair is divisible, you can find which pair it is without having to test every pair. So how do you propose to do that? – kaya3 Aug 09 '21 at 22:15
1

Does the "in the general case" rule out Matt's proposal, or what do you think about that? – Kelly Bundy Aug 09 '21 at 22:17
1

@KellyBundy Assuming you mean the proposal to do it in `O(n * sqrt(max element))` time, then as he points out, this isn't "better than O(n^2)" because it depends on the sizes of the elements in the array, which in general could be larger than n^2. If we want to be strictly formal, then the input size is measured in bits and the numeric value of an array element could be up to exponential in the input size. If it's known that there is a bound on the array elements, then there could well be a faster algorithm, but it's a different problem. – kaya3 Aug 09 '21 at 22:22
Yes, that's what I meant. Hmm... what if we combine the two algorithms, first spending O(n) time to decide which one to use, giving us O(min(n^2, n*sqrt(max element)))? Wouldn't that be better then, the same way that one class was called better than another by [this guy](https://stackoverflow.com/a/68543782/12671057)? :-) – Kelly Bundy Aug 09 '21 at 22:36
Yes, technically that would be better and I didn't word my answer carefully enough. What I mean is that no algorithm can have a better worst-case time complexity than Θ(n^2), and that includes such a construction. I do appreciate your sense of humour as usual though (-: – kaya3 Aug 09 '21 at 22:46

Calculate element which is most divisible by all elements in the same array (better than O(n^2))

4 Answers4