Is there an O(N) solution for this algorithm?

Question

includes duplicates and the reversed ordered pairs that add up to sum

numbers = [1, 2, 4, 4, 4, 4, 5, 5, 5, 7, 7, 8, 8, 8, 9]

match = []
for i in range(len(numbers)):
    for j in range(len(numbers)):
        if (i!=j):
            if(numbers[i] + numbers[j] == sum):
                match.append([numbers[i], numbers[j]])

I need to check for matches as well as duplicates, so the output needs to look like [[1, 9], [2, 8], [2, 8], [2, 8], [5, 5], [5, 5], [5, 5], [5, 5], [5, 5], [5, 5], [8, 2], [8, 2], [8, 2], [9, 1]]

If all the numbers are equal to `sum/2`, there will be O(n^2) pairs, thus O(n) (or more generally anything faster than O(n^2)) is impossible for the general case (because no algorithm can have a lower time complexity than the size of output it produces). — Bernhard Barker, Jun 16 '17 at 16:18
You'd do well to describe better textually what the algorithm is supposed to do. Code tells the computer what to do, only your comments can tell us what you _intended_ to do. — Richard, Jun 16 '17 at 16:18
includes duplicates and the reversed ordered pairs that add up to sum — joe, Jun 16 '17 at 16:19
The approaches given in [Design an algorithm to find all pairs of integers within an array which sum to a specified value](https://stackoverflow.com/questions/1494130/design-an-algorithm-to-find-all-pairs-of-integers-within-an-array-which-sum-to-a) should work with minor modifications. See also [2sum with duplicate values](https://stackoverflow.com/questions/18129171/2sum-with-duplicate-values) for a longer explanation as to why O(n^2) is impossible. — Bernhard Barker, Jun 16 '17 at 16:44
@joe what does that even mean includes duplicated? Can you explain what your trying to do. I'm pretty sure this can be an O(n log n). And you can have your second for look start at what ever the count i is. But you need to explain what your trying to do — johnny 5, Jun 16 '17 at 17:40

hiro protagonist · Answer 1 · 2017-06-17T06:10:30.900

this is a variant:

from collections import Counter


def sum_target(lst, target):

    unique_list = list(set(lst))
    counts = Counter(lst)

    remainders = {number: target-number for number in unique_list}

    match = set()
    for a, b in remainders.items():
        if b in remainders:
            match.add(tuple(sorted((a, b))))

    # restore multiplicities:
    ret = []
    for a, b in match:
        if a != b:
            mult = counts[a] * counts[b]
            ret.extend([a, b] for _ in range(mult))
            ret.extend([b, a] for _ in range(mult))
            # ret.append((mult, (a, b)))
            # ret.append((mult, (b, a)))
        else:
            mult = counts[a] * (counts[a]-1)
            ret.extend([a, a] for _ in range(mult))
            # if mult != 0:
                # ret.append((mult, (a, b)))

    return ret

not entirely sure the multiplicities are handled the way you want (in your example they are twice the ones i get... why are there 6 versions of [5, 5]? with three 5s in your original list you only get 3 pairs...)

if you have many duplicates in your list then given the length of your output i suggest changing the last part of my code to

# restore multiplicities:
ret = []
for a, b in match:
    if a != b:
        mult = counts[a] * counts[b]
        ret.append((mult, (a, b)))
        ret.append((mult, (b, a)))
    else:
        mult = counts[a] * (counts[a]-1)
        if mult != 0:
            ret.append((mult, (a, b)))

to get the result as

numbers = [1, 2, 4, 4, 4, 4, 5, 5, 5, 7, 7, 8, 8, 8, 9]
s = 10
print(sum_target(lst=numbers, target=s))
# [(3, (8, 2)), (3, (2, 8)), (6, (5, 5)), (1, (1, 9)), (1, (9, 1))]

Are you sure this algorithm is O(N)... by using lst.count inside a loop doesn't that make it O(N^2) — joe, Jun 16 '17 at 16:22
you have a point there... replaced it with a `Counter` object that transverses the list once only and has dict-speed lookup. — hiro protagonist, Jun 16 '17 at 16:28

Aleksei Shestakov · Answer 2 · 2017-06-16T17:05:36.640

0

As is was mentioned in comments there is no O(n) solution for general case, because in worst case you need to output n^2 pairs of numbers -- obviously it can't be done in O(n) time. But there is linear solution for the case when there is no equal numbers in array, you just need to use dictionary:

numbers = [1, 2, 4, 4, 4, 4, 5, 5, 5, 7, 7, 8, 8, 8, 9]
tgr = 10

num2idx = {numbers[i]: i for i in xrange(len(numbers))}
for i in xrange(len(numbers)):
    x = numbers[i]
    y = trg - x
    if y in num2idx:
        j = num2idx[y]
        while j >= 0 and numbers[j] == y:
            if i != j:
                print x, y
            j -= 1

edited Jun 16 '17 at 17:05

answered Jun 16 '17 at 16:32

Aleksei Shestakov

2,508
2
13
14

How fast is the above code? Is this actually faster than my original? You also said if theres no equal numbers in the array.. but there is.. – joe Jun 16 '17 at 16:35
Well, it has complexity `O(n^2)` and it's the best you can get -- because for the worst case (array like [5,5,5,5,5]) there is no algorithm faster than `O(n^2)`. Although this algorithm has the same complexity as yours, usually it will be faster because it doesn't waste time on pairs that don't match condition, and for the case with all different numbers it will even have linear time of work. – Aleksei Shestakov Jun 16 '17 at 16:53
Is there a way to write it without the enumerate keywords? I'm not too familiar with python and having a little trouble understanding exactly what it's doing – joe Jun 16 '17 at 16:59
Also I tested this with unsorted numbers and it doesn't seem to work.. Does this only work assuming the numbers are sorted? – joe Jun 16 '17 at 17:03
What exactly is "num2idx = {numbers[i]: i for i in xrange(len(numbers))}" – joe Jun 16 '17 at 17:07
Yep, only with sorted array. Sorting of an array takes `O(n log n)` so it doesn't ruin the overall complexity. `num2idx` is a map from number to its last index in array. – Aleksei Shestakov Jun 16 '17 at 17:09
so this + sort still > my original? – joe Jun 16 '17 at 17:11
It still has the same complexity `O(n^2)` (because `O(n^2 + n log n)` equals to `O(n^2)`) and still faster on random array. – Aleksei Shestakov Jun 16 '17 at 17:18
For a start - you could use `num2idx = {val: idx, for idx, val in enumerate(numbers)}`... – Jon Clements Jun 16 '17 at 17:31

G_M · Answer 3 · 2017-06-16T17:31:33.297

In[50]: numbers
Out[50]: [1, 2, 4, 4, 4, 4, 5, 5, 5, 7, 7, 8, 8, 8, 9]
In[51]: expected_result
Out[51]: 
[[1, 9],
 [2, 8],
 [2, 8],
 [2, 8],
 [5, 5],
 [5, 5],
 [5, 5],
 [5, 5],
 [5, 5],
 [5, 5],
 [8, 2],
 [8, 2],
 [8, 2],
 [9, 1]]
In[52]: from collections import Counter
   ...: 
   ...: 
   ...: def matches(nums, target):
   ...:     cnt = Counter(nums)
   ...:     result = []
   ...:     for num in nums:
   ...:         diff = target - num
   ...:         result.extend([[num, diff]] * (cnt[diff] - (diff == num)))
   ...:     return result
   ...: 
In[53]: matches(numbers, target=10) == expected_result
Out[53]: True

Is there an O(N) solution for this algorithm?

3 Answers3