26

Design an algorithm to find all pairs of integers within an array which sum to a specified value.

I have tried this problem using a hash table to store entries for the sum of array elements, but it is not an efficient solution.

What algorithm can I use to solve this efficiently?

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
Rachel
  • 100,387
  • 116
  • 269
  • 365
  • It was not an assignment question. Thank you all for the inputs. I had tried this problem using Hash table to store entry for sum of array elements but it is not the efficient solution and so want to know what are thoughts of other stackoverflow readers on it. Again, thanks for the inputs. – Rachel Sep 29 '09 at 20:25
  • The accepted answer here is ... less than ideal, given that the problem can be solved without the added complexity of binary search. – Bernhard Barker Jun 16 '17 at 17:05
  • 1
    Possible duplicate of [Find a pair of elements from an array whose sum equals a given number](https://stackoverflow.com/questions/4720271/find-a-pair-of-elements-from-an-array-whose-sum-equals-a-given-number) – Tot Zam Aug 20 '17 at 14:46

15 Answers15

34

I don't see why the hash table approach is inefficient, at least in algorithm analysis terms - in memory locality terms admittedly, it can be quite bad. Anyway, scan the array twice...

First scan - put all the array elements in the hash table - O(n) total. Individual inserts are only amortized O(1), but a neat thing about how amortized analysis works means the O(n) is absolute - not amortized.

Second scan - check for (sum - current) in the hash table - O(n) total.

This beats the O(n log n) sort-and-search methods, at least in theory.

Then, note that you can combine the two scans into one. You can spot a pair as soon as you encounter the second of that pair during the first scan. In pseudocode...

for i in array.range
  hashset.insert (array [i])

  diff = sum - array [i]
  if hashset.includes (diff)
    output diff, array [i]

If you need positions of the items, use a hashmap and store item positions in it. If you need to cope with duplicates, you might need to store counts in a hashmap. For positions and duplicates, you might need a hashmap of start pointers for linked lists of positions.

This makes assumptions about the hash table implementation, but fairly safe ones given the usual implementations in most current languages and libraries.

BTW - combining the scans shouldn't be seen as an optimisation. The iteration overhead should be insignificant. Memory locality issues could make a single pass slightly more efficient for very large arrays, but the real memory locality issues will be in the hashtable lookups anyway.

IMO the only real reason to combine the scans is because you only want each pair reported once - handling that in a two-scan approach would be a bit more hassle.

  • 1
    The question is ambiguous as to if the pair returned should be from unique array indexes. The first two solutions presented will output a solution when the duplicate of a single value in the array is the sum. For example if `sum=10` and `array=[5,1,2,3]`. The solution will output `(5,5)`. The second solution can be modified to move the insert at the end of the loop to avoid this behavior. – harshpatel991 Jul 29 '17 at 19:02
  • 1
    @harshpatel991 - good point about the ambiguity, I never thought of that. If you're saying my first (two-pass) approach assumes it's OK to use the same value twice, that's correct - you'd probably need a count or duplicate-detected flag in the hash table to handle the alternative interpretation. –  Jul 29 '17 at 23:21
  • 1
    If you're saying my second (one-pass) approach does the same, re-reading it now, I just assumed your fix from the start - even though the included code doesn't do that. As often happens, it's very easy to miss that you're making assumptions, even when those assumptions are contradicted - well spotted. –  Jul 29 '17 at 23:24
19
  1. If the array is sorted:

    Let i = 0, j = end of array, sum = the value you are looking for, then do:

    If i+j = sum, then output (i,j).
    If i+j < sum, then move i to the right one position.
    If i+j > sum, then move j to the left one position.

    Time complexity: O(n). Space complexity: O(1).

  2. If the array is not sorted, there are a few ways to approach this problem:

    1. Sort the array and then use the above approach.

    2. HashMap:

      Store all elements in a HashMap.

      a+b=sum, so b=sum-a. For each element a of the array, look up b from the HashMap.

      HashMap lookup takes amortized O(1).

      Time complexity: O(n). Space complexity: O(n).

    3. BitMap:

      Iterate through the input to create a bitmap where each bit corresponds to an element value. Say the input is {2,5,8}, then we toggle the bitmap array's indices 2, 5 and 8 from binary 0 to 1. This takes O(1) per element, thus O(n) in total.

      Go through the input again. We know b=sum-a, so for every element a in the input, look up its b, which can be done in O(1) since it's a bitmap index. This also takes O(n) in total.

      Time complexity: O(n) + O(n) = O(n). Space complexity: bitmap space = O(n).

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
Srujan Kumar Gulla
  • 5,721
  • 9
  • 48
  • 78
12

You don't even need to store all the elements in hashmap, and then scan. You can scan during the first iteration itself.

void foo(int[] A, int sum) {
    HashSet<Integer> set = new HashSet<Integer>();
    for (int e : A) {
        if (set.contains(sum-e)) {
            System.out.println(e + "," + (sum-e));
            // deal with the duplicated case
            set.remove(sum-e);
        } else {
            set.add(e);
        }
    }
}
Xiang
  • 143
  • 1
  • 4
  • 10
Ashok Bijoy Debnath
  • 1,493
  • 1
  • 17
  • 27
6

How about sorting the array, then marching in from both ends?

Beta
  • 96,650
  • 16
  • 149
  • 150
  • That's exactly where I was heading. Keeps number of operations to a minimum. – Bomlin Sep 29 '09 at 18:21
  • The sorting of the array itself can furher be optimized, by keeping a tab of the smallest and biggest number during the initial pass and calculating a maximum (and possibly a minimum) threshold that can be used to eliminate numbers that are too big (and possibly too small) during the next iteration. – mjv Sep 29 '09 at 18:39
  • 2
    You can discard numbers that are too big only if you assume that the integers are non-negative. And how can you discard a number for being too small? – Beta Sep 29 '09 at 20:12
  • I agree with @Beta. It gets too complicated if you have negative numbers. Which was not mentioned in the question. That's why I resorted to a full sort and binary search. – Ayman Sep 30 '09 at 04:22
5

Assume required sum = R

  1. sort the array
  2. for each number in the array A(n), do a binary search to find the number A(x) such that A(n) + A(x) = R
Ayman
  • 11,265
  • 16
  • 66
  • 92
  • 3
    No need to search - you can do this much more efficiently than that. – Nick Johnson Sep 29 '09 at 21:30
  • Efficiently in what sense? The sort is O(n log n), the cost of n binary searches is only O(n log n). Optimizing the searches is therefore pretty pointless. Do you mean you have an answer better than O(n log n) overall? – Alice Purcell Sep 29 '09 at 21:47
  • 4
    Isn't binary search O(log n)? – Ayman Sep 30 '09 at 04:19
  • 2
    If the array is already sorted, your algorithm is O( n log n ), but O(n) is possible. – leiz Sep 30 '09 at 05:48
  • Indeed, the sort is O(n log n), but the search can be O(n) by moving in from each end of the array. The overall complexity is still O(n log n), but that doesn't mean that the latter solution isn't strictly more efficient. – Nick Johnson Sep 30 '09 at 12:02
  • @NickJohnson But moving pointers in both directions doesn't work in case we have duplicate elements in the array.I meant we will miss a duplicate pair. – crackerplace Mar 11 '15 at 18:51
  • @whokares it won't if you keep a list of pairs for current item. e.g. for current item, you store the pairs in a list. when moving pointer to next item, you check if it's duplicate, if so, simply copy the list to result list and move on (you don't have to search one more time for the next item) – xialin May 09 '16 at 04:27
2

If you don't mind spending O(M) in space, where M is the sum you are seeking, you can do this in O(N + M) time. Set sums[i] = 1 when i <= M on a single pass over N, then check (sums[i] && sums[M-i]) on a single pass over M/2.

Blazemonger
  • 90,923
  • 26
  • 142
  • 180
1

Implemented in Python 2.7:

import itertools
list = [1, 1, 2, 3, 4, 5,]
uniquelist = set(list)
targetsum = 5
for n in itertools.combinations(uniquelist, 2):
    if n[0] + n[1] == targetsum:
    print str(n[0]) + " + " + str(n[1])

Output:

1 + 4
2 + 3
Erict2k
  • 15
  • 4
1

We can use C++ STL map to solve this

void subsetSum(int arr[], int n, int sum)
{
    map<int, int>Map;

    for(int i=0; i<n; i++)
    {
        Map[arr[i]]++;
        if(Map.count(sum-arr[i]))
        {
            cout<<arr[i]<<" "<<sum-arr[i]<<"\n";
        }
    }
}
pacholik
  • 8,607
  • 9
  • 43
  • 55
rashedcs
  • 3,588
  • 2
  • 39
  • 40
  • Using std::unordered_map this is the HashMap method (without the work of having to set up the hash map yourself.) – Don Slowik Aug 01 '21 at 00:30
  • I think the order of increment a Map element and checking if the count of element [sum-arr[i]] has to be reversed to deal with arr[i] == sum - arr[i] case. – Don Slowik Aug 01 '21 at 00:39
1
#include <iostream>
using namespace std;

#define MAX 15

int main()
{
 int array[MAX] = {-12,-6,-4,-2,0,1,2,4,6,7,8,12,13,20,24};
 const int find_sum = 0;
 int max_index = MAX - 1;
 int min_index = 0;
 while(min_index < max_index)
 {
  if(array[min_index] + array[max_index-min_index] == find_sum)
  {
   cout << array[min_index] << " & " << array[max_index-min_index] << " Matched" << endl;
   return 0;
  }
  if(array[min_index]+array[max_index-min_index] < find_sum)
  {
   min_index++;
   //max_index++;
  }
  if(array[min_index]+array[max_index-min_index] > find_sum)
  {
   max_index--;
  }
 }
 cout << "NO MATCH" << endl;
 return 0;
}
//-12 & 12 matched
Zaki
  • 6,997
  • 6
  • 37
  • 53
devsathish
  • 2,339
  • 2
  • 20
  • 16
0

Here is a solution witch takes into account duplicate entries. It is written in javascript and assumes array is sorted. The solution runs in O(n) time and does not use any extra memory aside from variable.

var count_pairs = function(_arr,x) {
  if(!x) x = 0;
  var pairs = 0;
  var i = 0;
  var k = _arr.length-1;
  if((k+1)<2) return pairs;
  var halfX = x/2; 
  while(i<k) {
    var curK = _arr[k];
    var curI = _arr[i];
    var pairsThisLoop = 0;
    if(curK+curI==x) {
      // if midpoint and equal find combinations
      if(curK==curI) {
        var comb = 1;
        while(--k>=i) pairs+=(comb++);
        break;
      }
      // count pair and k duplicates
      pairsThisLoop++;
      while(_arr[--k]==curK) pairsThisLoop++;
      // add k side pairs to running total for every i side pair found
      pairs+=pairsThisLoop;
      while(_arr[++i]==curI) pairs+=pairsThisLoop;
    } else {
      // if we are at a mid point
      if(curK==curI) break;
      var distK = Math.abs(halfX-curK);
      var distI = Math.abs(halfX-curI);
      if(distI > distK) while(_arr[++i]==curI);
      else while(_arr[--k]==curK);
    }
  }
  return pairs;
}

So here it is for everyone.

Start at both side of the array and slowly work your way inwards making sure to count duplicates if they exist.

It only counts pairs but can be reworked to

  • find the pairs
  • find pairs < x
  • find pairs > x

Enjoy and don't forget to bump it if its the best answer!!

Drone Brain
  • 419
  • 3
  • 8
0

A solution that takes into account duplicates and uses every number only one time:

void printPairs(int[] numbers, int S) {
    // toMap(numbers) converts the numbers array to a map, where
    // Key is a number from the original array
    // Value is a count of occurrences of this number in the array
    Map<Integer, Integer> numbersMap = toMap(numbers); 

    for (Entry<Integer, Integer> entry : numbersMap.entrySet()) {
      if (entry.getValue().equals(0)) {
        continue;
      }
      int number = entry.getKey();
      int complement = S - number;
      if (numbersMap.containsKey(complement) && numbersMap.get(complement) > 0) {
      for (int j = 0; j < min(numbersMap.get(number), 
                              numbersMap.get(complement)); j++) {
        if (number.equals(complement) && numbersMap.get(number) < 2) {
           break;
        }
        System.out.println(number, complement);
        numbersMap.put(number, numbersMap.get(number) - 1);
        numbersMap.put(complement, numbersMap.get(complement) - 1);
      }
    }
  }
}
Vitalii Fedorenko
  • 110,878
  • 29
  • 149
  • 111
0

Hashtable solution, in Ruby (quite straightforward to understand):

value = 100
pairs = [1,99,5,95]
hash_of_pairs = {}

pairs.map! do |pair|
  # Adds to hashtable the pair
  hash_of_pairs[pair] = pair

  # Finds the value the pair needs
  new_pair = hash_of_pairs[value - pair]

  # Returns the pair whenever the pair exists, or nil
  [new_pair, pair] if !new_pair.nil?
end.compact! # Cleans up the array, removing all nil values

print pairs # [[1,99], [5,95]]
thiagofm
  • 5,693
  • 4
  • 20
  • 26
0
@Test
public void hasPairWithSum() {
  assertFalse(hasPairWithSum_Ordered_Logarithmic(new int[] { 1, 2, 3, 9 }, 8));
  assertTrue(hasPairWithSum_Ordered_Logarithmic(new int[] { 1, 2, 4, 4 }, 8));

  assertFalse(hasPairWithSum_Ordered_Linear(new int[] { 1, 2, 3, 9 }, 8));
  assertTrue(hasPairWithSum_Ordered_Linear(new int[] { 1, 2, 4, 4 }, 8));

  assertFalse(hasPairWithSum_Unsorted_Linear(new int[] { 9, 1, 3, 2 }, 8));
  assertTrue(hasPairWithSum_Unsorted_Linear(new int[] { 4, 2, 1, 4 }, 8));

  assertFalse(hasPairWithSum_Unsorted_Quadratic(new int[] { 9, 1, 3, 2 }, 8));
  assertTrue(hasPairWithSum_Unsorted_Quadratic(new int[] { 4, 2, 1, 4 }, 8));
}

private boolean hasPairWithSum_Ordered_Logarithmic(int[] data, int sum) {
  for (int i = 0; i < data.length; i++) {
    int current = data[i];
    int complement = sum - current;
    int foundIndex = Arrays.binarySearch(data, complement);
    if (foundIndex >= 0 && foundIndex != i) {
      return true;
    }
  }
  return false;
}

private boolean hasPairWithSum_Ordered_Linear(int[] data, int sum) {
  int low = 0;
  int high = data.length - 1;
  while (low < high) {
    int total = data[low] + data[high];
    if (total == sum) {
      return true;
    } else if (total < sum) {
      low++;
    } else {
      high--;
    }
  }
  return false;
}

private boolean hasPairWithSum_Unsorted_Linear(int[] data, int sum) {
  Set<Integer> complements = Sets.newHashSet();
  for (int current : data) {
    if (complements.contains(current)) {
      return true;
    }
    complements.add(sum - current);
  }
  return false;
}

private boolean hasPairWithSum_Unsorted_Quadratic(int[] data, int sum) {
  for (int i = 0; i < data.length; i++) {
    int current = data[i];
    int complement = sum - current;
    for (int j = 0; j < data.length; j++) {
      if (data[j] == complement && i != j) {
        return true;
      }
    }
  }
  return false;
}
RKumsher
  • 2,787
  • 2
  • 14
  • 11
0

Creating a hash table and then looking for value in it.

function sum_exist(num : number, arr : any[]) {
    var number_seen = {};
    for(let item of arr){
        if(num - item in number_seen){
            return true
        }
        number_seen[item] = 0;
    }
    return false;
}

Test case (using Jest)

test('Given a list of numbers, return whether any two sums equal to the set number.', () => {
    expect(sum_exist(17 , [10, 15, 3, 7])).toEqual(true);
});


test('Given a list of numbers, return whether any two sums equal to the set number.', () => {
    expect(sum_exist(16 , [10, 15, 3, 7])).toEqual(false);
});
Asif Amin
  • 16
  • 3
0
#python 3.x
def sum_pairs(list_data, number):    
    list_data.sort()
    left = 0
    right = len(list_data)-1
    pairs = []
    while left < right:        
        if list_data[left]+list_data[right] == number:
            find_pairs = [list_data[left], list_data[right]]
            pairs.append(find_pairs)
            right = right-1
        elif list_data[left]+list_data[right] < number:
            left = left+1
        else:
            right = right-1
    return pairs
rajanbhadauria
  • 445
  • 4
  • 11