9

Possible Duplicate:
Given two arrays a and b .Find all pairs of elements (a1,b1) such that a1 belongs to Array A and b1 belongs to Array B whose sum a1+b1 = k .

Given : An unsorted array A of integers
Input : An integer k

Output : All the two element set with sum of elements in each set equal to k in O(n).

Example:

A = {3,4,5,1,4,2}

Input : 6
Output : {3,3}, {5,1}, {4,2}

Note : I know an O(n logn) solution but that would require to have the array sorted. Is there any way by which this problem can be solved in O(n). An non-trivial C++ data structure can be used i.e there's no bound on space

Community
  • 1
  • 1
Debanjan Chanda
  • 93
  • 1
  • 1
  • 3

5 Answers5

18

Make a constant-time lookup table (hash) so you can see if a particular integer is included in your array (O(n)). Then, for each element in the array, see if k-A[i] is included. This takes constant time for each element, so a total of O(n) time. This assumes the elements are distinct; it is not difficult to make it work with repeating elements.

  • Technically, short of a perfect hash function, a hash-table is only amortized time O(1). Though a clever solution nonetheless. If however the question to whether `3` can be used twice is *no*, then you'll have to count each member during hash construction. – edA-qa mort-ora-y Apr 12 '11 at 05:18
  • 1
    There's *no bound on space at all*, according to the question, and yet this is C++ in which integers are typically bounded, so you have a perfect hash function - the identity function. You just need a mega huge hash table (lookup table really), which takes very large - but *constant* - time to initialise! Tada! ;) What this proves is that complexity analysis can be misleading, especially when you postulate infinite memory. – Robin Green Apr 12 '11 at 17:35
  • @Robin A hash is different from the identity function in that it is indeed linear to the number of inputs. The identity function is limited to integers, and takes as much time as the interval between the smallest and largest elements; it is in no sense "linear". –  Apr 13 '11 at 03:32
  • @bdares The identity function doesn't take any time at all, when properly optimised out; it doesn't do anything. – Robin Green Apr 13 '11 at 04:27
  • @Robin: Infinite space is not the same as Infinite _initialized_ space. How do you initialize an infinite space hash table in constant time? I am pretty sure by unbounded space Debanjan meant you can allocate extra arrays etc. Isn't it a known fact that working say with O(n^2) space will require O(n^2) time? So I don't really see any credibility to your argument here. – user127.0.0.1 Apr 13 '11 at 17:45
  • btw, On some research, here is one which talks about using unsigned integer arrays without initializing them: http://eli.thegreenplace.net/2008/08/23/initializing-an-array-in-constant-time/. I believe these are called sparse arrays. So if we are allowed to do a random access of the infinite memory using the integer value as a 'pointer'/offset into an array/lookup table, then we have an easy O(n) solution. That seems like cheating though. – user127.0.0.1 Apr 13 '11 at 21:23
1

There are k pairs of integers that sum to k: {0,k}, {1,k-1}, ... etc. Create an array B of size k+1 where elements are boolean. For each element e of the array A, if e <= k && B[e] == false, set B[e] = true and if B[k-e] == true, emit the pair {e,k-e}. Needs to be extended slightly for negative integers.

cbare
  • 12,060
  • 8
  • 56
  • 63
1

Just a simple algorithm off the top of my head:

  • Create a bitfield that represents the numbers from 0 to k, labeled B
  • For each number i in A
    • Set B[i]
    • If B[k-i] is set, add (i, k-i) to the output

Now as people have raised, if you need to have two instances of the number 3 to output (3, 3) then you just switch the order of the last two statements in the above algorithm.

Also I'm sure that there's a name for this algorithm, or at least some better one, so if anyone knows I'd be appreciative of a comment.

Dominik Grabiec
  • 10,315
  • 5
  • 39
  • 45
  • Your algorithm takes time O(k*n), where k is the target number. This is much, much worse than O(n). –  Apr 12 '11 at 04:53
  • No, it's O(n), where n is determined by the length of A. The bitfield is a constant time lookup O(1) not a linear time lookup O(n) which you are thinking. – Dominik Grabiec Apr 12 '11 at 05:48
  • Yes, you're right. I misread your algorithm, and assumed you were reading every slot in B. –  Apr 12 '11 at 05:54
  • 1
    This algo needs O(k) space and equivalent to the proposed (java) hashset solutions (in c++ set is a tree) – Timofey Jan 30 '14 at 01:57
  • I think you might be confused Tim, I don't use a set data structure. I use a k sized bitfield, so that's 1 bit for each number from 0 to k, hence it uses `O( ceil( k / 8 ) )` bytes if you want to be pedantic. – Dominik Grabiec Jan 30 '14 at 08:17
  • The implementation http://ideone.com/jOOZX4 – puneet Oct 16 '14 at 01:52
1

http://codepad.org/QR9ptUwR

This will print all pairs. The algorithm is same as told by @bdares above.

I have used stl maps as we dont have hash tables in STL.

Vink
  • 1,019
  • 2
  • 9
  • 18
  • `std::map<>::insert(x)` is O(log N), and `std::map<>::count(x)` is O(N). Isn't your algorithm thus O(N^2)? – Robᵩ Apr 12 '11 at 04:54
  • @Rob Adams, I guess, count in map is O(logN) as it does not hold multiple elements with same key. Count in multimap is O(N). My algorithm above is O(NlogN). If we use a hashmap instead of map, we would have O(N) algorithm. – Vink Apr 12 '11 at 05:04
  • As of C++11, we kind of do: [unordered_map](http://en.cppreference.com/w/cpp/container/unordered_map) – Chiffa Sep 25 '13 at 22:40
  • 1
    why not to post the code here ;-) – Timofey Jan 30 '14 at 01:53
1

One can reduce the,

Element Uniqueness bit,

to this. No O(n).

Tarzan.
  • 11
  • 1
  • 1
    Can you give some hint what you mean by that? Or a link to a definition of element uniqueness bit and to the reduction? – kap Jan 22 '15 at 11:14