How to find 2-sum in linear time?

Question

I enrolled in the Algorithms, Part II course on Coursera, and one of the interview questions (ungraded) is as follows:

2-sum. Given an array a of n 64-bit integers and a target value T, determine whether there are two distinct integers i and j such that a[i] + a[j] = T. Your algorithm should run in linear time in the worst case.

Hint: sort the array in linear time.

I can think of solving it in several ways:

Insert the elements in a hash table in one pass. Then make a 2nd pass, looking for T - a[i] in the hash table. Space complexity is O(n) for the hash table, and O(n) for the 2 passes. This solution meets the time requirement stated in the question.
Sort the array, and then run 2 pointers i and j from the beginning and the end, respectively, looking for a[i] + a[j] = T. If a[i] + a[j] < T, increment i, else decrement j. Space complexity is dependent on the sorting algorithm; assuming, Quick Sort, no additional space necessary. Time complexity, nlogn, so it fails the time requirement stated in the question.
Considering that the question is given after the Radix Sort lecture, I'm assuming the intent is to use one of the Radix Sorts. Since the question specifies 64-bit integers, using binary representation of long, and using In-place MSD radix sort, the array can be sorted in-place in linear time. This seems like the best approach.

Other ideas?

P.S. I've seen this question, but it assumes a sorted array, and all the answers there use some kind of hashing.

I've also seen this question but it seems overly complicated for the specific case of 2-sum.

@lexicore the marked question is NOT a duplicate, because the array is specified to be sorted there and not here! — xrisk, Jun 10 '18 at 10:40
@pamcevoy Not necessary, This is a question on algorithms, and it should be abundantly clear that I’ve given it some thought. Do you’ve anything to add to this discussion? — Abhijit Sarkar, Jun 10 '18 at 17:21
With Radix sort you might be able to reject a few high numbers from the initial array immediately. If T = 15, say, then any number with bit 5 or higher can immediately be rejected, shortening the array to be searched for matches. — rossum, Jun 10 '18 at 20:45
Use solution 1, but a single pass (check each element you process against the hash; If such a pair exists, when you get to the 2nd element the first will be in the hash. — Dave, Jun 11 '18 at 17:21

NiVeR · Answer 1 · 2018-06-10T17:33:46.767

2

I analyzed your observations and regarding the one that seems to meet the requirement:

Insert the elements in a hash table in one pass. Then make a 2nd pass, looking for T - a[i] in the hash table. Space complexity is O(n) for the hash table, and O(n) for the 2 passes. This solution meets the time requirement stated in the question.

I believe that this approach doesn't meet the requirements as the hashtable worst case insert complexity theoretically is O(n).

As you said that you studied the Radix Sort, I believe that's the way to go, you can sort the array within the time requirements and afterwards you can use the Two Pointers technque to check if there is the sum T:

int left = 0, right = array_size - 1;
boolean found = false;
while (left < right) {

    if (a[left] + a[right] == T) {              
        found = true;
    }
    else if (a[left] + a[right] < T) {
        left++;
    }
    else {
        right--;
    }
}

edited Jun 10 '18 at 17:33

answered Jun 10 '18 at 11:25

NiVeR

9,644
4
30
35

The worst case complexity depends only on the compatibility or incompatibility of the hash function. As you know, with a bad hash function the entire hashtable is "useless". Could we not take for granted in these scenarios that the hash function does its job and that the complexity is indeed O(1) amortized? – JohEker Jun 10 '18 at 11:34
1

@JohEker We can discuss on this topic, but anyway I don't think that this is the approach to solve this problem. The hint suggests that sorting needs to be done. – NiVeR Jun 10 '18 at 11:38
It’s inconceivable that a hash function for integers would devolve into O(n). While it’s theoretically possible, we know the allowed range of input, and the fact that Java OOTB hash code would do just fine. If we had no idea about the input, then perhaps I’d worry about hash collisions. That said, I’m inclined towards Radix sort too simply because it’s a learning exercise from that chapter. – Abhijit Sarkar Jun 10 '18 at 17:31
@AbhijitSarkar I agree but yeah, it was clear that the sorting was the way to go. Did it solve your problem as expected? – NiVeR Jun 10 '18 at 17:34
I upvoted you, but your suggestion of using Radix sort was already mentioned in my question, so without working code, I don't feel like something additional was offered. I posted working code for those who might stumble across this post in the future. – Abhijit Sarkar Jun 11 '18 at 04:30
@AbhijitSarkar i disagree. Radix sort is the obvious part known from algorithms and data structures class from university. The key algorithm is what is done afterwards, which i did provide. – NiVeR Jun 11 '18 at 06:07

score 0 · Answer 2 · answered Jun 10 '18 at 12:06

As you traverse the array, put the values into a hash mapping the value to index. Since we are only looking for the sum of two numbers, look for the sum of the current number and the remainder in the hash to get the target.

public static int[] twoSumInOnePass(int[] values, int target) throws Exception {
    // value => index
    Map<Integer, Integer> valueToIndexMap = new HashMap<Integer, Integer>();
    for (int i = 0; i < values.length; i++) {
        int remainder = target - values[i];
        if (valueToIndexMap.containsKey(remainder)) {
            return new int[] { valueToIndexMap.get(remainder), i };
        }
        valueToIndexMap.put(values[i], i);
    }

    throw new Exception("Could not find indexes that sum to " + target);
}

https://leetcode.com/problems/two-sum/description/

It's just 1 pass. The values are inserted as we go and checked in the same pass. At each index we are asking if there exists a[j] with j < i such that a[j] + a[i] = target. — pamcevoy, Jun 11 '18 at 01:06

score 0 · Answer 3 · answered Jun 11 '18 at 04:26

Answering my own question, here's the working solution in Java:

public class RadixSortsInterviewQuestions {
    private static final int MSB = 64;

    static Map.Entry<Integer, Integer> twoSum(long[] a, long sum) {
        int n = a.length - 1;
        sort(a, MSB, 0, n);

        for (int i = 0, j = n; i < j; ) {
            long t = a[i] + a[j];
            if (t == sum) {
                return new SimpleImmutableEntry<>(i, j);
            } else if (t < sum) {
                i++;
            } else {
                j--;
            }
        }
        return null;
    }

    // Binary MSD radix sort: https://en.wikipedia.org/wiki/Radix_sort#In-place_MSD_radix_sort_implementations
    private static void sort(long[] a, int d, int lo, int hi) {
        if (hi < lo || d < 1) return;

        int left = lo - 1;
        int right = hi + 1;

        for (int i = left + 1; i < right; ) {
            if (isBitSet(a[i], d)) {
                swap(a, i, --right);
            } else {
                left++;
                i++;
            }
        }
        sort(a, d - 1, lo, left);
        sort(a, d - 1, right, hi);
    }

    private static boolean isBitSet(long x, int k) {
        boolean set = (x & 1L << (k - 1)) != 0;

        // invert signed bit so that all positive integers come after negative ones
        return (k == MSB) != set;
    }

    private static void swap(long[] a, int i, int j) {
        long tmp = a[i];
        a[i] = a[j];
        a[j] = tmp;
    }
}

How to find 2-sum in linear time?

3 Answers3