Finding a testcase which fails the code in the 4 sum problem

Question

We need to find whether there exists 4 numbers a, b, c and d (all numbers should be at different indices) in an array whose sum equals to a constant k.

Now its hashing based solution goes like this: Make a hash having key as sum of every pair in array and value as array of pairs of indices whose sum is the key. Now iterate over every pair in array and try to find the remaining sum in the hash table we made, while also checking that no 2 indices should be common.

While the above solution is fine, a solution I saw on geeksforgeeks.com did this: In the hash table the value is a pair instead of array of pairs. It only stores the last pair which concludes to a sum. It clearly looks wrong to me but I still can't find a test case where it fails.

Their code:

    // A hashing based  CPP program to find if there are    
    // four elements with given sum. 
    #include <bits/stdc++.h>     
    using namespace std; 

    // The function finds four elements with given sum X     
    void findFourElements (int arr[], int n, int X)    
    {     
        // Store sums of all pairs in a hash table    
        unordered_map<int, pair<int, int>> mp;     
        for (int i = 0; i < n-1; i++)     
            for (int j = i+1; j < n; j++)    
                mp[arr[i] + arr[j]] = {i, j};      

        // Traverse through all pairs and search     
        // for X - (current pair sum).      
        for (int i = 0; i < n-1; i++)     
        {     
            for (int j = i+1; j < n; j++)     
            {     
                int sum = arr[i] + arr[j];     
          
                // If X - sum is present in hash table,                 
                if (mp.find(X - sum) != mp.end())     
                {               
                    // Making sure that all elements are     
                    // distinct array elements and an element 
                    // is not considered more than once.     
                    pair<int, int> p = mp[X - sum];     
                    if (p.first != i && p.first != j &&     
                            p.second != i && p.second != j)     
                    {     
                        cout << arr[i] << ", " << arr[j] << ", "     
                             << arr[p.first] << ", "     
                             << arr[p.second];     
                        return;     
                    }     
                }     
            }     
        }     
    }     
          
    // Driver program to test above function     
    int main()     
    {     
        int arr[] = {10, 20, 30, 40, 1, 2};     
        int n = sizeof(arr) / sizeof(arr[0]);     
        int X = 91;     
        findFourElements(arr, n, X); 
        return 0; 
    }

How can I find a testcase where this code fails, or if it is correct, how?

If the array was: [1 2 3 6 2 3] and the sum was 12 then you could match pairs (1,4) and (2,3) or you could match (1,4) with (5,6). You could store a hash with a key of 5 with the array [(2,3) (5,6)] but what's the point? You only need one pair that sums to 5 unless you are trying to find ALL the combination of four items that sums to 12. Are you just trying to find one or all of the combinations? — Jerry Jeremiah, Nov 26 '19 at 21:11
Ok, this isn't a proof but I think they are right. If there are repeated values in the array they will store the index of the last value but then when the search happens they will match against the other one. I can't think of a case where it wouldn't work. Maybe you need to ask at mathematics.stackexchange.com ... — Jerry Jeremiah, Nov 26 '19 at 21:48
[Do not use `#include `](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h) and [avoid `using namespace std;`](https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice). Individually they can cause problems, but together they can transform into a destroyer of souls as well as a waster of time. — user4581301, Nov 26 '19 at 22:01

ph3rin · Accepted Answer · 2019-11-29T21:24:55.253

The algorithm is correct. Consider a quadruple (a, b, c, d) which satisfies the following: (1) arr[a] + arr[b] + arr[c] + arr[d] == k; (2) a < b < c < d.

It is obvious that four distinct element of the array sum to k if and only if such quadruple (a, b, c, d) exists.

Now consider the pair (a, b). You have mentioned the program records the last pair (e, f) (e < f) that is a compliment of (a, b) (i.e. arr[a] + arr[b] + arr[e] + arr[f] == k). Note that since (e, f) is the last pair with such property, so e >= c. Therefore a < b < e < f. Now we have found a valid quadruple (a, b, e, f).

Since the second loop traverse through all pairs, the pair (a, b) must have been visited, and the quadruple must have been detected. Therefore the algorithm is correct.

score -1 · Answer 2 · answered Nov 26 '19 at 21:48

-1

It only stores the last pair which concludes to a sum.

Not quite. It stores all of the pairs, just like you stored all of your arrays of length 2. Their algorithm does that here:

// Store sums of all pairs in a hash table 
unordered_map<int, pair<int, int>> mp; 
for (int i = 0; i < n-1; i++) 
    for (int j = i+1; j < n; j++) 
        mp[arr[i] + arr[j]] = {i, j};

{i, j} is a pair consisting of i as the first value and j as the second.

I think you're confused about what happens here:

pair<int, int> p = mp[X - sum]; 
if (p.first != i && p.first != j && 
        p.second != i && p.second != j)

They're pulling a pair out of the map. Notably, the pair that they're matching with to form the X sum. They could do:

if (mp[X - sum].first != i && mp[X - sum].first != j &&
        mp[X - sum].second != i && mp[X - sum].second != j)

But that's both ugly and a lot of map lookups. So instead they decide to copy the pair they're concerned with in a local variable, p.

They then make sure that neither of the indexes in p are those they're looking at now, i and j. Does that make sense?

answered Nov 26 '19 at 21:48

scohe001

15,110
2
31
51

How does it store all the pairs with the value of the unordered_map only being a pair? Shouldn't it be vector of pairs to store all pairs? – Gyanshu Nov 27 '19 at 03:39
@Gyan it's not clear to me what you're asking. A map (unordered or not) is a collection of `(key, value)` pairs. The value here is an `std::pair`, so the map is a collection of mapped sums like `sum -> (index1, index2)`. A vector of pairs would work, but it'd require you to do an O(n^2) loop to find the pair that sums to what you're looking for (and recompute those sums every time you run that loop). With an unordered_map, that lookup is on average O(1), and you've already got the sums computed once and stored. – scohe001 Nov 27 '19 at 04:24
@scohe001 Consider `mp[10] = {2,8}` followed by `mp[10] = {3,7}`. Then ask yourself, "what is the value of `mp[10]`?" The point is that even though the code "stores" both pairs in the map, only one of those pairs is retained. The pair that's retained is the last pair for a given key. – user3386109 Nov 27 '19 at 18:41

Finding a testcase which fails the code in the 4 sum problem

2 Answers2