Count inversions in two arrays

Question

An inversion in an array is a pair of indices(i,j) such that a[i]>a[j] and i < j.

Given 2 arrays A and B and we have to return number of such pairs such that a[i]>b[j] and i < j.

Example :

Let n=3 and A[]=[5,6,7] and B[]=[1,2,3] then answer is 3. 3 pairs are (5,2) , (5,3) and (6,3).

My code :

#include <stdio.h>
#include <stdlib.h>
int main()
{
    int len;
    scanf("%d",&len);
    int a[len];
    int b[len];
    for(int i = 0; i < len; i++)
       scanf("%d",&a[i]);
    for(int i = 0; i < len; i++)
       scanf("%d",&b[i]);
    int count = 0;
    for (int i = 0;i < len; i++)
    {
        for(int j = i+1; j < len; j++)
        {
             if(a[i] > b[j])
             {
                 count++;
             }
         }
     }
     printf("%d",count);
}

But this is O(N^2) solution.I need a better solution as N<=200000.I know we can count inversions in same array in O(N*Log N) time.But how this can be done for two different arrays ?

Are the arrays sorted and while choosing a pair, do we have to choose one element from each array? — Abhishek Bansal, Dec 23 '14 at 13:25
@user1990169 Yeah you need to choose one element from each array and arrays need NOT be sorted — prtyush, Dec 23 '14 at 13:26
Is there an algorithms stack exchange site? Seems like this would be a good question for that one. — Barry, Dec 23 '14 at 13:32
This is from an ongoing contest @ http://www.hackerearth.com/christmas-coding-marathon/algorithm/inversions-revisited/ Is this considered cheating? — Rohan Verma, Dec 23 '14 at 15:25
Is your solution this solution https://www.quora.com/How-can-we-solve-a-count-inversion-problem-when-two-different-arrays-are-given ? — hlscalon, Dec 23 '14 at 15:35
Shouldn't this be tagged with C instead of C++? Your example implementation is C. — Kimi, Dec 23 '14 at 15:43
@RohanVerma Yes, but Stack Overflow's management doesn't care. — David Eisenstat, Dec 23 '14 at 16:29
@Rohan on the other hand, hacker earth might care if they can connect this user to a participant — Niklas B., Dec 23 '14 at 18:11
How is this a duplicate... READ people. The question is different and the solutions are also different — Niklas B., Dec 24 '14 at 14:13
I vote to reopen. The question that someone said this was a duplicate of only talked about a single array, but this question is dealing with two arrays. — ArtOfWarfare, Dec 24 '14 at 14:46
This question appears to be off-topic because it is about cheating by getting answers to this contest: http://www.hackerearth.com/login/?next=/christmas-coding-marathon/algorithm/inversions-revisited/ — LittleBobbyTables - Au Revoir, Jan 02 '15 at 14:45

score 6 · Accepted Answer · edited May 23 '17 at 11:59

6

I've written in the past about how to count inversions using a Fenwick tree, which is a very efficient type of binary tree that lets you compute prefix aggregations on a sequence.

Here is an adhoc modifcation for your scenario:

long long inversions(const vector<int>& a, const vector<int>& b) {
  int n = a.size();
  vector<int> values(a);
  for (int x: b) values.push_back(x);
  sort(begin(values), end(values));
  vector<int> counts(2*n + 1);
  long long res = 0;
  for (int i = n - 1; i >= 0; --i) {
    // compute sum of prefix 1..rank(a[i]) - 1
    for (int v = lower_bound(begin(values), end(values), a[i]) - begin(values);
         v; 
         v -= v & -v)
      res += counts[v];
    //add 1 to point rank(b[i])
    for (int v = lower_bound(begin(values), end(values), b[i]) - begin(values) + 1;
         v <= 2*n;
         v += v & -v)
      counts[v]++;
  }
  return res;
}

Basically we walk through the arrays from right to left, maintaining a data structure that represents the values of a we have already seen in the suffix. For every element b[i], we add to the final result the number of of elements x in the data structure with x <= b[i] - 1. Then we add a[i] to the data structure.

The array values is used to compress the range of values to 1..2n because Fenwick trees take space linear in the range size. We could avoid that step by choosing a more fullfeatured data structure like a balanced bjnary search tree with subtree size augmentation.

The complexity is O(n log n), and the constant factor is very low.

edited May 23 '17 at 11:59

Community

1
1

answered Dec 23 '14 at 13:55

Niklas B.

92,950
18
194
224

Can you tell what you are trying to do as i am bad in python. – prtyush Dec 23 '14 at 13:58
Zero-initialize a BIT `counts`, go through the array b in reverse, add for each element b[i] the number of items in counts that are smaller than b[i]. Insert a[i] into counts. – Niklas B. Dec 23 '14 at 14:06
I am not getting it..:( – prtyush Dec 23 '14 at 14:17
I added C++ code. The vector `values` and the calls to lower_bound are just for coordinate compression to bring the real values into the range 1..2n. The two inner for loops are the prefix-sum and update-point primitives of the binary-indexed tree. You will find lots of resources about those on the internet. – Niklas B. Dec 23 '14 at 14:36
How are you finding the rank ?Remaining i understand pretty well – prtyush Dec 23 '14 at 14:37
Just binary search in the sorted array of all the values that occur in a or b – Niklas B. Dec 23 '14 at 14:39
Its crashing what are these v; and v<=2*n; ? – prtyush Dec 23 '14 at 14:46
These are the terminating conditions for the loop. Check the topcoder page for the explanation of these routines. I can't test the code right now because I'm on a phone but I had an error in the initial version. Are you sure you are loking at the newest version? Also I assume that a.size () == b.size () – Niklas B. Dec 23 '14 at 14:57
Well at least you have the idea. You will have to do the debugging on your own for now. – Niklas B. Dec 23 '14 at 15:03
Also I found another problem and fixed it in the newest revision. That would have definitely caused a crash – Niklas B. Dec 23 '14 at 15:05
This equation, and even your original python one has issues with it. It appears to have a more accurate number with larger amounts of inversions, lets say `1, 2, 3, 4, 5` 5, 2, 3, 4, 1` Gives incorrect number. Also in the cases that there is the equivalent to 1 inversion, it gives 0 in many instances it clearly isn't I am not sure if your original single array one worked that you linked, but nothing you've posted here actually gives accurate numbers. – Adam namrog84 Jan 13 '15 at 03:26
@Adamnamrog84 For my convenience, can you suggest values of `a` and `b` where the function gives the wrong output so I can debug it faster? I admit I never got around to properly test it after writing it initially on a smartphone. I *am* however pretty sure that I random-tested the Python code very properly against a brute-force implementation, including small and large instances. – Niklas B. Jan 13 '15 at 09:55
@Adamnamrog84 Ok nevermind it's simple to find counterexamples. I'll fix it. EDIT I just had `a` and `b` swapped :) Now it should work properly – Niklas B. Jan 13 '15 at 10:18
@NiklasB. - How can we solve for array with duplicate values as `[1, 3, 4, 5, 6, 4]` using merge sort ? – huzefa biyawarwala Nov 16 '20 at 16:42

score 1 · Answer 2 · answered Mar 29 '19 at 13:03

You can find inversions between two arrays using merge sort idea!

consider you have two arrays of same size, lets call them A, B if we denote first half of A, A1 and second half of A, A2 and B1 and B2 respectively for B then we can conclude that answer is sum of:

inversions between A1 and B1
inversions between A2 and B2
inversions between A1 and B2

first two elements can be supported by calling the function recursively, but how to calculate third element?

the idea is to go through A1 and B2 from left to right, any where element in B1 is greater than element in A1 , then elements in A1 which are not visited yet should be add up to number of inversions, and at the end we just sort A1 and A2 to A and B1 and B2 to B

here is the code in python:

def find_inv(A, B):

if len(A) <= 1:
    return 0
mid = len(A) // 2
A1 = A[:mid]
A2 = A[mid:]
B1 = B[:mid]
B2 = B[mid:]

if len(A1) >= 1 and len(B2) >= 1:
    ans = find_inv(A1, B1) + find_inv(A2, B2)
else:
    A.sort()
    B.sort()
    ans = 0
len_A = len(A1)
index_A = 0
len_B = len(B2)
index_B = 0

for k in range(len_A + len_B):

    if A1[index_A] <= B2[index_B]:

        index_A += 1
        if index_A == len_A:
            merge(A1, A2, A)
            merge(B1, B2, B)

            return ans
    else:
        index_B += 1
        ans += (len_A - index_A)

        if index_B == len_B:
            merge(A1, A2, A)
            merge(B1, B2, B)
            return ans

def merge(A1, A2, dest):
i = 0
j = 0


while i < len(A1) and j < len(A2):
    if A1[i] < A2[j]:
        dest[i+j] = A1[i]
        i += 1
    else:
        dest[i+j] = A2[j]
        j += 1
while i < len(A1):
        dest[i+j] = A1[i]
        i += 1

while j < len(A2):
        dest[i+j] = A2[j]
        j += 1

Abhishek Bansal · Answer 3 · 2014-12-23T14:19:13.190

-1

One idea:
1. Merge the two original arrays such that each pairs of elements with the same indices are adjacent to one other. (You will need to place the elements such that the lower one comes before the higher one).
3. Count the number of inversions in the resulting array as explained below.

edit: sorry I had misinterpreted the question. If you want inversions from only a(I) to b (j) then you can mark each element with another field arraytype (a or b). Then while mergesorting you can increment the count only if the inversion is from arraytype a to b.

edited Dec 23 '14 at 14:19

answered Dec 23 '14 at 13:34

Abhishek Bansal

12,589
4
31
46

An pseudo code or a piece of code will definatly explain it a bit more. – prtyush Dec 23 '14 at 13:35
Explain this "Now merge the two original arrays such that each pairs of elements with the same indices are adjacent to one other." – prtyush Dec 23 '14 at 13:36
@prtyush For instance in your example arrays, the resulting merged array shall be {1,5,2,6,3,7}. – Abhishek Bansal Dec 23 '14 at 13:37
@prtyush Sorry am in a hurry right now. Will be able to add pseudocode later. – Abhishek Bansal Dec 23 '14 at 13:39
This would return the sum of this two. 1: number of a[i]>b[j] and i < j 2: number of b[i]>a[j] and i < j – Kicsi Dec 23 '14 at 13:56
@Kicsi Yeah you are right.Thats why i was saying to post pseudocode.user1990169 has interpreted question in wrong way.This was not required.We only want number of a[i]>b[j] and i < j – prtyush Dec 23 '14 at 14:00
@prtyush sorry had misinterpreted the question. Edited. – Abhishek Bansal Dec 23 '14 at 14:20

Count inversions in two arrays

3 Answers3