10

An inversion in an array is a pair of indices(i,j) such that a[i]>a[j] and i < j.

Given 2 arrays A and B and we have to return number of such pairs such that a[i]>b[j] and i < j.

Example :

Let n=3 and A[]=[5,6,7] and B[]=[1,2,3] then answer is 3. 3 pairs are (5,2) , (5,3) and (6,3).

My code :

#include <stdio.h>
#include <stdlib.h>
int main()
{
    int len;
    scanf("%d",&len);
    int a[len];
    int b[len];
    for(int i = 0; i < len; i++)
       scanf("%d",&a[i]);
    for(int i = 0; i < len; i++)
       scanf("%d",&b[i]);
    int count = 0;
    for (int i = 0;i < len; i++)
    {
        for(int j = i+1; j < len; j++)
        {
             if(a[i] > b[j])
             {
                 count++;
             }
         }
     }
     printf("%d",count);
}

But this is O(N^2) solution.I need a better solution as N<=200000.I know we can count inversions in same array in O(N*Log N) time.But how this can be done for two different arrays ?

prtyush
  • 157
  • 1
  • 7

3 Answers3

6

I've written in the past about how to count inversions using a Fenwick tree, which is a very efficient type of binary tree that lets you compute prefix aggregations on a sequence.

Here is an adhoc modifcation for your scenario:

long long inversions(const vector<int>& a, const vector<int>& b) {
  int n = a.size();
  vector<int> values(a);
  for (int x: b) values.push_back(x);
  sort(begin(values), end(values));
  vector<int> counts(2*n + 1);
  long long res = 0;
  for (int i = n - 1; i >= 0; --i) {
    // compute sum of prefix 1..rank(a[i]) - 1
    for (int v = lower_bound(begin(values), end(values), a[i]) - begin(values);
         v; 
         v -= v & -v)
      res += counts[v];
    //add 1 to point rank(b[i])
    for (int v = lower_bound(begin(values), end(values), b[i]) - begin(values) + 1;
         v <= 2*n;
         v += v & -v)
      counts[v]++;
  }
  return res;
}

Basically we walk through the arrays from right to left, maintaining a data structure that represents the values of a we have already seen in the suffix. For every element b[i], we add to the final result the number of of elements x in the data structure with x <= b[i] - 1. Then we add a[i] to the data structure.

The array values is used to compress the range of values to 1..2n because Fenwick trees take space linear in the range size. We could avoid that step by choosing a more fullfeatured data structure like a balanced bjnary search tree with subtree size augmentation.

The complexity is O(n log n), and the constant factor is very low.

Community
  • 1
  • 1
Niklas B.
  • 92,950
  • 18
  • 194
  • 224
  • Can you tell what you are trying to do as i am bad in python. – prtyush Dec 23 '14 at 13:58
  • Zero-initialize a BIT `counts`, go through the array b in reverse, add for each element b[i] the number of items in counts that are smaller than b[i]. Insert a[i] into counts. – Niklas B. Dec 23 '14 at 14:06
  • I am not getting it..:( – prtyush Dec 23 '14 at 14:17
  • I added C++ code. The vector `values` and the calls to lower_bound are just for coordinate compression to bring the real values into the range 1..2n. The two inner for loops are the prefix-sum and update-point primitives of the binary-indexed tree. You will find lots of resources about those on the internet. – Niklas B. Dec 23 '14 at 14:36
  • How are you finding the rank ?Remaining i understand pretty well – prtyush Dec 23 '14 at 14:37
  • Just binary search in the sorted array of all the values that occur in a or b – Niklas B. Dec 23 '14 at 14:39
  • Its crashing what are these v; and v<=2*n; ? – prtyush Dec 23 '14 at 14:46
  • These are the terminating conditions for the loop. Check the topcoder page for the explanation of these routines. I can't test the code right now because I'm on a phone but I had an error in the initial version. Are you sure you are loking at the newest version? Also I assume that a.size () == b.size () – Niklas B. Dec 23 '14 at 14:57
  • Well at least you have the idea. You will have to do the debugging on your own for now. – Niklas B. Dec 23 '14 at 15:03
  • Also I found another problem and fixed it in the newest revision. That would have definitely caused a crash – Niklas B. Dec 23 '14 at 15:05
  • This equation, and even your original python one has issues with it. It appears to have a more accurate number with larger amounts of inversions, lets say `1, 2, 3, 4, 5` 5, 2, 3, 4, 1` Gives incorrect number. Also in the cases that there is the equivalent to 1 inversion, it gives 0 in many instances it clearly isn't I am not sure if your original single array one worked that you linked, but nothing you've posted here actually gives accurate numbers. – Adam namrog84 Jan 13 '15 at 03:26
  • @Adamnamrog84 For my convenience, can you suggest values of `a` and `b` where the function gives the wrong output so I can debug it faster? I admit I never got around to properly test it after writing it initially on a smartphone. I *am* however pretty sure that I random-tested the Python code very properly against a brute-force implementation, including small and large instances. – Niklas B. Jan 13 '15 at 09:55
  • @Adamnamrog84 Ok nevermind it's simple to find counterexamples. I'll fix it. EDIT I just had `a` and `b` swapped :) Now it should work properly – Niklas B. Jan 13 '15 at 10:18
  • @NiklasB. - How can we solve for array with duplicate values as `[1, 3, 4, 5, 6, 4]` using merge sort ? – huzefa biyawarwala Nov 16 '20 at 16:42
1

You can find inversions between two arrays using merge sort idea!

consider you have two arrays of same size, lets call them A, B if we denote first half of A, A1 and second half of A, A2 and B1 and B2 respectively for B then we can conclude that answer is sum of:

  1. inversions between A1 and B1
  2. inversions between A2 and B2
  3. inversions between A1 and B2

first two elements can be supported by calling the function recursively, but how to calculate third element?

the idea is to go through A1 and B2 from left to right, any where element in B1 is greater than element in A1 , then elements in A1 which are not visited yet should be add up to number of inversions, and at the end we just sort A1 and A2 to A and B1 and B2 to B

here is the code in python:

def find_inv(A, B):

if len(A) <= 1:
    return 0
mid = len(A) // 2
A1 = A[:mid]
A2 = A[mid:]
B1 = B[:mid]
B2 = B[mid:]

if len(A1) >= 1 and len(B2) >= 1:
    ans = find_inv(A1, B1) + find_inv(A2, B2)
else:
    A.sort()
    B.sort()
    ans = 0
len_A = len(A1)
index_A = 0
len_B = len(B2)
index_B = 0

for k in range(len_A + len_B):

    if A1[index_A] <= B2[index_B]:

        index_A += 1
        if index_A == len_A:
            merge(A1, A2, A)
            merge(B1, B2, B)

            return ans
    else:
        index_B += 1
        ans += (len_A - index_A)

        if index_B == len_B:
            merge(A1, A2, A)
            merge(B1, B2, B)
            return ans

def merge(A1, A2, dest):
i = 0
j = 0


while i < len(A1) and j < len(A2):
    if A1[i] < A2[j]:
        dest[i+j] = A1[i]
        i += 1
    else:
        dest[i+j] = A2[j]
        j += 1
while i < len(A1):
        dest[i+j] = A1[i]
        i += 1

while j < len(A2):
        dest[i+j] = A2[j]
        j += 1
Mahsirat
  • 79
  • 7
-1

One idea:
1. Merge the two original arrays such that each pairs of elements with the same indices are adjacent to one other. (You will need to place the elements such that the lower one comes before the higher one).
3. Count the number of inversions in the resulting array as explained below.

edit: sorry I had misinterpreted the question. If you want inversions from only a(I) to b (j) then you can mark each element with another field arraytype (a or b). Then while mergesorting you can increment the count only if the inversion is from arraytype a to b.

Abhishek Bansal
  • 12,589
  • 4
  • 31
  • 46