Counting inversions in a segment with updates

Question

I'm trying to solve a problem which goes like this:

Problem

Given an array of integers "arr" of size "n", process two types of queries. There are "q" queries you need to answer.

Query type 1
    input: l r
   result: output number of inversions in [l, r]

Query type 2
    input: x y
   result: update the value at arr [x] to y

Inversion

For every index j < i, if arr [j] > arr [i], the pair (j, i) is one inversion.

Input

n = 5
q = 3
arr = {1, 4, 3, 5, 2}

queries:
    type = 1, l = 1, r = 5
    type = 2, x = 1, y = 4
    type = 1, l = 1, r = 5

Output

4
6

Constraints

Time: 4 secs

1 <= n, q <= 100000

1 <= arr [i] <= 40

1 <= l, r, x <= n

1 <= y <= 40

I know how to solve a simpler version of this problem without updates, i.e. to simply count the number of inversions for each position using a segment tree or fenwick tree in O(N*log(N)). The only solution I have to this problem is O(q*N*log(N)) (I think) with segment tree other than the O(q*N²) trivial algorithm. This however does not fit within the time constraints of the problem. I would like to have hints towards a better algorithm to solve the problem in O(N*log(N)) (if it's possible) or maybe O(N*log²(N)).

I first came across this problem two days ago and have been spending a few hours here and there to try and solve it. However, I'm finding it non-trivial to do so and would like to have some help/hints regarding the same. Thanks for your time and patience.

Updates

Solution

With the suggestion, answer and help by Tanvir Wahid, I've implemented the source code for the problem in C++ and would like to share it here for anyone who might stumble across this problem and not have an intuitive idea on how to solve it. Thank you!

Let's build a segment tree with each node containing information about how many inversions exist and the frequency count of elements present in its segment of authority.

node {
    integer    inversion_count : 0
    array [40] frequency       : {0...0}
}

Building the segment tree and handling updates

For each leaf node, initialise inversion count to 0 and increase frequency of the represented element from the input array to 1. The frequency of the parent nodes can be calculated by summing up frequencies of the left and right childrens. The inversion count of parent nodes can be calculated by summing up the inversion counts of left and right children nodes added with the new inversions created upon merging the two segments of their authority which can be calculated using the frequencies of elements in each child. This calculation basically finds out the product of frequencies of bigger elements in the left child and frequencies of smaller elements in the right child.

parent.inversion_count = left.inversion_count + right.inversion_count
for i in [39, 0]
    for j in [0, i)
        parent.inversion_count += left.frequency [i] * right.frequency [j]

Updates are handled similarly.

Answering range queries on inversion counts

To answer the query for the number of inversions in the range [l, r], we calculate the inversions using the source code attached below.

Time Complexity: O(q*log(n))

Note

The source code attached does break some good programming habits. The sole purpose of the code is to "solve" the given problem and not to accomplish anything else.

Source Code

/**
    Lost Arrow (Aryan V S)
    Saturday 2020-10-10
**/

#include "bits/stdc++.h"

using namespace std;

struct node {
    int64_t inv = 0;
    vector <int> freq = vector <int> (40, 0);

    void combine (const node& l, const node& r) {
        inv = l.inv + r.inv;
        for (int i = 39; i >= 0; --i) {
            for (int j = 0; j < i; ++j) {
                // frequency of bigger numbers in the left * frequency of smaller numbers on the right
                inv += 1LL * l.freq [i] * r.freq [j];
            }
            freq [i] = l.freq [i] + r.freq [i];
        }
    }
};

void build (vector <node>& tree, vector <int>& a, int v, int tl, int tr) {
    if (tl == tr) {
        tree [v].inv = 0;
        tree [v].freq [a [tl]] = 1;
    }
    else {
        int tm = (tl + tr) / 2;
        build(tree, a, 2 * v + 1, tl, tm);
        build(tree, a, 2 * v + 2, tm + 1, tr);
        tree [v].combine(tree [2 * v + 1], tree [2 * v + 2]);
    }
}

void update (vector <node>& tree, int v, int tl, int tr, int pos, int val) {
    if (tl == tr) {
        tree [v].inv = 0;
        tree [v].freq = vector <int> (40, 0);
        tree [v].freq [val] = 1;
    }
    else {
        int tm = (tl + tr) / 2;
        if (pos <= tm)
            update(tree, 2 * v + 1, tl, tm, pos, val);
        else
            update(tree, 2 * v + 2, tm + 1, tr, pos, val);
        tree [v].combine(tree [2 * v + 1], tree [2 * v + 2]);
    }
}

node inv_cnt (vector <node>& tree, int v, int tl, int tr, int l, int r) {
    if (l > r)
        return node();
    if (tl == l && tr == r)
        return tree [v];
    int tm = (tl + tr) / 2;
    node result;
    result.combine(inv_cnt(tree, 2 * v + 1, tl, tm, l, min(r, tm)), inv_cnt(tree, 2 * v + 2, tm + 1, tr, max(l, tm + 1), r));
    return result;
}

void solve () {
    int n, q;
    cin >> n >> q;
    vector <int> a (n);
    for (int i = 0; i < n; ++i) {
        cin >> a [i];
        --a [i];
    }

    vector <node> tree (4 * n);
    build(tree, a, 0, 0, n - 1);

    while (q--) {
        int type, x, y;
        cin >> type >> x >> y;
        --x; --y;

        if (type == 1) {
            node result = inv_cnt(tree, 0, 0, n - 1, x, y);
            cout << result.inv << '\n';
        }
        else if (type == 2) {
            update(tree, 0, 0, n - 1, x, y);
        }
        else
            assert(false);
    }
}

int main () {
    std::ios::sync_with_stdio(false);
    std::cin.tie(nullptr);
    std::cout.precision(10);
    std::cout << std::fixed << std::boolalpha;

    int t = 1;
//  std::cin >> t;
    while (t--)
        solve();

    return 0;
}

I found [this](https://stackoverflow.com/questions/47071541/how-to-get-inversion-count-with-update) similar question on SO but it's been left unanswered for 3 years now. — Arrow, Oct 09 '20 at 12:47
Let assume that you use the tree in N*logN why do you have an n squared? If i understood correctly: WC you have all 1st type queries their complexity is O(n*logn)*O(q) BC you have all 2nd type query with cost O(1) so total cost O(q) — Skin_phil, Oct 09 '20 at 14:48
@Skin_phil My segtree solution contains the squared term because I naively iterate from l to r to calculate inversions in O(N*log(N)). And because the number of queries can be as big as N, the time time complexity will be O(qNlog(N)) and not O(qN^2log(N)) (making it cubic). I'll edit my mistake. However, it's still not fast enough. — Arrow, Oct 09 '20 at 14:59
@Skin_phil The intuitive O(N^2) inversion counting with O(1) updates is indeed easier solution to implement but because I know this problem has something to do with segment trees, I need to implement a clever and fast solution. — Arrow, Oct 09 '20 at 15:04
Well i dont see any better solution unless you can improve on your inversion count method, i see that the elements you can put in the array are limited, there is probably something to figure out there, with the data structure — Skin_phil, Oct 09 '20 at 15:09
But there is also to take in account the average case, that probably gives a better cost, but it s pretty hard to calculate — Skin_phil, Oct 09 '20 at 15:11
@Skin_phil I was thinking somewhere along the same lines. Because arr [i] is only limited to 40 (a very small number), what could I do to exploit it to my benefit? So far nothing better than the naive algorithms have been intuitive to me. Probably someone will know how to go about the problem and join in sometime. — Arrow, Oct 09 '20 at 15:18

Tanvir Wahid · Accepted Answer · 2020-10-09T15:31:57.193

arr[i] can be at most 40. We can use this to our advantage. What we need is a segment tree. Each node will hold 41 values (A long long int which represents inversions for this range and a array of size 40 for count of each numbers. A struct will do). How do we merge two children of a node. We know inversions for left child and right child. Also know frequency of each numbers in both of them. Inversion of parent node will be summation of inversions of both children plus number of inversions between left and right child. We can easily find inversions between two children from frequency of numbers. Query can be done in similar way. Complexity O(40*qlog(n))

I don't think bitmasks will work. You will need actual count. — Tanvir Wahid, Oct 09 '20 at 15:58

Counting inversions in a segment with updates

Updates

1 Answers1