Finding count of distinct elements in every k subarray

Question

How to solve this question efficiently?

Given an array of size n and an integer k we need to return the sum of count of all distinct numbers in a window of size k. The window slides forward.

e.g. arr[] = {1,2,1,3,4,2,3};

Let k = 4.

The first window is {1,2,1,3}, count of distinct numbers is 2….(1 is repeated)
The second window is {2,1,3,4} count of distinct numbers is 4
The third window is {1,3,4,2} count of distinct numbers is 4
The fourth window is {3,4,2,3} count of distinct numbers is 2

What is the expected value range of the elements? Provided the range is moderate, you could use an array to register the presence of a given value. — Axel Kemper, Aug 08 '15 at 11:47

score 1 · Answer 1 · answered Aug 08 '15 at 13:17

You should keep track of

a map that counts frequencies of elements in your window
a current sum.

The map with frequencies can also be an array if the possible elements are from a limited set.

Then when your window slides to the right...

increase the frequency of the new number by 1.
if that frequency is now 1, add it to the current sum.
decrease the frequency of the old number by 1.
if that frequency is now 0, subtract it from the current sum.

score 1 · Answer 2 · edited May 23 '17 at 12:30

You can use a hash table H to keep track of the window as you iterate over the array. You also keep an additional field for each entry in the hash table that tracks how many times that element occurs in your window.

You start by adding the first k elements of arr to H. Then you iterate through the rest of arr and you decrease the counter field of the element that just leaves the windows and increase the counter field of the element that enters the window.

At any point (including the initial insertion into H), if a counter field turns 1, you increase the number of distinct elements you have in your window. This can happen while the last but one occurrence of an element leaves the window or while a first occurrence enters it. If a counter field turns to any other value but 1, you decrease the number of distinct elements you have in the window.

This is a linear solution in the number of elements in arr. Hashing integers can be done like this, but depending on the language you use to implement your solution you might not really need to hash them yourself. In case the range in which the elements of arr reside in is small enough, you can use a simple array instead of the hash table, as the other contributors suggested.

score 1 · Answer 3 · answered Aug 08 '15 at 17:37

1

Actually, I am the asker of the question, I am not answering the question, but i just wanted to comment on the answers, but I can't since I have very less reputation.

I think that for {1, 2, 1, 3} and k = 4, the given algorithms produce count = 3, but according to the question, the count should be 2 (since 1 is repeated)

answered Aug 08 '15 at 17:37

Boopesh

149
1
2
7

OK. I agree. My initial solution was solving a slightly different problem. Check my update. – cobarzan Aug 08 '15 at 17:58

score 0 · Answer 4 · answered Sep 18 '21 at 13:02

This is how I solved the problem

private static int[] getSolve(int[] A, int B) {

    Map<Integer, Integer> map = new HashMap<>();
    for (int i = 0; i < B; i++) {
        map.put(A[i], map.getOrDefault(A[i], 0) + 1);
    }

    List<Integer> res = new ArrayList<>();
    res.add(map.size());
    //4, 1, 3, 1, 5, 2, 5, 6, 7

    //3, 1, 5, 2, 5, 6     count = 5
    for (int i = B; i < A.length; i++) {

        if (map.containsKey(A[i - B]) && map.get(A[i - B]) == 1) {
            map.remove(A[i - B]);
        }
        if (map.containsKey(A[i - B])) {
            map.put(A[i - B], map.get(A[i - B]) - 1);
        }
        map.put(A[i], map.getOrDefault(A[i], 0) + 1);


        System.out.println(map.toString());
        res.add(map.size());
    }
    return res.stream().mapToInt(i -> i).toArray();
}

Finding count of distinct elements in every k subarray

4 Answers4