0

I'm trying to solve this question:

As we all know that Varchas is going on. So FOC wants to organise an event called Finding Occurrence.

The task is simple :

Given an array A[1...N] of positive integers. There will be Q queries. In the queries you will be given an integer. You need to find out the frequency of that integer in the given array.

INPUT:

First line of input comprises of integer N, the number of integers in given array. The next line will comprise of N space separated integers. The next line will be Q, number of Queries. The next Q lines will comprise of a single integer whose Occurrence you are supposed to find out.

OUTPUT:

Output single integer for each Query which is the frequency of the given integer.

Constraints:

1<=N<=100000

1<=Q<=100000

0<=A[i]<=1000000

And this is my code:

#include <iostream>
using namespace std;

int main()
{
    long long n=0;
    cin >> n;
    long long a[1000000];
    for (int i=1;i<=n;i++)
    {
        cin >> a[i];
    }
    long long q=0;
    cin >> q;
    while (q--)
    {
        long long temp=0,counter=0;
        cin >> temp;
        for (int k=1;k<=n;k++)
        {
            if (a[k]==temp)
                counter++;
        }
        cout << "\n" << counter;
        temp=0;
        counter=0;
    }
    return 0;
}

However, I encountered the 'Time Limit Exceeded' error. I suspect this is due to the failure to handle large values in arrays. Could someone tell me how to handle such large size arrays?

justhalf
  • 8,960
  • 3
  • 47
  • 74
user2295715
  • 62
  • 1
  • 1
  • 9
  • 1
    Please make your question self-sufficient instead of providing links. – Abhishek Bansal Nov 11 '14 at 05:50
  • The problem requires you to first sort the array then for each query, you simply do a binary search and find the range that contains that value and print the length of the range – smac89 Nov 11 '14 at 06:05
  • the problem is for each query, you need to scan through the whole array, which make the time complexity is O(n^2), which is clearly a TLE – Pham Trung Nov 11 '14 at 06:05
  • 1
    @Smac89: The question is not about range query, it's much simpler than that. The question is asking to count the frequency of an integer. – justhalf Nov 11 '14 at 06:06
  • Was this supposed to be tagged as C++? – James H Nov 11 '14 at 06:07
  • @justhalf can you explain? Note that you are not just getting one integer to count the frequency of – smac89 Nov 11 '14 at 06:07
  • @Smac89, it is so simple, because A[i] <= 10^5, so you just need to have an array fre[10^5 +1], which count the frequency of elements – Pham Trung Nov 11 '14 at 06:10
  • @Smac89: You just need to return how many times the integer specified in the query appears in the array. – justhalf Nov 11 '14 at 06:10
  • @PhamTrung: Correction: `A[i] <= 10^6` – justhalf Nov 11 '14 at 06:10
  • 1
    Ahh I see what you guys are doing now. I usually go for memory efficiency where as you guys are relying on the fact that Q is small. That is nice but it won't work for large Q – smac89 Nov 11 '14 at 06:12
  • Ah, now I see what you're suggesting. Your method also works, but I think given the constraints in the question, simply counting them beforehand is easier. And actually using your method you can also precalculate the count, just store as a hash map. – justhalf Nov 11 '14 at 06:13
  • @Smac89, so that why we need to look at constraints :), for example, if we need to count the frequency of human age, so clearly space is not an issue. – Pham Trung Nov 11 '14 at 06:13

3 Answers3

4

The failure is in the algorithm itself, note that for each query, you traverse the whole array. There are 100,000 queries and 100,000 elements. That means at worse case you're traversing 100,000 * 100,000 elements = 10,000,000,000 elements, which won't finish in time. If you analyze the complexity using the Big-O notation, your algorithm is O(nq), which is too slow for this problem, since n*q are large.

What you're supposed to do is to calculate the scores before any query is made, then store in an array (this is why the range of A[i] is given. You should be able to do this by traversing the array only once. (hint: you don't need to store the input into an array, you can just count directly).

By doing this, the algorithm will just be O(n), and since n is small enough (as a rule of thumb, less than one million is small), it should finish in time.

Then you can answer each query instantly, making your program fast enough to be under the time limit.

Another thing that you can improve is the data type of the array. The value stored in that array won't be larger than 1 million, and so you don't need long long, which uses more memory. You can just use int.

justhalf
  • 8,960
  • 3
  • 47
  • 74
4

Your algorithm was inefficient. You read all the numbers into an array, then you searched linearly through the array for each query.

What you should have done is make one array of counts. In other words, if you read the number 5, do count[5]++. Then for each query all you have to do is return the count from the array. For example, how many 5's were there in the array? Answer: count[5].

JS1
  • 4,745
  • 1
  • 14
  • 19
1

Since your maximum number can be 10^6, I think that your problem will take memory limit exceeded, even if it fits in time. Another solution is to sort the array( you can do it in N*logN using STL sort function) and for each query you can make two binary searches. First is used to find the first position where the element appears and the second is used to find the last position where your element appears, so the answer for each query will be lastPosition - firstPosition + 1.

Iulian Popescu
  • 2,595
  • 4
  • 23
  • 31