1

Here is my code for a problem in CSES problem set "Distinct Numbers". The problem basically inputs an integer (i) specifying the no. of elements and and an array of length equal i. We basically have to find the distinct no. of elements in that array.

#include <iostream>

using namespace std;

bool check(unsigned long long int val, unsigned long long int arr[], unsigned long long int range) {
    unsigned long long int i;
    bool return_val = true;
    for (i=0; i<range; i++) {
        if (arr[i] == val) {
            return_val = false;
        }
    }
    return return_val;
}

int main()
{
    //cout << "DEBUG-1" << endl;
    unsigned long long int i;
    cin >> i;

    unsigned long long int in_array[i];
    unsigned long long int j;
    for (j=0; j<i; j++) {
        cin >> in_array[j];
    }
    unsigned long long int sort_array[i];
    unsigned long int long sort_array_index = 0;
    for (j=0; j<i; j++) {
        if(check(in_array[j], sort_array, sort_array_index)) {
            sort_array[sort_array_index] = in_array[j];
            sort_array_index++;
         }
    }
    cout << sort_array_index;
}

When the input is not huge and is limited to 2 digit inputs, the code works fine.

The problem I'm facing is that when the input is 200000, my code breaks and shows there is no output as the time limit for compilation exceeds 1 second. Here is my issue: There's a test case on the site in which i=200000, and the array consists of all 1's. The obvious answer is 1 and my output is also 1. But, when i=200000, and all values are distinct the above issue occurs and the input displayed on the test is (empty).

Even when I include the first cout << "DEBUG-1" << endl; line (commented out), it is not seen in the output, whereas it is even before the first input line. It's like the program just rejects the input on seeing it. What I'm not able to understand that I wrote a similar program for an earlier problem and it worked fine on big inputs. Please tell me what is going wrong and include suggestions to write better code as well.

TrebledJ
  • 8,713
  • 7
  • 26
  • 48
  • 1
    Note that C++ doesn't really have [variable-length arrays](https://en.wikipedia.org/wiki/Variable-length_array), it's one compiler that have it as a ***non**-standard* and ***non**-portable* extension to C++ (much like [``](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h)). If you want a "dynamic" array whose size isn't known until run-time use [`std::vector`](https://en.cppreference.com/w/cpp/container/vector) instead. – Some programmer dude Dec 18 '20 at 06:34
  • As for your problem, you can never brute-force or use naive solutions on so-called "competition" sites. They are really all about trickery. – Some programmer dude Dec 18 '20 at 06:38
  • Anytime you hear the word `distinct` in a coding problem, just assume HASH TABLE is needed. In C++ that's the `std::unordered_map<>` class. As you have it now, your algorithm is `O(N²)` – selbie Dec 18 '20 at 06:44
  • `unsigned long long int sort_array[i];` isn't standard C++. That's a GNU extension that gives you an array of size `i` – selbie Dec 18 '20 at 06:45
  • Also, you don't need `long int long` or `long long int`. Just `long` or `long long` will do fine. – selbie Dec 18 '20 at 06:46
  • @selbie There may be a difference between `long` and `long long`. – Some programmer dude Dec 18 '20 at 06:47
  • @Someprogrammerdude - I know. But there's other issues to tackle in the OP's code before getting the integer width sorted out. – selbie Dec 18 '20 at 06:55

2 Answers2

3

Your program is using an N-squared algorithm.

That is, as your sort_array_index value keeps increasing, each subsequent scan of sort_array takes longer and longer. (One minor optimization - break out of the loop as soon you set return_val to false. But that's not going to be a significant savings).

Your entire program can be simplified to use the unordered_set class as a hash table for O(1) insertion and duplicate detection on each of the N elements inserted. Hence, an O(N) algorithm

#include <unordered_set>
#include <iostream>

using namespace std;

int main()
{
    size_t i;
    std::unordered_set<long long> table;

    cin >> i;  // number of elements to read

    // read the elements
    for (size_t j=0; j<i; j++) {
        long long value;
        cin >> value;
        table.insert(value);
    }

    size_t unique_element_count = table.size();
    cout << unique_element_count;

    return 0;
}
selbie
  • 100,020
  • 15
  • 103
  • 173
  • It certainly works but still could not compile in 1 sec for 2 occasions where value of i was 200000, but worked in all other cases. What really confused me was that there were two cases where i=200000, where there were 200000 distinct characters, code compiled for one but not for another though both arrays were fairly similar. – perpetualprime Dec 18 '20 at 09:12
  • I'm not sure what you mean by `but still could not compile in 1 sec for 2 occassions`. Do you mean time to **compile** the program or the time to **run** the program? Compile time of the code I shown is constant - unless you are doing something like jamming the input array into the code. Run time will be linear to the length of the input. You would need to share a sample input file and expected output value for me to comment further. – selbie Dec 18 '20 at 19:52
1

It might be that reading input takes too long. You can make it faster by reading all input at once (std::getline if they are on one line) and then splitting it.

Another problem could be that you are creating large arrays. Instead, use std::unordered_map for counting and create entries only for the numbers you need:

#include <unordered_map>
// ...

std::unordered_map<unsigned long long, unsigned int> sort_array;
for (j=0; j<i;j++) {
    sort_array[in_array[j]]++;
}
cout << sort_array.size();

Or you could read while you are counting, which eliminates the need of in_array:

std::unordered_map<unsigned long long, unsigned int> sort_array;
for (j=0; j<i;j++) {
    int number;
    cin >> number;
    sort_array[number]++;
}
cout << sort_array.size();

Note: This code also counts how many instances of each number there are, which is unnecessary for your problem but should take the same time as the other answer using unordered_set.

VLL
  • 9,634
  • 1
  • 29
  • 54