0

Canadian Computing Competition: 2020 Stage 1, Senior #3 You're given a string N, called the needle, and a string H, called the haystack, both of which contain only lowercase letters a..z.

Write a program to count the number of distinct permutations of N which appear as a substring of H at least once. Note that N can have anywhere between 1 and |N|! distinct permutations in total – for example, the string aab has 3 distinct permutations (aab, aba, and baa).

Problem: my program runs perfectly fine for testcase 1-113, yet no matter how I revise my program, the CCC grader will always show Receive Signal 6 at testcase 114.

#include <bits/stdc++.h>
using namespace std;

int alpha_a[26], alpha_b[26];

bool check() {
    for (int i = 0; i < 26; ++i) {
        if (alpha_a[i] != alpha_b[i]) {
            return false;
        }
    }
    return true;
}
int main() {
    string a, b;
    cin >> a >> b;
    if (b.size() < a.size()) {
        cout << "0" << endl;
        return 0;
    }
    memset(alpha_a, 0, 26);
    memset(alpha_b, 0, 26);
    for (int i = 0; i < a.size(); ++i) {
        ++alpha_a[a[i] - 'a'];
        ++alpha_b[b[i] - 'a'];
    }
    string process = b.substr(0, a.size());
    set<string> no_repetition = {};
    int begin = 0, end = a.size();
    for (; end != b.size(); ++end, ++begin) {
        if (check()) {
            no_repetition.emplace(process);
        }
        process.erase(0, 1);
        --alpha_b[b[begin] - 'a'];
        process += b[end];
        ++alpha_b[b[end] - 'a'];
    }
    if (check()) {
        no_repetition.emplace(process);
    }
    cout << no_repetition.size() << endl;
    return 0;
}
Zheng Pei
  • 3
  • 1
  • 3
    And the obligatory link to [Why should I not #include ?](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h). When you have a mystery bug, `#include ` and `using namespace std;` should be among the first things to go because they can spawn really freaky bugs without much outside help. – user4581301 Jul 07 '21 at 21:44
  • @user4581301 I don't see any evidence in the linked page that `bits/stdc++.h` can cause "really freaky bugs". Not one single person gave a compelling reason to avoid it in the online judge context. – Brian Bi Jul 08 '21 at 16:04

1 Answers1

2

This error message is most likely because you keep adding a copy of each permutation to the no_repetition vector, which eventually exhausts the grader's memory and results in std::bad_alloc being thrown; this exception is not caught, and an uncaught exception results in std::terminate being called, which calls abort, which terminates the program with SIGABRT, which usually has the signal number 6.

To obtain full marks on this problem, you need to find a way to deduplicate permutations without storing a copy of each permutation in a set.

Brian Bi
  • 111,498
  • 10
  • 176
  • 312
  • Your advice is really helpful! However, the memory exhaustion of the grader is caused by the size and the amount of string added into the `no_repetition` (the correct output for testcase 114 is 66958 and the size of string a is 50000, which means the program stores 66958x50000 char value)but not because the program adding a copy of each permutation(since std::set only allow me to store unique value), please correct me if I'm wrong. The solution I found is to convert each string permutation into a hash value before adding it into a set. – Zheng Pei Jul 08 '21 at 16:16
  • @ZhengPei That's what I meant. The substrings are already in the input string; you then decided to copy such substrings into new `std::string` objects and add them to your set. Even though there is only one copy of each unique permutation in the set, there are many such permutations stored unnecessarily. I hope you were able to get your solution accepted using your new approach with hash values. – Brian Bi Jul 08 '21 at 16:17
  • Thank you for the explanation :) – Zheng Pei Jul 08 '21 at 16:38