-1

I was solving this problem and then I encounter this weird behaviour between different versions of C++. When I use C++17, the code gives correct output, but when I switch to C++14, the output changes completely. I was compiling with g++ (x86_64-posix-seh-rev1, Built by MinGW-Builds project) 13.1.0. I would appreciate it if someone would point out what is happening.

This is my solution to that problem.

#include <bits/stdc++.h>

#define nl '\n'

using namespace std;

using i64 = long long;

struct PersistentSegmentTree {
  struct Node {
    int value, left, right;

    Node() : value(0), left(0), right(0) {}
  };
  vector<Node> T = {Node()};

  int update(int root, int l, int r, int p, int v) {
    int id = T.size();
    T.emplace_back();
    T[id] = T[root];
    if (l == r) {
      T[id].value += v;
      return id;
    }
    
    int mid = (l + r) / 2;
    if (p <= mid) {
      T[id].left = update(T[root].left, l, mid, p, v);
    } else {
      T[id].right = update(T[root].right, mid + 1, r, p, v);
    }
    
    T[id].value = T[T[id].left].value + T[T[id].right].value;
    return id;
  }

  int query(int from, int to, int l, int r, int k) {
    if (l == r) {
      return l;
    }
    int mid = (l + r) / 2;
    int left_count = T[T[to].left].value - T[T[from].left].value;
    if (left_count >= k) {
      return query(T[from].left, T[to].left, l, mid, k);
    } else {
      return query(T[from].right, T[to].right, mid + 1, r, k - left_count);
    }
  }
} tree;

signed main() {
  cin.tie(0), ios::sync_with_stdio(0);

  int n, q;
  cin >> n >> q;
  vector<int> a(n);
  for (int i = 0; i < n; i++) {
    cin >> a[i]; 
  }

  vector<int> roots;
  roots.push_back(0);
  
  auto vals = a;
  sort(vals.begin(), vals.end());
  vals.erase(unique(vals.begin(), vals.end()), vals.end());

  for (int i = 0; i < n; i++) {
    a[i] = lower_bound(vals.begin(), vals.end(), a[i]) - vals.begin();
    roots.push_back(tree.update(roots.back(), 0, vals.size() - 1, a[i], 1));
  }

  for (int qi = 0; qi < q; qi++) {
    int l, r, k;
    cin >> l >> r >> k;
    cout << vals[tree.query(roots[l - 1], roots[r], 0, vals.size() - 1, k)] << nl;    
  }
}

The test case is:

7 3
1 5 2 6 3 7 4
2 5 3
4 4 1
1 7 3

C++17 output (correct):

5 
6 
3

C++14 output (incorrect):

7
7
4

Thanks in advance.

kilkuwu
  • 47
  • 4
  • The code has Undefined Behavior: terminate called after throwing an instance of 'std::length_error' what(): cannot create std::vector larger than max_size() Program terminated with signal: SIGSEGV – 273K Aug 09 '23 at 02:45
  • 3
    1) sounds like your code has [undefined behaviour](https://en.cppreference.com/w/cpp/language/ub). 2) *Never* ever include `bits/stdc++.h` - see [Why should I not #include ?](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h). 3) `using namespace std` is pretty bad on its own, but when combined with `2`, it's a disaster - see [Why is "using namespace std;" considered bad practice?](https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice). – Jesper Juhl Aug 09 '23 at 02:50
  • 1
    See heap-use-after-free: https://godbolt.org/z/GThMEzxMW – 273K Aug 09 '23 at 02:50
  • Too many issues with this code to enumerate in comments. – Jesper Juhl Aug 09 '23 at 02:52
  • Different behavior under different compilers is about the surest sign of undefined behavior you can get. – Mark Ransom Aug 09 '23 at 02:53
  • `using i64 = long long;` - Nooo; `int64_t` is a thing. – Jesper Juhl Aug 09 '23 at 02:54
  • Thanks for your help but I really do not understand why the undefined behaviour doesn't exist in C++17, I want to know what is the difference between the two versions that causes this "strange" behaviour. – kilkuwu Aug 09 '23 at 02:54
  • I know that `using namespace std` is bad, but I am using this purely for competitive programming which requires fast typing and speed coding, we have to be as fast as possible to score higher on the leaderboard, sorry if my style of code disturbs you. – kilkuwu Aug 09 '23 at 02:56
  • The funny thing about undefined behavior is that sometimes it can look like it's working. When it does be very afraid, because it just means it's waiting for the worst possible moment to bite you. – Mark Ransom Aug 09 '23 at 02:57
  • Do you know what the type is that `T.size()` returns? Hint it is not an `int`. And why don't you give your variables some more meaningful names to make your code readable? With all the comments above (the link you posted) and (useless) code like `cin.tie(0), ios::sync_with_stdio(0);` it looks like you are learning C++ from competitive coding sites... don't do that you will only pickup bad practices. – Pepijn Kramer Aug 09 '23 at 03:07
  • 1
    Learn C++ from : [a recent C++ book](https://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list) or from [learncpp.com](https://www.learncpp.com/) that site is decent and pretty up to date. Then use [cppreference](https://en.cppreference.com/w/) for reference material (and examples). When you have done all that keep using [C++ core guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines) to keep up to date with the latest best practices since C++ keeps evolving. – Pepijn Kramer Aug 09 '23 at 03:08
  • @PepijnKramer Thanks for the materials, I really appreciate it. I know T.size() returns size_t type, the reason my code uses somewhat bad practices is because if I do not use them, I will be at a *disadvantage*, the code you reference (useless) helps to increase input and output time when the time limit for a problem is tight, the other bad practices (`using namespace std`, `#include`, etc.) is to help with the speed that I code. – kilkuwu Aug 09 '23 at 03:17
  • The reason I asked this question is because I am curious about what the difference between C++14 and C++17 that makes the same code behave differently, it seems the problem lies in the two lines of code: `int id = T.size(); T.emplace_back()`, so I think the root of the problem is the way the vector class is implemented differently in each version, but I sure do not know what that difference is. – kilkuwu Aug 09 '23 at 03:21
  • 1
    Different results from different compilers is a typical sign of undefined behaviour, implementation-defined, or unspecified behaviour - which all allow freedom for different implementations to produce different results. Your code accesses vector elements using computed indices, so the *most likely* explanation is one of those computed indices going out of range for the relevant vector. One option is, BEFORE using an index, check it is valid (e.g. between `0` and `the_vector.size() - 1` inclusive). Some compilers/libraries in debug (not release) builds do that check (and throw exceptions) – Peter Aug 09 '23 at 03:33
  • @kilkuwu When making production code, correctness goes before performance. Time limit for a problem should never be really tight and is NOT found in micro-optimizations but in the algorithms you use. Note : Competitive coding sites always run on systems shared with other users so they cannot guarantee your code will finish within very tight timing margins. (milliseconds). There will always be a random factor in the time your program runs. – Pepijn Kramer Aug 09 '23 at 03:42
  • When optimizing production code software timing is measured using profilers on a system that is fully under your own control. And then the code is optimized (or algorithms redesigned) where bottlenecks actually occur. But this always measurement driven and comes after code correctness (proven by unit tests). So NO competitive coding sites do not teach you good software engineering. – Pepijn Kramer Aug 09 '23 at 03:45
  • 1
    @kilkuwu *I am using this purely for competitive programming which requires fast typing and speed coding* -- Here is the issue with this -- you posted on StackOverflow, and there is no competition here. You are asking volunteers to look at and possibly run the code you posted. If your code contains invalid C++, weird macros, etc. then it makes the volunteers job harder, and for really no reason for it to be harder. Posting obfuscated code, just because at the site you claim you are using, you need to work like a flash doesn't excuse not taking the time to post coherent code here on SO. – PaulMcKenzie Aug 09 '23 at 08:00

1 Answers1

3

You have issues in all assignments like T.at(id).left = update(....

The evaluation of the left and right sides of the assignment is unspecified until C++17. T.at(id).left seems to be evaluated first, its result type is Node&. Then update() is called, that adds a new element to the vector T.emplace_back() - this invalidates all references, thus the first evaluated reference T.at(id).left is now a dangling reference to a freed memory in the heap. After uppdate() has finished, the assignent of the result value to the dangling pointer results in the heap-use-after-free.

The fix is quite trivial:

  const auto left = update(T.at(root).left, l, mid, p, v);
  T.at(id).left = left;

I hope you will fix the other assignment yourself.

273K
  • 29,503
  • 10
  • 41
  • 64