How did changing datatype of an array from int to long long and INT_MAX to LLONG_MAX in a C++ code result in a runtime error?

Question

The values in my input testcase files were such that at some point in the code, values would exceed the capacity of int, so I figured I'd change the datatype of this particular array holding this value greater than INT_MAX from int to long long and change maximum values in the code to LLONG_MAX from INT_MAX so that comparisons during runtime don't yield a wrong answer.

However, now the code seems to get stuck with a runtime error even before arriving at the mentioned testcase. It now fails at a case that it used to pass when the values were all int oriented. I don't understand how this is possible.

The testcase which passes with int but fails with ll is:

Below is the code that gives an error at this tc.

#include<bits/stdc++.h>
using namespace std;

# define ll long long int

ll update, previous; 
set<pair<ll, int>> dist;
auto it=dist.begin();
int ind=0, n, i, j;
pair<ll, int>p;

void dij(vector<pair<int, ll>> tree[], bool decided[], ll d[], int path[]) {
    ind=0;
    while(!dist.empty()) {
        it=dist.begin();
        if(it==dist.end()) return;
        ind=it->second;
        dist.erase(it);
        decided[ind]=1;
        for(j=0; j<tree[ind].size(); j++) {
            update=d[ind]+tree[ind][j].second;
            previous=d[tree[ind][j].first];
            if(update<previous) {
                p=make_pair(previous, tree[ind][j].first);
                dist.erase(dist.find(p));
                p=make_pair(update, tree[ind][j].first);
                dist.insert(p);
                path[tree[ind][j].first]=ind;
            }
            d[tree[ind][j].first]=min(update, previous);
        }
    }
}

int main()
{
    ll edges;
    ll x, y, weight;
    cin>>n>>edges;
    vector<pair<int, ll>> graph[n];
    for(i=0; i<edges; i++) {
        cin>>x>>y>>weight;
        x--; y--;
        graph[x].push_back({y, weight});
        graph[y].push_back({x, weight});
    }
    int src=1;
    src--;
    ll d[n];
    for(i=0; i<n; i++) {
        if(src==i) {
            dist.insert({0, i});
            d[i]=0;
        }
        else {
            dist.insert({LLONG_MAX, i});
            d[i]=LLONG_MAX;
        }
    }
    bool decided[n]={0};
    int path[n]={-1};
    for(int i=1; i<n; i++) path[i]=-2;
    dij(graph, decided, d, path);
    if(path[n-1]==-2) cout<<-1;
    else {
        vector<int> s;
        int final=n-1;
        while (final!=-1) {
            s.push_back(final);
            final=path[final];
        }
        reverse(s.begin(), s.end());
        for(auto pi:s) cout<<pi+1<<" ";
    }
    cout<<endl;
}

Below is the code that produces a correct output for this tc.

#include<bits/stdc++.h>
using namespace std;

# define ll long long int

ll update, previous; 
set<pair<ll, int>> dist;
auto it=dist.begin();
int ind=0, n, i, j;
pair<ll, int>p;

void dij(vector<pair<int, ll>> tree[], bool decided[], int d[], int path[]) {
    ind=0;
    while(!dist.empty()) {
    it=dist.begin();
    if(it==dist.end()) return;
    ind=it->second;
    dist.erase(it);
    decided[ind]=1;
    for(j=0; j<tree[ind].size(); j++) {
        update=d[ind]+tree[ind][j].second;
        previous=d[tree[ind][j].first];
        if(update<previous) {
            p=make_pair(previous, tree[ind][j].first);
            dist.erase(dist.find(p));
            p=make_pair(update, tree[ind][j].first);
            dist.insert(p);
            path[tree[ind][j].first]=ind;
        }
        d[tree[ind][j].first]=min(update, previous);
    }
    }
}

int main()
{
    ll edges;
    ll x, y, weight;
    cin>>n>>edges;
    vector<pair<int, ll>> graph[n];
    for(i=0; i<edges; i++) {
        cin>>x>>y>>weight;
        x--; y--;
        graph[x].push_back({y, weight});
        graph[y].push_back({x, weight});
    }
    int src=1;
    src--;
    int d[n];
    for(i=0; i<n; i++) {
        if(src==i) {
            dist.insert({0, i});
            d[i]=0;
        }
        else {
            dist.insert({INT_MAX, i});
            d[i]=INT_MAX;
        }
    }
    bool decided[n]={0};
    int path[n]={-1};
    for(int i=1; i<n; i++) path[i]=-2;
    dij(graph, decided, d, path);
    if(path[n-1]==-2) cout<<-1;
    else {
        vector<int> s;
        int final=n-1;
        while (final!=-1) {
            s.push_back(final);
            final=path[final];
        }
        reverse(s.begin(), s.end());
        for(auto pi:s) cout<<pi+1<<" ";
    }
    cout<<endl;
}

The only difference in the 2 codes are the following lines:

void dij(vector<pair<int, ll>> tree[], bool decided[], ll d[], int path[])

void dij(vector<pair<int, ll>> tree[], bool decided[], int d[], int path[])

ll d[n];

int d[n];

dist.insert({LLONG_MAX, i})

dist.insert({INT_MAX, i})

d[i]=LLONG_MAX

d[i]=INT_MAX

Could someone please point out how is this creating the following error which I read is related to "allocating memory where I should not" or "attempting to execute delete with a pointer value that was not obtained from new". What is causing this problem and how should I resolve it?

free(): invalid pointer
Aborted (core dumped)

Why use non-standard and risky variable-length arrays when you are aware of `std::vector`? — molbdnilo, Dec 15 '20 at 22:28
`int d[n];` -- To add to the previous comment, you're already using `std::vector`, so why did you not use it here also? It looks like you are not aware what vectors are used for. — PaulMcKenzie, Dec 15 '20 at 22:30
Using vectors will give you a fighting chance in solving the problem. With vector, you have the `at()` function that tests for boundary conditions. There is no such thing for those non-standard variable length arrays. And in any event, anyone who uses Visual C++ that would try to test your code would have to change it anyway. It would be better if you changed your code to not use any VLA's and use vector. — PaulMcKenzie, Dec 15 '20 at 22:32
The `vector` would certainly reduce the odds of a stack overflow. Judges love to feed in data sets that exceed the available Automatic storage. — user4581301, Dec 15 '20 at 22:33
Note that there's a lot of compiler-specific code in there. That's fine (with a loose definition of fine) for a competition where you know exactly what build environment is being used, but not so good if you find yourself having to port to Visual Studio or possibly even the next revision of GCC. — user4581301, Dec 15 '20 at 22:36
Take advantage of the `at` function Paul recommended. If the program breaches the bounds of any of the vectors you'll get a different error message. The freeing of an invalid pointer looks like it could be a buffer over-or-under flow that mauls the book-keeping information surrounding an allocation. — user4581301, Dec 15 '20 at 22:39
Not outright wrong, but why `int src=1; src--;` and not `int src=0;`? — user4581301, Dec 15 '20 at 22:41
I did use at after that suggestion... it is giving the same error what should I do now? — costheta_z, Dec 15 '20 at 22:41
It was meant for user entered value of src that's why when it asked me to find constant 1 to n I just kept the src — costheta_z, Dec 15 '20 at 22:42
@costheta_z `graph[x].push_back({y, weight}); graph[y].push_back({x, weight})` -- This is an out-of-bounds access for both the "working" and non-working code. The values of `x` and/or `y`, given the data you have, will at times be greater than `n-1`. Thus `graph[x]` or `graph[y]` would be out-of-bounds. This is the reason why you should use vector, and it's for errors like this: `using GraphType = std::vector>>;` and then `GraphType graph(n);` would then be `graph.at(x).push_back({x, weight})`, for example. — PaulMcKenzie, Dec 15 '20 at 22:59
@PaulMcKenzie could you point that data out? I mean which one do you mean will exceed n-1 because these are the constraints of the question itself the testcases shouldn't break them... They'sre all <=100 for this one no? Yes x and y will be greater than n-1, but x-1 and y-1 won't because maximum value of x and/or y can be n only. — costheta_z, Dec 15 '20 at 23:08
`32 100 117` -- x is 32, y is 100 (out-of-bounds). Also, your input statements don't match your `cin`'s. How are you reading in the first two values, `100` and `50`? — PaulMcKenzie, Dec 15 '20 at 23:12
@PaulMcKenzie 100 and 50 are read in n and edges respectively right at the beginning of main and x=31, y=99 because of the decrementng are both under 100 which is n for this case how is that out of bounds? — costheta_z, Dec 15 '20 at 23:14
[Here is your code, as translated to standard C++ as best as I can](https://ideone.com/qi7apE). There you see an error, even with the code that you claim works. The error I pointed out previously was due to trying to piece together your code into a [mcve]. Also, why are you using global variables?? The error, when run under Visual Studio, is this line: `dist.erase(dist.find(p));`. Why are you erasing when you have not confirmed the item exists? Anyway, the code is there, it's up to you to debug it. — PaulMcKenzie, Dec 15 '20 at 23:24
[link](https://ideone.com/D6Cil8) that is the code I claimed works, and it does... it gave the expected output, I'd submitted it on cf and it passed this test that's how I knew but here is the link cuz you ran the one I said didn't work... The ll one — costheta_z, Dec 15 '20 at 23:32
Welcome to the world of C++. Just because a code "works" doesn't mean it's correct. C++ has something called "undefined behavior". — PaulMcKenzie, Dec 15 '20 at 23:40
[link](https://ideone.com/ApARhU) and that is how `dist.erase(dist.find(p));` is working in one of them and not in the other, thanks for pointing it out but yeah the code I claimed to be working seems to find that pair (which it should it's just dijkstra's algo) the other one gives a bunch of "oops" as [link](https://ideone.com/Wvdly2) here. Thanks for your help I appreciate it. — costheta_z, Dec 15 '20 at 23:42
The undefined behavior occurs when i try to erase what doesn't exist, in the code I linked here I am checking for the pair and then erasing otherwise printing "oops" which should not give undefined behavior. By it "works" for the code I claimed to be working I mean it finds every pair and is not giving undefined behavior. — costheta_z, Dec 15 '20 at 23:45
[Classic example](https://www.youtube.com/watch?v=73wMnU7xbwE) The guy on the left is Bill Gates, and I guarantee you he was not going to be on that stage unless he'd seen that demonstration run perfectly more than a few times. Undefined behaviour is the purest form of Evil known to mankind. — user4581301, Dec 15 '20 at 23:59
But then if I am checking whether or not iterator points to vector.end() and only erasing when it doesn't then what is the cause of undefined behavior? @user4581301 sorry to be bothering you — costheta_z, Dec 16 '20 at 00:05
Your error message is highly suspicious since there's no `free` in the code. — Mark Ransom, Dec 16 '20 at 01:14
@PaulMcKenzie even the `[]` operator of vector can [check for out-of-bound conditions](https://stackoverflow.com/q/1290396/995714) in debug mode in MSVC — phuclv, Dec 17 '20 at 00:23
@phuclv -- Yes, that is true. My opinion is that Visual C++ is a better compiler to learn C++ on in terms of ease-of-use and debugging than, for example g++ or clang. — PaulMcKenzie, Dec 17 '20 at 00:31

score 0 · Answer 1 · edited Dec 21 '20 at 03:21

The problem was indeed related to long long and that is why the code with int was running fine, because the fact that update would create an overflow, as the variable is a sum of two long long type variables which would have had max value LLONG_MAX assigned in main() was overlooked.

As long long can not accommodate 2*LLONG_MAX it was neither holding nor finding that value in the set of pairs used as the min heap. Thus iterator pointed to end of the set and erasing set.end() would generate undefined behavior in the long long datatype oriented code whereas it wouldn't in the int oriented one.

Replacing LLONG_MAX with 1e18 instead, in the code solved the problem and the code runs for all test files smoothly.

Additionally to clarify all the reasons that were pointed out through the comments, I thought I should clarify that not checking if dist.find(p) exists and doesn't point to end of set before performing dist.erase(dist.find(p)) would not create any problems. This is because it is Dijkstra's Algorithm and as many times as update is found to be less than previous, the node that this updated distance is being calculated for, from the source will always be present in the set paired with the distance previous. This is because all the nodes are initially entered with a value of 10e8 and are being updated as the values are found in successive iterations of the while loop.

Below is the working code, the only difference is that instead of LLONG_MAX I have used 1e18 and it runs fine on all test files including the one I had mentioned in the question as being problematic.

#include<bits/stdc++.h>
using namespace std;

# define ll long long int

ll update, previous; 
set<pair<ll, int>> dist;
auto it=dist.begin();
int ind=0, n, i, j;
pair<ll, int>p;

void dij(vector<pair<int, ll>> tree[], bool decided[], ll d[], int path[]) {
    ind=0;
    while(!dist.empty()) {
        it=dist.begin();
        if(it==dist.end()) return;
        ind=it->second;
        dist.erase(it);
        decided[ind]=1;
        for(j=0; j<tree[ind].size(); j++) {
            update=d[ind]+tree[ind][j].second;
            previous=d[tree[ind][j].first];
            if(update<previous) {
                p=make_pair(previous, tree[ind][j].first);
                //cout<<p.first<<" intermediate "<<p.second<<endl;
                dist.erase(dist.find(p));
                p=make_pair(update, tree[ind][j].first);
                dist.insert(p);
                path[tree[ind][j].first]=ind;
            }
            d[tree[ind][j].first]=min(update, previous);
        }
    }
}

int main()
{
    ll edges;
    ll x, y, weight;
    cin>>n>>edges;
    vector<pair<int, ll>> graph[n];
    for(i=0; i<edges; i++) {
        cin>>x>>y>>weight;
        x--; y--;
        graph[x].push_back({y, weight});
        graph[y].push_back({x, weight});
    }
    int src=1;
    src--;
    ll d[n];
    for(i=0; i<n; i++) {
        if(src==i) {
            dist.insert({0, i});
            d[i]=0;
        }
        else {
            dist.insert({1e18, i});
            d[i]=1e18;
        }
    }
    bool decided[n]={0};
    int path[n]={-1};
    for(int i=1; i<n; i++) path[i]=-2;
    dij(graph, decided, d, path);
    if(path[n-1]==-2) cout<<-1;
    else {
        vector<int> s;
        int final=n-1;
        while (final!=-1) {
            s.push_back(final);
            final=path[final];
        }
        reverse(s.begin(), s.end());
        for(auto pi:s) cout<<pi+1<<" ";
    }
    cout<<endl;
}

Here is the output for the test file in the question:

Input

Participant's output

-1

Jury's answer

-1

The link to submission - Dijkstra

`Changing LLONG_MAX to 1e18 solved the problem` no you're not allowed to change standard macro values — phuclv, Dec 16 '20 at 17:29
I thought it was obvious... plus there is also the code in the answer so I didn't think it was understandable but I'll make sure I edit it to "replace" instead of "change" so that no one else gets confused... — costheta_z, Dec 16 '20 at 20:21
[Here is your answer, written in standard C++](https://ideone.com/g351jz). — PaulMcKenzie, Dec 17 '20 at 06:00

How did changing datatype of an array from int to long long and INT_MAX to LLONG_MAX in a C++ code result in a runtime error?

1 Answers1