-3

I the below code demonstrates strange behaviour when trying to access an out-of-range index in a vector

#include <iostream>
#include <vector>

int main()
{
    std::vector<int> a_vector(10, 0);

    for(int i = 0; i < a_vector.size(); i++)
    {
        std::cout << a_vector[i] << ", ";
    }
    for(int j = 0; j <= a_vector.size(); j++)
    {
        std::cout << a_vector[i] << ", ";
    }
    return 0;
}

The first for loop produces the expected 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, output, however the second loop produces 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1318834149,.

The last number produced by the second loop changes every time to code is run, and is always large, unless the length of the vector is between three and seven (inclusive), in which case the last number is 0. It also persists for larger indexes - for example modifying the stop value of the second loop to j <= a_vector.size() + 2000 keeps producing large numbers until index 1139, at which point it reverts to 0.

Where does this number come from, does it mean anything, and most importantly why isn't the code throwing an 'out of range' error, which is what I would expect it to do when asked the access the 11th element of a vector 10 elements long

Utumno
  • 19
  • 5
  • 4
    This is undefined behaviour. – Oliver Charlesworth Aug 06 '16 at 17:20
  • 1
    Just don't do it. Use [range `for` loops](http://en.cppreference.com/w/cpp/language/range-for) or iterators or [standard algorithms](http://en.cppreference.com/w/cpp/algorithm) instead. – Some programmer dude Aug 06 '16 at 17:22
  • Any reference of your choosing will say nothing about an `out_of_range` error from `operator[]`. – chris Aug 06 '16 at 17:22
  • this isn't an issue I'm grappling with, it turned up after making a daft error - I'm just curious about what caused it – Utumno Aug 06 '16 at 17:24
  • Learn this basic tenet of C++: You don't pay for what you don't use. You didn't ask for bounds-checking, e.g. by using `.at(index)`, so none is performed. Normally people don't deliberately write broken code, and when they don't, there is no possibility of out-of-bounds errors and hence UB, so they don't need to waste time checking and hence can have working _and_ fast code. – underscore_d Aug 06 '16 at 17:24
  • @chris why doesn't `operator[]` produce out of range errors? – Utumno Aug 06 '16 at 17:24
  • 1
    @Utumno because the language doesn't require it to... and because `vector` is just a clever, auto-managed dynamically allocated array, and basic arrays' `operator[]` doesn't check either, for reasons already given. Lack of bounds-checking and undefined behaviour are extremely widely discussed subjects; have you tried searching? – underscore_d Aug 06 '16 at 17:26
  • 1
    @underscore_d I did search, but found nothing I found relevant to my terms, but now I know what I'm looking at I found this: [link](http://stackoverflow.com/questions/1239938/accessing-an-array-out-of-bounds-gives-no-error-why) which is the same issue – Utumno Aug 06 '16 at 17:30

2 Answers2

1

Did you meant ?

for(int j = 0; j < a_vector.size(); j++)
{
    std::cout << a_vector[j] << ", ";
}

Because you're going out of the vector range, wich is an undefined behavior and will return and "random" number everytime your run it.

Paul F.
  • 168
  • 1
  • 2
  • 9
  • The code I wrote is deliberately broken to produce the behaviour - I was curious as to where the 'random' number came from – Utumno Aug 06 '16 at 17:26
  • 1
    @Utumno If you write broken code that invokes undefined behaviour, then anything can happen, including a 'random' number. – underscore_d Aug 06 '16 at 17:27
  • Oh ok, my bad. I guess that the vector object is looking to the next "slot" in memory wich contains pretty much anything and interprets it as a number. – Paul F. Aug 06 '16 at 17:29
  • @PaulF. Thanks anyway, I didn't realise the random number comes from the next memory slot – Utumno Aug 06 '16 at 17:31
  • @Utumno Array accesses and, by symmetry, `operator[]` are implemented as `*(base + index)`, so it's pretty much 'guaranteed' that the compiler will _try_ to access the next piece of memory _as if_ it were another element of the array. Whether that attempt will succeed is a different matter entirely, because the behaviour is undefined. For example, the next piece of memory might cause a segfault, or return a trap representation, or who knows? It's UB. So it must be avoided at all costs. – underscore_d Aug 06 '16 at 17:36
0

C++ is powerful, and with great power comes great responsibility. If you go out of range, it lets you. 99.99999999% of the time that isn't a good thing, but it still lets you do it.

As for why it changes every time, the computer is treating the hunk of memory after the end of the array as another int, then displaying it. The value of that int depends on what bits are left in that memory from when it was used last. It might have been used for a string allocated and discarded earlier in the program, it might be padding that the compiler inserts in memory allocations to optimize access, it might be active memory being used by another object. If you have no idea (like in this case), you have no way to know and shouldn't expect any sort of consistent behavior.

(What is the 0.00000001% when is it a good thing, you may ask? Once I intentionally reached beyond the range of an array in a subclass to access some private data in the parent class that had no accessor to fix a bug. This was in a library I had no control over, so I had to do this or live with the bug. Perhaps not exactly a good thing, but since I was confident of the layout of memory in that specific case it worked.)

ADDENDUM: The case I mentioned was a pragmatic solution to a bad situation. Code needed to ship and the vendor wasn't going to fix a bug for months. Although I was confident of the behavior (I knew the exact target platform, the code was statically linked so wasn't going to be replaced by the user, etc.) this introduced code fragility and new responsibility for the future. Namely, the next time the library was updated it would almost certainly break.

So I commented the heck of the code explaining the exact issue, what I was doing, and when it should be removed. I also used lots of CAPITAL LETTERS in my commit message. And I told all of the other programmers, just in case I got hit by a bus before the bug was fixed. In other words, I exercised the great responsibility needed to wield this great power.

Steve
  • 209
  • 1
  • 6
  • 1
    `If you have no idea (like in this case), you have no way to know and shouldn't expect any sort of consistent behavior.` There is usually no way to know, since the memory is managed by the OS, not the program... and even if you _think_ you know, relying on that 'knowledge' is folly because accessing out-of-bounds is UB. Doing any 'trick' that relies on UB is just asking for trouble. It might've worked for you for a while but is completely broken and vulnerable to changing at the drop of a hat, for instance on another platform or compiler that pads differently. Please don't legitimise this idea – underscore_d Aug 06 '16 at 17:40
  • I agree, this was years ago and a situation where there was no legit mechanism to fix the problem -- code needed to ship and the vendor wasn't going to fix the bug for months. I still feel a little dirty thinking about it. Maybe the number is more like 0.00000001% :) – Steve Aug 06 '16 at 17:42
  • At least you feel suitably guilty ;-) Did it all get sorted eventually? I'd suggest including more of a warning sign in your answer that the whole thing is UB, whether or not one knows the layout of memory past-the-end. Such constructs must be avoided if at all possible - which is almost always, except in a vanishingly small number of cases. & don't get me wrong: those are amusing to hear about! It'd also be good to illustrate briefly _why_ not bounds-checking is part of C++'s power (speed when you want it) - which you and me will think is obvious, but new readers might not realise right away. – underscore_d Aug 06 '16 at 17:45
  • 1
    There is _never_ a good time to rely on UB by definition, one shouldn't even try to ascribe a non-zero number to the "percentage of time" when it's a good idea. – sjrowlinson Aug 06 '16 at 18:03
  • I agree that relying on undefined behavior is a horrible idea, but in the case I'm describing it wasn't "undefined". I knew the memory layout around the object, architecture it was being used on, etc. This wasn't code that was going to be compiled elsewhere (well before the days of sharing stuff on GitHub), it was going to be used by me and my team. – Steve Aug 06 '16 at 18:29