1

This has been bugging me for quite some time. I have a pointer. I declare an array of type int.

int* data;
data = new int[5];

I believe this creates an array of int with size 5. So I'll be able to store values from data[0] to data[4].

Now I create an array the same way, but without size.

int* data;
data = new int;

I am still able to store values in data[2] or data[3]. But I created an array of size 1. How is this possible?

I understand that data is a pointer pointing to the first element of the array. Though I haven't allocated memory for the next elements, I still able to access them. How?

Thanks.

Rhathin
  • 1,176
  • 2
  • 14
  • 20
  • 6
    C++ doesn't have any bounds-checking. Going out of bounds leads to [*undefined behavior*](https://en.wikipedia.org/wiki/Undefined_behavior). – Some programmer dude Jan 07 '20 at 09:57
  • 3
    You just write somewhere in the memory. Use C arrays only, when really needed. Otherwise use `std::vector`, `std::array` or similar. – Simon Jan 07 '20 at 09:58
  • 4
    Does this answer your question? [Accessing an array out of bounds gives no error, why?](https://stackoverflow.com/questions/1239938/accessing-an-array-out-of-bounds-gives-no-error-why) – ChrisMM Jan 07 '20 at 10:00

7 Answers7

3

Normally, there is no need to allocate an array "manually" with new. It is just much more convenient and also much safer to use std::vector<int> instead. And leave the correct implementation of dynamic memory management to the authors of the standard library.

std::vector<int> optionally provides element access with bounds checking, via the at() method.

Example:

#include <vector>
int main() {
    // create resizable array of integers and resize as desired
    std::vector<int> data; 
    data.resize(5);
    // element access without bounds checking
    data[3] = 10;
    // optionally: element access with bounds checking
    // attempts to access out-of-range elements trigger runtime exception
    data.at(10) = 0; 
}

The default mode in C++ is usually to allow to shoot yourself in the foot with undefined behavior as you have seen in your case.

For reference:


Also, in the second case you don't allocate an array at all, but a single object. Note that you must use the matching delete operator too.

int main() {
    // allocate and deallocate an array
    int *arr = new int[5];
    delete[] arr;
    // allocate and deallocate a single object
    int *p = new int;
    delete p;
}

For reference:

moooeeeep
  • 31,622
  • 22
  • 98
  • 187
3

When you used new int then accessing data[i] where i!=0 has undefined behaviour. But that doesn't mean the operation will fail immediately (or every time or even ever). On most architectures its very likely that the memory addresses just beyond the end of the block you asked for are mapped to your process and you can access them. If you're not writing to them it's no surprise you can access them (though you shouldn't). Even if you write to them most memory allocators have a minimum allocation and behind the scenes you may well have been allocated space for more (4 is realistic) integers even though the code only requests 1. You may also be overwriting some area of memory but never get tripped up. A common consequence of writing beyond the end of an array is to corrupt the free-memory store itself. The consequence may be catastrophe but may only exhibit itself in a later allocation possibly of a similar sized object.

It's a dreadful idea to rely on such behaviour but it's not very surprising that it appears to work. C++ doesn't (typically or by default) perform strict range checking and accessing invalid array elements may work or at least appear to work initially.

This is why C and C++ can be plagued with bizarre and intermittent errors. Not all code that provokes undefined behaviour fails catastrophically in every execution.

Persixty
  • 8,165
  • 2
  • 13
  • 35
2

Going outside the bounds of an array in C++ is undefined behavior, so anything can happen, including things that appear to work "correctly".

In practical implementation terms on common systems, you can think of "virtual" memory as a large "flat" space from 0 up to the size of a pointer, and pointers are into this space.

The "virtual" memory for a process is mapped to physical memory, page file, etc. Now, if you access an address that is not mapped, or try to write a read-only part, you will get an error, such as an access violation or segfault.

But this mapping is done for fairly large chunks for efficiency, such as for 4KiB "pages". The allocators in a process, such as new and delete (or the stack) will further split up these pages as required. So accessing other parts of a valid page are unlikely to raise an error.

This has the unfortunate result that it can be hard to detect such out of bounds access, use after free, etc. In many cases writes will succeed, only to corrupt some other seemingly unrelated object, which may cause a crash later, or incorrect program output, so best to be very careful about C and C++ memory management.

data = new int; // will be some virtual address
data[1000] = 5; // possibly the start of a 4K page potentially allowing a great deal beyond it 
other_int = new int[5];
other_int[10] = 10;
data[10000] = 42; // with further pages beyond, so you can really make a mess of your programs memory
other_int[10] == 42; // perfectly possible to overwrite other things in unexpected ways

C++ provides many tools to help, such as std::string, std::vector and std::unique_ptr, and it is generally best to try and avoid manual new and delete entirely.

Fire Lancer
  • 29,364
  • 31
  • 116
  • 182
1

new int allocates 1 integer only. If you access offsets larger than 0, e.g. data[1] you override the memory.

nivpeled
  • 1,810
  • 5
  • 17
  • 3
    You don't necessarily override memory (even if that might be a common consequence). You have Undefined Behavior, so anything might happen. If the compiler notices this, it might also turn the entire function/code to a no-op. Or do anything else. – Max Langhof Jan 07 '20 at 10:05
1

int * is a pointer to something that's probably an int. When you allocate using new int , you're allocating one int and storing the address to the pointer. In reality, int * is just a pointer to some memory.

We can treat an int * as a pointer to a scalar element (i.e. new int) or an array of elements -- the language has no way of telling you what your pointer is really pointing to; a very good argument to stop using pointers and only using scalar values and std::vector.

When you say a[2], you well access the memory sizeof(int) after the value pointed to by a. If a is pointing to a scalar value, anything could be after a and reading it causes undefined behaviour (your program might actually crash -- this is an actual risk). Writing to that adress will most likley cause problems; it is not merely a risk, but something you should actively guard against -- i.e. use std::vector if you need an array and int or int& if you don't.

Clearer
  • 2,166
  • 23
  • 38
  • Reading from and writing and to a scalar, as if it was an array are hard bugs. You should do whatever you can to make sure it doesn't happen. – Clearer Jan 07 '20 at 11:50
1

The expression a[b], where one of the operands is a pointer, is another way to write *(a+b). Let's for the sake of sanity assume that a is the pointer here (but since addition is commutative it can be the other way around! try it!); then the address in a is incremented by b times sizeof(*a), resulting in the address of the bth object after *a.

The resulting pointer is dereferenced, resulting in a "name" for the object whose address is a+b.

Note that a does not have to be an array; if it is one, it "decays" to a pointer before the operator [] is applied. The operation is taking place on a typed pointer. If that pointer is invalid, or if the memory at a+b does not in fact hold an object of the type of *a, or even if that object is unrelated to *a (e.g., because it is not in the same array or structure), the behavior is undefined.

In the real world, "normal" programs do not do any bounds checking but simply add the offset to the pointer and access that memory location. (Accessing out-of-bounds memory is, of course, one of the more common bugs in C and C++, and one of the reasons these languages are not without restrictions recommended for high-security applications.)

If the index b is small, the memory is probably accessible by your program. For plain old data like int the most likely result is then that you simply read or write the memory in that location. This is what happened to you.

Since you overwrite unrelated data (which may in fact be used by other variables in your program) the results are often surprising in more complex programs. Such errors can be hard to find, and there are tools out there to detect such out-of-bounds access.

For larger indices you'll at some point end up in memory which is not assigned to your program, leading to an immediate crash on modern systems like Windows NT and up, and unpredictable results on architectures without memory management.

Peter - Reinstate Monica
  • 15,048
  • 4
  • 37
  • 62
0

I am still able to store values in data[2] or data[3]. But I created an array of size 1. How is this possible?

The behaviour of the program is undefined.

Also, you didn't create an array of size 1, but a single non-array object instead. The difference is subtle.

eerorika
  • 232,697
  • 12
  • 197
  • 326