25

Let's say I have a class Derived which derives from class Base whereas sizeof(Derived) > sizeof(Base). Now, if one allocates an array of Derived like this:

Base * myArray = new Derived[42];

and then attempts to access the n-th object using

doSomethingWithBase(myArray[n]);

Then this is might likely (but not always) cause undefined behaviour due to accessing Base from an invalid location.

What is the correct term for such an programming error? Should it be considered a case of object slicing?

jotik
  • 17,044
  • 13
  • 58
  • 123
  • "Using C-style arrays and pointers and `new`". That's the name of this programmer's error. – n. m. could be an AI May 05 '16 at 12:58
  • @n.m. this is not C-style array but classic dynamic array, I think OP raised very good question - completely legit code by syntax leads to disaster – Slava May 05 '16 at 13:01
  • Allocating arrays of anything that's not plain old C structures - that's your problem right there. –  May 05 '16 at 13:01
  • @Slava Not limited as such, but you really want to wrap them in a class that manages them properly, such as std::vector :) –  May 05 '16 at 13:05
  • 2
    @Slava That's classic dynamic **C-style** array. A C-style array is a sequence of objects exposed via a raw pointer to its first element. It doesn't matter how it is allocated. – n. m. could be an AI May 05 '16 at 13:05
  • @Slava *completely legit code by syntax leads to disaster* This is just your run-off-the-mill undefined behavior, nothing special. – n. m. could be an AI May 05 '16 at 13:09
  • 1
    @Slava Use `std::vector` (I don't believe I still have to say that several times a day). – n. m. could be an AI May 05 '16 at 13:10
  • 2
    @n.m. how vector allocates data inside? It is using new[] is it not? But according to you that's programmer's error – Slava May 05 '16 at 13:13
  • 2
    @Slava That's irrelevant to OP's question; please avoid asking here. – edmz May 05 '16 at 13:27
  • 3
    @Slava `how vector allocates data inside` is not anyone's buiness. It is guaranteed by the standard to work, that's all users need to know. `new`, if used, is called by the standard library. The standard library is allowed to use stuff whuch is not accessible or not advised for the users of the library. – n. m. could be an AI May 05 '16 at 13:42
  • Huh, I never thought of this. A legal implicit pointer conversion resulting in total nonsense code, really easily. Good going, C++! – Lightness Races in Orbit May 10 '16 at 20:03
  • @Slava: _"how vector allocates data inside? It is using new[] is it not?"_ Yes but `new char[]` most likely, then placement new into that buffer. Not related to this question in any way. – Lightness Races in Orbit May 10 '16 at 20:04
  • 2
    As a German I suggest ___Arrayelementsizemismatch___ as the correct term. Mhmm. Gotta shove a "base" in there somewhere... – sbi May 14 '16 at 08:23
  • 1
    In the special case where 42 is instead 1, it would be object slicing. But, really it's that since it is possible that sizeof(Base) != sizeof(Derived), array indexing gets the wrong memory location. If the sizeof(Base) == sizeof(Derived), then again it is object slicing. Object slicing is a special case of this! – eyeApps LLC May 15 '16 at 19:39

4 Answers4

26

It is not slicing at all, rather it is undefined behavior because you are accessing a Derived object where none exists (unless you get lucky and the sizes line up, in which case it is still UB but might do something useful anyway).

It's a simple case of failed pointer arithmetic.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • It's still undefined behaviour even if you "get lucky and the sizes line up". Pointers are pointers, not integers. Technically. :) – Lightness Races in Orbit May 10 '16 at 20:04
  • @LightnessRacesinOrbit: You're right, the way I worded that made it sound like it would be totally valid in that case but it's not. I've updated the answer to reflect that. – John Zwinck May 11 '16 at 01:40
19

This is not object slicing.

As noted, indexing myArray does not cause object slicing, but results in undefined behavior caused by indexing into an array of Derived as if it were an array of Base.

A kind of "array decay bug".

The bug introduced at the assignment of new Derived[42] to myArray may be a variation of an array decay bug.

In a true instance of this type of bug, there is an actual array:

Derived x[42];
Base *myArray = x;

The problem is introduced because an array of Derived decays into a pointer to Derived with value equal to the address of its first element. The decay allows the pointer assignment to work properly. This decay behavior is inherited from C, which was a language design feature to allow arrays to be "passed by reference".

This leads us to the even worse incarnation of this bug. This feature gives C and C++ semantics for arrays syntax that turn array function arguments into aliases for pointer arguments.

void foo (Base base_array[42]) {
    //...
}

Derived d[42];
foo(d);          // Boom.

However, new[] is actually an overloaded operator that returns a pointer to the beginning of the allocated array object. So it is not a true instance of array decay (even though the array allocator is used). However, the bug symptoms are the same, and the intention of new[] is to get an array of Derived.

Detecting and avoiding the bug.

Use a smart pointer.

This kind of problem can be avoided by using a smart pointer object instead of managing a raw pointer. For example, the analogous coding error with unique_ptr would look like:

std::unique_ptr<Base[]> myArray = new Derived[42];

This would yield a compile time error, because unique_ptrs constructor is explicit

Use a container, and maybe std::reference.

Alternatively, you could avoid using new[], and use std::vector<Derived>. Then, you would have forced yourself to design a different solution for sending this array to framework code that is only Base aware. Possibly, a template function.

void my_framework_code (Base &object) {
    //...
}

template <typename DERIVED>
void my_interface(std::vector<DERIVED> &v) {
    for (...) {
        my_framework_code(v[i]);
    }
}

Or, by using std::reference_wrapper<Base>.

std::vector<Derived> v(42);
std::vector<std::reference_wrapper<Base>> myArray(v.begin(), v.end());
jxh
  • 69,070
  • 8
  • 110
  • 193
  • If the sizes are the same, it will most likely work. Does the standard have anything to say about that? – sp2danny May 12 '16 at 07:50
  • @sp2danny: Once the array-to-pointer conversion is performed, the `Base` pointer must be treated as a pointer to a nonarray object. Only a `Derived` pointer can be treated as a pointer to an array object. Thus, pointer arithmetic on the `Base` pointer is limited to what is defined for an array of length one. – jxh May 12 '16 at 09:20
  • @jxh So contrary to ruakh's comment on n.m.'s answer, `myArray + 0` is NOT undefined behaviour? – jotik May 13 '16 at 06:20
  • @sp2danny: Okay, found n.m.'s citation in C++2014. – jxh May 13 '16 at 18:17
11

This is not object slicing in any way.

Object slicing is perfectly well defined by the C++ standard. It may be a violation of object-oriented design principles or whatever, but it is not a violation of C++ rules.

This code violates 5.7 [expr.add] paragraph 7:

For addition or subtraction, if the expressions P or Q have type “pointer to cv T”, where T is different from the cv-unqualified array element type, the behavior is undefined. [Note: In particular, a pointer to a base class cannot be used for pointer arithmetic when the array contains objects of a derived class type. —end note].

Array subscript operator is defined to be equivalent to pointer arithmetic, 5.2.1 [expr.sub] paragraph 1:

The expression E1[E2] is identical (by definition) to *((E1)+(E2))

jotik
  • 17,044
  • 13
  • 58
  • 123
n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • 1
    +1. Interestingly, this implies that even `myArray + 0` invokes undefined behavior. I wonder if any compilers currently use that fact for any weird optimizations? – ruakh May 05 '16 at 19:46
  • 1
    @ruakh: They could use it to determine that the dynamic type equals the static type and eliminate virtual function calls. Don't know, if any do though. – MikeMB May 06 '16 at 07:00
  • What do you mean by *"Object slicing is perfectly well defined by the C++ standard"*? – jotik May 06 '16 at 07:59
  • 2
    @jotik I mean object slicing does not violate any C++ rules. There's no undefined behaviour. – n. m. could be an AI May 06 '16 at 08:33
  • @MikeMB nice, that would be an excellent way to hint a compiler at the dynamic type of an object. – Johannes Schaub - litb May 14 '16 at 14:03
8

This is not a case of slicing, although it is very similar. Slicing is well defined. This is simply undefined behaviour (always, not just likely) due to illegal pointer arithmetic.

eerorika
  • 232,697
  • 12
  • 197
  • 326