Is the compiler allowed to optimise out private data members?

Question

If the compiler can prove that a (private) member of a class is never used, including by potential friends, does the standard allow the compiler to remove this member from the memory footprint of the class?

It is self-evident that this not possible for protected or public members at compile time, but there could be circumstances where it is possible regarding private data members for such a proof to be constructed.

Related questions:

Behind the scenes of public, private and protected (sparked this question)
Is C++ compiler allowed to optimize out unreferenced local objects (about automatic objects)
Will a static variable always use up memory? (about static objects)

*Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. * (9.2.14). However, it is not clear for me if it implies that the compiler cannot remove an element. A removed element will not have a higher address..., but it doesn't need an address... — Damien, Dec 08 '20 at 15:22
Just trying to look for the spanner to throw into the works here ... but what if code elsewhere would use a `sizeof(myClass)` type of operation? I really can't think *why* it would do so, but that would possibly break if a member were optimized away. — Adrian Mole, Dec 08 '20 at 15:24
@AdrianMole: A very good point -- even if potential accesses were finite (friends and members), and even if all of those finite accesses were visible to the compiler in one compilation unit, it needs to assume that `sizeof(T)` is used in some other compilation unit which lacks visibility into some of the definitions. Since the other compilation unit would be unable to perform the optimization, no compilation unit can perform the optimization. — Ben Voigt, Dec 08 '20 at 15:35
@AdrianMole Why would that break? The size of a type is up to the compiler IIRC. It is allowed to inflate the size beyond the raw bytes needed for the members anyway, so why not reduce it as well? — bitmask, Dec 08 '20 at 15:36
@AdrianMole “I really can't think why it would do so” — It does that (under the hood) for every array access. But as OP correctly notes, the size doesn’t necessarily reflect the exact members of the class. — Konrad Rudolph, Dec 08 '20 at 15:38
Well, I said "would *possibly* break" ... but I'm aware of padding and related issues. However, in a situation where padding is turned off - or at least known - a 'missing' data member (especially if large) could set things awry. — Adrian Mole, Dec 08 '20 at 15:40
:-) Asking that question definitly did not hurt you. I am jealous but did as promised. ;-) — Yunnosch, Dec 08 '20 at 19:20
Answered in [Will a static variable always use up memory?](https://stackoverflow.com/questions/7755257/will-a-static-variable-always-use-up-memory) — Language Lawyer, Dec 08 '20 at 20:20

Peter Cordes · Answer 1 · 2020-12-29T03:56:13.690

32

Possible in theory (along with unused public members), but not with the kind of compiler ecosystem we're used to (targeting a fixed ABI that can link separately-compiled code). Removing unused members could only be done with whole-program optimization that forbids separate libraries¹.

Other compilation units might need to agree on sizeof(foo), but that wouldn't be something you could derive from a .h if it depended on verifying that no implementation of a member function's behaviour depended on any private members.

Remember C++ only really specifies one program, not a way to do libraries. The language ISO C++ specifies is compatible with the style of implementation we're used to (of course), but implementations that take all the .cpp and .h files at once and produce a single self-contained non-extensible executable are possible.

If you constrain the implementation enough (no fixed ABI), aggressive whole-program application of the as-if rule becomes possible.

Footnote 1: I was going to add "or exports the size information somehow to other code being compiled" as a way to allow libraries, if the compiler could already see definitions for every member function declared in the class. But @PasserBy's answer points out that a separately-compiled library could be the thing that used the declared private members in ways that ultimately produce externally-visible side effects (like I/O). So we'd have to fully rule them out.

Given that, public and private members are equivalent for the purposes of such an optimization.

edited Dec 29 '20 at 03:56

answered Dec 08 '20 at 16:19

Peter Cordes

328,167
45
605
847

1

“Removing unused private members could … be done with whole-program optimization ….” Sure, but the same is true for public members, because it’s a trivial transformation under the as-if rule. – Konrad Rudolph Dec 08 '20 at 16:21
@KonradRudolph: Indeed. Very true. This is the "language lawyer" answer, not the "given real-world implementation techniques we definitely don't want to change" answer. – Peter Cordes Dec 08 '20 at 16:26
"whole-program optimization" would LTO count? – lights0123 Dec 09 '20 at 01:31
1

@lights0123: yes, that would be one way to implement it. But that does *not* avoid the constraint that no new code can be linked, so it's very much unlike current compilers LTO where a `.o` or `.so`/`.dll` that doesn't have LTO sections can just be linked in as opaque non-inlinable definitions. i.e. *every* object must be LTO, no libraries that are already separately-optimized into machine code. The key point is that you can tell the compiler *this is all the source file*, like `gcc -fwhole-program`. Not just "you can optimize across some of these object files". – Peter Cordes Dec 09 '20 at 04:20
I'd pound a compiler that deleted unused variables from a struct. They're supposed to be blittable to disk and back. – Joshua Dec 09 '20 at 18:42
2

@Joshua: An I/O function like `write` or `read` with a `void*` or `char*` to the object representation would be a "use", because it would be an externally-observable view of whatever the program had stored there. – Peter Cordes Dec 09 '20 at 18:51
@PeterCordes: Unfortunately, the Standard doesn't recognize the concept of an object's address "escaping" to code that may use it in ways that a compiler can't possibly be expected to know about, nor give any clue as to when compilers should or should not be expected to be pessimistic about objects whose address escapes. It would be reasonable to treat such things to a large extent as a Quality of Implementation issue outside the Standard's jurisdiction, if the Standard made clear that it makes no attempt to define everything that should be expected of a quality implementation, but... – supercat Dec 09 '20 at 21:40
...it fails to make clear that compilers should generally be cautious about things whose address escapes unless they have particular knowledge that such addresses won't be used in certain tricky ways, rather than assuming that such caution is only needed when compilers have particular knowledge that such addresses will be used in tricky ways. – supercat Dec 09 '20 at 21:45

score 18 · Answer 2 · answered Dec 08 '20 at 15:26

18

If the compiler can prove that a (private) member of a class is never used

The compiler cannot prove that, because private members can be used in other compilation units. Concretely, this is possible in the context of a pointer to member in a template argument according to [temp.spec]/6 of the standard, as originally described by Johannes Schaub.

So, in summary: no, the compiler must not optimise out private data members any more than public or protected members (subject to the as-if rule).

answered Dec 08 '20 at 15:26

Konrad Rudolph

530,221
131
937
1,214

1

What if every member and friend has a definition in the current compilation unit? The definitions can't be different in other CUs, that would be an ODR violation. The key seems to be that *any* code can name a private member by doing so in a template argument. – Ben Voigt Dec 08 '20 at 15:29
@BenVoigt I don’t understand what you mean by that. My answer isn’t concerned with ODR. – Konrad Rudolph Dec 08 '20 at 15:31
Help me understand this argument. It appears to me that this is an example where the compiler would not be able to prove the member cannot be accessed outside. But how does this show that there can be *no* example whatsoever? What about non-template classes where all functions are defined in the same CU? Even forget about member functions, what about this class: `class A { int x; };`. – bitmask Dec 08 '20 at 15:41
@bitmask I’m not talking about class templates. I’m talking about regular classes with private members which are *used* in a template argument list (as a pointer to member). Your class `A` could still be used in another translation unit inside a template argument list to access `&A::x`. – Konrad Rudolph Dec 08 '20 at 15:47

Passer By · Answer 3 · 2020-12-09T04:54:12.490

13

No, because you can subvert the access control system legally.

class A
{
    int x;
};

auto f();

template<auto x>
struct cheat
{
    friend auto f() { return x; }
};

template struct cheat<&A::x>;  // see [temp.spec]/6

int& foo(A& a)
{
    return a.*f();  // returns a.x
}

Given that the compiler must fix the ABI when A is first used, and that it can never know whether some future code may access x, it must fix the memory of A to contain x.

edited Dec 09 '20 at 04:54

answered Dec 08 '20 at 15:30

Passer By

19,325
6
49
96

5

ISO C++ doesn't specify that there has to be an ABI. I added an answer that points out that a whole-program optimizing compiler that doesn't allow libraries could do this in theory, because it knows it's seeing everything. Your answer puts the nail in the coffin for compilers that allow separately-compiled libraries, though. i.e. the kind of implementations we actually want to use. – Peter Cordes Dec 08 '20 at 16:23
3

It seems that you don't even need the `tag` type or `n` parameter at all in my experiments (C++17) which gets rid of a warning. What I don't understand (and perhaps you could elaborate in your answer) why `cheat<&A::x, 0>` is even legal. – bitmask Dec 08 '20 at 16:57
@bitmask My answer explains why it’s legal (tl;dr: it’s due to [temp.spec]/6). – Konrad Rudolph Dec 08 '20 at 20:10
@KonradRudolph Yes, your answer together with this one form a good explanation. Both of them, however, I couldn't understand individually. – bitmask Dec 08 '20 at 20:13

Is the compiler allowed to optimise out private data members?

3 Answers3

Linked