1

A C++ class containing two Eigen vectors has a strange size. I have a MWE of my problem here:

#include <iostream>
#include "Eigen/Core"

class test0 {
  Eigen::Matrix<double,4,1> R;
  Eigen::Matrix<double,4,1> T;
};

class test1 {
  Eigen::Matrix<double,4,1> R;
  Eigen::Matrix<double,3,1> T;
};

class test2 {
  Eigen::Matrix<double,4,1> R;
  Eigen::Matrix<double,2,1> T;
};

class test3 {
  Eigen::Matrix<double,7,1> T;
};

class test4 {
  Eigen::Matrix<double,3,1> T;
};

int main(int argc, char *argv[])
{
    std::cout << sizeof(test0) << ", " << sizeof(test1) << ", " << sizeof(test2) << ", " << sizeof(test3) << ", " << sizeof(test4) << std::endl;
    return 0;
}

The output I get on my system (MacBook Pro, Xcode Clang++ compiler) is:

64, 64, 48, 56, 24

The class "test1" has some bizarre extra padding - I would have expected it to have size 56. I don't understand the reason for it, especially given that none of the other classes have any padding. Can anyone explain, or is this an error?

user664303
  • 2,053
  • 3
  • 13
  • 30
  • "I would have expected it to have size 56" - why? –  Jan 12 '17 at 01:25
  • 1
    @latedeveloper, because it has 7 doubles. – zneak Jan 12 '17 at 01:25
  • I really meant "why do you think the compiler can't add whatever padding it likes, unless you explicitly control it?" –  Jan 12 '17 at 01:27
  • 1
    @latedeveloper, there is literally no weaker reason for anything than "because it can". Padding has a memory cost, and I would generally try to avoid a compiler that increases cost "because it can" rather than because I get something in return. In this case, I fail to see benefits. – zneak Jan 12 '17 at 01:31
  • I use this class in an Eigen::Map, and it seems like that last 8 bytes is getting written to. Which is annoying (breaking things) if I'm passing in a pointer to an array of 7 doubles. Surely the compiler shouldn't allow writes to the padding. – user664303 Jan 12 '17 at 01:32
  • 2
    I find your question interesting, but I have to leave it here that it is not legal C++ to take a pointer to one of these and expect it to be a valid pointer to doubles. – zneak Jan 12 '17 at 01:33
  • 2
    @zneak If you are interested in padding (most people are not) then you _will_ need to control it explicitly via compiler-specific mechanisms. –  Jan 12 '17 at 01:36
  • @zneak: That's not what I'm doing. I am mapping a pointer to an array of doubles to one of these, using [Eigen::Map](https://eigen.tuxfamily.org/dox/classEigen_1_1Map.html). I assume this core maths library is using legal C++, but maybe I'm mistaken. My MWE simplifies things a little, too. – user664303 Jan 12 '17 at 01:39
  • @latedeveloper, I'm generally not interested in padding, but I'm interested in how efficient my code is, and "it can" is an insufficient explanation to inefficiency. – zneak Jan 12 '17 at 01:46
  • @zneak It's entirely sufficient, because that's what compilers do, but take up your gripes with the compiler writers, not me. –  Jan 12 '17 at 01:49
  • @zneak it can be efficient or inefficient, depending on how much memory access you have. The only way to know it is profiling. In any case, worrying about this is probably premature optimization, which is the "root of all evils", unless you have carefully benchmarked the code and make sure that it's a bottleneck – phuclv Jan 12 '17 at 01:59
  • Correction to my earlier comment. I don't believe that the padding gets written to. However, I am telling the compiler that two sequential containers, the first of which is a test1, are not aliases, using the restrict keyword. Given the padding, they are in fact overlapping. This might be the cause of the errors. – user664303 Jan 12 '17 at 02:03
  • Placing `#pragma pack(8)` just above the test1 class changes its size to 56. – user664303 Jan 12 '17 at 02:11
  • @LưuVĩnhPhúc, I believe that whatever heuristics the compiler uses are explainable. I'm sure that there's a tradeoff involved, but I am not willing to take "optimization is the root of all evil" as the reason for why I shouldn't be interested in learning about that tradeoff. – zneak Jan 12 '17 at 02:31

2 Answers2

5

This happens because of how the Eigen library is implemented, and it is not related to compiler tricks. The backing storage for Eigen::Matrix<double, 4, 1> has the EIGEN_ALIGN_TO_BOUNDARY(16) tag on it, which has compiler-specific definitions that ask the type to be aligned on a 16-byte boundary. To ensure this, the compiler has to add 8 bytes of padding at the end of the structure, since otherwise the first matrix field would not be aligned on a 16-byte boundary if you had an array of test1.

Eigen simply does not try to impose similar requirements to the backing storage of Eigen::Matrix<double, 7, 1>.

This happens in Eigen/src/Core/DenseStorage.

zneak
  • 134,922
  • 42
  • 253
  • 328
  • 2
    You can use `Matrix` to locally disable explicit alignment and pay for slightly less efficient vectorized code. – ggael Jan 12 '17 at 15:49
  • Indeed. And Eigen::Map uses the unaligned format by default. So I think my problem was elsewhere. – user664303 Jan 12 '17 at 22:15
0

Padding requirements aren't mandated by the language, they're actually mandated by your processor architecture. Your class is being padded so it's 64 bytes wide. You can override this of course, but it's done so that structures sit neatly in memory and can be read efficiently, aligning to cache lines.

In what circumstances a structure is padded is a complex question, but generally speaking "memory is cheap, cycles are not". Modern computers have loads of memory and since performance gains are becoming harder to find as we approach the limits of copper, so trading some off for performance is usually a good idea.

Some additional reading is available here.


Following up on the discussion in the comments, it's worth noting that the compiler isn't your god. Not every optimisation is a good idea, and even trivial changes to your code can have vast implications for some optimisations. If you don't like what your toolchain producing and think you can do better, then do it! Take some benchmarks, make your changes and then measure again. As you do all of that, take not how long you spend on it and then ask yourself - was that a good use of you or your employers time? :)

Liam M
  • 5,306
  • 4
  • 39
  • 55
  • Any insight into why test1 is padded and test3 and test4 are not? – user664303 Jan 12 '17 at 01:44
  • I don't believe the alignment of member variables explains the existence of the padding here, either. The T variable will start on an 8 byte boundary with no padding, if R starts on an 8 byte boundary (which it must). – user664303 Jan 12 '17 at 01:47
  • As I've already commented on the question, I find that "it's legal" is a poor explanation for inefficiencies. – zneak Jan 12 '17 at 01:50
  • @user664303 it's not a matter of aligning members, it's about aligning instances of the class sequentially in memory. It's been some time since I've been involved in circumstances where this mattered too much to me, but it's worth noting that 48+56+24=128. Cache lines in x86 are 64 bytes long. – Liam M Jan 12 '17 at 01:53
  • @zneak it's not inefficient. It's your toolchain trying to cope with data that's a peculiar size, trying to ensure that it's packed into memory so that the processor can read and cache it efficiently. – Liam M Jan 12 '17 at 01:55
  • @LiamM: That issue is handled by alignment pragmas, and I do not believe it should affect the size of a class itself. Basically, if you as a programmer know that your class is going to be stored in a vector of classes and that you can accelerate computations on this class using for example SSE instructions that require particular alignments, then it is up to you to tell the compiler. Also, the compiler can align the classes without padding the classes themselves. – user664303 Jan 12 '17 at 01:57
  • Case in point: Eigen::Matrix has size 24 but is aligned on a 32 byte boundary (by default, though this can be over-ridden). – user664303 Jan 12 '17 at 01:58
  • @LiamM, I am willing to believe that, but I struggle to find a case where this is true. My best explanation is that this ensures that the first matrix is aligned on a 16-byte boundary when you have it in an array, but then it didn't do it for the case of one matrix of 7 doubles. Your answer goes along the line of "I'm not sure but I trust the compiler", which is more a statement of your philosophy than an actual answer (as you note in your edit). However, there is a very real chance that the compiler is doing it with a good reason, and I find that it would be more valuable. – zneak Jan 12 '17 at 02:05
  • @user664303 packing suppresses padding. If you're referring to the pack pragmas, these are used to override packing. Have a look at this [this question](http://stackoverflow.com/questions/3318410/pragma-pack-effect) for an overview of padding vs packing. – Liam M Jan 12 '17 at 02:08
  • @zneak How the pack/pad decisions are made is a very complex question, and the answers given by a given toolchain can vary widely. It's not a statement of philosophy, rather just a statement of fact: compilers and optimisers make a lot of assumptions and a lot of guesses. Often they guess wrong; watch some presentations by Mike Acton if you'd like to see just how wrong they can be. My answer isn't "I'm not sure but trust the compiler", it's "this is the rationale, and what you get in a given circumstance is complicated. Measure, change, measure, assess". – Liam M Jan 12 '17 at 02:12
  • @LiamM: Funnily enough, just saw that before reading your comment. It does squish my padding. I was actually referring to __attribute__((aligned(64))) style preprocessor directives. I don't seem to be able to use the two together succesfully though, which I find odd. – user664303 Jan 12 '17 at 02:23
  • To be clear, my udnerstanding is that packing affects where a structure ends (i.e. its size), whereas alignment affects where it starts, but needn't affect it's size. – user664303 Jan 12 '17 at 02:27