-4

I am trying to minimise the size my class occupies in memory (both data and instructions). I know how to minimise data size, but I am not too familiar with how GCC places member functions.

Are they stored in memory, the same order they are declared in the class?

user997112
  • 29,025
  • 43
  • 182
  • 361
  • 1
    Does `-Os` not work well enough? – Colonel Thirty Two May 23 '15 at 15:23
  • 1
    Member functions are just functions. They do not add directly to the size of any instantiated objects. Though they do take up space like other functions. But makeing the smaller is usually not a goal (apart from esoterically small devices). – Martin York May 23 '15 at 15:26
  • 1
    @ColonelThirtyTwo Does the compiler know my access pattern? Does it know which data members/functions are more likely to be called? Can it analyse my critical path? – user997112 May 23 '15 at 15:26
  • @user997112 I think GCC can if you give it profiling data. But how would any of that affect code size if you're trying to get the smallest possible binary? – Colonel Thirty Two May 23 '15 at 15:33
  • @ColonelThirtyTwo I want to allocate the functions with temporal locality, contiguously. – user997112 May 23 '15 at 15:38
  • 3
    If you're using GCC, the `hot` and `cold` [function attributes](https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes) may help. But reordering the functions doesn't affect the binary size. – Colonel Thirty Two May 23 '15 at 15:41
  • Thanks. Because there's no padding (unlike for data)? – user997112 May 23 '15 at 15:43
  • 1
    You only get *one* function in memory no matter how many *objects* you instantiate. Padding is not relevant to functions. – Galik May 23 '15 at 16:06
  • Actually function may be padded, e.g. their starting instruction is often 8 or 16 bytes aligned (for cache performance), but that does not matter much. – Basile Starynkevitch May 23 '15 at 16:06
  • You should have shown some actual code in your question. – Basile Starynkevitch May 23 '15 at 16:24
  • Try setting your compiler optimizations for high code space savings. – Thomas Matthews May 23 '15 at 18:30
  • 1
    Do you have specific size (and perhaps performance) targets, or is this an aesthetic exercise or callisthenics? – PJTraill May 23 '15 at 19:43
  • Just a tip: don't bother asking such low level optimization questions and tagging them C++ unless you like an avalanche of "it's not defined in the standard" or "why do you care type" comments and answers (and associated downvotes). Just use the other tags which usually have few language lawyer types monitoring them. – BeeOnRope Aug 25 '17 at 02:11

2 Answers2

3

For the purpose of in-memory data representation, a C++ class can have either plain or static member functions, or virtual member functions (including some virtualdestructor, if any).

Plain or static member functions do not take any space in data memory, but of course their compiled code take some resource, e.g. as binary code in the text or code segment of your executable or your process. Of course, they can also require static data (or thread-local data), or local data (e.g. local variables) on the call stack.

My answer is Linux oriented. I don't know Windows, and don't know how GCC work on it.

Virtual member functions are very often implemented thru virtual method table (or vtable); a class having some virtual member functions usually have instances with a single (assuming single-inheritance) vtable-pointer pointing to that vtable (which is practically some data packed in the text segment).

Notice that vtables are not mandatory and are not required by C++11 standard. But I don't know any C++ implementation not using them.

When you are using multiple-inheritance things become more complex, objects might have several vtable pointers.

So if you have a class (either a root class, or using single-inheritance), the consumption for virtual member functions is one vtable pointer per instance (plus the small space needed by the single vtable itself). It won't change (for each instance) if you have only one virtual member function (or destructor) or a thousand of them (what would change is the vtable itself). Each class has its own single vtable (unless it has no virtual member function), and each instance has generally one (for single-inheritance case) vtable pointer.

The GCC compiler is free to organize the vtable as it wishes (and its order and layout is an implementation detail you should not care about); see also this. In practice (for single-inheritance) for most recent GCC versions, the vtable pointer is the first word of the object, and the vtable contain function pointers in the order of virtual method declaration, but you should not depend on such details.

The GCC compiler is free to organize the functions in the code segment as it wishes, and it would actually reorder them (e.g. for optimizations). Last time I looked, it ordered them in reverse order. But you certainly should not depend on that order! BTW GCC can inline functions (even when not marked inline) and clone functions when optimizing. You could also compile and link with link-time optimizations (e.g. make CXX='g++ -flto -Os'), and you could ask for profile-guided optimizations (for GCC: -fprofile-generate, -fprofile-use, -fauto-profile etc...)

You should not depend on how the compiler (and linker) is organizing function code or vtables. Leave the optimizations to the compiler (and such optimizations depend upon your target machine, your compiler flags, and the compiler version). You might also use function attributes to give hints to the GCC (or Clang/LLVM) compiler (e.g. __attribute__((cold)), __attribute__((noinline)) etc etc....)

If you really need to know how functions are placed (which IMHO is very wrong), study the generated assembly code (e.g. using g++ -O -fverbose-asm -S) and be aware that it could vary with compiler versions!

If you need on Linux and Posix systems at runtime to find out the address of a function from its name, consider using dlsym (for Linux, see dlsym(3), which also documents dladdr). Be aware of name mangling, which you can disable by declaring such functions as extern "C" (see C++ dlopen minihowto).

BTW, you might compile and link with -rdynamic (which is very useful for dlopen etc...). If you really need to know the address of functions, use nm(1) as nm -C your-executable.

You might also read the ABI specification and calling conventions for your target platform (and compiler), e.g. Linux x86-64 ABI spec.

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Thanks. So my question is, in the binary are the non-virtual functions located in the order they are declared? So the ordering could be non-static_func1, non-static-func2, static-func1, static-func2 etc? – user997112 May 23 '15 at 15:40
  • 2
    @user997112: why are you worrying about the order in which they're stored in memory? – Mat May 23 '15 at 15:56
  • 1
    You should not know such implementation details, leave them to the compiler. – Basile Starynkevitch May 23 '15 at 15:56
  • @user997112 - as others implied, you shouldn't need to care about that. But just in case you need them in a particular order, you can just create an array of pointers to the functions in whatever order you want and use that. Keep in mind non static member functions also need an object pointer to be called. – dtech May 23 '15 at 15:58
  • 1
    The ordering of the functions is up to the Linker and the Loader, and the optimization settings of the Linker. Some linkers alphabetize the functions. Some may order functions for best use of memory. Some functions may be *inlined* and not as separate functions. You may be able to give instructions to the linker as to where to put the functions. – Thomas Matthews May 23 '15 at 16:05
1

Let's say we have a type T with 4 instance methods.

class T {
    public:
        void member_function_1() { ... }
        void member_function_2() { ... }
        void member_function_3() { ... }
        void member_function_4() { ... }
};

The amount of memory that those methods take up is the same if we instantiate 1 copy of T, or if we instantiate 1 million copies of T.

Bill Lynch
  • 80,138
  • 16
  • 128
  • 173
  • 3
    @user997112: Everything in my answer is the still true irrespective of the amount of code in the implementation. – Bill Lynch May 23 '15 at 15:29
  • You're saying functions containing many instructions occupy the same amount of memory as a function which only increments.....? – user997112 May 23 '15 at 15:30
  • 2
    @user997112: No. I'm saying that the amount of memory that the methods use is not correlated with the number of instances of the object that you have created over the course of your program. – Bill Lynch May 23 '15 at 15:31
  • 1
    But thats not what I was asking? I wanted to know how GCC orders functions (in the binary). – user997112 May 23 '15 at 15:37
  • 2
    @user997112 You kinda are... _"I am trying to minimise the size my class occupies in memory"_. Function size and locality are two unrelated concepts. – Colonel Thirty Two May 23 '15 at 15:45
  • 1
    I meant the memory footprint the class member functions consume. – user997112 May 23 '15 at 15:48
  • 1
    Why assume @user997112 does not realise that methods will only be instantiated once? – PJTraill May 23 '15 at 19:45
  • 1
    It's clear to me after reading the other answer and comments that the OP understands that functions don't have a per-instance cost. He just wants to pack all his functions in a certain way, which isn't entirely unreasonable if you have really exhausted more usual avenues of optimization. – BeeOnRope Aug 25 '17 at 02:10