3

I have been developing embedded software for the Microblaze processor for more than a year using C++. My designs were not so complex, so I wasn't using the powerful, object-oriented features of the language.

For a while, I have been trying to enhance the structure of my designs. For this purpose, I try to widely use the sophisticated features of C++ such as inheritance, polymorphism, etc. As a newbie, I believe that using inheritance solely doesn't affect the code size. Only the polymorphism has some side effects like adding virtual table pointers, run-time-type-informations, etc. My problem started with adding a pure virtual member function to a base class.

To provide a runnable example, I will try to mimic the situation that I face against.

The code below compiles and produces 13292 bytes of code. There is no way that this code can have such an amount of instructions. But, I believe that there are some parts from the generated BSP that are mandatory to include when producing an elf file.

class Base{
public:
    Base() = default;
    ~Base() = default;
  
    virtual void func() {}
  
    int m_int;
};

class Derived : public Base{
public:
    Derived() = default;
    ~Derived() = default;
    
    void func() final {}
  
    int m_int2;
};

int main()
{
    Derived d;
  
    while(1);    
}
 

13KB is not that much when you think that you have nearly 128KB of usable RAM. Actually, I didn't even notice the size of the produced code until the problem with the pure virtual functions emerges. The second code, below, has the same structure except for the func() is now a pure virtual function. Building this code gives us a code size which more than the available*(128KB)* RAM size. So, I modified the linker file to add some fake RAM just to be able to compile the code. After a successful compilation, the size of the produced code is nearly 157KB!

class Base{
public:
    Base() = default;
    ~Base() = default;
  
    virtual void func() = 0;
  
    int m_int;
};

class Derived : public Base{
public:
    Derived() = default;
    ~Derived() = default;
    
    void func() final {}
  
    int m_int2;
};

int main()
{
    Derived d;
  
    while(1);    
}

I didn't change any preferences of the compiler, all arguments are in their default states. There are no additional libraries other than the auto-generated ones. What do you think that the problem could be?

Some Additional Notes

  • I tried the codes on two different IDEs. Vivado SDK 2017.2 and Vitis 2019.2
  • The same problem also goes for the dynamic allocation calls(operator new and delete). Replacing them with C-Style malloc and free solves the problem.
  • Building the codes in the release mode solves the problem also. In release mode, the produced code is 1900 bytes whether I use the pure virtual function or not.

I can provide additional information if needed, thanks

I asked the same question on Xilinx forums, you can find it here

Caglayan DOKME
  • 948
  • 8
  • 21
  • What makes `while(1);` undefined behavior? Isn't `;` a legitimate null statement? – Nathan Pierson Apr 15 '21 at 17:43
  • 1
    @NathanPierson Because of the rule `No thread of execution can execute forever without performing any of these observable behaviors.` Where *"these observable behaviors"* is a list of things that doesn't include the null statement. – eerorika Apr 15 '21 at 17:46
  • @eerorika This is an embedded system, so it should run as long as there is power – Caglayan DOKME Apr 15 '21 at 17:51
  • @ÇağlayanDÖKME What's the point of a program that runs forever while doing nothing? – eerorika Apr 15 '21 at 17:56
  • @eerorika This is just a trivial code piece. In real file implementations, the situations differ. This could be a code that blinks an LED, or just waits for an external interrupt to run some sequential processes. – Caglayan DOKME Apr 15 '21 at 18:05
  • Why don't you try disassembling both the binary versions and compare? I tried the same in GCC and saw some extra functions (mostly for handling exceptions) added in the pure-virtual version. I don't have access to the IDEs/compilers in question. So I can't see the exact difference. – debashish.ghosh Apr 15 '21 at 18:24
  • @debashish.ghosh Thanks for the suggestion. The produced code is too long and hard to inspect. I would be appreciated if you could suggest a tutorial or any kind of resource that I can learn the way of doing it? – Caglayan DOKME Apr 15 '21 at 18:50
  • 1
    The .map file produced by the linker should detail what memory is used for what components. Compare the .map files for your two builds. – kkrambo Apr 15 '21 at 18:51
  • @eerorika Actually we have such empty endless loops in real-world multi-threaded products, which are running just from events triggered by hardware and work by the handling functions. – the busybee Apr 15 '21 at 18:52
  • 1
    Check your map file to see what has been included and the sizes. I just tried it with ARMCC v6 with optimisation disabled and it comes to 1548 bytes including start-up code. The code for the object module containing this code was only 82 bytes. Enabling RTTI increased the size to 3208, but no impact on the 82 bytes attributed to this code. At `-01` it reduces to 46 bytes. I know nothing about MicroBlaze, but clearly something is wrong. But do disable RTTI if it is not already. – Clifford Apr 15 '21 at 19:08
  • 1
    Compare the map file from a debug and release build to see what it is adding. – Clifford Apr 15 '21 at 19:19
  • @ÇağlayanDÖKME I am not aware of tutorials. You could try "readelf" command with -a flag. It gives you summary of the ELF. I bet comparing the reports would be a lot more manageable. :) – debashish.ghosh Apr 15 '21 at 19:32
  • 2
    [This question](https://stackoverflow.com/q/37946912/218774) talks about a similar behaviour for ARM. The problem seemed to be related with handling the possibility to call a pure virtual method. – J. Calleja Apr 15 '21 at 19:33
  • @J.Calleja This is the exact answer I've been looking for. I tried it on the project and it works! There is one more thing I should handle. It is the problem with operators `new` and `delete`. They also blow up the code size somehow. I will edit the question and provide a solution as soon as possible. Thanks for the help, you made my day :) – Caglayan DOKME Apr 16 '21 at 06:08
  • 1
    This is exactly why we don't use C++ in embedded systems. That original code should not generate 13kb! It should generate some 10-20 bytes RAM, some 100 bytes of flash at most + a bit of vtable. Your compiler seems to be far beyond broken. – Lundin Apr 16 '21 at 09:32
  • 1
    @NathanOliver Nah, `while(1);` is perfectly valid code and has been so since the dawn of time, long before C++ was even invented. CPUs have supported interrupts and hardware exceptions since I dunno, the 1960s? Having the main program stall in a busy-wait loop while interrupts execute in the background is a perfectly fine program. If the C++ language is nowadays so defect that it isn't even aware of how computers work, then it shouldn't be used. – Lundin Apr 16 '21 at 09:35

1 Answers1

2

The solution is a little bit creepy :) Before beginning, special thanks to everyone who helped.

SHORT ANSWER

Just add the following code piece to your main file:

extern "C" void __cxa_pure_virtual() { while(1); }

If you also want to solve the problem related to operator new and operator delete, add the following codes also:

void* operator new(const std::size_t size) noexcept
{
    void* p = std::malloc(size);
    return p;
}

void operator delete(void* p) noexcept
{
    std::free(p);
}

DETAILS

The original solution is here. The problem starts with completely pulling libstdc++ out of the picture. This way we waive the right of using standard library functions, so we should provide our own implementations of the standard calls such as malloc, new, free, etc. Even if you reimplement all required calls, the compiler would complain about the lack of a function called __cxa_pure_virtual(). This is a clue for the final solution.

The __cxa_pure_virtual function is an error handler that is invoked when a pure virtual function is called. We can easily say that we never make such foolish attempts. But, the compiler never trusts any software developer :) Therefore, when you write a C++ code that includes pure virtual functions, the compiler implicitly adds an error handler to handle potential runtime errors. As you can guess that those are expensive calls for systems with limited resources such as in our case the Microblaze.

So, if we are writing a C++ application that has pure virtual functions we shall supply our own __cxa_pure_virtual error handler function. If you are not a competitive embedded software developer you should just add an endless to your custom handler function. Don't worry, you will never have a chance to call your pure virtual function that invokes the error handler as long as you follow the best practices of the language.

The problem with the operator new and operator delete is also related to underlying exception mechanisms. To avoid expensive exception handling mechanisms you could just reimplement them in a way that doesn't throw any exception. The only thing that you should consider is to check the allocation success after calling the operator new as it will no more produce exceptions. I believe that you will never need to call the operator delete as long as you work on an operating systemless application project.

After applying this holy recipe on your own codes you will see that the size of the executable file will fall back to its original state.

The answer is open to contributions and suggestions. I would be appreciated if you could make so

Caglayan DOKME
  • 948
  • 8
  • 21
  • 1
    Another option is to build libstdc++ itself with --disable-libstdcxx-verbose option and use that instead. The exception handling code won't get added to your binary. Ref: [libstdc++ manual](https://gcc.gnu.org/onlinedocs/libstdc++/manual/configure.html) – debashish.ghosh Apr 16 '21 at 19:15