on top what others suggest:
Limit use of c++ features, write like in ANSI C with minor extensions. Standard (std::) templates use a large system of dynamic allocation. If you can, avoid templates altogether. While not inherently harmful, they make it way too easy to generate lots and lots of machine code from just a couple simple, clean, elegant high-level instructions. This encourages writing in a way that - despite all the "clean code" advantages - is very memory hungry.
If you must use templates, write your own or use ones designed for embedded use, pass fixed sizes as template parameters, and write a test program so you can test your template AND check your -S output to ensure the compiler is not generating horrible assembly code to instantiate it.
Align your structures by hand, or use #pragma pack
{char a; long b; char c; long d; char e; char f; } //is 18 bytes,
{char a; char c; char d; char f; long b; long d; } //is 12 bytes.
For the same reason, use a centralized global data storage structure instead of scattered local static variables.
Intelligently balance usage of malloc()/new and static structures.
If you need a subset of functionality of given library, consider writing your own.
Unroll short loops.
for(i=0;i<3;i++){ transform_vector[i]; }
is longer than
transform_vector[0];
transform_vector[1];
transform_vector[2];
Don't do that for longer ones.
Pack multiple files together to let the compiler inline short functions and perform various optimizations Linker can't.