2

I'm working on some Arduino code and have the following code:

uint8_t world[24][2][3];
bool getDispPixel(uint8_t x, uint8_t y, uint8_t num)
{
    static uint8_t rowByte = 0; // 0 means top 8, 1 means bottom 8
    static uint8_t rowBit = 0;

    if(y > 7)
    {
        rowByte = 1;
        rowBit = x - 8;
    }
    else
    {
        rowByte = 0;
        rowBit = x;
    }

    return (world[x][rowByte][num] & (1 << rowBit)) > 0;
}

void setDispPixel(uint8_t x, uint8_t y, uint8_t num, bool state)
{
    static uint8_t rowByte = 0; // 0 means top 8, 1 means bottom 8
    static uint8_t rowBit = 0;

    if(y > 7)
    {
        rowByte = 1;
        rowBit = x - 8;
    }
    else
    {
        rowByte = 0;
        rowBit = x;
    }

    if(state)
        world[x][rowByte][num] |= (1 << rowBit);
    else
        world[x][rowByte][num] &= ~(1 << rowBit);
}

What's weird is these methods add a TON of size to the program. Even just parts of it. If i comment out the following part from just one of the methods, it drops 2536 bytes from the program size!

if(y > 7)
{
    rowByte = 1;
    rowBit = x - 8;
}
else
{
    rowByte = 0;
    rowBit = x;
}

Both methods are called quite often, over 200 times combined. I would believe it if they were marked as inline, but they are not. Any idea of what could be causing this?

Update: If I completely comment out those methods' contents it drops the size by 20k! Looks like every call to the function eats up 94 bytes. No idea why...

Adam Haile
  • 30,705
  • 58
  • 191
  • 286
  • What IDE are you using? Can you generate a mixed assembler / source listing? – Greycon Nov 25 '13 at 22:13
  • Regular Arduino IDE. Not sure how to do that – Adam Haile Nov 25 '13 at 22:15
  • Which operating system are you using? (for the available tools) – Étienne Nov 25 '13 at 22:23
  • 1
    I'd start with the static variables inside the functions, they tend to generate non-trivial code because the compiler must ensure they are initialized only once. No need for that at all. – Hans Passant Nov 25 '13 at 22:26
  • Windows... why should it matter? – Adam Haile Nov 25 '13 at 22:26
  • On Linux there are standard tools like `readelf`, `objdump`, `nm` to display the content of the generated program. Then it would be obvious what it is taking space. You need a windows equivalent. – Étienne Nov 25 '13 at 22:26
  • Tried making them non-static and it only helps by a few bytes – Adam Haile Nov 25 '13 at 22:27
  • Have a look at this: http://stackoverflow.com/questions/11054534/how-to-use-intall-gnu-binutils-objdump You can Install binutils (`objdump`, `readelf`..) as part of MinGW and then use it to display the content of your object file (.o). Then you can easily see what is taking space. – Étienne Nov 25 '13 at 22:34
  • Not really sure what do to with that... – Adam Haile Nov 25 '13 at 22:45
  • Can you upload the object file containing this function and post a link to it? I can show you the relevant output in an answer if you want. – Étienne Nov 25 '13 at 22:51
  • Are you compiling with optimization turned on? Also, have you looked at the generated assembly code to see what the cost of each step is? – Joe Z Nov 25 '13 at 22:53
  • That would be awesome. Here it is: http://adamhaile.net/code/GOLClock.cpp.o – Adam Haile Nov 25 '13 at 22:54
  • Joe Z - No idea... it's just the default Arduino IDE. – Adam Haile Nov 25 '13 at 22:54
  • What do you mean when you say "not marked as inline"? If you mean not _declared_ inline, that's no guarantee that the compiler won't decide that it's able to inline a function. – user888379 Nov 25 '13 at 22:57
  • Yes, not declared as inline. If the compiler is deciding this should be inline it's got some problems :P – Adam Haile Nov 25 '13 at 22:59
  • You probably have somewhere in your IDE where you can select "optimize for speed" or "optimize for size". – Étienne Nov 25 '13 at 23:08
  • Just using the standard Arduino IDE. There's no settings for that. And it's happening on multiple machines with completely separate setups. – Adam Haile Nov 25 '13 at 23:12
  • @AdamHaile The compiler is allowed to inline any function except if you explicitly say it should not. The "inline" keyword can be completely inored by your compiler and dates from the time where compiler were not clever enough to decide when to inline a function. – Étienne Nov 25 '13 at 23:39
  • hmmmm...weird. I guess I just don't get why it would think that making a function called 200 times inline is a good idea! Especially when Arduino defaults to optimizing for size! – Adam Haile Nov 25 '13 at 23:40
  • Probably the murphy's law! – Étienne Nov 25 '13 at 23:41

2 Answers2

5

If the Arduino toolchain supports GCC extensions (and some quick searching suggests it does), then you can use __attribute__((noinline)) to disable inlining on these functions like so:

bool getDispPixel(uint8_t x, uint8_t y, uint8_t num) __attribute__((noinline));
bool getDispPixel(uint8_t x, uint8_t y, uint8_t num)
{
    // body of the function here
}

void setDispPixel(uint8_t x, uint8_t y, uint8_t num, bool state) __attribute((noinline));
void setDispPixel(uint8_t x, uint8_t y, uint8_t num, bool state)
{
    // body of the function here
}

The extra line looks redundant, but isn't. It's how the syntax for the extension works.

Joe Z
  • 17,413
  • 3
  • 28
  • 39
  • you're awesome... that cut off almost 18k bytes! – Adam Haile Nov 25 '13 at 23:36
  • now I just wish I knew WHY it was inlining them in the first place! :P – Adam Haile Nov 25 '13 at 23:38
  • If the compiler is optimizing for speed it may make sense to inline them. – Étienne Nov 25 '13 at 23:40
  • Arduino defaults to -Os (size if I recall) – Adam Haile Nov 25 '13 at 23:41
  • You don't need the extra line, you can declare the attributes directly in one line with the function name. `__attribute((noinline)) void setDispPixel(uint8_t x, uint8_t y, uint8_t num, bool state) {...}` – Étienne Nov 25 '13 at 23:43
  • @Étienne: I've had trouble with that syntax in the past for some reason. Every example I've seen shows the form I gave above. But, whatever works... – Joe Z Nov 26 '13 at 00:03
  • @AdamHaile: Even if it defaults to `-Os`, it could be that the compiler had a faulty cost calculation. I've noticed other compilers err far to much in the opposite direction, ironically. Maybe the AVR 8-bit compiler isn't as tuned as it could be. – Joe Z Nov 26 '13 at 00:05
2

Here is the output of: nm --print-size --size-sort --radix=d --demangle GOLClock.cpp.o (Size of your objects ranked by size):

http://pastebin.com/rHEhuEKg

You can see that the assembly code for the function SetDispPixel takes 148 bytes, and for the function GetDispPixel 94 bytes.

If it causes a huge increase of your binary it probably means that your function is getting inlined.

Étienne
  • 4,773
  • 2
  • 33
  • 58