1

I have following struct:

struct SkipListNode{
    void        *data;      // 8 bytes
    uint8_t     size;       // 1 byte
    // 7 bytes padding here...
    void        *next[1];   // dynamic array, 8 bytes each "cell"
};

I am using malloc() and I am allocating more space than sizeof(SkipListNode), so I am extending the next[] array.

I want to avoid 7 bytes waste. I can completely remove size field, but then I should keep single NULL (8 bytes) at the end of the array. However this does not help reducing the size.

Shall I use __ attribute__((__ packed__)) or there some different way I could do the trick?

This must be compiled under C and C++ as well.

Compiler is gcc.

alk
  • 69,737
  • 10
  • 105
  • 255
Nick
  • 9,962
  • 4
  • 42
  • 80
  • What is the problem with `attribute((packed))`? Is memory only concern and are you creating many instances (objects) of this type? – Mohit Jain Jun 23 '15 at 07:08
  • generally, no problem. Memory is the concern because I am making thousands of these (it is skip list node) – Nick Jun 23 '15 at 07:08
  • 2
    have you considered `#pragma pack(1)` – Acha Bill Jun 23 '15 at 07:12
  • 1
    Just to make sure,Compiler directive for packing structure is different for different compilers.Please specify the compiler as well in question. – Vagish Jun 23 '15 at 07:12
  • #pragma pack(1) is same as __ attribute__((__ packed__)) – Nick Jun 23 '15 at 07:13
  • __attribute__((packed)) is designed for that purpose, you should use it... – Ishay Peled Jun 23 '15 at 07:19
  • the truth is, even I use packed, the malloc() is aligned, so there will be padding after the struct - same reason why you can not really allocate less than 8 or probably 16 bytes. It smells to custom allocator anyway... – Nick Jun 23 '15 at 07:22
  • 1
    @Nick that's correct, but I'd give the OS some credit for managing it's own resources. When you use packed, the actual size if the smallest size possible for this structure. I believe the extra padding in the malloc level does not go to waste (think of the consequences of that!), but rather being reused for further malloc calls. Anyhow its easy to test... just run your process with 1000 structs and look at how much memory it consumes. Better do that before you write your own allocator – Ishay Peled Jun 23 '15 at 07:31
  • 3
    @Ishay: the memory in the unpacked struct goes to waste. It is added so that the variables sit on 'even' boundaries in regard to the machine word (usually 64bits nowadays). If you pack this, which is usually used for transmitting across machine boundaries, since only then the real structure is known, the vars sit tightly one after another. Which means, when you load them, there will be (afaik) additional operations to sort that out. So be aware you are trading memory for CPU. Packing should only be done when it is really,really, really, necessary. – Mario The Spoon Jun 23 '15 at 07:40
  • @MarioTheSpoon: best explanation about "reconstructing" memory I ever encounter. Speaking about it, probably the NULL at the end will be better, because I will access continuous array? – Nick Jun 23 '15 at 07:45
  • @MarioTheSpoon Agreed - there is overhead here, but the common use case is HW access where you actually require the struct to be written as a whole to the HW rather than transmitting stuff across machines which add so much overhead compared to this struct size it doesn't even matter.. In this case, a waste of 7 bytes over a structure of 9 bytes is definitely worth the minor overhead – Ishay Peled Jun 23 '15 at 07:45
  • @Nick - if you are referring to locality of reference it should not matter, all should be cached. But to answer your question in detail, I think it would be necessary to see the CPUs instructions and optimizations in detail. I dare say this would be micro-optimizing which may be undone by the compiler or the CPU itself... – Mario The Spoon Jun 23 '15 at 07:51
  • @Ishay . Agreed, as I added in the comment to your answer. Close to the machine level and across machines the reasoning changes. – Mario The Spoon Jun 23 '15 at 07:52
  • Does C++ support flexible array members? If so, then that would be the preferred solution, rather than some gcc-specific gibberish. – Lundin Jun 23 '15 at 09:44
  • 1
    @Lundin: no it is not support flexible members, this is why size is 1 instead of 0 or just []. But code works perfectly OK, you need just to remember this 1 when allocating. This works even with POD classes – Nick Jun 23 '15 at 09:46

2 Answers2

3
#include <stdio.h>
#include <stdint.h>

struct SkipListNode{
    void        *data;      // 8 bytes
    uint8_t     size;       // 1 byte
    void *next[1];
    };

struct SkipListNode_pack{
    void        *data;      // 8 bytes
    uint8_t     size;       // 1 byte
    void *next[1];
    } __attribute__((packed));

int main(int argc, char** argv)
{
    printf("%d:%d\n", sizeof(struct SkipListNode), sizeof(struct SkipListNode_pack));
    return 0;
}

The output from the above code:
ishaypeled@arania sandbox]$ ./test
24:17

So yeah, you should definitely use __attribute__((packed)) in your case if you want to save memory. On the other hand, like @MarioTheSpoon states - it may come with a performance penalty.

I'd check the padding solution to see if you can live with that penalty on your specific machines. My bet is you won't even notice it unless you're really performance critical.

unwind
  • 391,730
  • 64
  • 469
  • 606
Ishay Peled
  • 2,783
  • 1
  • 23
  • 37
  • I disagree! Only use #pragma pack(1) when it is really necessary. Wasting ~100k in a standard programm on a standard computer should not be the most of the worries. Of course, if it is some kind of embedded system or you are allocating millions of these records, the reasoning may be different. The system tricks you again, since malloc will again align the buffers on 64bit boundaries (on the described system). – Mario The Spoon Jun 23 '15 at 07:44
  • 1
    Is there any particular reason why you ommited the `next` member as per the OP's `struct`? – alk Jun 23 '15 at 07:46
  • 1
    @MarioTheSpoon The motivation for this question was to save memory... I already agreed it *may* cause CPU tradeoff, but that's really what the author requested... – Ishay Peled Jun 23 '15 at 07:53
  • 2
    There's an interesting discussion about the CPU penalty here: http://stackoverflow.com/questions/3454673/can-attribute-packed-affect-the-performance-of-a-program – Ishay Peled Jun 23 '15 at 07:56
0

I accepted the other answer,

however, I did considered refactoring of the program, I found a way to proceed correctly, without knowing the size of the array.

If anyone is interested here it is:
https://github.com/nmmmnu/HM3/blob/master/skiplist.cc

So I eliminated the size field and I eliminated the end NULL. Now the struct is aligned :)

While refactor, I remember that one may use malloc_usable_size() to find the size of allocated block. This is heavy non-portable hack, but could work well at some other cases.

Nick
  • 9,962
  • 4
  • 42
  • 80