5

I'm new in a company where the following use of a struct is done:

#include <stdio.h>
#include <string.h>

typedef unsigned char uint8;
typedef signed char int8;
typedef unsigned short int uint16;
typedef signed short int int16;

typedef struct padded_t {
    int8 array0[11];
    int8 array1[4];
    uint16 len2;
    int8 array2[25];
    uint16 len3;
    int8 array3[6];
    // many more len/arrays follow
} padded_t;

int main(int argc, char** argv)
{
    padded_t foo;
    memset((void*)&foo, 0, sizeof(padded_t));

    int8* str = "foobar";
    int16 size = (int16)strlen(str);

    int8* ptr = (int8*)&foo.len2;

    // please note that the memcpy references only the pointer to len
    // are the following couple of lines safe?
    memcpy ((void*)ptr, &size, 2);
    memcpy ((void*)&ptr[2], (void*)str, size);

    printf("%d\n", foo.len2);
    printf("%s\n", foo.array2);

    return 0;
}

I know some things about alignment and padding, and I assume the compiler (gnu C99 for an ARM9 device) will add some paddings to make the struct aligned.

But is this code safe?
As I understand it, it will be safe as long as the uint16 len variables are immediately followed by int8 array[] variables regardless of the other struct members.

Will it only add paddings before a uint16 when the size before it is odd?
Is this use correct? And more importantly, is it safe?

learning_frog
  • 305
  • 1
  • 3
  • 14
  • It is unsafe. You can't assume `len2` is not padded. – Eugene Sh. Aug 10 '16 at 17:32
  • 2
    Get a new job! 1) Homebrew fixed width types instead of using standard types. 2) not in a header, but the implementation 3) not `_static_assert`ed 4) Relying on implementation details/UB. 5) Wildly casting 6) Undefined behaviour. – too honest for this site Aug 10 '16 at 17:32
  • 1
    @Olaf Expand on 6, please.. – Eugene Sh. Aug 10 '16 at 17:33
  • `typedef char int8;` should be `typedef signed char int8;` – chqrlie Aug 10 '16 at 17:34
  • @Olaf 1 and 2 are not really true. I just wanted to add a completely compilable code. But the definitions are in the header of the device SDK. – learning_frog Aug 10 '16 at 17:35
  • The 2 might be obsolete. Still 1) seems to be true, otherwise you just had to use the standard names and include the standard header. Then get a different vendor. It is a bad sign they don't use standard types. – too honest for this site Aug 10 '16 at 17:37
  • Frankly, I side with Olaf, you should get a new job. Code like this is a good indication you will learn nothing useful there. – chqrlie Aug 10 '16 at 17:39
  • 1
    @chqrlie But you can contribute much and get some respect and recognition. – Eugene Sh. Aug 10 '16 at 17:40
  • 4
    The C standard guarantees that the members of a struct will be stored in the order they are declared, but says nothing about the alignment. The compiler is free to add any sort of padding it deems necessary between struct members to align them as needed. If this code works at all, it is very fragile and will break if ever ported to a different architecture or even a different compiler. – Jeff Loughlin Aug 10 '16 at 17:46
  • @EugeneSh.: if you are in a position to take charge, there is a lot you can do and indeed a lot to be done. If you are just an low ranking developper in the company, you will suffer and waste your time debugging ugly code. – chqrlie Aug 10 '16 at 17:48
  • `memcpy ((void*)ptr, &size, 2);` is invalid because `sizeof(size)` is 1. – Ian Abbott Aug 10 '16 at 17:50
  • @Ian Abbott: It should be int16. I mistyped it when translating the code – learning_frog Aug 10 '16 at 17:52
  • @Jeff Loughlin: The code works in the device I'm working (it is in production for a long time) and in my x86_64 computer. You can try it yourself. – learning_frog Aug 10 '16 at 18:03
  • 1
    @Gustavo It doesn't really matter if the code works on your computer, or Bob's computer. In fact, it doesn't matter if it works on the majority of the computers, or every single computer in the universe. If the standard says the compiler is allowed to add any sort of padding or alignment, then you shouldn't take the chance. The fact that there _might_ be the chance that a computer is created where this code does not work as expected means it is unsafe code. – iRove Aug 10 '16 at 18:15
  • @Gustavo: Oh, I don't doubt that it works, it's just not guaranteed to *always* work. You can pee on an electric fence and get away with it sometimes, but I don't recommend doing it too often. When it goes wrong, it goes *badly* wrong. – Jeff Loughlin Aug 10 '16 at 18:16
  • @iRove: I agree with you. I wouldn't do this kind of coding myself. But I can't change it either. – learning_frog Aug 10 '16 at 18:17
  • @Jeff Loughlin: I completely understand. As I said, I wouldn't do it myself. But since everyone says it is not safe, is there anything I can do to make it safer? Maybe adding `__attribute__ ((__packed__))` to the struct? – learning_frog Aug 10 '16 at 18:21
  • Is this by any chance some sort of network packet for some protocol? If it is, then the "proper" way is to pack the struct and manually insert padding so that it matches the layout of the packet. – Mysticial Aug 10 '16 at 18:38
  • @Mysticial: I could do that, but will it make it any safer? I mean, is there a way of knowing exactly where I should insert the paddings? – learning_frog Aug 10 '16 at 18:40
  • Packing a struct is not specified by the standard, but every major compiler supports it with their own extension (otherwise, it'd be difficult to implement network protocols). When you pack a struct, there is no padding at all. So if you need to do your own padding, just insert dummy fields. – Mysticial Aug 10 '16 at 18:43
  • Read this: http://www.catb.org/esr/structure-packing/ everything you ever wanted to know about structure packing – Jeff Loughlin Aug 10 '16 at 18:53
  • @iRove: Any useful program for a free-standing implementation must rely upon behaviors not defined by the Standard, since the Standard doesn't define any means by which a free-standing program could have any behavior. Nothing in the Standard specifies that pointers have to be two bytes long, but if an implementation's documentation says pointers are two bytes long a programmer should be entitled to rely upon that when targeting that implementation. – supercat Aug 12 '16 at 20:04

2 Answers2

3

You don't need to write code that works on every system. What you should do is write code that has predictable behavior. Either it works as designed or if your assumptions don't hold, a static assertion aborts the compilation.

The second memcpy line fails to do that. It assumes that offsetof(struct padded_t, len2) + 2 == offsetof(struct padded_t, array2). An assumption which will often hold, but is utterly stupid.

Why not just write

foo.len2 = strlen(str);
memcpy (foo.array2, str, foo.len2);
//possibly, foo.array2[foo.len2] = '\0';

Code is readable. No unneeded variables. No unnecessary casts. No unpredictable behavior. The original doesn't look like code you will learn a lot from, the clutter doesn't look like something I expect someone proficient in C to write.

To answer a comment of yours, making them packed is the wrong fix. Because it will misalign members and will just open another can of worms.

Also keep wary of custom fixed-size typedefs. I once had the fun of debugging a typedef char int8; on a system that had chars unsigned

Community
  • 1
  • 1
a3f
  • 8,517
  • 1
  • 41
  • 46
  • The real code is not that simple. I simplified the example to get to the point, but I just don't have no means to do what you suggested. I could show a code similar to the real one, but there is no reason for it, really. The point is already addressed by the question. Anyways, thank you for the answer. – learning_frog Aug 10 '16 at 18:38
  • 1
    I can only judge by the code you posted and that's just messy. – a3f Aug 10 '16 at 18:41
  • @Gustavo You don't want to show the code because _"there is no reason for it,"_ but this code does not work for you. Clearly it would be useful for us to know your _real constraints_. Otherwise how do you suggest we provide an answer which fits your needs? :-) – iRove Aug 10 '16 at 20:02
  • @iRove: There is no reason for it because the question is the same. I cannot post the company code here, as you are surely aware. It would take a lot of time for me to translate the code and it would not add meaning to the question. Anyways, I appreciate the effort to help me. – learning_frog Aug 10 '16 at 20:13
  • @iRove: The code doesn't work for me simply because it references the name of the member variable. Maybe I should have pointed in the question that this is exactly what they didn't want when they implemented the code. – learning_frog Aug 10 '16 at 20:23
  • @Gustavo Yes, knowing that would help. I see where you are coming from. Good luck! – iRove Aug 10 '16 at 20:29
2

But is this code safe?

As I understand it, it will be safe as long as the uint16 len variables are immediately followed by int8 array[] variables regardless of the other struct members.

It is not safe in the sense that compilers are allowed free rein to insert any amount of padding between or after structure members, so you cannot be confident that &ptr[2] points to the first byte of foo.array2. Provided that uint16 is indeed two bytes wide, however, (which is in no way guaranteed by the language) you can be confident that if size is less than 25 then the

memcpy ((void*)&ptr[2], (void*)str, size);

will modify neither any bytes of foo.len2 nor the last byte of foo.array2. Since foo was earlier filled with zeroes, this will leave foo.array2 as a properly terminated C string. It is thus safe to print it with printf(). On the other hand, C does not guarantee that the result of doing so will be the same as the result of printing str.

Will it only add paddings before a uint16 when the size before it is odd?

This is at the discretion of the compiler. It might be influenced by pragmas, command-line options, configuration options, language extensions (though none of these are in use in the example), target architecture, or anything else that the compiler wants to use to make such decisions.

Is this use correct?

As far as I can tell, the program is conforming, if that's what you mean.

And more importantly, is it safe?

The program's output is not predictable from its code alone, so in that sense, no, it is not safe.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157