How do I hide some fields of struct in C?

Question

I'm trying to implement a struct person, and I need to hide some fields or make them constant. A trick for create private fields.

Header:

#pragma once

#define NAME_MAX_LEN 20

typedef struct _person {
    float wage;
    int groupid;
} Person;

const char const *getName (Person *p);
int getId (Person *p);

/// OTHER FUNCTIONS

Source

#include "person.h"


struct _person
{
    int id;

    float wage;
    int groupid;

    char name[NAME_MAX_LEN];
};

/// FUNCTIONS

GCC says that person.c:7:8: error: redefinition a 'struct _person' struct _person

I can write this in a header, but after it, I can't use fields of a struct.

typedef struct _person Person;

C doesn't let you selectively hide fields. There's no `private` here. — user2357112, Mar 22 '19 at 20:43
@user2357112 How to protect from edit my variables (`id` and `name`)? — Wootiae, Mar 22 '19 at 20:58
An unusual (not say BOFH) approach to the *"no really, just because I let you see the internals doesn't mean it is OK for you to mess with them"* problem in a comment elsewhere on the site: https://stackoverflow.com/questions/31195551/opaque-types-allocatable-on-stack-in-c#comment50396436_31195551 — dmckee --- ex-moderator kitten, Mar 23 '19 at 00:37

dbush · Answer 1 · 2019-03-22T21:04:06.007

24

A struct cannot have multiple conflicting definitions. As such, you can't create a struct that hides some of the fields.

What you can do however it declare that the struct exists in the header without defining it. Then the caller is restricted to using only a pointer to the struct and using functions in your implementation to modify it.

For example, you could define your header as follows:

typedef struct _person Person;

Person *init(const char *name, int id, float wage, int groupid);

const char *getName (const Person *p);
int getId (const Person *p);
float getWage (const Person *p);
int getGroupid (const Person *p);

And your implementation would contain:

#include "person.h"

struct _person
{
    int id;

    float wage;
    int groupid;

    char name[NAME_MAX_LEN];
};

Person *init(const char *name, int id, float wage, int groupid)
{
    Person *p = malloc(sizeof *p);
    strcpy(p->name, name);
    p->id = id;
    p->wage= wage;
    p->groupid= groupid;
    return p;
}

...

edited Mar 22 '19 at 21:04

answered Mar 22 '19 at 20:48

dbush

205,898
23
218
273

I would add `const` pointers: `int getId (const Person *p);` so functions can be called with constant pointers (since they're just getters) – Jean-François Fabre Mar 22 '19 at 20:54
1

@Jean-FrançoisFabre Good idea. Updated. Also, congrats on the diamond! – dbush Mar 22 '19 at 20:56
Can I "show" `wage` and `groupid`? for use `p->wage`? – Wootiae Mar 22 '19 at 21:00
@Wootiae Not in the calling code, because it doesn't know what `Person` contains. Your implementation needs an accessor function to allow the user to read it. – dbush Mar 22 '19 at 21:02
How to use `sizeof` with it? – Wootiae Mar 22 '19 at 21:10
@Wootiae You can't from outside the implementation because the type isn't fully defined. You shouldn't need to either, as the implementations should do any manipulation that needs to be done. – dbush Mar 22 '19 at 21:12

score 19 · Accepted Answer · answered Mar 22 '19 at 21:04

C has no mechanism for hiding individual members of a structure type. However, by operating only in terms of pointers to such a type, and not providing a definition, you can make the whole type opaque. Users would then have to use the functions you provide to manipulate instances in any way. This is a thing that is sometimes done.

To some extent, you may be able to achieve something like what you describe with a hidden context. For example, consider this:

header.h

typedef struct _person {
    float wage;
    int groupid;
} Person;

implementation.c

struct _person_real {
    Person person;  // must be first, and is a structure, not a pointer.
    int id;
    char name[NAME_MAX_LEN];
};

Now you can do this:

Person *create_person(char name[]) {
    struct _person_real *pr = malloc(sizeof(*pr));

    if (pr) {
        pr->person.wage = DEFAULT_WAGE;
        pr->person.groupid = DEFAULT_GROUPID;
        pr->id = generate_id();
        strncpy(pr->name, name, sizeof(pr->name));
        pr->name[sizeof(pr->name) - 1] = '\0';

        return &pr->person;  // <-- NOTE WELL
    } else {
        return NULL;
    }
}

A pointer to the first member of a structure always points also to the whole structure, too, so if the client passes a pointer obtained from that function back to you, you can

struct _person_real *pr = (struct _person_real *) Person_pointer;

and work on the members from the larger context.

Be well aware, however, that such a scheme is risky. Nothing prevents a user from creating a Person without the larger context, and passing a pointer to it to a function that expects the context object to be present. There are other issues.

Overall, C APIs generally either take the opaque structure approach or just carefully document what clients are permitted to do with the data they have access to, or even just document how everything works, so that users can make their own choices. These, especially the latter, are well aligned with overall C approaches and idioms -- C does not hold your hand, or protect you from doing harm. It trusts you to know what you're doing, and to do only what you intend to do.

*just document how everything works, so that users can make their own choices.* The problem with that is you become locked into a specific implementation of your structure - which can only be a bad thing. If you miss something in your implementation, or your implementation precludes some new functionality you didn't think of when you designed it, you likely can only make a change if you're willing to break user's code. — Andrew Henle, Mar 22 '19 at 22:37
Better to use a fully opaque pointer anyway so the user doesn't try allocating their own `Person`. — Kevin, Mar 22 '19 at 23:52
@AndrewHenle, yet both C and POSIX take that approach in multiple places. They designate numerous structure types for which they specify a certain minimum set of members and what those mean. Clearly the committees that maintain these standards disagree that "this can only be a bad thing". It does have drawbacks, but these are offset by the very great advantages of users being able to declare objects of the relevant structure type, to access them directly instead of via function calls, and to avoid requiring either dynamic memory management or static data to use them. — John Bollinger, Jul 06 '19 at 14:35
@JohnBollinger Those examples in both POSIX and C are pretty much documenting existing interfaces that have been hammered out over literally **years** of widespread use. Creating an entirely new interface and assuming you've covered how it may evolve in the future is entirely different. Given the migration of standard structures like `FILE` towards opacity (both Linux and Solaris have moved from open to opaque `FILE` structures over the decades) I'll stand by my assertion that open structures are problematic. — Andrew Henle, Jul 06 '19 at 15:52
(cont) And that's not even getting into the possibility that `struct` layouts can actually change on some systems [just by changing compiler options](https://docs.oracle.com/cd/E37069_01/html/E37074/bjapp.html#OSSCGbjavc). Kinda hard to have a non-opaque structure that has different layouts because of potentially different optimization options. — Andrew Henle, Jul 06 '19 at 15:58

score 4 · Answer 3 · answered Mar 22 '19 at 21:04

4

You can use a mixin style; e.g. write in the header:

struct person {
    float wage;
    int groupid;
};

struct person *person_new(void);
char const *getName (struct person const *p);
int getId (struct person const *p);

and in the source

struct person_impl {
    struct person   p;
    char            name[NAME_MAX_LEN];
    int             id;
}

struct person *person_new(void)
{
    struct person_impl *p;

    p = malloc(sizeof *p);
    ...
    return &p->p;
}

chra const *getName(struct person const *p_)
{
    struct person_impl *p =
           container_of(p_, struct person_impl, p);

    return p->name;
}

See e.g. https://en.wikipedia.org/wiki/Offsetof for details of container_of().

answered Mar 22 '19 at 21:04

ensc

6,704
14
22

But then if the caller ever does anything with the returned `struct person *` other than getting fields with `->` the hidden `name` and `id` will be lost and `getName` will return garbage. C won't realize that that is invalid at all. – pbfy0 Mar 23 '19 at 03:18
@pbfy0 I do not see a problem here; you can do this everywhere in C. E.g. with the opaque pointer from the other answer you can override internals with `memset(persion, 23, 42)`. – ensc Mar 23 '19 at 14:02
Person a = *person_new(); That code doesn't seem that unreasonable to C users who aren't super experienced. `memset`ing a random offset into the structure seems like a more obvious trigger of unwanted behavaior – pbfy0 Mar 23 '19 at 18:28

Myst · Answer 4 · 2019-07-11T06:56:21.490

Addendum to John Bollinger's answer:

Although, IMHO, opaque pointer types with accessor functions (init/get/set/destroy) are the most secure approach, there's another option that allows users to place objects on the stack.

It's possible to allocate a single "typeless" chunk of memory as part of the struct and use that memory explicitly (bit by bit / byte by byte) instead of using additional types.

i.e.:

// public
typedef struct {
    float wage;
    int groupid;
    /* explanation: 1 for ID and NAME_MAX_LEN + 1 bytes for name... */
    unsigned long private__[1 + ((NAME_MAX_LEN + 1 + (sizeof(long) - 1)) / sizeof(long))];
} person_s;

// in .c file (private)
#define PERSON_ID(p) ((p)->private__[0])
#define PERSON_NAME(p) ((char*)((p)->private__ + 1))

This is a very strong indicator that access to the data in the private__ member should be avoided. Developers that don't have access to the implementation file won't even know what's in there.

Having said that, the best approach is an opaque type, as you may have encountered when using the pthread_t API (POSIX).

typedef struct person_s person_s;
person_s * person_new(const char * name, size_t len);
const char * person_name(const person_s * person);
float person_wage_get(const person_s * person);
void person_wage_set(person_s * person, float wage);
// ...
void person_free(person_s * person);

Notes:

avoid typedef with a pointer. It only confuses developers.

It's better to keep pointers explicit, so all developers can know that the type they're using is dynamically allocated.

EDIT: Also, by avoiding "typedefing" a pointer type, the API promises that future / alternative implementations will also use a pointer in it's API, allowing developers to trust and rely on this behavior (see comments).
When using an opaque type, the NAME_MAX_LEN could be avoided, allowing names of arbitrary length (assuming renaming requires a new object). This is an extra incentive to prefer the opaque pointer approach.
avoid placing the _ at the beginning of an identifier when possible (i.e., _name). Names starting with _ are assumed to have a special meaning and some are reserved. The same goes for types ending with _t (reserved by POSIX).

Notice how I use the _s to mark the type as a struct, I don't use _t (which is reserved).
C is more often snake_case (at least historically). The best known APIs and most of the C standard is snake_case (except where things were imported from C++).

Also, being consistent is better. Using CamelCase (or smallCamelCase) in some cases while using snake_case for other things could be confusing when developers try to memorize your API.

When it comes to opaque types, I would recommend typedefing pointers. If you're worried about freeing the allocated memory. Declare a function for that too. — klutt, Jul 10 '19 at 12:15
@klutt - typedefing pointers to opaque types proved to be a mistake in many occasions, `pthread_t` being one of them. Typedefing a pointer doesn't promise that future / alternative implementations will use a pointer based type (it might be a struct in a different implementation, keeping the API intact, if not the ABI). This makes the design less portable. It can cause double indirection and add allocations when the type needs to be dynamically allocated and managed by developers using the API (i.e., storing and passing a `ptherad_t *`). — Myst, Jul 10 '19 at 16:10
Hmmm, I don't see the problem. Do you have a link describing it? — klutt, Jul 10 '19 at 16:31
@klutt - Assume you have a task queue, where a task accepts a `void *` pointer (i.e., `void task(void* arg);` - how do you pass a `thread_t` object to a task? If the answer was "casting", then the answer isn't portable (and wrong), you don't **know** that `pthread_t` is **always** a pointer type and you don't **know** if `sizeof(pthread_t) == sizeof(void *)`. Your only choice is to `malloc(sizeof(pthread_t))`. This is because `pthread_t` is typedefing a pointer (on most systems). This choice makes it impossible for you to assume other pthread implementations will also use a pointer. — Myst, Jul 10 '19 at 16:55
@klutt - as for a link, the [facil.io documentation](http://facil.io/0.7.x/fio#fio_thread_new) and [code](https://github.com/boazsegev/facil.io/blob/fe7574d70c0bd3aa73a8d592cb902cfd6b8b8837/lib/facil/fio.c#L583-L605) encountered a similar problem when abstracting away the thread library implementation details - since `pthread_t` was a typedef, facil.io couldn't trust that all systems will implement `pthread_t` as a pointer (even if most do). — Myst, Jul 10 '19 at 16:59
Interesting. Cannot say that I understand it fully, but I'll look into it. But btw. *"avoid placing the _ at the beginning of an identifier when possible"* When would this not be possible? — klutt, Jul 11 '19 at 04:32
@klutt - *When would this not be possible*? I'm being polite :) but also, sometimes we don't dictate names or naming conventions for a project. — Myst, Jul 11 '19 at 06:52

Grabusz · Answer 5 · 2019-07-08T14:31:57.683

What John Bollinger wrote is a neat way of utilising how structs and memory works, but it's also an easy way to get a segfault (imagine allocating an array of Person and then later passing the last element to a 'method' which accesses the id or it's name), or corrupt your data (in an array of Person the next Person is overwriting 'private' variables of the previous Person). You'd have to remember that you must create an array of pointers to Person instead of array of Person (sounds pretty obvious until you decide to optimise something and think that you can allocate and initialise the struct more efficiently than the initialiser function).

Don't get me wrong, it's a great way to solve the problem, but you've got to be careful when using it. What I'd suggest (though using 4/8 bytes more memory per Person) is to create a struct Person which has a pointer to another struct which is only defined in the .c file and holds the private data. That way it'd be harder to make a mistake somewhere (and if it's a bigger project then trust me - you'll do it sooner or later).

.h file:

#pragma once

#define NAME_MAX_LEN 20

typedef struct _person {
    float wage;
    int groupid;

    _personPriv *const priv;
} Person;

void personInit(Person *p, const char *name);
Person* personNew(const char *name);

const char const *getName (Person *p);
int getId (Person *p);

.c file:

typedef struct {
    int id;
    char name[NAME_MAX_LEN];
} _personPriv;

const char const *getName (Person *p) {
    return p->priv->name;
}

int getId (Person *p) {
    return p->priv->id;
}

_personPriv* _personPrivNew(const char *name) {
    _personPriv *ret = memcpy(
        malloc(sizeof(*ret->priv)),
        &(_personPriv) {
            .id = generateId();
        },
        sizeof(*ret->priv)
    );

    // if(strlen(name) >= NAME_MAX_LEN) {
    //     raise an error or something?
    //     return NULL;
    // }

    strncpy(ret->name, name, strlen(name));

    return ret;
}

void personInit(Person *p, const char *name) {
    if(p == NULL)
        return;

    p->priv = memcpy(
        malloc(sizeof(*p->priv)),
        &(_personPriv) {
            .id = generateId();
        },
        sizeof(*p->priv)
    );

    ret->priv = _personPrivNew(name);
    if(ret->priv == NULL) {
        // raise an error or something
    }
}

Person* personNew(const char *name) {
    Person *ret = malloc(sizeof(*ret));

    ret->priv = _personPrivNew(name);
    if(ret->priv == NULL) {
        free(ret);
        return NULL;
    }
    return ret;
}

Side note: this version can be implemented so that private block is allocated right after/before the 'public' part of the struct to improve locality. Just allocate sizeof(Person) + sizeof(_personPriv) and initialise one part as Person and second one as _personPriv.

You shouldn't create identifiers that begin with double-underscores. Those are reserved in all contexts. — Christian Gibbons, Jul 08 '19 at 14:14

How do I hide some fields of struct in C?

5 Answers5