Use struct member pointer to fill-in a struct in C++

Question

So I have the following available:

struct data_t {
    char field1[10];
    char field2[20];
    char field3[30];
};
const char *getData(const char *key);
const char *field_keys[] = { "key1", "key2", "key3" };

This code is given to my and I cannot modify it in any way. It comes from some old C project.

I need to fill in the struct using the getData function with the different keys, something like the following:

struct data_t my_data;
strncpy(my_data.field1, getData(field_keys[0]), sizeof(my_data.field1));
strncpy(my_data.field1, getData(field_keys[1]), sizeof(my_data.field2));
strncpy(my_data.field1, getData(field_keys[2]), sizeof(my_data.field3));

Of course, this is a simplification, and more things are going on in each assignment. The point is that I would like to represent the mapping between keys and struct member in a constant structure, and use that to transform the last code in a loop. I am looking for something like the following:

struct data_t {
    char field1[10];
    char field2[20];
    char field3[30];
};

typedef char *(data_t:: *my_struct_member);
const std::vector<std::pair<const char *, my_struct_member>> mapping = {
    { "FIRST_KEY" , &my_struct_t::field1},
    { "SECOND_KEY", &my_struct_t::field2},
    { "THIRD_KEY",  &my_struct_t::field3},
};

int main()
{
    data_t data;

    for (auto const& it : mapping) {
        strcpy(data.*(it.second), getData(it.first));
        // Ideally, I would like to do
        // strlcpy(data.*(it.second), getData(it.first), <the right sizeof here>);
    }
}

This, however, has two problems:

It does not compile :) But I believe that should be easy to solve.
I am not sure about how to get the sizeof() argument for using strncpy/strlcpy, instead of strcpy. I am using char * as the type of the members, so I am losing the type information about how long each array is. In the other hand, I am not sure how to use the specific char[T] types of each member, because if each struct member pointer has a different type I don't think I will be able to have them in a std::vector<T>.

Semi-related, Is the reason you're not just using dynamic `std::string` because these are fixed-length structures for fixed-record-size storage somewhere? Understandable if that is the case; just curious. — WhozCraig, Oct 23 '18 at 15:20
Fixed length character buffers will end you if you're not *extremely* careful. `strcpy` is not careful enough. Use `std::string` as WhozCraig suggests, it'll save you innumerable hours of agony. — tadman, Oct 23 '18 at 15:21
This is some glue code between C and C++, unfortunately, I cannot modify the struct. — José D., Oct 23 '18 at 15:22
`This code is given to my and I cannot modify it in any way. It` That's kind of unfortunate, considering the given code is broken (in C as well as C++): https://repl.it/repls/MammothObedientBytes — eerorika, Oct 23 '18 at 15:27
@user2079303 Thanks for the warning, I have fixed the code so that it compiles. — José D., Oct 23 '18 at 15:43
@JoséD. It is unclear what you need. A few questions: why do you want to keep the type information? Why isn't `strcpy` the right solution here? Why do you need to loop over the members? ...? Depending on the answers, there may be different solutions that fit quite nicely. — Acorn, Oct 23 '18 at 15:43
@Acorn answer your question in order: So I can use strncpy/strlcpy to safely copy without going outside the struct. strcpy I don't think is the right solution because it could go over the layout of the struct. I don't need to loop over the member, but I think it simplifies the code, and it avoids repetition, so I wonder about how to do it. — José D., Oct 23 '18 at 16:02
`getData()` takes 1 argument and returns no value. You're trying to call it with 2 arguments and expect the return value to be a C string. — Barmar, Oct 23 '18 at 16:06
The edit fixed the problem with the return value, but you're still calling it with 2 arguments instead of just 1. Why does it need a size argument? — Barmar, Oct 23 '18 at 16:08
@JoséD. Well, if your mapping is correct, you shouldn't be going over it. If you cannot prove it, you could always store the available size in runtime (like `vector` or `string` does). In any case, looping is not really simplifying much, because now you have to take care of a mapping (that might not even be close to the loop), so the code becomes harder to read. You could play some tricks to approximate what you want, but you end up with something even more complex. When you need to do this, is because you have to change the mapping at runtime. If the mapping is fixed, I don't see the value. — Acorn, Oct 23 '18 at 16:09
`sizeof(my_data.field1)` should be an argument to `strncpy()`, not `getData()`. — Barmar, Oct 23 '18 at 16:09
@Barmar sorry, I wrote the question too quickly. I have reviewed it again and I tried to make it clearer. — José D., Oct 23 '18 at 16:09
@Acorn "If the mapping is fixed, I don't see the value." It avoids repeating code. This is a simplification but what if the struct has way more fields, and the works is not as easy as a ```strlcpy```? — José D., Oct 23 '18 at 16:13
@JoséD. No, you shouldn't be repeating code. If you can put enough info in a mapping to do a loop, then you can also put enough information in a function or lambda and call that in the loop. If the struct has many fields, it does not matter: you will still have the same amount of lines, either in the mapping or in calls to the function/lambda. I will create an answer with an example. — Acorn, Oct 23 '18 at 18:15

score 1 · Answer 1 · answered Oct 23 '18 at 18:40

A variadic templates based solution:

struct my_struct_t {
    char one_field[30];
    char another_field[40];
};

template<typename T1, typename T2>
void do_mapping(T1& a, T2& b) {
    std::cout << sizeof(b) << std::endl;
    strncpy(b, a, sizeof(b));
}

template<typename T1, typename T2, typename... Args>
void do_mapping(T1& a, T2& b, Args&... args) {
    do_mapping(a, b);
    do_mapping(args...);
}

int main()
{
    my_struct_t ms;
    do_mapping(
        "FIRST_MAPPING",  ms.one_field, 
        "SECOND_MAPPING", ms.another_field
    );
    return 0;
}

Acorn · Accepted Answer · 2018-10-24T09:28:00.297

As explained in my comment, if you can store enough information to process a field in a mapping, then you can write a function that does the same.

Therefore, write a function to do so, using array references to ensure what you do is safe, e.g.:

template <std::size_t N>
void process_field(char (&dest)[N], const char * src)
{
    strlcpy(dest, getData(src), N);

    // more work with the field...
};

And then simply, instead of your for loop:

process_field(data.field1, "foo");
process_field(data.field2, "bar");
// ...

Note that the amount of lines is the same as with a mapping (one per field), so this is not worse than a mapping solution in terms of repetition.

Now, the advantages:

Easier to understand.
Faster: no memory needed to keep the mapping, more easily optimizable, etc.
Allows you to write different functions for different fields, easily, if needed.

Further, if both of your strings are known at compile-time, you can even do:

template <std::size_t N, std::size_t M>
void process_field(char (&dest)[N], const char (&src)[M])
{
    static_assert(N >= M);
    std::memcpy(dest, src, M);

    // more work with the field...
};

Which will be always safe, e.g.:

process_field(data.field1, "123456789");  // just fits!
process_field(data.field1, "1234567890"); // error

Which has even more pros:

Way faster than any strcpy variant (if the call is done in run-time).
Guaranteed to be safe at compile-time instead of run-time.

This is nice, but wouldn't work for the original question. Note that the original question talks about an API functino 'getData' which may return different strings in runtime, so it is a bit different that using string literals at compile time. — José D., Oct 24 '18 at 09:05
@JoséD. In that case, simply use a single template parameter. I have updated the solution. I thought `getData` was a const/pure function, and given the parameters that you pass to it are fixed (they come from the mapping), I assumed the output would be known ahead of time. — Acorn, Oct 24 '18 at 09:24

Caleth · Answer 3 · 2018-10-23T16:28:45.957

The mapping can be a function to write the data into the appropriate member

struct mapping_t
{
    const char * name;
    std::function<void(my_struct_t *, const char *)> write;
};

const std::vector<mapping_t> mapping = {
    { "FIRST_KEY",  [](data_t & data, const char * str) { strlcpy(data.field1, str, sizeof(data.field1); } }
    { "SECOND_KEY", [](data_t & data, const char * str) { strlcpy(data.field2, str, sizeof(data.field2); } },
    { "THIRD_KEY",  [](data_t & data, const char * str) { strlcpy(data.field3, str, sizeof(data.field3); } },
};

int main()
{
    data_t data;

    for (auto const& it : mapping) {
        it.write(data, getData(it.name));
    }
}

score 0 · Answer 4 · answered Oct 23 '18 at 16:16

Since data_t is a POD structure, you can use offsetof() for this.

const std::vector<std::pair<const char *, std::size_t>> mapping = {
    { "FIRST_FIELD" , offsetof(data_t, field1},
    { "SECOND_FIELD", offsetof(data_t, field2)}
};

Then the loop would be:

for (auto const& it : mapping) {
    strcpy(static_cast<char*>(&data) + it.second, getData(it.first));
}

I don't think there's any way to get the size of the member similarly. You can subtract the offset of the current member from the next member, but this will include padding bytes. You'd also have to special-case the last member, subtracting the offset from the size of the structure itself, since there's no next member.

This feels like UB to me, but then again the use of `offsetof` is valid and I don't know why else `offsetof` would be provided if not for this, so it must just be me. — Lightness Races in Orbit, Oct 23 '18 at 16:26

score 0 · Answer 5 · answered Oct 23 '18 at 18:00

To iterate over struct member you need:

offset / pointer to the beginning of that member
size of that member

struct Map {
    const char *key;
    std::size_t offset;
    std::size_t size;
};

std::vector<Map> map = {
    { field_keys[0], offsetof(data_t, field1), sizeof(data_t::field1), },
    { field_keys[1], offsetof(data_t, field2), sizeof(data_t::field2), },
    { field_keys[2], offsetof(data_t, field3), sizeof(data_t::field3), },
};

once we have that we need strlcpy:

std::size_t mystrlcpy(char *to, const char *from, std::size_t max)
{

    char * const to0 = to;
    if (max == 0) 
        return 0;
    while (--max != 0 && *from) {
        *to++ = *from++;
    }
    *to = '\0';
    return to0 - to - 1;
}

After having that, we can just:

data_t data;

for (auto const& it : map) {
    mystrlcpy(reinterpret_cast<char*>(&data) + it.offset, getData(it.key), it.size);
}

That reinterpret_cast looks a bit ugly, but it just shift &data pointer to the needed field.

We can also create a smarter container which takes variable pointer on construction, thus is bind with an existing variable and it needs a little bit of writing:

struct Map2 {
    static constexpr std::size_t max = sizeof(field_keys)/sizeof(*field_keys);

    Map2(data_t* pnt) : mpnt(pnt) {}

    char* getDest(std::size_t num) {
        std::array<char*, max> arr = {
            mpnt->field1,
            mpnt->field2,
            mpnt->field3,
        };
        return arr[num];
    }

    const char* getKey(std::size_t num) {
        return field_keys[num];
    }

    std::size_t getSize(std::size_t num) {
        std::array<std::size_t, max> arr = {
            sizeof(mpnt->field1),
            sizeof(mpnt->field2),
            sizeof(mpnt->field3),
        };
        return arr[num];
    }

private:
    data_t* mpnt;
};

But probably makes the iterating more readable:

Map2 m(&data);
for (std::size_t i = 0; i < m.max; ++i) {
    mystrlcpy(m.getDest(i), getData(m.getKey(i)), m.getSize(i));
}

Live code available at onlinegdb.

Use struct member pointer to fill-in a struct in C++

5 Answers5