2

I'm doing some work in C++ for a company that has everything else written in C (using C isn't an option for me :( ). They have a number of data structures that are VERY similar (i.e., they all have fields such as "name", "address", etc. But, for whatever reason there isn't a common structure that they used to base everything else off of (makes doing anything hell). Anywho, I need to do a system-wide analysis of these structs that are in memory, and through it all into a table. Not too bad, but the table has to include entries for all the fields of all the variables, even if they don't have the field (struct b may have field "latency", but struct a doesn't - in the table the entry for each instance of a must have an empty entry for "latency".

So, my question is, is there a way to determine at runtime if a structure that has been passed into a template function has a specific field? Or will I have to write some black magic macro that does it for me? (The problem is basically that I can't use template specialization)

Thanks! If you have any questions please feel free to ask!

Here's a snippit of what I was thinking...

struct A
{
  char name[256];
  int index;
  float percision;
};

struct B
{
  int index;
  char name[256];
  int latency;
};

/* More annoying similar structs... note that all of the above are defined in files that were compiled as C - not C++ */

struct Entry
{
  char name[256];
  int index;
  float percision;
  int latency;
  /* more fields that are specific to only 1 or more structure */
};

template<typename T> struct Entry gatherFrom( T *ptr )
{
  Entry entry;

  strcpy( entry.name, ptr->name, strlen( ptr->name ) );
  entry.index = ptr->index;
  /* Something like this perhaps? */
  entry.percision = type_contains_field( "percision" ) ? ptr->percision : -1;
}

int main()
{
  struct A a;
  struct B b;

  /* initialization.. */

  Entry e  = gatherFrom( a );
  Entry e2 = gatherFrom ( b );

  return 0;
}
datenwolf
  • 159,371
  • 13
  • 185
  • 298
Neil
  • 466
  • 1
  • 6
  • 15
  • 6
    «Using C isn't an option for me»: maybe I'm an idealist, but I once dreamed that C++ programmers were actually able to code C stuff... – Macmade Jul 05 '12 at 21:53
  • 7
    *determine at runtime* and *template function* don't really go together... templates are resolved at compile time. You can probably use SFINAE to detect at *compile time* whether the field exists or not. – David Rodríguez - dribeas Jul 05 '12 at 21:54
  • 2
    Maybe [this](http://stackoverflow.com/questions/1005476/how-to-detect-whether-there-is-a-specific-member-variable-in-class) is what you want? – jxh Jul 05 '12 at 21:57
  • I know they don't go together... that's part of the problem. And I hadn't heard of SFINAE. Another part of the issue is that I don't want it to throw a compile error if the field doesn't exist. – Neil Jul 05 '12 at 22:03
  • There is no way in C to do run time type identification. Since you mentioned 'black magic macros', can you provide an example of what the macro would do? Maybe that could give a clue about what information is available at run time. – Deepan Jul 05 '12 at 21:57
  • I'm working in C++ trying to pull data from a set of C structures. – Neil Jul 05 '12 at 22:01
  • @nbsdx But if you can't modify the C code, then the structures are still C, so you won't get RTTI. – Deepan Jul 05 '12 at 22:29
  • yeah, that's why I was hoping something existed already to do this – Neil Jul 05 '12 at 22:39
  • @nbsdx: Sorry I'm late to the party. Your problem is resolvable at compile-time (even better than runtime, since it incurs no performance penalty vs hand-crafted code). The trick, as DavidRodríguez-dribeas suggests, is to use SFINAE. See [my answer](http://stackoverflow.com/a/17689868/9990) for details. – Marcelo Cantos Jul 17 '13 at 06:49

3 Answers3

3

You can do this at compile-time without touching the source of the original structs:

#include <iostream>
#include <limits>
#include <memory.h>

struct A
{
    char name[256];
    int index;
    float percision;
};

struct B
{
    int index;
    char name[256];
    int latency;
};

struct Entry
{
    char name[256];
    int index;
    float percision;
    int latency;
    /* more fields that are specific to only 1 or more structure */
};

inline
std::ostream & operator<<(std::ostream & os, Entry const & e) {
    return os << e.name << "{" << e.index << ", " << e.percision << ", " << e.latency << "}";
}

template <typename T>
inline
void assign(T & dst, T const & src) {
    dst = src;
}

template <size_t N>
inline
void assign(char (&dst)[N], char const (&src)[N]) {
    memcpy(dst, src, N);
}

#define DEFINE_ENTRY_FIELD_COPIER(field)                            \
    template <typename T>                                           \
    inline                                                          \
    decltype(T::field, true) copy_##field(T const * t, Entry & e) { \
        assign(e.field, t->field);                                  \
        return true;                                                \
    }                                                               \
                                                                    \
    inline                                                          \
    bool copy_##field(void const *, Entry &) {                      \
            return false;                                           \
    }

DEFINE_ENTRY_FIELD_COPIER(name)
DEFINE_ENTRY_FIELD_COPIER(index)
DEFINE_ENTRY_FIELD_COPIER(percision)
DEFINE_ENTRY_FIELD_COPIER(latency)

template <typename T>
Entry gatherFrom(T const & t) {
    Entry e = {"", -1, std::numeric_limits<float>::quiet_NaN(), -1};
    copy_name(&t, e);
    copy_index(&t, e);
    copy_percision(&t, e);
    copy_latency(&t, e);
    return e;
}

int main() {
    A a = {"Foo", 12, 1.2};
    B b = {23, "Bar", 34};

    std::cout << "a = " << gatherFrom(a) << "\n";
    std::cout << "b = " << gatherFrom(b) << "\n";
}

The DEFINE_ENTRY_FIELD_COPIER() macro defines a pair of overloaded functions for each field you want to extract. One overload (copy_##field(T const * t, …), which becomes copy_name(T const * t, …), copy_index(T const * t, …), etc.) defines its return type as decltype(T::field, true), which resolves to type bool if T has a data member called name, index, etc. If T doesn't have such a field, the substitution fails, but rather than causing a compile-time error, this first overload is simply treated as if it doesn't exist (this is called SFINAE) and the call thus resolves to the second overload, copy_##field(void const * t, …), which accepts any type at all for its first argument and does nothing.

Notes:

  1. Because this code resolves the overloads at compile-time, gatherFrom() is optimal, in the sense that the generated binary code for gatherFrom<A>(), for example, will look as if you tuned it for A by hand:

    Entry handCraftedGatherFromA(A const & a) {
        Entry e;
        e.latency = -1;
        memcpy(_result.name, a.name, sizeof(a.name));
        e.index = a.index;
        e.percision = a.percision;
        return e;
    }
    

    Under g++ 4.8 with -O3, gatherFrom<A>() and handCraftedGatherFromA() generate identical code:

    pushq   %rbx
    movl    $256, %edx
    movq    %rsi, %rbx
    movl    $-1, 264(%rdi)
    call    _memcpy
    movss   260(%rbx), %xmm0
    movq    %rax, %rcx
    movl    256(%rbx), %eax
    movss   %xmm0, 260(%rcx)
    movl    %eax, 256(%rcx)
    movq    %rcx, %rax
    popq    %rbx
    ret
    

    Clang 4.2's gatherFrom<A>() doesn't do as well, unfortunately; it redundantly zero-initialises the entire Entry. So it's not all roses, I guess.

    By using NRVO, both versions avoid copying e when returning it. However, I should note that both versions would save one op-code (movq %rcx, %rax) by using an output parameter instead of a return value.

  2. The copy_…() functions return a bool result indicating whether the copy happened or not. This isn't currently used, but it could be used, e.g., to define int Entry::validFields as a bitmask indicating which fields were populated.

  3. The macro isn't required; it's just for DRY. The essential ingredient is the use of SFINAE.

  4. The assign() overloads also aren't required. They just avoid having a different almost-identical macro to handle char arrays.

  5. The above code relies on C++11's decltype keyword. If you are using an older compiler, it's messier, but still possible. The cleanest solution I've managed to come up with is the following. Its C++98-conformant and still based on the SFINAE principle:

    template <typename C, typename F, F (C::*), typename T>
    struct EnableCopy {
        typedef T type;
    };
    
    #define DEFINE_ENTRY_FIELD_COPIER(field, ftype)             \
        template <typename T>                                   \
        inline                                                  \
        typename EnableCopy<T, ftype, &T::field, bool>::type    \
        copy_##field(T const * t, Entry & e) {                  \
            copy_value(e.field, t->field);                      \
            return true;                                        \
        }                                                       \
                                                                \
        inline                                                  \
        bool copy_##field(void const *, Entry &) {              \
            return false;                                       \
        }
    
    DEFINE_ENTRY_FIELD_COPIER(name     , char[256]);
    DEFINE_ENTRY_FIELD_COPIER(index    , int);
    DEFINE_ENTRY_FIELD_COPIER(percision, float);
    DEFINE_ENTRY_FIELD_COPIER(latency  , int);
    

    You'll also have to forgo C++11's portable std::numeric_limits<float>::quiet_NaN() and use some trick (0.0f/0.0f seems to work) or choose another magic value.

Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365
1

everything else written in C (using C isn't an option for me :( ).

First I'd like to quote what Linus Torvalds had to say about this issue:


From: Linus Torvalds <torvalds <at> linux-foundation.org>
Subject: Re: [RFC] Convert builin-mailinfo.c to use The Better String Library.
Newsgroups: gmane.comp.version-control.git
Date: 2007-09-06 17:50:28 GMT (2 years, 14 weeks, 16 hours and 36 minutes ago)

C++ is a horrible language. It's made more horrible by the fact that a lot 
of substandard programmers use it, to the point where it's much much 
easier to generate total and utter crap with it. Quite frankly, even if 
the choice of C were to do *nothing* but keep the C++ programmers out, 
that in itself would be a huge reason to use C.

http://harmful.cat-v.org/software/c++/linus


They have a number of data structures that are VERY similar (i.e., they all have fields such as "name", "address", etc. But, for whatever reason there isn't a common structure that they used to base everything else off of (makes doing anything hell).

They may have had very sound reasons for this. Putting common fields into a single base structure (class) may sound like a great idea. But it makes things really difficult if you want to apply major changes to one of the structures (replace some fields, change types, etc.) while leaving the rest intact. OOP is certainly not the one true way to do things.

So, my question is, is there a way to determine at runtime if a structure that has been passed into a template function has a specific field?

No this is not possible. Neither in C nor in C++, because all information about types gets discarded when the binary is created. There's neither reflection nor introspection in C or C++. Well, technically the debug information the compiler emits does provide this information, but there's no language builtin feature to access this. Also this sort of debug information relies on analysis performed at compile time, not at runtime. C++ has RTTI, but this is only a very coarse system to identify which class an instance is off. It does not help with class or struct members.

But why do you care to do this at runtime anyway?

Anywho, I need to do a system-wide analysis of these structs that are in memory, and through it all into a table.

You should be actually happy that you have to analyse C and not C++. Because C is really, really easy to parse (unlike C++ which is tremendously difficult to parse, mostly because of those darn templates). Especially structs. I'd just write a small and simple script, that extracts all the struct definitions from the C sources. However since structs are of constant size, they often contain pointers to dynamically allocated data. And unless you want to patch your allocator, I think the most easy way to analyse this, is by hooking into a debugger and record the memory usage of every unique object whose pointer is assigned to a struct member.

datenwolf
  • 159,371
  • 13
  • 185
  • 298
  • 1
    I completely agree with the point on C vs C++ - I hate C++, but that's what I was told to use, so I figured I'd try to take advantage of the templates. So, could I check at compile time, or during the preprocessor stage to see if the call was invalid, and if so take a different route? (Similar to a #ifdef)? I think that would be optimal - and I think that it's what SFINAE is all about too, but that isn't a part of the standard, correct? – Neil Jul 05 '12 at 22:34
  • To add to the fun see [here](http://harmful.cat-v.org/software/c++/I_did_it_for_you_all). – Gigi Jul 05 '12 at 22:51
  • 2
    @MSalters: Aha, lack of knowledge of C++. Enlighten me. FYI: C++ is statically compiled. C++ has no built in reflection. C++ has no built in introspection (reflection was a planned feature of C++0x, that however got removed). Now tell me: Without reflection or introspection, how do you determine at *runtime* what struct/class members there are in an object? – datenwolf Jul 06 '12 at 18:36
  • @datenwolf: No experience with customers either, I guess. The fact that he _asks_ for runtime resolution doesn't mean in the least that he _needs_ it. Classic XY problem. Note that he also thinks "black magic macros" may work, which (unlike C++ templates) don't grok types at all. Those macros's are definitely compile time, so we see he's contradicting himself right there . – MSalters Jul 06 '12 at 21:25
  • @MSalters: I don't see people asking on SO as customers. OP had a specific question and he got a specific answer, namely that what he tries to do (runtime inspection of C structs and classes) is not possible. I also offered a compiletime solution. Also I might add gccxml to it, but this only adds the introspection part, but gives no code/memory coverage data. (BTW, the question title was very unspecific). And I explicitly mentioned, that doing compile time analysis is easier for C than C++, because of the simpler grammar. – datenwolf Jul 07 '12 at 10:03
  • Maybe I should have told, that about 8 years ago I had this little project, which aim was to add introspection and reflection data to C++ classes, so that I just "plug" them into script language interfaces. I soon (about 3 months into it) gave up on the project, due to the shear complexity. – datenwolf Jul 07 '12 at 10:08
  • 1
    @datenwolf: The stated problem is not to discover all the fields of a struct, but which subset of a predetermined set it possesses. This is possible using compile-introspection. See [my answer](http://stackoverflow.com/a/17689868/9990) for details. – Marcelo Cantos Jul 17 '13 at 06:46
  • There are a lot of C people bashing C++. 99.9% of the time their opinion of C++ is uninformed simply because they don't know it, including Linus Torvalds. The way to read such rants is to remove all subjective adjectives, like _horrible_ and see if there is any factual content left. – Maxim Egorushkin Jul 17 '13 at 09:14
  • 1
    @MaximYegorushkin: After my adventures in Turbo Pascal in the early 1990ies I did first learn C++ and spent years using it (circa 2005) before I actually did learn to think and code in C. And ever since I strongly prefer coding in C over coding in C++. I think an important catalyst for this was, that around that time I got exposed to functional programming, which was kind of an eye opener on how weird and convoluted C++ actually is. Heck I even wasted some time trying to implement a templating capable C++ parser similar to Qt's *moc*. I went through C++ detox and try to stay clear of it. – datenwolf Jul 17 '13 at 11:07
  • @datenwolf So, your argument boils down to _C++ is hard to learn._ – Maxim Egorushkin Jul 17 '13 at 12:31
  • 1
    @MaximYegorushkin: My argument is, that C++ became so convoluted that it effectively prohibits efficient code reuse. For any given project a common subset of language features and used idioms will chosen. As soon as you try to integrate that with another C++ project all hell breaks loose. In 1998 I wrote a really nice framework for terminal based interactive applications. Think Qt - but for the console. It was all done in C++; I now deeply regret writing it in C++, because I can hardly reuse it in new projects. – datenwolf Jul 17 '13 at 13:03
  • 1
    @MaximYegorushkin: The library itself is not hard to reuse with my own programs. But it's hard to reuse if used together with other libraries also written in C++. For example the way my library deals with extensions may clash with the exception handling of the other library (speaking of C++ extensions, my advice, now: Don't use them). Or template metaprogramming: Unless it's a single coherent set of libraries adhering to the same code and programming style (like Boost) it's virtually impossible to mix template metaprogramming libs. – datenwolf Jul 17 '13 at 14:27
  • I dont see how derogatory comments about developers of a certain language can be considered a helpful answer. – scigor Jan 18 '19 at 10:41
1

Yes, this isn't hard at all. Just put both an A and an Entry in a single object, and make the Entry a second-class citizen:

void setDefaultValues(Entry*); // You should be able to provide these.
struct Entry {
  int x;
  int y;
};
struct Indirect : public Entry { };
template<typename T> struct EntryOr : public T, Indirect
{
  setDefaultValues(this);
};

// From C code
struct A {
  int x;
}

int main()
{
  EntryOr<A> foo;
  foo.x = 5; // A::x
  std::cout << foo.x << foo.y; // Prints A::x and Entry::y
}

(Link)

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • I don't have time to play with it right now, but that's really interesting. I'll mess around with it later tonight or this weekend. – Neil Jul 06 '12 at 19:07
  • Are we talking about C++?? If so: error C2385: ambiguous access of 'x', a class cannot inherit from more than one base – Johnny Pauling Jul 10 '12 at 14:09