5

I would appreciate your help!

Which data structure is used to implement heterogeneous array in C or C++ ? The array may contain any standard data types like int,float, char, string etc...

leader
  • 51
  • 1
  • 4

3 Answers3

10

As ravi mentions, the appropriate data structure is called a tag-union (also called variant record). One way to implement it is this:

typedef union {
    int tag;
    struct {
        int tag;
        int val;
    } int_;
    struct {
        int tag;
        double val;
    } double_;
} object;

Then you can make an array of these.

object arr[5];

You use the tag field to indicate which union member is in use. Typically with an enum.

enum type { INT, DOUBLE };

Then set the tag when creating the object, and check the tag before accessing. This can be encapsulated by using constructor functions.

object new_int(int i){
    object ret;
    ret.tag = INT;
    ret.int_.val = i;
    return ret;
}

object new_double(double d){
    object ret;
    ret.tag = DOUBLE;
    ret.double_.val = d;
    return ret;
}

And you usually want to use a switch on the tag for accessing, writing a different case for each type.

void print_object(object x){
    switch(x.tag){
    case INT: printf("%d\n", x.int_.val); break;
    case DOUBLE: printf("%f\n", x.double_.val); break;
    }
}

Or in some circumstances, you may want to fold the array into a single type so it can be accessed without checking the tag each time.

for (i = 0; i < sizeof arr/sizeof*arr; i++)
    if (arr[i].tag == INT)
        arr[i] = new_double(arr[i].int_.val);
luser droog
  • 18,988
  • 3
  • 53
  • 105
  • 1
    Naïve question: isn't a union supposed to only "have" one element at a time, so that if either the int or double struct is used, the tag won't have any available memory? Are those tag elements in the structs there to pad the memory to somehow allow this? How weird. – Andres Salas Oct 23 '18 at 23:41
  • 2
    Yes, exactly. The `int tag;` inside each struct overlays the same memory as the one which is the union member. All union members are at the same memory location as the union itself with no padding at the front. And the first member of a struct is at the same memory location as the struct itself. There are other ways to organize the same kind of data structure. This one makes the definition a little weird but it makes the expressions look nicer IMO, but it's a trade-off. – luser droog Oct 24 '18 at 03:09
  • I see! I suppose one could create a struct with elements tag and the union with the two members if such a thing is possible. I like it though, thanks! This should have more attention for clearly addressing heterogeneous array-making! – Andres Salas Oct 24 '18 at 15:51
  • Yes, that way is used a lot and it works. What I don't like is having to add the `.u.` in all the expressions to access the inner union. I eventually moved on to a third way which is described [over here](https://codereview.stackexchange.com/questions/140994/encoding-data-into-integer-handles-for-language-interpreter) – luser droog Oct 24 '18 at 15:55
  • That explains it, the expressions are indeed neater this way. Bit stealing, that's clever, if trickier to implement. So long as the memory ordering is not somehow compiler-dependent this works and looks good. – Andres Salas Oct 24 '18 at 16:13
  • 1
    The behavior of structs and unions is defined by the standard. There may be padding added in the middle, but never at the front. The compiler may offer a `#pragma pack` option to select how it pads in the middle. But the members of a struct must be laid out in the order the code specifies. This allows things like the `offsetof()` macro to work reliably. ... So, if you were to make the tag be a `char` instead of `int` the whole thing probably wouldn't get any smaller, it would just introduce 3 bytes of padding. – luser droog Oct 24 '18 at 17:32
1

There is no such array in c++ which can store elements of different types nor there is container in stl. Although there's one way to store different element in a container but condition is those types should be related through inheritance.

In C there's a concept called tagged union which can store different types giving tag as a means to specify which variable is actually there.

One more way to do this is using an array of void* pointers. Although that would be quite ugly C++. This would not be truly heterogeneous as you are using a type of pointer that any pointer can be cast into. It is similar to making a collection of base class type and then storing objects of derived classes.

This I got from Stroustrup article:-

If you need a heterogeneous container in C++, define a common interface for all the elements and make a container of those. For example:

class Io_obj { /* ... */ }; // the interface needed to take part in object I/O

vector<Io_obj*> vio;        // if you want to manage the pointers directly
vector< Handle<Io_obj> > v2;    // if you want a "smart pointer" to handle the objects

Apart from that Boost::Any can also be used:-

vector<Any> v;
ravi
  • 10,994
  • 1
  • 18
  • 36
  • Please have a look at [this](http://www.boost.org/doc/libs/1_54_0/doc/html/any.html). Can it be used to implement heterogeneous array? – leader Nov 06 '14 at 06:15
  • @leader: yes, it will work fine to move around stuff of "hidden" types, but as you need to specify the type to get the data out of it as you'll need to perform any operation on it. – Matteo Italia Nov 06 '14 at 07:21
0

I guess you could keep an array of pointers to anything

void* stuff[size];
const char* str = "hello";
int x = 20;
int *array = malloc(sizeof(int) * 5);
stuff[0] = str;
stuff[1] = &x;
stuff[2] = array;

Alternatively, an array of unions if you knew all the types before hand.

ryanpattison
  • 6,151
  • 1
  • 21
  • 28
  • How will you take the input into your data structure? Can you explain? I guess you will use type casting and it requires checking the data type before hand right? – leader Nov 06 '14 at 06:20
  • I added some examples. An array of structs, each with a 'type' field and `void*` element may prove to be useful as well. – ryanpattison Nov 06 '14 at 06:27
  • filename.cpp [Error] 'stuff' does not name a type – leader Nov 06 '14 at 06:30
  • More importantly, after you store 1000's of things in `stuff`, **How do you know how to access it?** If I want to later access `stuff[963]`, **What tells me whether it is an `int`, `char *` or `array of ints`??** No downvote, but **How do you use the collection of varying types in `stuff[]` in any meaningful way??** – David C. Rankin Nov 06 '14 at 07:39
  • @DavidC.Rankin I gave a minimal example to show it can be done. In the comments here is a suggestion for an array of structs with type fields. – ryanpattison Nov 06 '14 at 07:43
  • I agree with the `array of structs`, that makes sense, the only penalty you pay is the additional storage (per element) for the unused types. The tagged union eliminates the wasted storage, but isn't the most convenient. But with an array of void pointers, I was really left wondering, How would you ever know what to cast any one element in the array to in order to ever make use of it again? I guess you could only add something like a tagged union, or, create a separate array of ints that represent the type of each element. It just seemed unworkable. – David C. Rankin Nov 06 '14 at 07:59