Your implication that having unions without the possibility of reading their inactive members makes them useless is wrong. Consider the following simplified implementation of a string class:
class string {
char* data_;
size_t size_;
union {
size_t capacity_;
char buffer_[16];
};
string(const char* str) : size_(strlen(str)) {
if (size_ < 16)
data_ = buffer_; // short string, buffer_ will be active
else {
capacity_ = size_; // long string, capacity_ is active
data_ = new char[capacity_ + 1];
}
memcpy(data_, str, size_ + 1);
}
bool is_short() const { return data_ == buffer_; }
...
public:
size_t capacity() const { return is_short() ? 15 : capacity_; }
const char* data() const { return data_; }
...
};
Here, if the stored string has less then 16 characters, it is stored in buffer_
and data_
points to it. Otherwise, data_
points to a dynamically-allocated buffer.
Consequently, you can distinguish between both cases (short/long string) by comparing data_
with buffer_
. When the string is short, buffer_
is active and you don't need to read capacity_
, since you know it is 15. When the string is long, capacity_
is active and you don't need to read buffer_
, since it is unused.
Exactly this approach is used in libstdc++. It is a bit more complicated there since std::string
is just a specialization of std::basic_string
class template, but the idea is the same. Source code from include/bits/basic_string.h
:
enum { _S_local_capacity = 15 / sizeof(_CharT) };
union
{
_CharT _M_local_buf[_S_local_capacity + 1];
size_type _M_allocated_capacity;
};
It can save a lot of space if your program works with a lot of strings at once (consider, e.g., databases). Without union, each string
objects would take 8 more bytes in memory.