Would it be beneficial to use a union?

Question

While working on my classes I started to experiment with some design decisions that I might want to incorporate into my existing Register class. I was thinking of combining the corresponding std::uint8_t, std::uint16_t, std::uint32_t and or std::uint64_t into a nested nameless struct - union data structure with a std::bitset<value> within my templated class depending on the class's template type upon instantiation where value is the size of bits of that type.

Here is the arbitrary class that I was working on to incorporate the data structure design. It is using the same header files that my Register class uses that can be found below.

template<typename T>
struct MyRegT {
    union {
        struct {
            T value : (sizeof(T) * CHAR_BIT);
        };
        std::bitset<sizeof(T) * CHAR_BIT> bits;
    } reg;

    explicit MyRegT(T value) : reg{ value } {}
};

Both class's compile and are reporting the same expected values. The only difference between the two is that the class above uses less internal memory due to the union in which can be a good thing.

However, when using union's and bitfields one does have to be careful due to many different factors such as the endian of the architecture, the operating system itself, the compiler in how it pads the memory within structures.

In this specific case what would be the implications if I did incorporate this; would it be an issue with portability, thread safety or thread capable, does it maintain readability, etc?

These are just concerns that I have before I change my existing class. I already mentioned the benefits of reusing the same memory and in this context I believe it should be okay as in a specific use case:

MyRegT<std::uint8_t> r8{32};

Here the union would be between a std::uint8_t and a std::bitset<8> and the value would contain 32 while the bitset would be able to provide the user with a string of the current stored bits or binary representation of the value 32 stored as a std::uint8_t as well as any of the other features that std::bitset has to offer. Any changes in one should reflect in the other. As they are mutually the same entity just different representations of it.

If I was to use something on this lines, would it be better to have the std::bitset<...> inside of the nameless struct and the std::uintN_t where N is the size in bits outside of the struct?

I would like to know are there any other missing pros and all of the cons towards this approach before I decide make any major changes to my already existing code base.

Here is my full existing Register class with all of its current functionality for a complete reference.

Register.h

#pragma once

#include <algorithm>
#include <bitset>
#include <cassert>
#include <climits>
#include <cstdint>
#include <exception>
#include <iterator>
#include <iostream>
#include <iomanip>
#include <limits>
#include <type_traits>

namespace vpc {
    using u8  = std::uint8_t;
    using u16 = std::uint16_t;
    using u32 = std::uint32_t;
    using u64 = std::uint64_t;

    template<typename T>
    struct Register;

    using Reg8  = Register<u8>;
    using Reg16 = Register<u16>;
    using Reg32 = Register<u32>;
    using Reg64 = Register<u64>;

    template<typename T>
    struct Register {
        T value;
        std::bitset<sizeof(T)* CHAR_BIT> bits;

        Register() : value{ 0 }, /*previous_value{ 0 },*/ bits{ 0 } {}

        template<typename U, std::enable_if_t<(sizeof(U) > sizeof(T))>* = nullptr>
        explicit Register(const U val, const u8 idx = 0) :
            value{ static_cast<T>((val >> std::size(bits) * idx) &
                  std::numeric_limits<std::make_unsigned_t<T>>::max()) },
            bits{ value }
        {
            constexpr u16 sizeT = sizeof(T);
            constexpr u16 sizeU = sizeof(U);
            assert((idx >= 0) && (idx <= ((sizeU / sizeT) - 1)) );
        }

        template<typename U, std::enable_if_t<(sizeof(U) < sizeof(T))>* = nullptr>
        explicit Register(const U val, const u8 idx = 0) :
            value{ static_cast<T>((static_cast<T>(val) << sizeof(U)*CHAR_BIT*idx) &
                  std::numeric_limits<std::make_unsigned_t<T>>::max()) },
            bits{ value }
        {
            constexpr u16 sizeT = sizeof(T);
            constexpr u16 sizeU = sizeof(U);
            assert((idx >= 0) && (idx <= ((sizeT / sizeU) - 1)) );
        }

        template<typename U, std::enable_if_t<(sizeof(U) == sizeof(T))>* = nullptr>
        explicit Register(const U val, const u8 idx = 0) :
            value{ static_cast<T>( val ) }, bits{ value }
        {}

        template<typename... Args>
        Register(Args... args) {}

        template<typename U>
        Register(const Register<U>& reg, const u8 idx = 0) : Register(reg.value, idx) {}

        void changeEndian() {
            T tmp = value;
            char* const p = reinterpret_cast<char*>(&tmp);
            for (size_t i = 0; i < sizeof(T) / 2; ++i)
                std::swap(p[i], p[sizeof(T) - i - 1]);
            bits = tmp;
        }

        Register& operator=(const Register& obj) {
            this->value = obj.value;
            //this->previous_value = obj.previous_value;
            this->bits = obj.bits;
            return *this;
        }

        template<typename Lhs, typename Rhs>
        friend auto operator+(const Register<Lhs>& l, const Register<Rhs>& r);

        template<typename Lhs, typename Rhs>
        friend auto operator-(const Register<Lhs>& l, const Register<Rhs>& r);

        template<typename Lhs, typename Rhs>
        friend auto operator*(const Register<Lhs>& l, const Register<Rhs>& r);

        template<typename Lhs, typename Rhs>
        friend auto operator/(const Register<Lhs>& l, const Register<Rhs>& r);

        template<typename Lhs, typename Rhs>
        friend auto operator%(const Register<Lhs>& l, const Register<Rhs>& r);

        template<typename Lhs, typename Rhs>
        friend auto operator&(const Register<Lhs>& l, const Register<Rhs>& r);

        template<typename Lhs, typename Rhs>
        friend auto operator|(const Register<Lhs>& l, const Register<Rhs>& r);

        template<typename Lhs, typename Rhs>
        friend auto operator^(const Register<Lhs>& l, const Register<Rhs>& r);

        template<typename Reg>
        friend auto operator~(const Register<Reg>& l);
    };

    template<typename Lhs, typename Rhs>
    auto operator+(const Register<Lhs>& l, const Register<Rhs>& r) {    
        return Register<decltype(l.value + r.value)>{ l.value + r.value };
    }

    template<typename Lhs, typename Rhs>
    auto operator-(const Register<Lhs>& l, const Register<Rhs>& r) {
        return Register<decltype(l.value - r.value)>{ l.value - r.value };
    }

    template<typename Lhs, typename Rhs>
    auto operator*(const Register<Lhs>& l, const Register<Rhs>& r) {
        return Register<decltype(l.value * r.value)>{ l.value * r.value };
    }

    template<typename Lhs, typename Rhs>
    auto operator/(const Register<Lhs>& l, const Register<Rhs>& r) {
        if (r.value == 0)
            throw std::exception( "Division by 0\n" );
        return Register<decltype(l.value / r.value)>{ l.value / r.value };
    }

    template<typename Lhs, typename Rhs>
    auto operator%(const Register<Lhs>& l, const Register<Rhs>& r) {
        return Register<decltype(l.value % r.value)>{ l.value % r.value };
    }

    template<typename Lhs, typename Rhs>
    auto operator&(const Register<Lhs>& l, const Register<Rhs>& r) {
        return Register<decltype(l.value & r.value)>{ l.value & r.value};
    }

    template<typename Lhs, typename Rhs>
    auto operator|(const Register<Lhs>& l, const Register<Rhs>& r) {
        return Register<decltype(l.value | r.value)>{ l.value | r.value};
    }

    template<typename Lhs, typename Rhs>
    auto operator^(const Register<Lhs>& l, const Register<Rhs>& r) {
        return Register<decltype(l.value ^ r.value)>{ l.value ^ r.value };
    }

    template<typename Reg>
    auto operator~(const Register<Reg>& r) {
        return Register<decltype(~r.value)>{~r.value};
    }

    template<typename T>
    std::ostream& operator<<(std::ostream& os, const Register<T>& r) {
        return os << "Reg" << std::size(r.bits) << '(' << +r.value << ")"
            << "\nhex: 0x" << std::uppercase << std::setfill('0') << std::setw(sizeof(T) * 2) << std::hex
            << +r.bits.to_ullong() << std::dec << "\nbin: "
            << r.bits << "\n\n";
    }

    template<typename T>
    T changeEndian(T in) {
        char* const p = reinterpret_cast<char*>(&in);
        for (size_t i = 0; i < sizeof(T) / 2; ++i)
            std::swap(p[i], p[sizeof(T) - i - 1]);
        return in;
    }

    template<typename T>
    Register<T> reverseBitOrder(Register<T>& reg, bool copy = false) {
        static constexpr u16 BitCount = sizeof(T) * CHAR_BIT;

        auto str = reg.bits.to_string();
        std::reverse(str.begin(), str.end());

        if (copy) { // return a copy
            Register<T> cpy;
            cpy.bits = std::bitset<BitCount>(str);
            cpy.value = static_cast<T>(cpy.bits.to_ullong());
            return cpy;
        }
        else {
            reg.bits = std::bitset<BitCount>(str);
            //reg.previous_value = reg.value;
            reg.value = static_cast<T>(reg.bits.to_ullong());
            return {};
        }
    }    
} // namespace vpc

@L.F. Oh in my original class when I create an instance of one it would look like this is user code: `Reg8 r8{32};`, `Reg16 r16{0xABCD};` etc. as I'm not using the `template` directly since I have `Reg8`, `Reg16` etc. declared above the class. As for the test class that I'm working on; I am templating it out `MyRegT r32{0xABCDEF01};` — Francis Cugler, Sep 28 '19 at 03:02
@L.F. It could be, and I'm not 100% sure, but I'm not having any issues in Visual Studio 2017... And I have written functions to do unit testing on the original class, but haven't gotten around to doing it for the modified version... I have tried various values for the different types `u8`, `u16`, `u32` and `u64` where each of them correspond to `std::uintX_t`. And so far they are producing the same results as my current class. — Francis Cugler, Sep 28 '19 at 03:03
@L.F. In the original class I've also tested the operators and they seem to work too. Now one can construct a `Reg64` from a `Reg32` or 2 `Reg32` objects due to the constructors. As for the operators; I haven't tested them on different types, but as long as they are the same type the operators are working. — Francis Cugler, Sep 28 '19 at 03:06
Hmm ... "it seems to be working correctly" is just one possible outcome of [undefined behavior](https://stackoverflow.com/q/2397984). The code can break due to compiler update, or new optimization strategies, or change in environment temperature, ... — L. F., Sep 28 '19 at 03:09

score 3 · Answer 1 · edited Sep 28 '19 at 05:13

3

Union is not a drop-in replacement for your class. You have member functions some of which read bits member while others read value (maybe there is some reading both, I didn't through it all).

Only one of the union members can be active at any moment, and behaviour of reading the inactive member is undefined (there is an exception, which don't apply here).

Furthermore, anonymous structs are ill-formed in C++.

edited Sep 28 '19 at 05:13

answered Sep 28 '19 at 03:27

eerorika

232,697
12
197
326

Thank you for your feed back; I'm familiar with but rarely use them so I just wanted some clarification. I could probably incorporate it but would have to rewrite the entire class around the `union { struct { }; };` object. The current way I'm doing this is i'm storing 2 variables so if I have a `Reg8` object there's at least 1 byte for the `std::uint8_t` and how ever many bytes it takes for `std::bitset<8>` not counting padding and other things... So I was thinking of trying to consolidate the memory of footprint of the two members into a single entity since they do represent the same thing. – Francis Cugler Sep 28 '19 at 03:54
@FrancisCugler You could simply store a `std::uint8_t`. You can access its bits using bitwise operations; no bitset needed. – eerorika Sep 28 '19 at 03:55
I know I could probably do that; however I like the representation of bitset to easily view the stored bits either as a `string` or `ulong` type and some of the built in operations that the class template can do such as it's manipulators and the `stream` insertion and extraction operators. I'm in the process of using this class in a `Hardware - CPU` emulator program. I can easily declare a bunch of registers as a CPU would contain but instead of doing a bunch of `array[]` manipulation I can store these into a `vector` index the one I need and easily get the binary or hex representation out... – Francis Cugler Sep 28 '19 at 04:02
I'm thinking this is more for displaying and debugging purposes while I build my emulator and once I have the `emulated hardware` such as the `6502` or some other arbitrary CPU then I could exchange the existing class with a basic register class without the `bitset` portion once I know the code is working. And the replacement should be fairly easy. – Francis Cugler Sep 28 '19 at 04:04
1

@FrancisCugler Then write a converter from your integer to a bitset. You cannot have the same memory viewed as both an integer and a bitset in standard-compliant C++. – Yakk - Adam Nevraumont Sep 28 '19 at 16:30
@Yakk-AdamNevraumont Okay; that's what I wanted to know. – Francis Cugler Sep 28 '19 at 17:03

Would it be beneficial to use a union?

1 Answers1