4

I want to write a template function that writes tables to HDF5 files. The signature should look similar to

template<typename record> void writeTable(const std::vector<record>& data);

where record is a struct, or

template<typename... elements> 
    void writeTable(const std::vector<std::tuple<elements...>>& data);

The actual implementation would have more parameters to determine the destionation, etc.

To write the data I need to define a HDF5 compound type, which contains the name and the offset of the members. Usually you would use the HOFFSET macro the get the field offset, but as I don't know the struct fields beforehand I can't do that.

What I tried so far was constructing a struct type from the typename pack. The naive implementation did not have standard layout, but the implementation here does. All that's left is get the offsets of the members. I would like to expand the parameter pack into an initializer list with the offsets:

#include <vector>

template<typename... members> struct record {};

template<typename member, typename... members> struct record<member, members...> : 
    record<members...> {
  record(member m, members... ms) : record<members...>(ms...), tail(m) {}
  member tail;
};

template<typename... Args> void 
    make_table(const std::string& name, const std::vector<record<Args...>>& data) {
  using record_type = record<Args...>;
  std::vector<size_t> offsets = { get_offset(record_type,Args)... };
}

int main() {
  std::vector<record<int, float>> table = { {1, 1.0}, {2, 2.0} };
  make_table("table", table);
}

Is there a possible implementation for get_offset? I would think not, because in the case of record<int, int> it would be ambiguous. Is there another way to do it?

Or is there any other way I could approach this problem?

max66
  • 65,235
  • 10
  • 71
  • 111
maufl
  • 369
  • 1
  • 5
  • 14
  • constexpr version of this question https://stackoverflow.com/questions/70647441/how-to-determine-the-offset-of-an-element-of-a-tuple-at-compile-time – alfC Jan 10 '22 at 05:35

2 Answers2

4

Calculating offsets is quite simple. Given a tuple with types T0, T1 ... TN. The offset of T0 is 0 (as long as you use alignas(T0) on your char array. The offset of T1 is the sizeof(T0) rounded up to alignof(T1).

In general, the offset of TB (which comes after TA) is round_up(offset_of<TA>() + sizeof(TA), alignof(TB)).

Calculating the offsets of elements in a std::tuple could be done like this:

constexpr size_t roundup(size_t num, size_t multiple) {
  const size_t mod = num % multiple;
  return mod == 0 ? num : num + multiple - mod;
}

template <size_t I, typename Tuple>
struct offset_of {
  static constexpr size_t value = roundup(
    offset_of<I - 1, Tuple>::value + sizeof(std::tuple_element_t<I - 1, Tuple>),
    alignof(std::tuple_element_t<I, Tuple>)
  );
};

template <typename Tuple>
struct offset_of<0, Tuple> {
  static constexpr size_t value = 0;
};

template <size_t I, typename Tuple>
constexpr size_t offset_of_v = offset_of<I, Tuple>::value;

Here's a test suite. As you can see from the first test, the alignment of elements is taken into account.

static_assert(offset_of_v<1, std::tuple<char, long double>> == 16);
static_assert(offset_of_v<2, std::tuple<char, char, long double>> == 16);
static_assert(offset_of_v<3, std::tuple<char, char, char, long double>> == 16);
static_assert(offset_of_v<4, std::tuple<char, char, char, char, long double>> == 16);

static_assert(offset_of_v<0, std::tuple<int, double, int, char, short, long double>> == 0);
static_assert(offset_of_v<1, std::tuple<int, double, int, char, short, long double>> == 8);
static_assert(offset_of_v<2, std::tuple<int, double, int, char, short, long double>> == 16);
static_assert(offset_of_v<3, std::tuple<int, double, int, char, short, long double>> == 20);
static_assert(offset_of_v<4, std::tuple<int, double, int, char, short, long double>> == 22);
static_assert(offset_of_v<5, std::tuple<int, double, int, char, short, long double>> == 32);

I hardcoded the offsets in the above tests. The offsets are correct if the following tests succeed.

static_assert(sizeof(char) == 1 && alignof(char) == 1);
static_assert(sizeof(short) == 2 && alignof(short) == 2);
static_assert(sizeof(int) == 4 && alignof(int) == 4);
static_assert(sizeof(double) == 8 && alignof(double) == 8);
static_assert(sizeof(long double) == 16 && alignof(long double) == 16);

std::tuple seems to store it's elements sequentially (without sorting them to optimize padding). That's proven by the following tests. I don't think the standard requires std::tuple to be implemented this way so I don't think the following tests are guaranteed to succeed.

template <size_t I, typename Tuple>
size_t real_offset(const Tuple &tup) {
  const char *base = reinterpret_cast<const char *>(&tup);
  return reinterpret_cast<const char *>(&std::get<I>(tup)) - base;
}

int main(int argc, char **argv) {
  using Tuple = std::tuple<int, double, int, char, short, long double>;
  Tuple tup;
  assert((offset_of_v<0, Tuple> == real_offset<0>(tup)));
  assert((offset_of_v<1, Tuple> == real_offset<1>(tup)));
  assert((offset_of_v<2, Tuple> == real_offset<2>(tup)));
  assert((offset_of_v<3, Tuple> == real_offset<3>(tup)));
  assert((offset_of_v<4, Tuple> == real_offset<4>(tup)));
  assert((offset_of_v<5, Tuple> == real_offset<5>(tup)));
}

Now that I've gone to all of this effort, would that real_offset function suit your needs?


This is a minimal implementation of a tuple that accesses a char[] with offset_of. This is undefined behavior though because of the reinterpret_cast. Even though I'm constructing the object in the same bytes and accessing the object in the same bytes, it's still UB. See this answer for all the standardese. It will work on every compiler you can find but it's UB so just use it anyway. This tuple is standard layout (unlike std::tuple). If the elements of your tuple are all trivially copyable, you can remove the copy and move constructors and replace them with memcpy.

template <typename... Elems>
class tuple;

template <size_t I, typename Tuple>
struct tuple_element;

template <size_t I, typename... Elems>
struct tuple_element<I, tuple<Elems...>> {
  using type = std::tuple_element_t<I, std::tuple<Elems...>>;
};

template <size_t I, typename Tuple>
using tuple_element_t = typename tuple_element<I, Tuple>::type;

template <typename Tuple>
struct tuple_size;

template <typename... Elems>
struct tuple_size<tuple<Elems...>> {
  static constexpr size_t value = sizeof...(Elems);
};

template <typename Tuple>
constexpr size_t tuple_size_v = tuple_size<Tuple>::value;

constexpr size_t roundup(size_t num, size_t multiple) {
  const size_t mod = num % multiple;
  return mod == 0 ? num : num + multiple - mod;
}

template <size_t I, typename Tuple>
struct offset_of {
  static constexpr size_t value = roundup(
    offset_of<I - 1, Tuple>::value + sizeof(tuple_element_t<I - 1, Tuple>),
    alignof(tuple_element_t<I, Tuple>)
  );
};

template <typename Tuple>
struct offset_of<0, Tuple> {
  static constexpr size_t value = 0;
};

template <size_t I, typename Tuple>
constexpr size_t offset_of_v = offset_of<I, Tuple>::value;

template <size_t I, typename Tuple>
auto &get(Tuple &tuple) noexcept {
  return *reinterpret_cast<tuple_element_t<I, Tuple> *>(tuple.template addr<I>());
}

template <size_t I, typename Tuple>
const auto &get(const Tuple &tuple) noexcept {
  return *reinterpret_cast<tuple_element_t<I, Tuple> *>(tuple.template addr<I>());
}

template <typename... Elems>
class tuple {
  alignas(tuple_element_t<0, tuple>) char storage[offset_of_v<sizeof...(Elems), tuple<Elems..., char>>];
  using idx_seq = std::make_index_sequence<sizeof...(Elems)>;

  template <size_t I>
  void *addr() {
    return static_cast<void *>(&storage + offset_of_v<I, tuple>);
  }

  template <size_t I, typename Tuple>
  friend auto &get(const Tuple &) noexcept;

  template <size_t I, typename Tuple>
  friend const auto &get(Tuple &) noexcept;

  template <size_t... I>
  void default_construct(std::index_sequence<I...>) {
    (new (addr<I>()) Elems{}, ...);
  }
  template <size_t... I>
  void destroy(std::index_sequence<I...>) {
    (get<I>(*this).~Elems(), ...);
  }
  template <size_t... I>
  void move_construct(tuple &&other, std::index_sequence<I...>) {
    (new (addr<I>()) Elems{std::move(get<I>(other))}, ...);
  }
  template <size_t... I>
  void copy_construct(const tuple &other, std::index_sequence<I...>) {
    (new (addr<I>()) Elems{get<I>(other)}, ...);
  }
  template <size_t... I>
  void move_assign(tuple &&other, std::index_sequence<I...>) {
    (static_cast<void>(get<I>(*this) = std::move(get<I>(other))), ...);
  }
  template <size_t... I>
  void copy_assign(const tuple &other, std::index_sequence<I...>) {
    (static_cast<void>(get<I>(*this) = get<I>(other)), ...);
  }

public:
  tuple() noexcept((std::is_nothrow_default_constructible_v<Elems> && ...)) {
    default_construct(idx_seq{});
  }
  ~tuple() {
    destroy(idx_seq{});
  }
  tuple(tuple &&other) noexcept((std::is_nothrow_move_constructible_v<Elems> && ...)) {
    move_construct(other, idx_seq{});
  }
  tuple(const tuple &other) noexcept((std::is_nothrow_copy_constructible_v<Elems> && ...)) {
    copy_construct(other, idx_seq{});
  }
  tuple &operator=(tuple &&other) noexcept((std::is_nothrow_move_assignable_v<Elems> && ...)) {
    move_assign(other, idx_seq{});
    return *this;
  }
  tuple &operator=(const tuple &other) noexcept((std::is_nothrow_copy_assignable_v<Elems> && ...)) {
    copy_assign(other, idx_seq{});
    return *this;
  }
};

Alternatively, you could use this function:

template <size_t I, typename Tuple>
size_t member_offset() {
  return reinterpret_cast<size_t>(&std::get<I>(*static_cast<Tuple *>(nullptr)));
}

template <typename Member, typename Class>
size_t member_offset(Member (Class::*ptr)) {
  return reinterpret_cast<size_t>(&(static_cast<Class *>(nullptr)->*ptr));
}

template <auto MemPtr>
size_t member_offset() {
  return member_offset(MemPtr);
}

Once again, this is undefined behavior (because of the nullptr dereference and the reinterpret_cast) but it will work as expected with every major compiler. The function cannot be constexpr (even though member offset is a compile-time calculation).

Indiana Kernick
  • 5,041
  • 2
  • 20
  • 50
  • This looks very good. Is the calculation in real_offset always correct? Even if the tuple does not have standard layout? – maufl Mar 11 '19 at 09:23
  • @maufl The calculation inside `real_offset` is always correct and works for any type (not just tuples). However, it uses `reinterpret_cast` so it's technically undefined behavior and it cannot be used in a `constexpr` context (even though it's a compile-time calculation). The calculation inside `offset_of` only works for standard layout types. `std::tuple` is standard layout so it will work for all tuples. A standard layout type is still standard layout even if it contains non-standard layout members. I'd recommend using `offset_of` because it's not undefined behavior. – Indiana Kernick Mar 11 '19 at 09:43
  • Thanks. I checked but in gcc, `std::tuple` does not have standard layout. Undefined behavior is what scares me, and why I did not pick an implementation yet. – maufl Mar 11 '19 at 22:12
  • @maufl I just checked with clang and you're right. Damn! – Indiana Kernick Mar 11 '19 at 22:21
  • @maufl Maybe you could find a way of implementing a tuple that is standard layout. One idea would be to store an array of bytes and use offset_of and placement new. That would be standard layout. – Indiana Kernick Mar 11 '19 at 22:23
  • @maufl I just had a look at the libc++ implementation of std::tuple. It's not standard layout because it uses multiple inheritance and stores an element in each of the base classes. The order that base classes are layed out in an object is undefined so std::tuple cannot be standard layout. Your record struct suffers the same problem so it isn't standard layout either. – Indiana Kernick Mar 11 '19 at 22:34
  • @maufl I implemented a tuple that stores a char array and accesses it with `offset_of` This is undefined behavior though. In my opinion, the standard is wrong. `real_offset` will work on every platform using any data type but it's undefined behavior. Just use `real_offset` (maybe give it a different name!). In most cases, compilers will just do "the sensible thing" when they encounter undefined behavior. In the case of `real_offset`, every compiler will just return the offset of the member without doing any funny business. – Indiana Kernick Mar 11 '19 at 23:33
1

Not sure to understand what do you exactly want but... what about using recursion based on a index sequence (starting from C++14) something as follows?

#include <vector>
#include <utility>
#include <iostream>

template <typename... members>
struct record
 { };

template <typename member, typename... members>
struct record<member, members...> : record<members...>
 {
   record (member m, members... ms) : record<members...>(ms...), tail(m)
    { }

   member tail;
 };

template <std::size_t, typename, std::size_t = 0u>
struct get_offset;

template <std::size_t N, typename A0, typename ... As, std::size_t Off>
struct get_offset<N, record<A0, As...>, Off> 
   : public get_offset<N-1u, record<As...>, Off+sizeof(A0)>
 { };

template <typename A0, typename ... As, std::size_t Off>
struct get_offset<0u, record<A0, As...>, Off> 
   : public std::integral_constant<std::size_t, Off>
 { };

template <typename... Args, std::size_t ... Is>
auto make_table_helper (std::string const & name,
                        std::vector<record<Args...>> const & data,
                        std::index_sequence<Is...> const &)
 { return std::vector<std::size_t>{ get_offset<Is, record<Args...>>::value... }; }

template <typename... Args>
auto make_table (std::string const & name,
                 std::vector<record<Args...>> const & data)
 { return make_table_helper(name, data, std::index_sequence_for<Args...>{}); }

int main ()
 {
   std::vector<record<int, float>> table = { {1, 1.0}, {2, 2.0} };

   auto v = make_table("table", table);

   for ( auto const & o : v )
      std::cout << o << ' ';

   std::cout << std::endl;
 }

Unfortunately isn't an efficient solution because the last value is calculated n-times.

max66
  • 65,235
  • 10
  • 71
  • 111
  • 2
    This doesn't take into account the alignment though? If the type doesn't have standard layout, is there any guarantee of the offset at all? – maufl Mar 11 '19 at 09:21
  • @maufl - right; just to give an idea of the recursion part; for the alignment, instead of `Off+sizeof(A0)`, I suppose you can use something as `Off+roundup(A0, alignof(A0))` (se the `roundup()` in the Kerndog73 answer) but, also in that case, I'm not sure that you can have guarantee at all. – max66 Mar 11 '19 at 10:09