124

Is there is any difference between using a std::tuple and a data-only struct?

typedef std::tuple<int, double, bool> foo_t;

struct bar_t {
    int id;
    double value;
    bool dirty;
}

From what I have found online, I found that there are two major differences: the struct is more readable, while the tuple has many generic functions that can be used. Should there be any significant performance difference? Also, is the data layout compatible with each other (interchangeably casted)?

Alex Koay
  • 2,915
  • 4
  • 18
  • 16
  • 1
    I just remarked I had forgotten about the *cast* question: the implementation of the `tuple` is implementation defined, therefore it depends on your implementation. Personally, I would *not* count on it. – Matthieu M. Jan 12 '13 at 18:29
  • I'm looking for a way to solve the following problem: https://stackoverflow.com/questions/888531/enumerate-members-of-a-structure – Maf Oct 11 '22 at 16:13

13 Answers13

49

We have a similar discussion about tuple and struct and I write some simple benchmarks with the help from one of my colleague to identify the differences in term of performance between tuple and struct. We first start with a default struct and a tuple.

struct StructData {
    int X;
    int Y;
    double Cost;
    std::string Label;

    bool operator==(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) == std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }

    bool operator<(const StructData &rhs) {
        return X < rhs.X || (X == rhs.X && (Y < rhs.Y || (Y == rhs.Y && (Cost < rhs.Cost || (Cost == rhs.Cost && Label < rhs.Label)))));
    }
};

using TupleData = std::tuple<int, int, double, std::string>;

We then use Celero to compare the performance of our simple struct and tuple. Below is the benchmark code and performance results collected using gcc-4.9.2 and clang-4.0.0:

std::vector<StructData> test_struct_data(const size_t N) {
    std::vector<StructData> data(N);
    std::transform(data.begin(), data.end(), data.begin(), [N](auto item) {
        std::random_device rd;
        std::mt19937 gen(rd());
        std::uniform_int_distribution<> dis(0, N);
        item.X = dis(gen);
        item.Y = dis(gen);
        item.Cost = item.X * item.Y;
        item.Label = std::to_string(item.Cost);
        return item;
    });
    return data;
}

std::vector<TupleData> test_tuple_data(const std::vector<StructData> &input) {
    std::vector<TupleData> data(input.size());
    std::transform(input.cbegin(), input.cend(), data.begin(),
                   [](auto item) { return std::tie(item.X, item.Y, item.Cost, item.Label); });
    return data;
}

constexpr int NumberOfSamples = 10;
constexpr int NumberOfIterations = 5;
constexpr size_t N = 1000000;
auto const sdata = test_struct_data(N);
auto const tdata = test_tuple_data(sdata);

CELERO_MAIN

BASELINE(Sort, struct, NumberOfSamples, NumberOfIterations) {
    std::vector<StructData> data(sdata.begin(), sdata.end());
    std::sort(data.begin(), data.end());
    // print(data);

}

BENCHMARK(Sort, tuple, NumberOfSamples, NumberOfIterations) {
    std::vector<TupleData> data(tdata.begin(), tdata.end());
    std::sort(data.begin(), data.end());
    // print(data);
}

Performance results collected with clang-4.0.0

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    196663.40000 |            5.08 | 
Sort            | tuple           | Null            |              10 |               5 |         0.92471 |    181857.20000 |            5.50 | 
Complete.

And performance results collected using gcc-4.9.2

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    219096.00000 |            4.56 | 
Sort            | tuple           | Null            |              10 |               5 |         0.91463 |    200391.80000 |            4.99 | 
Complete.

From the above results we can clearly see that

  • Tuple is faster than a default struct

  • Binary produce by clang has higher performance that that of gcc. clang-vs-gcc is not the purpose of this discussion so I won't dive into the detail.

We all know that writing a == or < or > operator for every single struct definition will be a painful and buggy task. Let replace our custom comparator using std::tie and rerun our benchmark.

bool operator<(const StructData &rhs) {
    return std::tie(X,Y,Cost, Label) < std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
}

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    200508.20000 |            4.99 | 
Sort            | tuple           | Null            |              10 |               5 |         0.90033 |    180523.80000 |            5.54 | 
Complete.

Now we can see that using std::tie makes our code more elegant and it is harder to make mistake, however, we will loose about 1% performance. I will stay with the std::tie solution for now since I also receive a warning about comparing floating point numbers with the customized comparator.

Until now we have not has any solution to make our struct code run faster yet. Let take a look at the swap function and rewrite it to see if we can gain any performance:

struct StructData {
    int X;
    int Y;
    double Cost;
    std::string Label;

    bool operator==(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) == std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }

    void swap(StructData & other)
    {
        std::swap(X, other.X);
        std::swap(Y, other.Y);
        std::swap(Cost, other.Cost);
        std::swap(Label, other.Label);
    }  

    bool operator<(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) < std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }
};

Performance results collected using clang-4.0.0

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    176308.80000 |            5.67 | 
Sort            | tuple           | Null            |              10 |               5 |         1.02699 |    181067.60000 |            5.52 | 
Complete.

And the performance results collected using gcc-4.9.2

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    198844.80000 |            5.03 | 
Sort            | tuple           | Null            |              10 |               5 |         1.00601 |    200039.80000 |            5.00 | 
Complete.

Now our struct is slightly faster than that of a tuple now (around 3% with clang and less than 1% with gcc), however, we do need to write our customized swap function for all of our structs.

hungptit
  • 1,414
  • 15
  • 16
  • It would be a better test if you add elementwise swap for your struct since swap(tuple, tuple) used in std::sort is implemented using elementwise swap – erzya Jul 09 '21 at 05:38
26

If you're using several different tuples in your code you can get away with condensing the number of functors you are using. I say this because I've often used the following forms of functors:

template<int N>
struct tuple_less{
    template<typename Tuple>
    bool operator()(const Tuple& aLeft, const Tuple& aRight) const{
        typedef typename boost::tuples::element<N, Tuple>::type value_type;
        BOOST_CONCEPT_REQUIRES((boost::LessThanComparable<value_type>));

        return boost::tuples::get<N>(aLeft) < boost::tuples::get<N>(aRight);
    }
};

This might seem like overkill but for each place within the struct I'd have to make a whole new functor object using a struct but for a tuple, I just change N. Better than that, I can do this for every single tuple as opposed to creating a whole new functor for each struct and for each member variable. If I have N structs with M member variables that NxM functors I would need to create (worse case scenario) that can be condensed down to one little bit of code.

Naturally, if you're going to go with the Tuple way, you're also going to need to create Enums for working with them:

typedef boost::tuples::tuple<double,double,double> JackPot;
enum JackPotIndex{
    MAX_POT,
    CURRENT_POT,
    MIN_POT
};

and boom, you're code is completely readable:

double guessWhatThisIs = boost::tuples::get<CURRENT_POT>(someJackPotTuple);

because it describes itself when you want to get the items contained within it.

wheaties
  • 35,646
  • 15
  • 94
  • 131
  • 8
    Uh... C++ has functions pointers, so `template struct struct_less { template bool operator()(C const&, C const&) const; };` should be possible. Spelling it out is slightly less convenient, but it's only written once. – Matthieu M. Aug 19 '15 at 18:38
18

Tuple has built in default(for == and != it compares every element, for <.<=... compares first, if same compares second...) comparators: http://en.cppreference.com/w/cpp/utility/tuple/operator_cmp

edit: as noted in the comment C++20 spaceship operator gives you a way to specify this functionality with one(ugly, but still just one) line of code.

NoSenseEtAl
  • 28,205
  • 28
  • 128
  • 277
  • 1
    In C++20, this remedied with minimal boilerplate using [the spaceship operator](https://devblogs.microsoft.com/cppblog/simplify-your-code-with-rocket-science-c20s-spaceship-operator/). – John McFarlane Sep 25 '20 at 06:06
7

Well, here's a benchmark that doesn't construct a bunch of tuples inside the struct operator==(). Turns out there's a pretty significant performance impact from using tuple, as one would expect given that there's no performance impact at all from using PODs. (The address resolver finds the value in the instruction pipeline before the logic unit ever even sees it.)

Common results from running this on my machine with VS2015CE using the default 'Release' settings:

Structs took 0.0814905 seconds.
Tuples took 0.282463 seconds.

Please monkey with it until you're satisfied.

#include <iostream>
#include <string>
#include <tuple>
#include <vector>
#include <random>
#include <chrono>
#include <algorithm>

class Timer {
public:
  Timer() { reset(); }
  void reset() { start = now(); }

  double getElapsedSeconds() {
    std::chrono::duration<double> seconds = now() - start;
    return seconds.count();
  }

private:
  static std::chrono::time_point<std::chrono::high_resolution_clock> now() {
    return std::chrono::high_resolution_clock::now();
  }

  std::chrono::time_point<std::chrono::high_resolution_clock> start;

};

struct ST {
  int X;
  int Y;
  double Cost;
  std::string Label;

  bool operator==(const ST &rhs) {
    return
      (X == rhs.X) &&
      (Y == rhs.Y) &&
      (Cost == rhs.Cost) &&
      (Label == rhs.Label);
  }

  bool operator<(const ST &rhs) {
    if(X > rhs.X) { return false; }
    if(Y > rhs.Y) { return false; }
    if(Cost > rhs.Cost) { return false; }
    if(Label >= rhs.Label) { return false; }
    return true;
  }
};

using TP = std::tuple<int, int, double, std::string>;

std::pair<std::vector<ST>, std::vector<TP>> generate() {
  std::mt19937 mt(std::random_device{}());
  std::uniform_int_distribution<int> dist;

  constexpr size_t SZ = 1000000;

  std::pair<std::vector<ST>, std::vector<TP>> p;
  auto& s = p.first;
  auto& d = p.second;
  s.reserve(SZ);
  d.reserve(SZ);

  for(size_t i = 0; i < SZ; i++) {
    s.emplace_back();
    auto& sb = s.back();
    sb.X = dist(mt);
    sb.Y = dist(mt);
    sb.Cost = sb.X * sb.Y;
    sb.Label = std::to_string(sb.Cost);

    d.emplace_back(std::tie(sb.X, sb.Y, sb.Cost, sb.Label));
  }

  return p;
}

int main() {
  Timer timer;

  auto p = generate();
  auto& structs = p.first;
  auto& tuples = p.second;

  timer.reset();
  std::sort(structs.begin(), structs.end());
  double stSecs = timer.getElapsedSeconds();

  timer.reset();
  std::sort(tuples.begin(), tuples.end());
  double tpSecs = timer.getElapsedSeconds();

  std::cout << "Structs took " << stSecs << " seconds.\nTuples took " << tpSecs << " seconds.\n";

  std::cin.get();
}
Khatharr
  • 119
  • 2
  • 8
  • 1
    Thanks for this. I noticed that when optimised with `-O3`, `tuples` took less time than `structs`. – Simog Apr 13 '20 at 00:27
  • Yes, I added the edit. The tuples are almost 7 times faster. https://godbolt.org/z/h3eaEPv8q – Amin Ya Apr 28 '21 at 07:10
7

Also, is the data layout compatible with each other (interchangeably casted)?

Oddly I can't see a direct response to this part of the question.

The answer is: no. Or at least not reliably, as the layout of the tuple is unspecified.

Firstly, your struct is a Standard Layout Type. The ordering, padding and alignment of the members are well-defined by a combination of the standard and your platform ABI.

If a tuple was a standard layout type, and we knew the fields were laid out in the order the types are specified, we might have some confidence it would match the struct.

The tuple is normally implemented using inheritance, in one of two ways: the old Loki/Modern C++ Design recursive style, or the newer variadic style. Neither is a Standard Layout type, because both violate the following conditions:

  1. (prior to C++14)

    • has no base classes with non-static data members, or

    • has no non-static data members in the most derived class and at most one base class with non-static data members

  2. (for C++14 and later)

    • Has all non-static data members and bit-fields declared in the same class (either all in the derived or all in some base)

since each leaf base class contains a single tuple element (NB. a single-element tuple probably is a standard layout type, albeit not a very useful one). So, we know the standard does not guarantee the tuple has the same padding or alignment as the struct.

Additionally, it's worth noting that the older recursive-style tuple will generally lay out the data members in reverse order.

Anecdotally, it has sometimes worked in practice for some compilers and combinations of field types in the past (in one case, using recursive tuples, after reversing the field order). It definitely doesn't work reliably (across compilers, versions etc.) now, and was never guaranteed in the first place.

Useless
  • 64,155
  • 6
  • 88
  • 132
6

Judging by other answers, performance considerations are minimal at best.

So it really should come down to practicality, readability and maintainability. And struct is generally better because it creates types which are easier to read and to understand.

Sometimes, a std::tuple (or even std::pair) might be necessary to deal with code in a highly generic way. For example, some operations related to variadic parameter packs would be impossible without something like std::tuple. std::tie is a great example of when std::tuple can improve code (prior to C++20).

But anywhere you can use a struct, you probably should use a struct. It will bestow semantic meaning on the elements of your type. That's invaluable in understanding and using the type. In turn, this can help avoid silly mistakes:

// hard to get wrong; easy to understand
cat.arms = 0;
cat.legs = 4;

// easy to get wrong; hard to understand
std::get<0>(cat) = 0;
std::get<1>(cat) = 4;
John McFarlane
  • 5,528
  • 4
  • 34
  • 38
4

Well, a POD struct can often be (ab)used in low-level contiguous chunk reading and serializing. A tuple might be more optimized in certain situations and support more functions, as you said.

Use whatever is more appropriate for the situation, there ain't no general preference. I think (but I haven't benchmarked it) that performance differences won't be significant. The data layout is most likely not compatible and implementation specific.

orlp
  • 112,504
  • 36
  • 218
  • 315
4

Don't worry about speed or layout, that's nano-optimisation, and depends on the compiler, and there's never enough difference to influence your decision.

You use a struct for things that meaningfully belong together to form a whole.

You use a tuple for things that are together coincidentally. You can use a tuple spontaneously in your code.

gnasher729
  • 51,477
  • 5
  • 75
  • 98
3

My experience is that over time functionality starts to creep up on types (like POD structs) which used to be pure data holders. Things like certain modifications which shouldn't require inside knowledge of the data, maintaining invariants etc.

That is a good thing; it's the foundation of object orientation. It is the reason why C with classes was invented. Using pure data collections like tuples are not open to such logical extension; structs are. That's why I would almost always opt for structs.

Related is that like all "open data objects", tuples violate the information hiding paradigm. You cannot change that later without throwing out the tuple wholesale. With a struct, you can move gradually towards access functions.

Another issue is type safety and self-documenting code. If your function receives an object of type inbound_telegram or location_3D it's clear; if it receives a unsigned char * or tuple<double, double, double> it is not: the telegram could be outbound, and the tuple could be a translation instead of a location, or perhaps the minimum temperature readings from the long weekend. Yes, you can typedef to make intentions clear but that does not actually prevent you from passing temperatures.

These issues tend to become important in projects which exceed a certain size; the disadvantages of tuples and advantages of elaborate classes become not visible and indeed are an overhead in small projects. Starting with proper classes even for inconspicuous little data aggregates pays late dividends.

Of course one viable strategy would be to use a pure data holder as the underlying data provider for a class wrapper which provides operations on that data.

Peter - Reinstate Monica
  • 15,048
  • 4
  • 37
  • 62
3

As far as the "generic function" go, Boost.Fusion deserves some love... and especially BOOST_FUSION_ADAPT_STRUCT.

Ripping from the page: ABRACADBRA

namespace demo
{
    struct employee
    {
        std::string name;
        int age;
    };
}

// demo::employee is now a Fusion sequence
BOOST_FUSION_ADAPT_STRUCT(
    demo::employee
    (std::string, name)
    (int, age))

This means that all Fusion algorithms are now applicable to the struct demo::employee.


EDIT: Regarding the performance difference or layout compatibility, tuple's layout is implementation defined so not compatible (and thus you should not cast between either representation) and in general I would expect no difference performance-wise (at least in Release) thanks to the inlining of get<N>.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 16
    I do not believe that this is the top voted answer. It doesn't even reply to the question. The question is about `tuple`s and `struct`s, not boost! – gsamaras Apr 28 '14 at 12:04
  • @G.Samaras: The question is about the difference between tuples and `struct`, and notably the abundance of algorithms to manipulate tuples against the absence of algorithms to manipulate structs (beginning by iterating over its fields). This answer shows that this gap can be bridged by using Boost.Fusion, bringing to `struct`s as many algorithms as there are on tuples. I added a small blurb on the exact two questions asked. – Matthieu M. Jul 25 '14 at 06:18
1

There shouldn't be a performance difference (even an insignificant one). At least in the normal case, they will result in the same memory layout. Nonetheless, casting between them probably isn't required to work (though I'd guess there's a pretty fair chance it normally will).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • 4
    Actually I think there might be a small difference. A `struct` must allocate at least 1 byte for each subobject while I think that a `tuple` can get away with optimizing out the empty objects. Also, with regard to packing and alignment, it could be that tuples have more leeway. – Matthieu M. May 02 '11 at 07:00
0

I know it is an old theme, however I am now about to make a decision about part of my project: should I go the tuple-way or struct-way. After reading this thread I have some ideas.

  1. About the wheaties and the performance test: please note that you can usually use memcpy, memset and similar tricks for structs. This would make the performance MUCH better than for tuples.

  2. I see some advantages in tuples:

    • You can use tuples to return a collection of variables from function or method and decrease a number of types you use.
    • Based on the fact that tuple has predefined <,==,> operators you can also use tuple as a key in map or hash_map which is much more cost effective that struct where you need to implement these operators.

I have searched the web and eventually reached this page: https://arne-mertz.de/2017/03/smelly-pair-tuple/

Generally I agree with a final conclusion from above.

Peter - Reinstate Monica
  • 15,048
  • 4
  • 37
  • 62
Tom K
  • 1
-2

There is no burden of compatible C memory layout, etc., which is more conducive to optimization.

sunxing
  • 15
  • 6
  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 30 '21 at 09:17