17

Is there no function in the standard library like this?

set<T> set::union(set<T> other)

Or even this?

set<T> getUnion(set<T> a, set<T> b)

set_union is the right function in name only. It can operate on vector also, which means it may not be as efficient as a set-only function.

I am not appending. Appending destroys the original set. I want a new set representing the union.

Chris Redford
  • 16,982
  • 21
  • 89
  • 109
  • @Dlotan I'm not trying to append. I want a new set representing the union. – Chris Redford May 02 '14 at 21:16
  • 2
    so someone there talked about: http://www.cplusplus.com/reference/algorithm/set_union/ – Florian Groetzner May 02 '14 at 21:17
  • @ChrisRedford See the second answer of the duplicate, `set_union` – Daniel Frey May 02 '14 at 21:17
  • @ChrisRedford When you append, you get a union. – juanchopanza May 02 '14 at 21:18
  • @DanielFrey I was aware of `set_union` before asking and it is the right function in name only. It can operate on vectors also, which means it may not be as efficient as a `set`-only function. – Chris Redford May 02 '14 at 21:19
  • @juanchopanza When you append, you destroy the original set. – Chris Redford May 02 '14 at 21:20
  • @ChrisRedford Don't underestimate the efficiency of [`set_union`](http://en.cppreference.com/w/cpp/algorithm/set_union), as it requires ***sorted*** ranges (as given by a `std::set`). It is, in fact, linear to the number of elements in both ranges, I don't see how you can hope for more. – Daniel Frey May 02 '14 at 21:24
  • @ChrisRedford Well then you copy the original into another one. – juanchopanza May 02 '14 at 21:24
  • @DanielFrey `set_union` has to insert into an output set, right? And these insertions are log(N). – juanchopanza May 02 '14 at 21:25
  • @juanchopanza Good point, but OP want a new `set` which is the union of both existing sets. Since the inserts are in-order, `set_union` might even use a hint to insert after `end()` in each case. I don't know if it can get any better than that, but my point is that `set_union` is not a stupid O(n^2) algorithm :) – Daniel Frey May 02 '14 at 21:29
  • @keyser that ignores the cost of inserting into the output container. – juanchopanza May 02 '14 at 21:30
  • @DanielFrey That is a good point. I might go and profile this if I find a spare moment. – juanchopanza May 02 '14 at 21:31
  • in c++ 17, you can get a merge method, which would be great https://en.cppreference.com/w/cpp/container/set/merge . But the Complexity N*log(size()+N)), where N is source.size() , is the same as you just insert it – Y00 Sep 15 '19 at 21:56

2 Answers2

28

You can use the two-iterator std::set::insert template for this:

template <typename T>
std::set<T> getUnion(const std::set<T>& a, const std::set<T>& b)
{
  std::set<T> result = a;
  result.insert(b.begin(), b.end());
  return result;
}

Note: Following some of the comments suggesting I take one of the parameters by value because I need a copy anyway, I chose this implementation to avoid disallowing RVO, which is not allowed when returning parameter taken by value. To better deal with rvalue arguments, overloads of this function taking rvalue reverences and leveraging move semantics could be provided.

juanchopanza
  • 223,364
  • 34
  • 402
  • 480
  • 1
    wouldn't `std::set_union` be more efficient? – sehe May 02 '14 at 21:18
  • 5
    Pro tip: when you're passing a const reference to a function then immediately taking a copy of it, just pass by value instead. – Mark Ransom May 02 '14 at 21:19
  • 2
    @MarkRansom I am returning the copy, and I don't want to inhibit RVO. Copying the argument and returning it would inhibit it. So I don't think the pro tip is a good one in this case. – juanchopanza May 02 '14 at 21:20
  • @sehe I'm not sure it is. It is linear in comparisons, but each set insertion is log(N). Or did I miss something? – juanchopanza May 02 '14 at 21:23
  • @juanchopanza `insert` is guaranteed to attempt insertion, so it could be considerably more costly if you did e.g. `b = a; a.insert(b.begin(), b.end());` (it's likely up to QoI though) – sehe May 02 '14 at 21:24
  • @juanchopanza why it inhibit RVO ? use some thing like this : http://paste.ofcode.org/uhGfDnXeum8HMwc65dM5Qn – uchar May 02 '14 at 21:25
  • @omid Yes, that inhibits RVO. Otherwise I would have suggested it in my answer. – juanchopanza May 02 '14 at 21:26
  • @omid but make `b` by `const&`. – sehe May 02 '14 at 21:27
  • @omid See http://stackoverflow.com/questions/9444485/why-is-rvo-disallowed-when-returning-a-parameter – juanchopanza May 02 '14 at 21:35
  • 1
    @juanchopanza, thanks for the link, I was curious too. Just goes to show that you can't rely on *any* rules of thumb. – Mark Ransom May 02 '14 at 22:57
  • @juanchopanza each set insertion is O(log(N)) where N is the current size, but this is repeated b.size() times. According to cppreference.com, this insert is "often implemented as a loop that calls the overload (3) with end() as the hint; they are optimized for appending a sorted sequence (such as another set) whose smallest element is greater than the last". Using the position after the last inserted element would be a better choice for the hint, since it would perform just as well in the given case while also making your usage linear rather than NlogN. – cxseven May 05 '20 at 04:28
1

There's std::set_union.

The example from that page uses vectors and arrays, so it's pretty versatile:

// set_union example
#include <iostream>     // std::cout
#include <algorithm>    // std::set_union, std::sort
#include <vector>       // std::vector

int main () {
  int first[] = {5,10,15,20,25};
  int second[] = {50,40,30,20,10};
  std::vector<int> v(10);                      // 0  0  0  0  0  0  0  0  0  0
  std::vector<int>::iterator it;

  std::sort (first,first+5);     //  5 10 15 20 25
  std::sort (second,second+5);   // 10 20 30 40 50

  it=std::set_union (first, first+5, second, second+5, v.begin());
                                               // 5 10 15 20 25 30 40 50  0  0
  v.resize(it-v.begin());                      // 5 10 15 20 25 30 40 50

  std::cout << "The union has " << (v.size()) << " elements:\n";
  for (it=v.begin(); it!=v.end(); ++it)
    std::cout << ' ' << *it;
  std::cout << '\n';

  return 0;
}

Output:

The union has 8 elements:
 5 10 15 20 25 30 40 50
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
keyser
  • 18,829
  • 16
  • 59
  • 101