1

I have seen some similar questions, like this, this and this, however they all are quite old and might be obsolete.

It is 2023 and the latest C++ standard is C++20 which was published in 2020, and different standards of C++ are very different.

So in C++20 are there library methods for set difference, intersection and union of unordered_set that was included using #include <unordered_set>?

I know in Python, if a and b are sets, a & b is the intersection, a - b is one way difference (elements in a that are not in b), a ^ b is the symmetric difference, a | b is the union. Are there similar methods in C++ that are more efficient than for loop based solutions?

I am reading this and there seems to be set_union, set_difference and set_intersection methods, but I don't know what those are.

Of course I know how to brute force it:

#include <iostream>
#include <unordered_set>

using intSet = std::unordered_set<int>;

void print_set(const intSet& set) {
    std::cout << "{";
    for (int i : set) {
        std::cout << i << ", ";
    }
    std::cout << "}\n";
}

int main()
{
    intSet a = { 16, 1, 14, 7, 18, 5, 12, 19 };
    intSet b = { 0, 9, 1, 8, 19, 2, 18, 13 };
    intSet intersection_set;
    intSet diff_set;
    intSet union_set;
    for (int i : a) {
        if (b.count(i)) {
            intersection_set.insert(i);
        } else {
            diff_set.insert(i);
        }
        union_set.insert(i);
    }
    for (int i : b) {
        union_set.insert(i);
    }
    std::cout << "a = ";
    print_set(a);
    std::cout << "b = ";
    print_set(b);
    std::cout << "a & b = ";
    print_set(intersection_set);
    std::cout << "a - b = ";
    print_set(diff_set);
    std::cout << "a | b = ";
    print_set(union_set);
}
a = {16, 1, 14, 7, 18, 5, 12, 19, }
b = {8, 0, 1, 9, 19, 18, 2, 13, }
a & b = {1, 18, 19, }
a - b = {16, 14, 7, 5, 12, }
a | b = {16, 1, 14, 7, 18, 5, 12, 19, 8, 0, 9, 2, 13, }
Jarod42
  • 203,559
  • 14
  • 181
  • 302
Ξένη Γήινος
  • 2,181
  • 1
  • 9
  • 35
  • 5
    The Python article just seems wrong, as in C++ an `unordered_set` is very different from a `set`. The set algorithms have a precondition that the sets are ordered, which the `unordered_set` obviously is not. – BoP Aug 25 '23 at 09:52
  • You say, *more efficient than for loop based solutions* ... but what makes you think, if such methods were implemented for **unordered** sets, that they wouldn't be (of necessity) loop-based? – Adrian Mole Aug 25 '23 at 11:08
  • 1
    You won't get much better than your "brute force" method. For the intersection, you can iterate over the smaller set, but that is all you can do when dealing with *unordered* sets. – Nelfeal Aug 25 '23 at 11:14
  • Maybe you could implement `&`, `|` and others for unordered (and even ordered) sets and offer them to the Standards Committee for release in C++26? – Adrian Mole Aug 25 '23 at 11:22

1 Answers1

4

The algorithms from <algorithm> aren't designed to work with std::unordered_set but with ordered ranges. For std::unordered_set, you can use its member functions in order to compute unions and differences, and std::erase_if for intersections.

In-place

#include <unordered_set>

std::unordered_set<int> a, b; // TODO: fill with data

// a | b  (set union)
a.insert(b.begin(), b.end());

// a - b  (set difference)
a.erase(b.begin(), b.end());

// a & b  (set intersection)
std::erase_if(a, [&b](int e) {
    return !b.contains(e);
});

Non-modifying

#include <unordered_set>
#include <ranges>

const std::unordered_set<int> a, b; // TODO: fill with data

// a | b  (set union)
auto ab = {a, b};
auto joined = ab | std::views::join;
std::unordered_set<int> u(joined.begin(), joined.end());

// a - b  (set difference)
auto diff = a | std::views::filter([&b](int e) {
    return !b.contains(e);
});
std::unordered_set<int> d(diff.begin(), diff.end());

// a & b  (set intersection)
auto isect = a | std::views::filter([&b](int e) {
    return b.contains(e);
});
std::unordered_set<int> i(isect.begin(), isect.end());

Note: these declarations can be simplified in C++23 with std::from_range.

Ξένη Γήινος
  • 2,181
  • 1
  • 9
  • 35
Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
  • Your methods are indeed better than mine, however I don't want to mutate the original sets inplace, but I want rather a copy. How to do that with `std::copy_if`? – Ξένη Γήινος Aug 25 '23 at 13:07
  • @ΞένηΓήινος I've added non-modifying versions to the answer too. It's generally best to use constructors whenever possible instead of algorithms. `std::copy_if` with `std::inserter` could technically be used as well, but it's not necessary for any of these operations. – Jan Schultke Aug 25 '23 at 13:14