Efficient combined min and mean computation of a vector

Question

I need both the minimum and the mean of the values in a vector.

I'm computing them separately with the following:

template <class T>
T Minimum(std::vector<T> & v){ return *min_element(begin(v), end(v)); }

template <class T>
T Mean(std::vector<T> & v)
{
    T sum = std::accumulate(v.begin(), v.end(), static_cast<T>(0));
    T mean = sum / v.size();
    return mean;
}

Both of these have to sweep the vector.

Is there a std efficient way to compute both minimum and mean of a vector sweeping it only once?

Nothing to do exactly this exists in the `` library. It is achievable if you create some comparator that does extra work, but I wouldn't. The library is not intended to be exhaustive though, but to provide a model for implementing your own algorithms in a simliar style, so I would write this new algorithm myself. It is pretty tiny anyway. — BoBTFish, Jun 17 '19 at 11:20
'Sweeping' vectors is extremely cheap, I wouldn't worry about this really and not try to optimize where optimization isn't necessary at all. — nada, Jun 17 '19 at 11:22
I would, as opposed to @BoBTFish, suggest creating a functor that would do the work. As you can see in [this answer](https://stackoverflow.com/a/28592394/7151494), the author uses `std::accumulate` with their own functor, which calculates the average. You can easily alter the example so it will store the min element and mean. By the way, they way you're computing `mean` suggests that you're interested in *average*, not a *mean*. — Fureeish, Jun 17 '19 at 11:23
I wouldn't be surprised to see a compiler fuse the loops if you call both of those in succession — Caleth, Jun 17 '19 at 11:24

Caleth · Answer 1 · 2019-06-17T12:47:44.953

8

Yes, you can accumulate the minimum and the sum in the same call. No, it probably won't be more efficient, nor will it be less efficient.

template <typename T>
std::pair<T, T> MeanAndMin(const std::vector<T> & v)
{
    auto zero = std::make_pair(static_cast<T>(0), std::numeric_limits<T>::max());
    auto plus = [](auto pair, auto elem) { return std::make_pair(pair.first + elem, std::min(pair.second, elem)); };
    auto res = std::accumulate(begin(v), end(v), zero, plus);
    res.first /= v.size();
    return res;
}

edited Jun 17 '19 at 12:47

answered Jun 17 '19 at 11:29

Caleth

52,200
2
44
75

Would be interesting to see a benchmark, so we'd know if this really is more efficient. – nada Jun 17 '19 at 12:14
@nada the [generated assembly](https://godbolt.org/z/hhsuUG) looks very similar too – Caleth Jun 17 '19 at 12:21
1

@nada Of course I dont know what OP really had in mind, but note that nobody asked for "more efficient". It is a matter of style, but imho when it can be done in one pass it should be done in one pass and I think the code much better expresses what it actually is supossed to do when it uses a single pass. Hence, being not less efficient is already fine. – 463035818_is_not_an_ai Jun 17 '19 at 12:36
Ok, it's not really more efficient. Both are about the same time-wise – WurmD Jun 17 '19 at 12:38
@WurmD I still fail to see the point in all of this. Why was having two functions not o.k.? – nada Jun 17 '19 at 12:42
@nada if you get told to fetch eggs and apples from the market, do you go to the market twice? You do not necesarily need to go once and you do not necessarily need to go twice, but given the task going there exactly once is the most natural – 463035818_is_not_an_ai Jun 17 '19 at 14:05
@formerlyknownas_463035818 Because in programming the compiler should worry about how many times to go to the market, not the developer. – nada Jun 18 '19 at 08:01
@nada I dont want to extend the discussion in comments, so just one last remark: code is not primarily for the compiler but for others to read, it expresses my intent (and the compiler cares for an efficient realization). It is not my intent to go to the market twice, hence if I explicitly write in the code "go to the market twice", the compiler can of course optimize it, but the code it not telling what I actually wanted to say – 463035818_is_not_an_ai Jun 18 '19 at 08:04
1

@formerlyknownas_463035818 And I was taught to write functions so that they solve exactly 1 problem. – nada Jun 18 '19 at 08:06

Fureeish · Answer 2 · 2019-06-17T11:35:46.430

You can use std::accumulate, paired with custom functors:

#include <iostream>
#include <vector>
#include <numeric>
#include <limits>
#include <algorithm>

struct average_and_min {
    int sum = 0;
    int min = std::numeric_limits<int>::max();
    std::size_t num_of_elements = 0;

    int get_sum() {
        return sum;
    }

    double get_average() {
        return static_cast<double>(sum) / num_of_elements;
    }

    int get_min() {
        return min;
    }
};

int main() {
    std::vector<int> vec = {1, 2, 5, 4, 4, 2, 4};

    auto func_accumulate = [](average_and_min acc, int value) {
        return average_and_min{acc.sum + value, std::min(acc.min, value), acc.num_of_elements + 1};
    };

    auto data = std::accumulate(vec.cbegin(), vec.cend(), average_and_min{}, func_accumulate);

    std::cout << "avg: " << data.get_average() << '\n'
              << "min: " << data.get_min() << '\n';
}

EDIT:

As @Caleth suggested in the comments, it might be a good idea not to use lambdas to combine your struct and the value - you can overload operator + inside average_and_min like so:

average_and_min operator + (int value) {
    return average_and_min{sum + value, std::min(min, value), num_of_elements + 1};
}

and the line

auto data = std::accumulate(vec.cbegin(), vec.cend(), average_and_min{}, func_accumulate);

can now become

auto data = std::accumulate(vec.cbegin(), vec.cend(), average_and_min{});

I'd pull `func_accumulate` out as `average_and_min operator+(average_and_min, int)` — Caleth, Jun 17 '19 at 11:31
@Caleth that's also a way to do it. Whether that's a replacement is debatable. Gonna add this as an edit. — Fureeish, Jun 17 '19 at 11:32

Efficient combined min and mean computation of a vector

2 Answers2