13

Consider three values x, y, z.

What would be the formula to get the mid value (not the mean value but the value which is neither the min nor the max)?

const double min = std::min(x, std::min(y, z));
const double mid = /* what formula here ? */
const double max = std::max(x, std::max(y, z));
Vincent
  • 57,703
  • 61
  • 205
  • 388

6 Answers6

13

To find all three at once in a symmetrical fashion:

min = x; med = y; max = z;

if (min > med) std::swap(min, med);
if (med > max) std::swap(med, max);
if (min > med) std::swap(min, med);
n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • Isn't this a kind of Bubble sort? – Abhishek Bansal May 04 '14 at 11:06
  • @AbhishekBansal this is indeed bubble sort. – n. m. could be an AI May 04 '14 at 11:08
  • @OliCharlesworth: I know right. I hate to ask this: so what? I cannot say that it will not work in C? What do you think "just saying" means? ;) – László Papp May 04 '14 at 12:13
  • @LaszloPapp: Just not sure what the purpose of your previous comment was, that's all (this code won't work in Java either!). (It could easily be seen as a mis-read of the tags, for example.) – Oliver Charlesworth May 04 '14 at 12:15
  • @OliCharlesworth: "just saying" usually means it is not a biggie and the person is aware of the circumstances. Comparing Java to C in C++ context when talking about compatibility is odd, I must say that, especially given that my answer indicated the intent of that one hour ago. Fwiw, I even upvoted the answer (as well as question) regardless I had an alternative answer. – László Papp May 04 '14 at 12:18
13

The answer from this link shared in the comments:

const double mid = std::max(std::min(x,y),std::min(std::max(x,y),z));

Edit - As pointed out by Alan, I missed out on a case. I have given now a more intuitive proof.

Direct Proof: Without Loss of Generality with respect to x and y.
Starting with the innermost expression, min(max(x,y),z) ...

  1. If it returns z, we have found the relations: max(x,y) > z. Then the expression evaluates to max(min(x,y),z). Through this we are able to determine the relation between min(x,y) and z.
    If min(x,y) > z, then z is smaller than x and y both (as the relation becomes: max(x,y) > min(x,y) > z). Therefore the min(x,y) is indeed the median and max(min(x,y),z) returns that.
    If min(x,y) < z, then z is indeed the median (as min(x,y) < z < max(x,y)).

  2. If it returns x, then we have x < z and y < z. The expressions evaluates to: max(min(x,y),x). Since max(x,y) evaluated to x, min(x,y) evaluates to y. Getting the relation z > x > y. We return the max of x and y (as the expression becomes max(y,x)) which is x and also the median. (Note that the proof for y is symmetrical)

Proof Ends


Old Proof - Note it is NOT complete (Direct):

Without loss of generality: Assume x > y > z
Min of x and y is y. And min of (max of x and y) and z is z.
The max of y and z is y which is the median.

Assume x = y > z
Min of x and y say is x. And min of (max of x and y is x) and z is z.
Max of the above two is x, which is the median.

Assume x > y = z
Min of x and y is y. And min of (max of x and y is x) and z is z.
Max of the above two is y, which is the median.

Finally, assume x = y = z
Any of the three numbers will be the median., and the formula used will return some number.

Community
  • 1
  • 1
digvijay91
  • 697
  • 1
  • 6
  • 21
  • Yes, it works. However, it is not very readable and consequently would be difficult to maintain; at first glance it is not clear what is happening. In contrast, with my solution 'auto mid = median(x,y,z)', it is instantly recognizable that the function returns the median of the three values. – Ricky65 May 06 '14 at 12:02
  • I'm not entirely convinced by your proof. What about the case where x < y < z? (You say "without loss of generality" - but the expression is not symmetric in the three values.) However, I believe the code is correct. – Alan Stokes May 10 '14 at 13:39
  • @AlanStokes Good catch. I will look into it, and improve the answer. (possibly try to give a more elegant proof if possible). – digvijay91 May 10 '14 at 13:53
12

This seems like cheating, but: x + y + z - min - max

Alan Stokes
  • 18,815
  • 3
  • 45
  • 64
  • 2
    I like it. The only problem would be overflow. – keyser May 04 '14 at 10:52
  • @keyser: true, so it is worth posting a different answer, too, just in case. Also, this is not necessarily returning the variable, just the value. – László Papp May 04 '14 at 10:53
  • @Laszlo True (although arguably that's a misfeature of min and max). SInce we're not taking a reference to the answer it doesn't matter. – Alan Stokes May 04 '14 at 11:00
  • You can get med without mix and max, so their limitation is not necessaily relevant. There are at least two options as far as I am aware or not using min and max. – László Papp May 04 '14 at 12:04
4

It is a bit uglier than Alan's trick, but it cannot cause overflow, nor numerical errors, and so on:

int x, y, z, median;
...

if (x <= y && y <= z || y >= z && y <= x) median = y;
else if (y <= x && x <= z || x >= z && x <= y) median = x;
else median = z;

The algorithm is simply this:

  • check if x is between y and z, if yes, that is it.

  • check if y is between x and z, if yes, that is it.

  • It must be z since it was neither x, nor y.

=====================================================

You could also get this more flexibly if you have more than three elements, with sorting.

// or xor implementation, does not matter...

void myswap(int* a, int* b) { int temp = *b; *b = *a; *a = temp; }

int x, y, z;
// Initialize them
int min = x;
int med = y;
int max = z;

// you could also use std::swap here if it does not have to be C compatible
// In that case, you could just pass the variables without the address operator.
if (min > med) myswap(&min, &med);
if (med > max) myswap(&med, &max);
if (min > med) myswap(&min, &med);
László Papp
  • 51,870
  • 39
  • 111
  • 135
  • If you want all three (min, mid, max), just sorting the list is probably best. One comparison and swap to get the first two in order, then at most two tests to work out the order of the other two. – Alan Stokes May 04 '14 at 11:04
  • Yes, although that would be a specific solution, and this solution will also work in C unlike std::swap. :P – László Papp May 04 '14 at 11:05
  • True - although the general algorithm for finding the median (or nth_greatest) is based on quick sort, but skipping unnecessary work. – Alan Stokes May 04 '14 at 11:06
  • @AlanStokes: sure, I agree. This version is optimized for readability and compreheniveness, while that one is for efficiency and flexibility. There is no efficiency problem here and the OP seems to be asking about three values, however, so I think both are valid. But yeah, sorting works gently for more than 3 elements, too; just need to make sure about odd numbers. – László Papp May 04 '14 at 11:07
  • @AlanStokes: updated a swap sorting version (i.e. bubble sort) with C compatibility, just in case. :D – László Papp May 04 '14 at 11:16
  • I hate to quibble, but I don't believe C has references. – Alan Stokes May 04 '14 at 11:46
3

A variant of Alan's "cheat" that (kind of and sometimes) prevents overflow:

#include <iostream>
#include <algorithm>

using namespace std;
int main(int argc, char *argv[]) {
    double a = 1e308;
    double b = 6e306;
    double c = 7.5e18;

    double mn = min(a,min(b,c));
    double mx = max(a,max(b,c));
    double avg = mn + (mx-mn)*0.5;
    double mid = a - avg + b - avg + c;

    cout << mid << endl;
}

Output:

6e+306

It makes use of the avg-formula often used in binary search to prevent overflow:

The average of two values can be calculated as low + (high-low)/2

However, it only works for positive values. Possible fallbacks include Alan's answer, or simply (x+y)/2 for the avg calculation.

Note that double precision comes into play here, and may cause issues in the mid-calculation. It works really well for positive integers though :)

keyser
  • 18,829
  • 16
  • 59
  • 101
  • The problem of not accessing the middle variable is still in there. – László Papp May 04 '14 at 11:09
  • 2
    @LaszloPapp That wasn't part of the question. It was about finding the _value_ – keyser May 04 '14 at 11:25
  • No, it was not, but the readers can be made aware of the limitation of an algorithm. :) – László Papp May 04 '14 at 11:28
  • 2
    It's not a limitation, it's a specification :p Finding the variable is a completely separate issue according to me. I mean, it's wrong to call it a _problem_. – keyser May 04 '14 at 11:29
  • Note that Stack Overflow is used by many people and they might take algorithms from here into generic cases. Yes, you can say, "it is their problem", if they take it wrong, but preventing that is better IMHO. It is not to say, your answer is not usable, etc, of course. – László Papp May 04 '14 at 11:30
  • That is not what I said. And I really can't imagine any scenario where this code could be misused. Beginners don't usually assume that they have created magic references, quite the opposite actually :p – keyser May 04 '14 at 11:48
  • Sorry? You cannot imagine a place where users need one element of a set either through index or reference? Imagine harder. :) – László Papp May 04 '14 at 11:50
  • I cannot imagine how they would ever assume that _this_ code in any way attempts to find such a reference. It simply calculates a value, which is not in any way confusing. – keyser May 04 '14 at 11:51
  • I do not follow you. It is not about code, but algorithm, big difference! You can change your code in whatever way you want with this algorithm, but it will never be really good other than just getting the mere value... it has the limitation of not being able to return the index and/or reference. You only concentrate on reference, but it is the same limitation telling which index that is. One has to take that into account when reusing. It is as simple as that to me, no more, no less. – László Papp May 04 '14 at 11:53
  • It is obvious to me that you have misunderstood most of what I'm saying when you say things like _"getting the mere value"_ and _"limitation"_. I won't try to explain further as we seem to be stuck. – keyser May 04 '14 at 12:00
  • Your algorith is unflexible for such things, and if one wants such a thing, this algorithm could not be considered. It is up to the readers what use cases they have. I would have personally made it clear if I were you, as I also made the pro/cons in my answer. – László Papp May 04 '14 at 12:02
  • Yes, just like the "algorithm" for finding the mean of two numbers isn't useful for much more than finding the mean of two numbers. – keyser May 04 '14 at 12:03
  • Exactly the point. For a flexible algorithm, sorting is the bestest. – László Papp May 04 '14 at 12:03
  • So you're criticising math as a whole? You're [over-engineering](http://www.doxdesk.com/img/updates/20091116-so-large.gif). – keyser May 04 '14 at 12:05
  • No, and those are quite non-worthy comments to be honest with you. I am telling one disadvantage of the algorithm to make the readers available and hence providing better Q/A threads, and you take it as offense rather than admitting, and then you throw such non-worthy comments. I am off. – László Papp May 04 '14 at 12:12
  • I didn't take offense. I'm simply saying that you're overthinking it. You obviously took offense, and I'm guessing it's because I criticised your way of thinking. It's understandable. – keyser May 04 '14 at 12:15
  • I will refrain from improving the understanding and quality of your posts; pardon me. I do not need this "Only feedback is accepted positive" attitude. Even I wrote the limitation of my algos into the answer for clarity... – László Papp May 04 '14 at 12:16
  • I get negative feedback all the time, and that's mainly why I'm here. You're still thinking about _limitations_ when it's just a matter of math (and overflow). Your attitude is admirable, but not applicable here. You're right about the inflexibility, it's just not what we were talking about. The mean of a and b is (a+b)/2, and there's not much more to it. The use case you're referencing is simply off-topic. You're free to discuss it of course, but putting disclaimers on other answers is simply not necessary. I'm not sure who you imagine helping. – keyser May 04 '14 at 12:39
  • OK, nvm. Let us move on. It is not worth this argument. – László Papp May 04 '14 at 13:34
  • @Jarod42 You're right, it certainly does. Not much to do about here without switching types. – keyser May 04 '14 at 15:44
3

The best way to do this is with a generic median function template. No copying, swapping or mathematical operations are required.

template <typename T>
const T& median(const T& a, const T& b, const T& c)
{
    if (a < b)
        if (b < c)
            return b;
        else if (a < c)
            return c;
        else
            return a;
    else if (a < c)
        return a;
    else if (b < c)
        return c;
    else
        return b;
}
Ricky65
  • 1,657
  • 18
  • 22
  • "The best way" is actually unreadable with the if/elseif/else jungle for a non-performance critical code. :) I think it is better to be comprehensive in such cases than micro optimization. Also, this will not fly with more than three elements when you need to calculate the median for odd amount of numbers.. – László Papp May 04 '14 at 11:58
  • imo it is not unreadable; it is elegant. You've probably just glanced at it for a few seconds and immediately declared it unreadable without putting enough thought into it. Also, he's only looking for the median of three values. – Ricky65 May 04 '14 at 12:05
  • I was actually considering templates before, but then I decided not to mention them because IMHO it is more complex than needed. Also, I do not see any non-negligible performance gain out of it. It is basically the templatized version of my first solution with different if/elseif/else ordering, but I am not sure about the gain of template here compared to my non-template code. To me, it seems to make it more complex than necessary. One thing I agree about is that it is useful if you have different types to deal with, so it is flexible in that sense. – László Papp May 04 '14 at 12:07