-1

Given the sorted list

item_list = [41, 53, 54, 57, 315, 324, 340]

How would I find the three values that have the lowest standard deviation from their own average? The correct answer for the example is these three values

answer = [53, 54, 57]

They have an average of 54,66 and a standard deviation of 1,55. Any other combination would yield a higher standard deviation.

All I can think of at this moment is simply iterate through available options and calculate each. Given this 7 item list that would be 7 * 6 * 5 = 210 options I believe. But since in my actual application the list is much longer, this would perhaps lead to issues.

Also, as the list size increases, I need to be able do the same for 4, 5, 6 etc. number of elements.

Edit:

It's not a duplicate I believe. There is no code example as I didn't think my nested implementation of this bit of pseudocode;

for item in item_list:
  # calculate average with all variations of 2 other elements
  for each average:
    # calculate standard deviation

Would be very useful, nor readable. I tried to present the minimal viable case, starting from scratch, and really had no idea where to go from there.

Eloque
  • 311
  • 2
  • 10
  • To be clear, you're trying to find the subset of the list which has the smallest standard deviation? – Patrick Michaelsen Sep 20 '17 at 21:58
  • Basically you want to minimize stdev(x0,x1,x2) subject to the constraint that x0,x1,x2 are elements of `item_list`, and no item in the list is repeated. – Mako212 Sep 20 '17 at 22:08
  • Perhaps there's some way to narrow down the number of choices to approximately N. What property is likely to be true of numbers in a sorted list, if the numbers are as close to each other as possible? Does that hold true for the example you gave here? – Kenny Ostrom Sep 20 '17 at 22:09
  • 1
    Iterate over the list, in threes, find sigma for each triad and store in a container, find the minimum of all the sigma's. – wwii Sep 20 '17 at 22:18
  • from what I can tell, in a sorted list, @wwii 's solution is a simple and effective solution – Patrick Michaelsen Sep 20 '17 at 22:18
  • Welcome to SO. Unfortunately this isn't a code writing service or a study group. Please read [ask] and [mcve]. – wwii Sep 20 '17 at 22:19
  • Possible duplicate of [Iterate over n successive elements of list (with overlapping)](https://stackoverflow.com/q/38151445/2823755.) – wwii Sep 20 '17 at 22:31
  • Possible duplicate of [Making all possible combinations of a list in python](https://stackoverflow.com/q/8371887/2823755) depending on what you actually want. – wwii Sep 20 '17 at 22:37
  • - updated original post – Eloque Sep 21 '17 at 09:07
  • @Eloque do either of these answers [answer your question](https://stackoverflow.com/help/someone-answers)? – Brad Solomon Oct 04 '17 at 17:45

2 Answers2

1

Assuming a sorted list and a function pstdev which finds the population std for a list:

list = [41, 53, 54, 57, 315, 324, 340]
size = 3
subset = list[:size]
minstd = pstdev(list[:size])
for i in range(size, len(list)):
    std = pstdev(list[i-size:i])
    if(std < minstd):
        minstd = std
        subset = list[i-size:i]

print(subset)
print(minstd)

This is O(n), which I believe, is the best you can do for this no matter what tricks you try.

Std without numpy sourced from: https://stackoverflow.com/a/27758326/5013193

1

Here's an implementation with NumPy and itertools.

import itertools
import numpy as np

item_list = [41, 53, 54, 57, 315, 324, 340]
combos = np.array(list(itertools.combinations(item_list, 3)))
s = combos.std(axis=1).argmin()
print(combos[s].tolist())
# [53, 54, 57]
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235