I have N large lists of different length, where each value in the list represents the signal over a fixed window of length 25. I.e., I take the average value of the signal every 25 seconds/bases/etc, and I store that value in a list.
I do this for different experiments/devices that run for different time length (all multiples of 25 btw).
I.e., list 1 is a 1000 run, with 1000/25=40 values in the list1, list 2 is a 1025 minutes run, with 1025/25 = 41 values in list2, list3 is a 2525 run, with 2525/25 = 101 values in list3, etc...
Now, for the sake of comparison, I'd like to re-scale each list to same number of bins, let us say 40 bins.
As a matter of fact list1resized length will be 40 and its values would not change, since 1000/40 = 25 exactly. list2resized would go from a length of 41 values to a length of 40 values, and list3 would go from a length of 101 values to a length of 40 values (aka all lists are now of the same size).
And here comes the question. How would I resize each list to a fixed length of 40 by taking the weighted averages over the appropriate bins?
An example will clarify the question.
list1 = [4.8, 6.9, ...] #40 values for the 1000 run
list2 = [5.6, 7.8, 8.9, 13.4, ...] #41 values for the 1025 run
list3 = [4.1, 5.6, 10.3, 9.8, 40, 30, 21.4, 3, 2,...] #101 values for the 2525 run
Now, the resized lists should look like:
list1resized = [4.8*25/25, 6.9*25/25,...] #40 values for the 1000 run
list2resized = [(5.6*25+7.8*0.625)/25.625, (7.8*24.375+8.9*1.275)/25.625, (23.725*8.9+1.9*13.4)/25.625,...] # 40 values, averaged accordingly, for the 1025 run
list3resized = [(4.1*25+5.6*25+10.3*13.125)/(63.125), (10.3*11.875+9.8*25+40*25+30*1.25)/(63.125),...] # 40 values, averaged accordingly, for the 2525 run
In order to obtain such average values for each element of the resized list, we took the weighted average over the new resized bins (i.e., average over 1000/40=25 for list1, average over 1025/40=25.625 for list2, average over 2525/40=63.125 for list3, etc.). I.e, same but with the formulas I used for the weighted averages:
list1resized = [4.8*25/25, 6.9*25/25,...] #40 values for the 1000 run
list2resized = [(5.6*25+7.8*0.625)/25.625, (7.8*24.375+8.9*(25.65-24.375))/(25.625), (23.725*8.9+(25.625-23.725)*13.4)/(25.625),...] # 40 values, averaged accordingly, for the 1025 run
list3resized = [(4.1*25+5.6*25+10.3*13.125)/(63.125), (10.3*(25-13.125)+9.8*25+40*25+30*(63.125-25*3+13.125)))/(63.125),...] # 40 values, averaged accordingly, for the 2525 run
As you can see it can get messy, and hard to deal with, but I am looking for a pythonic, elegant and fast solution to the problem.
I have to do this for many lists many times so it's be nice to considering run time.
Not sure if you have any ideas, but help would be greatly appreciated.
Thanks.