Use itertools.groupby() to group together the entries for a month, and reduce() to add up the numbers. For example:
import itertools
ddat= [['2012-01', 1, 5.4], ['2012-01', 2, 8.1], ['2012-01', 3, 10.8],
['2012-01', 4, 13.5], ['2012-02', 1, 8.1], ['2012-02', 2,10.8],
['2012-02', 3, 13.5], ['2012-02', 4, 16.2], ['2012-03', 1, 10.8],
['2012-03', 2, 13.5], ['2012-03', 3, 16.2], ['2012-03', 4, 18.9],
['2012-04', 1, 13.5], ['2012-04', 2, 16.2], ['2012-04', 3,18.9]]
[[w[0], reduce(lambda x, y: x+y[1]*y[2], list(w[1]), 0)] for w in itertools.groupby(ddat, key=lambda x:x[0])]
produces
[['2012-01', 108.0],
['2012-02', 135.0],
['2012-03', 162.0],
['2012-04', 102.6]]
Edit: The above only gets the numerator of the desired value. The code shown below computes both the numerator and the denominator. As demo code, it produces a list containing both the values and their ratio.
Note the apparently-extra for
in the following code. (That is, the portion
... for w,v in [[w, list(v)] for w,v in itertools ...
in the third line of code.) The extra layer of for
is used to make a copy of iterable v
as a list. That is, because the v
returned by itertools.groupby() is an iterable rather than an actual list, numer_sum(v)
would exhaust v
, so denom_sum(v)
would get a value of 0. Another approach would be to use itertools.tee; but an answer to another question says the list
approach may be faster. A third possibility is to combine numer_sum
and denom_sum
into a single function that returns a tuple, and add an outer for
to compute the ratio.
def numer_sum(w): return reduce(lambda x,y: x+y[1]*y[2], w, 0)
def denom_sum(w): return reduce(lambda x,y: x+y[2], w, 0)
[[w, round(denom_sum(v),3), numer_sum(v), numer_sum(v)/denom_sum(v)] for w,v in [[w, list(v)] for w,v in itertools.groupby(ddat, key=lambda x:x[0])]]
produces
[['2012-01', 37.8, 108.0, 2.857142857142857],
['2012-02', 48.6, 135.0, 2.777777777777778],
['2012-03', 59.4, 162.0, 2.7272727272727275],
['2012-04', 48.6, 102.6, 2.111111111111111]]