0

i have a list of lists like this:

list = [[year1-month1,int1,float1],[year1-month1,int2,float2],[year1-month2,int3,float3]....

I need to define a function that goes through it returns a result like so:

newList = [[((int1*float1)+(int2*float2))/(float1+float2),year-month1],...

My problem is that the first item of over 2000 sublists is a date which is in a year-month format and the rest are values for days, and I need to get the monthly average. I tried few things but couldn't get it to work. I would be grateful for some suggestions.

what I've tried is something like:

    def avPrice(mylist):
        month=[]
        i = 0
        for i in mylist:
            if mylist[i][0] not in month:
                month = mylist[i][0],mylist[i][1]*mylist[i][2],mylist[i][2]
            else:
                month = month[0],month[1]+(mylist[i][1]*mylist[line][2]),month[2]+mylist[i][2]
                i = i + 1
            return month
        monthAvPrice.append(month)
3m3k
  • 39
  • 3
  • 2
    Can you edit your answer to include some of the things you've tried? Perhaps we can help you find a problem in your code. – Michael0x2a Dec 17 '12 at 01:15

3 Answers3

1

Use itertools.groupby() to group together the entries for a month, and reduce() to add up the numbers. For example:

import itertools
ddat= [['2012-01', 1, 5.4], ['2012-01', 2, 8.1], ['2012-01', 3, 10.8],
['2012-01', 4, 13.5], ['2012-02', 1, 8.1], ['2012-02', 2,10.8],
['2012-02', 3, 13.5], ['2012-02', 4, 16.2], ['2012-03', 1, 10.8],
['2012-03', 2, 13.5], ['2012-03', 3, 16.2], ['2012-03', 4, 18.9],
['2012-04', 1, 13.5], ['2012-04', 2, 16.2], ['2012-04', 3,18.9]]

[[w[0], reduce(lambda x, y: x+y[1]*y[2], list(w[1]), 0)] for w in itertools.groupby(ddat, key=lambda x:x[0])]

produces

[['2012-01', 108.0],
 ['2012-02', 135.0],
 ['2012-03', 162.0],
 ['2012-04', 102.6]]

Edit: The above only gets the numerator of the desired value. The code shown below computes both the numerator and the denominator. As demo code, it produces a list containing both the values and their ratio.

Note the apparently-extra for in the following code. (That is, the portion
... for w,v in [[w, list(v)] for w,v in itertools ...
in the third line of code.) The extra layer of for is used to make a copy of iterable v as a list. That is, because the v returned by itertools.groupby() is an iterable rather than an actual list, numer_sum(v) would exhaust v, so denom_sum(v) would get a value of 0. Another approach would be to use itertools.tee; but an answer to another question says the list approach may be faster. A third possibility is to combine numer_sum and denom_sum into a single function that returns a tuple, and add an outer for to compute the ratio.

def numer_sum(w): return reduce(lambda x,y: x+y[1]*y[2], w, 0)
def denom_sum(w): return reduce(lambda x,y: x+y[2], w, 0)
[[w, round(denom_sum(v),3), numer_sum(v), numer_sum(v)/denom_sum(v)] for w,v in [[w, list(v)] for w,v in itertools.groupby(ddat, key=lambda x:x[0])]]

produces

[['2012-01', 37.8, 108.0, 2.857142857142857],
 ['2012-02', 48.6, 135.0, 2.777777777777778],
 ['2012-03', 59.4, 162.0, 2.7272727272727275],
 ['2012-04', 48.6, 102.6, 2.111111111111111]]
Community
  • 1
  • 1
James Waldby - jwpat7
  • 8,593
  • 2
  • 22
  • 37
0

Here's what I've come up with.

def appendDateNumbers(d, item):
    def sumItem(date, integer, floating, *junk):
        if date in d:
            d[date]+=integer*floating
        else:
            d[date]=integer*floating
        return d
    return sumItem(*item)

def _averageListWith(dn, datesList):
    def averageItem(i):
        return (i, dn[i]/datesList.count(i))
    return dict(map(averageItem, dn.keys()))

def averageLst(lst):
    return _averageListWith(reduce(appendDateNumbers, lst, {}), 
                            map(lambda x: x[0], lst))

print averageLst([["12-12", 1, 1.0],["12-12", 2, 2.2],["13-1", 3, 3.3]])

The averageLst() function should serve you plus or minus rounding errors.

Ishpeck
  • 2,001
  • 1
  • 19
  • 21
-1

I know there are probably better ways, but have you tried using a for loop?

def monthly_average(list):
    newList=[]
    for i in range(len(list)/2):
        avg=((list[i][1]*list[i][2])+(list[i+1][1]+list[i+1][2]))
        avg=avg/(list[i][2]+list[i+1][2])
        newList.append(avg)
        newList.append(list[i][0])
    return newList

That should work assuming you have two sublists for every month. If you have more, then you might have to add a function to check for all the sublists whose 'zeroth' index is equal to a certain string. For example:

newList=[]
tempList=[]
for i in list:
    if i[0]=='year1-month1':
        tempList.append(i)
while len(tempList)>1:
    tempList=monthly_average(tempList)

Then just iterate that for every month, changing the string value.

Again, it's probably not the most efficient method, but it works.

Volatility
  • 31,232
  • 10
  • 80
  • 89