1

I open a list into my script and the search for matches to '2011' and print the '2011' strings using the following code

for row in dL:
    if "2011" in row:
        print row

and get the following output

['2011', 'randome', '6200']
['2011', 'marks', '6020']
['2011', 'man', '6430']
['2011', 'is', '6040']
['2011', 'good', '6230']

what I am trying to do is get all the values from the 3rd column and sum them to get the result 30920 and then calculate and print the average which is 6184. So far I have the following code.

   total = int(row[2])
   total2 = sum(total)
   print total2

however I get the following error

total2 = sum(total)
TypeError: 'int' object is not iterable

How can I fix this error and create the total and averages??

smci
  • 32,567
  • 20
  • 113
  • 146
  • Note: when you say 'row[2]' you mean *'the third column'* ;-) Not *'third row'* – smci Nov 17 '17 at 19:57
  • Also if you're doing any non-trivial data-mungeing, learn pandas package, it makes stuff like this easy. – smci Nov 17 '17 at 19:57

3 Answers3

2

You want to find the sum of all the lists, not from one specifically (as you have tried).

Use a list comprehension instead of a for-loop:

total2 = sum(int(i[2]) for i in dL if '2011' in i)

To get the average:

average = total2 / float(len([int(i[2]) for i in dL if '2011' in i])) # In python 3, the float() is not needed

A list comprehension is a quick way to make a list. Take for example this:

result = []
for i in range(1, 4):
    result.append(i**2)

Result will contain:

[1, 4, 9]

However, this can be shortened to a list comprehension:

[i**2 for i in range(1,4)]

Which returns the same thing.

The reason for when I call sum() and I don't put in brackets around the comprehension is because I don't need to. Python interprets this as a generator expression. You can read more about it here

Community
  • 1
  • 1
TerryA
  • 58,805
  • 11
  • 114
  • 143
  • Hi Hadiro ive replaced my total section of the code with your section however i get the following error. total2 = sum(i[2] for i in dL if '2011' in i) TypeError: unsupported operand type(s) for +: 'int' and 'str' would you know why that is?? – user2603519 Jul 21 '13 at 06:42
  • @user2603519 I just fixed that as you commented :) – TerryA Jul 21 '13 at 06:43
  • Thanks this works beautifully. If im trying to calculate the average how would i do so by using len of list?? But its not really a list?? – user2603519 Jul 21 '13 at 06:47
  • Haidro. I am trying to use this code you have supplied in a loop but I am getting a value division error. Can you help with this?? Do I need to make a new question?? – user2603519 Jul 21 '13 at 07:02
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/33842/discussion-between-haidro-and-user2603519) – TerryA Jul 21 '13 at 07:03
  • @user2603519 Make sure you're not dividing by zero. – Ashwini Chaudhary Jul 21 '13 at 07:03
0

total should be a list.

total = [int(row[2]) for row in dL if '2011' in row]    # totals in a list
total2=sum(total)                                       # total of totals. :P
print total2                                            # print total
average = total2/len(total)                             # get average
print average                                           # print average
rnbguy
  • 1,369
  • 1
  • 10
  • 28
0

since you want to get average also, accordinly you have to take length of filtered list also. you can modify any of the above code accordingly, I will go with @haidro's answer.

l = [int(i[2]) for i in dL if '2011' in i]   #to get filtered list
total2 = sum(l)      #total of list elemnents
avg = total2/len(l)   #average of list elements
tailor_raj
  • 1,037
  • 2
  • 9
  • 19