calculation of conditional statement within pandas

Question

In the past I used to make calculation of conditional statements through pandas dataframe which returns Y/N to 1/0 and then calculate and get score. However I want to learn advanced method to implement calculation with larger dataset within a list.

Here is my code :

a=[234,45,57]
b=[26,51,59]
c=[87,23,56]


avrg_score = [['A',a[0]>0],
               ['B',b[0]>0],
               ['C',b[0]>0],
              ['C',a[0]*b[0]>c[0]],]


avrg_score = pd.DataFrame(avrg_score, columns=['Figure','Pass'],dtype=float).round(3)                                            
x=avrg_score.Figure.count()
y=avrg_score.Pass.sum()
avrg_score_result=(x/y)*100

output:

But this is for index [0] for 3 lists (a,b,c) , however I need manually do for the rest of the indexes of the lists.

How can I do automatically for all indexes for the given lists?

when I put such format for the full list:

avrg_score = [['A',a>0],
               ['B',b>0],
               ['C',b>0],
              ['C',a*b>c]]

I get such error:

'>' not supported between instances of 'list' and 'int'

Would appreciate any help.

what will be columns in your DataFrame for all indices? I am assuming the list ids A, B, C are going to be the DataFrame index — davidbilla, Mar 05 '20 at 18:58
I believe that the marked question covers the principles you're trying to implement. This is included in many PANDAS tutorials, but this one is farther down the sequence than you might want to go at the moment. — Prune, Mar 16 '20 at 18:52

score 0 · Answer 1 · answered Mar 05 '20 at 19:35

Is this what you are expecting? I am assuming the list ids A, B, C are going to be the DataFrame index and columns for each of the 3 list items as 'Pass_1', 'Pass_2' and 'Pass_3'

import pandas as pd
import inspect

a=[234,45,57]
b=[26,51,59]
c=[87,23,56]

def getname(var):
    callers_local_vars = inspect.currentframe().f_back.f_locals.items()
    return (str([k for k, v in callers_local_vars if v is var][0]))

lists = [a, b, c]
fig_list = ['A', 'B', 'C']
cols = ['Pass_1','Pass_2','Pass_3']
df_result = pd.DataFrame(columns=cols, dtype=float)
for item in lists:
    df_result.loc[getname(item)] = [i>0 for i in item]
print(df_result)

Output:

   Pass_1  Pass_2  Pass_3
a     1.0     1.0     1.0
b     1.0     1.0     1.0
c     1.0     1.0     1.0

But remember, the print_this function getname() inspects the locals and it could be costly when doing to large lists. Instead I would have the list names/dataframe index names as list of strings and use it in the loop.

your code works with i>0, but I cannot give lists different condition such as all indexes in list a more than 20, in b more than 30 , a>b for all indexes both in a and c. How to do that? — , Mar 06 '20 at 10:02
@Nina I am not sure if I understood what you are asking. Based in the original question, I assumed that for the list a with 3 elements, you want to find out each element is greater than 0 and store it in a df column. So, if your list has 3 elements you have 3 columns (one for each element) in your data frame. Now how do you want to store the result for 3 elements consolidated in one column? Do you want to check if each element in the list is greater than 0 and store the result in one column or count all the elements in the list that are greater than 0 and store the sum in the result df column? — davidbilla, Mar 06 '20 at 11:02

calculation of conditional statement within pandas

1 Answers1