-2

my dataframe contains 10000+ rows , 2 columns columns are y(val) and its pobability (in decimal) i have to apply formula:

=[0 if y_score < 0.5 else 1]
ypred=[0 if y_score < 0.5 else 1]

and Add column which will show either 0 or 1 as output corresponding to the value.

pls provide the syntax as i am new in python world.

I am trying to pass whole dataframe in calling function but not getting result.

def class_label(df):
    if df['proba'] > 0.5:
        df['Class'] == 1
    else:
        df['Class'] == 0  // function def

and

df['class'] = class_label(df) - calling function
ForceBru
  • 43,482
  • 10
  • 63
  • 98
user2986845
  • 75
  • 1
  • 2
  • 8
  • [Here's the syntax](https://docs.python.org/2.0/ref/function.html) – ForceBru Aug 20 '19 at 15:53
  • `df['proba']` is a series. (which is a list-like or array-like container but different in it's own way) .You need to look at how to work with an array or series of values, Just a simple `if` statement won't do. – Paritosh Singh Aug 20 '19 at 15:57

1 Answers1

0

You want to do :

def class_label(df):
    df.loc[df['proba']<0.5, 'Class'] = 0
    df.loc[df['proba']>=0.5, 'Class'] = 1
    return df
Benoit Drogou
  • 969
  • 1
  • 5
  • 15
  • i am trying to call this function with following code: – user2986845 Aug 21 '19 at 08:36
  • df['Class'] = y_dash(df['proba']) - class will be my new variable which will add column named Class to my dataframe. I am getting following error: 1 def y_dash(df): ----> 2 df.loc[df['proba']<0.5, 'label'] = 0 3 df.loc[df['proba']<0.5, 'label'] = 1 4 return df pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type() KeyError: 'proba' please check – user2986845 Aug 21 '19 at 08:38
  • this just means that you do not have any column named "proba" in your dataframe – Benoit Drogou Aug 21 '19 at 12:54