I'm trying to build an entropy function from the scratch as asked by my leader. I have a dataset Ttrain, with many variables, sex being one. I'm having to extract the categories(male and female), and then calculate the probabilities and entropy subsequently, in a loop using the following code:
def entropy3(c):
import math
u=c.unique()
a=[]
b=[]
z=[]
for i in range(len(u)):
a=Ttrain[(c==u[i]) & (Ttrain.survived==1)].survived.count()
b=Ttrain[(c==u[i]) & (Ttrain.survived==0)].survived.count()
p=a/(a+b)
q=b/(a+b)
z=-(p)*math.log(p,2)-(q)*math.log(q,2)
return z
Now, when I run print(entropy3(Ttrain.sex)), I get 0.85, which is the entropy for the category female. Which means the loop does not iterate to the other category. Will be grateful if somebody could point out where am I going wrong. I'm very new to programming so please excuse any conceptual errors.