here is my DataFrame
Tipo Número renal dialisis
CC 260037 NULL NULL
CC 260037 NULL AAB
CC 165182 NULL NULL
CC 165182 NULL CCDE
CC 260039 NULL NULL
CC 49740 XYZ NULL
CC 260041 NULL NULL
CC 259653 NULL NULL
I want to determine if values in renal
and dialisis
are NULL
ore not, for each row in the DataFrame. Those rows which are not NULL
will be 1
in survived
list; and if they are both NULL
are going to be 0
.
My code is:
survival = pd.read_table('Sophia_Personalizado bien.txt',encoding='utf-16')
survived = []
numero_paciente = []
lista_pacienytes= survival['Número'].values.tolist()
lista_pacienytes= sorted(set(lista_pacienytes))
for e in lista_pacienytes:
survival_i = survival.loc[survival['Número']==e]
renal = set(survival_i['renal'].values.tolist())
dialisis = set(survival_i["dialisis"].values.tolist())
print('dialisis',dialisis)
print('renal',renal)
if renal == 'nan' or dialisis == 'nan':
survived.append(0)
numero_paciente.append(e)
else:
survived.append(1)
numero_paciente.append(e)
e = pd.DataFrame({'numero': numero_paciente,
'survival': survived})
Surprisingly, all rows equal to 1
, but as we can see in the DataFrame it is not true. Also, the result of
print('dialisis',dialisis)
print('renal',renal)
is:
dialisis {nan, nan}
renal {nan}
which should be NAN
as I use set()
.
What am I missing? Thanks