I have a dataframe with 2 columns: type and value, where some type usually has the same value in every row. In some rows, however, the value missing and we just have NaN. I want to fill in the appropriate value for each row (based on the row's type). I've made a sample dataframe and written code that does this and actually works. That being said, I'm new to pandas, and python in general, so I'm pretty sure it sucks. I was wondering if there is a more elegant way to do this, using fillna or similar functions. Here's the code that I have (A correlates to D, B to E, C to F, NaN to N):
import pandas as pd
import numpy as np
df = pd.DataFrame({"type": ["A", "B", "C", "A", "B", "C", "A", "B", "C", np.NaN, np.NaN, np.NaN],
"value": ["D", "E", "F", "D", "E", "F", np.NaN, np.NaN, np.NaN, np.NaN, "N", "N"]
})
print(df)
def valuemode(stype):
if type(stype) == str: # excluding NaN type
y = df.loc[(df['type'] == stype)]
# print(y)
else:
y = df.loc[(df['type'].isnull())]
# print(y)
mode = (y.mode())
return mode.iloc[0]["value"]
for index, row in df.iterrows():
rowtype = (row['type'])
x = valuemode(row['type'])
#print(row['value'])
if pd.isnull(row['value']) == True:
print("Type " + str(rowtype) + " will now have value " + str(x))
row['value'] = x
for index, row in df.iterrows():
print(row['type'], row['value'])