I have a pandas dataframe with columns that sometimes have nan values. I know that I could remove these en masse using pandas own dropna
function but I wanted to write my own function in this case that I can then individually call on each column so I wrote:
data = pd.read_csv('data.csv')
def remove_nans_from_column(column_name):
data = data[~data[column_name].isna()]
remove_nans_from_column('bmi')
But running this produces this error:
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
<ipython-input-28-3be0ddcb71ff> in <module>()
----> 1 remove_nans_from_column('bmi')
<ipython-input-26-911d33fb618e> in remove_nans_from_column(column_name)
1 def remove_nans_from_column(column_name):
----> 2 data = data[~data[column_name].isna()]
UnboundLocalError: local variable 'data' referenced before assignment
I understand that the variable data is not defined inside the function but it should be able to get it from the rest of the code.
data = data[~data['bmi'].isna()]
works when it is not inside a function so why does it not work in a function?
I am more curious about the reason for this error here rather than how to fix it which I can already do.