I want to create a function that will let me input a list of values and will remove any rows that contain the values in a given column. I will use the following data frame as an example:
data = {'Name': ['Tom', 'nick', 'krish', 'jack'],
'Age': [20, 21, 19, 18]}
sample = pd.DataFrame(data)
I would like to remove any rows containing the following values in the 'Age' column.
remove_these = [20,21]
This is what I have so far:
def rem_out (df,column,x):
df.drop(df[df['column'] == x].index, inplace = True)
return df
In my function 'df' refers to the data frame, 'column' is the name of the column that should be checked for the values, and 'x' is the list of values. It is very important that I be able to give my function a list of values because I will be removing hundreds of values from my data.
When I run my function like this:
rem_out(sample, Age, remove_these)
I get an error saying that Age is not defined. How can I specify the column of interest so that the values in my list can be removed from the data frame?
Ideally, my function would have removed the first and second rows.