0

I have a dataframe (raw_file):

NAME ABCD XYZ

abc 1 111

def 254 121

ghi 8976541 254

jkl 000000111/1215 111

mno 15614987 117

I am writing a function that creates a new variable according to some calculations and filters the data based on this criteria:

Len_Filter = [1,2,3,4,14]

The function so far is:

def Acc(df,AN,TR,LF):
    df[AN]=df[AN].astype(str)
    df[TR]=df[TR].astype(str)
    df['NEW'] = df[AN].str.len()
    df = df[df['NEW'].isin(LF)]     #ERROR
    df[AN] = "0000" + df[TR] + "/" + df[AN]

The function call is:

Acc(raw_file,'ABCD','XYZ',Len_Filter)

While the following code works outside the function,

raw_file = raw_file[raw_file['NEW'].isin(Len_Filter)] 

I am getting the following warning while using it inside the function:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

Someone please help me out with this issue.

Aditya
  • 3
  • 1

1 Answers1

0

The problem is that the df variable you are using inside the function is a reference to the dataframe. So when you use the assignment operator, it reassigns the values inside the table and it leads to conflict.
You can resolve this issue by making a copy of the original dataframe and then refering the df variable to it.
For example: df = df[df['NEW'].isin(LF)].copy()

N.Hung
  • 154
  • 6