1

Please be patient I am new to Python and Pandas. I have a lot of pandas dataframe, but some are duplicates. So I wrote a function that check if 2 dataframes are equal, if they are 1 will be deleted:

def check_eq(df1, df2):
    if df1.equals(df2):
        del[df2]
        print( "Deleted %s" (df_name) )

The function works, but I wish to know how to have the variable "df_name" as string with the name of the dataframe. I don't understand, the parameters df1 and df2 are dataframe objects how I can get their name at run-time if I wish to print it? Thanks in advance.

Jamiu S.
  • 5,257
  • 5
  • 12
  • 34
Paxi
  • 61
  • 4
  • Is this what you are looking for? https://stackoverflow.com/questions/31727333/get-the-name-of-a-pandas-dataframe – HunkBenny Nov 25 '22 at 21:34

1 Answers1

1

What you are trying to use is an f-string.

def check_eq(df1, df2):
    if df1.equals(df2):
        del[df2]
        print(f"Deleted {df2.name}")

I'm not certain whether you can call this print method, though. Since you deleted the dataframe right before you call its name attribute. So df2 is unbound.

Instead try this:

def check_eq(df1, df2):
    if df1.equals(df2):
        print(f"Deleted {df2.name}")
        del df2
        

Now, do note that your usage of 'del' is also not correct. I assume you want to delete the second dataframe in your code. However, you only delete it inside the scope of the check_eq method. You should familiarize yourself with the scope concept first. https://www.w3schools.com/python/python_scope.asp

The code I used:

d = {'col1': [1, 2], 'col2': [3, 4]}
df1 = pd.DataFrame(data=d)
df2 = pd.DataFrame(data=d)

df1.name='dataframe1'
df2.name='dataframe2'

def check_eq(df1, df2):
    if df1.equals(df2):
        print(f"Deleted {df2.name}")
HunkBenny
  • 38
  • 1
  • 4