1

I'm trying to create a function that returns a dynamically-named list of columns. Usually I can manually name the list, but I now have 100+ csv files to work with.

My goal:

  1. Function creates a list, and names it based on dataframe name
  2. Created list is callable outside of the function

I've done my research, and this answer from an earlier post came very close to helping me.

Here is what I've adapted

def test1(dataframe):
    # Using globals() to get dataframe name
    df_name = [x for x in globals() if globals()[x] is dataframe][0]
    
    # Creating local dictionary to use exec function
    local_dict = {}
    
    # Trying to generate a name for the list, based on input dataframe name 
    name = 'col_list_' + df_name
    exec(name + "=[]", globals(), local_dict)

    # So I can call this list outside the function
    name = local_dict[name]
    
    for feature in dataframe.columns:
        # Append feature/column if >90% of values are missing
        if dataframe[feature].isnull().mean() >= 0.9:
            name.append(feature)
            
    return name

To ensure the list name changes based on the DataFrame supplied to the function, I named the list using:
name = 'col_list_' + df_name

The problem comes when I try to make this list accessible outside the function:
name = local_dict[name].

I cannot find away to assign a dynamic list name to the local dictionary, so I am forced to always call name outside the function to return the list. I want the list to be named based on the dataframe input (eg. col_list_df1, col_list_df2, col_list_df99).

This answer was very helpful, but it seems specific to variables.
global 'col_list_' + df_name returns a syntax error.

Any help would be greatly appreciated!

orangecat94
  • 145
  • 3
  • 1
    Why not utilize pandas to read each csv file into it's own data frame. This will define the column headings based on the csv file. You can then keep a list of data frames and process them as you see fit. – itprorh66 Dec 04 '20 at 15:02
  • Actually I've already read in each csv file into its own dataframe. From there, I'm trying to make a function that takes in these dataframes, and returns a list of columns names. Additionally, I was hoping the function can name the list based on the dataframe name. – orangecat94 Dec 07 '20 at 01:51

0 Answers0