0

I have a folder that has 50 .csv files.

They are named UselessInfo.Needed_info.csv or UselessInfo.Needed.csv

I create a list of the names I want to assign as a variable to the representing df:

files = []
for filename in os.listdir():
    if not filename.endswith('.ipynb'):
        if '_' in filename or len(filename.split('.')) > 2:
            files.append(filename.split('.')[1])
        else:
             files.append(filename.split('.')[0])

Then I iterate over the files and create a df for each file:

for filename in files:
    filename = pd.read_csv(UselessInfo.{filename}.csv', delimiter=';')

But none of the names in files are created as a df, only filename holds the last file's data. Why the iteration of files does not create a df with the name in the files held by the filename variable?

I've seen this answer:

path = Path(os.getcwd()) 

dict_of_df = {}
for file in path.glob('*.csv'):
    dict_of_df[file.stem] = pd.read_csv(file,delimiter=';')

Which creates a dict of df's but how can I achieve so that I create a variable for each file in the folder?

Jonas Palačionis
  • 4,591
  • 4
  • 22
  • 55

1 Answers1

1

updated to address op's goal: get variables named just like filename pointing to each df.

This is a bad idea compared to putting them in a dict. But if you insist here is how to do it:

for filename in files:
    globals()[filename] = pd.read_csv(UselessInfo.{filename}.csv', delimiter=';')

This would create a variable using filename (a str).

Reference: Using a string variable as a variable name [duplicate]

Z Li
  • 4,133
  • 1
  • 4
  • 19
  • I know that the example with `dict` works, but I want to address a variable to get the `df` back, not the `dict['df_name']`. My goal is to overwrite the filename to be each filenames name as a variable, in your example I will not get 50 variables with respectively attached `df` to its name. – Jonas Palačionis Dec 09 '20 at 20:48
  • @itaishz my goal is not to have a single `df` so no concat is needed. Could you explain why this approach would be a bad idea? – Jonas Palačionis Dec 09 '20 at 20:55
  • You can go check the reference for more details. In general, you are trying to create variables whose names can change depend on what you have in that folder. This is not safe (the filenames have to follow python naming syntax) and it is hard to manage (how do you plan to use them if you do not know for sure what is going to be in that folder). – Z Li Dec 09 '20 at 20:57
  • I know the folder, I will need to work with those 50 .csv as as separate files and would like to address them as a single variable instead of a key in a dict. – Jonas Palačionis Dec 09 '20 at 20:58
  • The point is using `df[filename]` to reference the df is just as effective as using `filename`. And it is safe in case the folder changes. – Z Li Dec 09 '20 at 20:59
  • In your previous example I would not get the `df` as I would need to get the index of the `df` not the name, because you `.append()` it to list. – Jonas Palačionis Dec 09 '20 at 21:08
  • 1
    @JonasPalačionis yes. In my opinion the best solution here is to use a dict. – Z Li Dec 09 '20 at 21:09