4

In R there is a function called assign which assigns a value to a name in the environment.

EG:

assign("Hello", 2)
> Hello
[1] 2

In python I can't seem to do the same. I initially tried:

import numpy as np
import pandas as pd
import os

for file in os.listdir('C:\\Users\\Olivia\\Documents'):
    if file.endswith(".csv"):
        os.path.splitext(file)[0] = pd.read_csv('C:\\Users\\Olivia\\Documents\\' + file)

But I can see this is trying to make a string equal to a file which doesn't work.

I managed to get all the files in a list by doing:

import glob

dl = glob.glob(r'C:\Users\Olivia\Documents\*.csv')
nl = []
for i in dl:
    pl = i.split(os.sep)
    name = pl[5][:-4]
    nl.append(name)

ddict = {}

 for k, v in zip(nl,dl):
    ddict[k] = ddict.get(k,"") + v

 dfl = []

 for k, v in ddict.items():
    dfl.append(read_csv(v))

But now how do I get each data frame out of the list and named as the file without the extension. There must be a way to assign each data frame in the list as a name from the file list

Olivia
  • 814
  • 1
  • 14
  • 26
  • I got the list from https://stackoverflow.com/questions/30246620/how-to-create-separate-pandas-dataframes-for-each-csv-file-and-give-them-meaning – Olivia Oct 26 '17 at 09:09

1 Answers1

10

Honestly, you were on the right track with your first method. Unfortunately, python doesn't give you the option to create a "variable number of variables" dynamically, as you have tried and realised already. However! You can create a dictionary and assign dataframes to string keys as you like. Here's how.

root = 'C:\\Users\\Olivia\\Documents'

ddict = {}
for file in os.listdir(root):
    if file.endswith(".csv"):
        name = os.path.splitext(file)[0]
        ddict[name] = pd.read_csv(os.path.join(root, file))

Another way of building this dictionary is using a dict comprehension:

ddict = {os.path.splitext(file)[0] : pd.read_csv(os.path.join(root, file)) 
                for file in os.listdir(root) if file.endswith('csv')
}

Now, referring to a single dataframe is as easy as

ddict['your_file_name']

Another thing to note, the safest way to join files is using os.path.join. It's just safer than a plain +.


References

cs95
  • 379,657
  • 97
  • 704
  • 746
  • Brilliant, and then with the final dictionary is there a way to take everything out as individual objects? I know this would be less efficient and unnecessary, but im curious if its easy to do. – Olivia Oct 26 '17 at 09:22
  • 1
    @Olivia Really it's not recommended, but it is possible. You use: `globals().update(ddict)` but this leads to code smell, it's better to just leave it inside a dictionary. – cs95 Oct 26 '17 at 09:23
  • 1
    @Olivia `os.path.join` will automatically take care of adding the separator if it does not exist. If you really want to concat to strings (as opposed to joining file sub paths), then use `+`. Otherwise, to join subpaths, it's considered good practice to use `os.path.join`, and it is portable too. – cs95 Oct 26 '17 at 09:37