1

This is my first post. I want to loop or iterate in a directory to import each of the files as a separate DataFrame with a name similar (at least with the numeration) to the name of its file. After a lot of research I still do not know how to do it. Obviously I am a very beginner :-)

My code is:

Main_folder =  os.getcwd()
Folders = os.listdir('.')

for file in Folders:
    data= pd.read_csv(file, sep="\t", header=0)
    data.columns=data.columns.str.strip()

where for instance Folders is a list of files names including the file extension. e.g.:

Folders=['01_load.TXT', '02_load.TXT', '03_load.TXT']

What I need is just to import all files to my work space such:

Load_01=pd.read_csv('01_load.TXT', sep="\t", header=0)
Load_02=pd.read_csv('02_load.TXT', sep="\t", header=0)

but in a loop since I have many files.

Mort
  • 3,379
  • 1
  • 25
  • 40

2 Answers2

2

You can use a dictionary:

data = {}
for file in os.listdir('.'):
    data[file] = pd.read_csv(file, sep="\t", header=0)
    data[file].columns = data[file].columns.str.strip()

Then you access each dataframe as a key of the dictionary, for example: data['01_load.TXT']

It is possible to set variable variable names and access them, but it's not adviced or a good practice.

dataista
  • 3,187
  • 1
  • 16
  • 23
0

When create objects in a loop you can't give them explicit names. But, you can add them to a data structure that associates them with a relevant name. I would recommend a dictionary here. So, for example, you can do this:

Folders=['01_load.TXT', '02_load.TXT', '03_load.TXT'] # These should be called filenames not folders but anyway.

data_frames = {}    # Initialise a dictionary

for filename in Folders:
    df = pd.read_csv(filename, sep='\t', header=False)
    data_frames[filename] = df

# Now you can access any of the dataframes by the filename by using the dictionary:
# Let's say you want the df associated with 02_load.TXT

df = data_frames['02_load.TXT']
print(df.head())
Neil
  • 3,020
  • 4
  • 25
  • 48
  • Thank you very much. So my mistake is that I cannot store different variables in a loop. Thus, I need to create an empty dictionary to store each of the files, in this case with the key as the name if the file and the value the content of my file imported as a data frame, right? – Alejandro BM Nov 20 '18 at 09:22
  • Pretty much yea you have the idea. Technically, it is possible to create variable variable names, but you don't do that unless you want to end up throwing your computer out the window (debugging would be a total nightmare). Remember that, technically, behind the scenes, python pretty much stores everything in dictionaries anyway. So basically using a dictionary achieves the same thing as variable variable names. – Neil Nov 20 '18 at 11:23
  • Also, it is technically possible to achieve this without first making an empty dictionary. You can use dictionary comprehension to create and populate the dictionary in one go. But don't worry about that yet if you don't know what it is. – Neil Nov 20 '18 at 11:24
  • Top level, yes, the key is the name of the file and the value is the dataframe and this is a good way of creating names with values in a loop. – Neil Nov 20 '18 at 11:24