0

Dear stackoverflow users,

I'm struggling with figuring out the following problem:

I have a directory with multiple files such as

datasets/
    dataset1.txt
    dataset2.txt
    dataset3.txt
    dataset4.txt
    dataset5.txt

and to read out the files and assign their content to a variable that is their filename without the file type extension. To be explicit: The content of dataset1.txt should be saved to a variable dataset1, the content of dataset2.txt should be saved to the variable dataset2 and so on.

I know that I can iterate over the content of my folder with the following function:

for root, dirs, files in os.walk('.'):
print(files)

but at the end it should do something like the folling:

for root, dirs, files in os.walk('.'):
for file in files:
file.split('.')[0] = numpy.loadtxt(file) # here it should create e.g. a variable dataset1 and read content of dataset1 into it.

How is this possible?

Regards,

Jakob

JakobJakobson13
  • 135
  • 1
  • 4
  • 13
  • What type of variable do you want them to be? A list with each line in it? – Zeke Egherman Feb 21 '18 at 19:13
  • The content of the files looks is a tabular seperated text file where the first column is my variable for the x-axis and the second column is my variable for the y-axis. – JakobJakobson13 Feb 21 '18 at 19:17
  • Possible duplicate of [How do I create a variable number of variables?](https://stackoverflow.com/questions/1373164/how-do-i-create-a-variable-number-of-variables) – jpp Feb 21 '18 at 20:10

2 Answers2

2

I would use dictionary for this situation:

fileSet = {}

for root, dirs, files in os.walk('.'):
   for file in files:
      fileSet[file.split('.')[0]] = numpy.loadtxt(file)

Then you could access content with expression such as

dataset1Val = fileSet['dataset1']
Chun-Yen Wang
  • 568
  • 2
  • 10
  • That looks like a good solution. Just out of interest: is there a way to archieve this without a dictionary (even this wouldn't be as elegant)? – JakobJakobson13 Feb 21 '18 at 19:24
  • 1
    There might be. But in searching online, folks mostly prefer using dictionary than creating variables dynamically. – Chun-Yen Wang Feb 21 '18 at 19:27
0

You should use a dictionary here. The advantage a dictionary would give you would be that

  1. you could quickly access data in any file using a key.
  2. To loop through the data from any all files, you don't need to do a lot of hardwork, instead you could just do for key in my_file_dict.keys()
  3. To do a quick find for a text in a file, a dictionary would come in handy.

The below code, reads the data from a file, converts it into a list and then stores it to a dictionary with the key being the file name

To make it more performance oriented, you can use a DictionaryComprehension as well instead of the traditional nested for loops.

output_dict = {file.split('.')[0] : numpy.loadtxt(f'{directory_path}\\{file}').tolist() for root, dirs, files in os.walk(directory_path) for file in files}

The traditional way:

output_dict = {}
for root, dirs, files in os.walk(path):
    for file in files:
        output_dict[file.split('.')[0]] = numpy.loadtxt(f'{path}\\{file}').tolist()
iam.Carrot
  • 4,976
  • 2
  • 24
  • 71