How can I iterate over a list of .txt files using numpy?

Question

I'm trying to iterate over a list of .txt files in Python. I would like to load each file individually, create an array, find the maximum value in a certain column of each array, and append it to an empty list. Each file has three columns and no headers or anything apart from numbers.

My problem is starting the iteration. I've received error messages such as "No such file or directory", then displays the name of the first .txt file in my list.

I used os.listdir() to display each file in the directory that I'm working with. I assigned this to the variable filenamelist, which I'm trying to iterate over.

Here is one of my attempts to iterate:

for f in filenamelist:
    x, y, z = np.array(f)
    currentlist.append(max(z))

I expect it to make an array of each file, find the maximum value of the third column (which I have assigned to z) and then append that to an empty list, then move onto the next file.

Edit: Here is the code that I have wrote so far:

import os
import numpy as np
from glob import glob

path = 'C://Users//chand//06072019'
filenamelist = os.listdir(path)
currentlist = []
for f in filenamelist:
    file_array = np.fromfile(f, sep=",")
    z_column = file_array[:,2]
    max_z = z_column.max()
    currentlist.append(max_z)

Edit 2: Here is a snippet of one file that I'm trying to extract a value from:

0,           0.996,    0.031719
5.00E-08,    0.996,    0.018125
0.0000001,   0.996,    0.028125
1.50E-07,    0.996,    0.024063
0.0000002,   0.996,    0.023906
2.50E-07,    0.996,    0.02375
0.0000003,   0.996,    0.026406

Each column is of length 1000. I'm trying to extract the maximum value of the third column and append it to an empty list.

So this would make an array of the filename string. Try using glob to get the file. https://stackoverflow.com/questions/419163/what-does-if-name-main-do — BenT, Jun 17 '19 at 16:31
Sorry wrong link: https://stackoverflow.com/questions/35672809/how-to-read-a-list-of-txt-files-in-a-folder-in-python — BenT, Jun 17 '19 at 16:38
For a start don't focus on the iteration over files. Get code working for just one file. PIck a filename in the directory, and figure out how to load it as an array. Details of that will be depend on the format of the file. Being a `.txt` it's likely to be a `csv`, which `np.genfromtxt` can handle - if you use the right parameters. — hpaulj, Jun 17 '19 at 16:38
Please post the complete code that goes with the error you're trying to solve. Yes, it's frustrating trying to get code that does what you want, but posting incomplete code that is unrelated to the error described doesn't help anyone. — user2699, Jun 17 '19 at 16:51
The "No such file or directory" error is because `os.listdir` returns a list of files in the directory, not paths. To get a path that can be used to load the file use `os.path.join(, f)`. Then follow the advice given by others here about how to load data into numpy arrays to fix the errors that will come up once the file path is correct. — user2699, Jun 17 '19 at 16:53
@chandler22 Did you get everything working? I've made some changes to my answer to help solve the `os.listdir()` issue. Hope it helps! — Jack, Jun 17 '19 at 17:45

Jack · Accepted Answer · 2019-06-17T17:43:01.970

0

The main issue is thatnp.array(filename) does not load the file for you. Depending on the format of your file, something like np.loadtxt() will do the trick (see the docs).

Edit: As others have mentioned, there is another issue with your implementation. os.listdir() returns a list of file names, but you need file paths. You could use os.path.join() to get the path that you need.

Below is an example of how you might do what you want, but it really depends on the file format. In this example I'm assuming a CSV (comma separated) file.

Example input file:

1,2,3
4,5,6

Example code:

path = 'C://Users//chand//06072019'
filenames = os.listdir(path)
currentlist = []

for f in filenames:
    # get the full path of the filename
    filepath = os.path.join(path, f)
    # load the file
    file_array = np.loadtxt(filepath, delimiter=',')
    # get the whole third column
    z_column = file_array[:,2]
    # get the max of that column
    max_z = z_column.max()
    # add the max to our list
    currentlist.append(max_z)

edited Jun 17 '19 at 17:43

answered Jun 17 '19 at 16:45

Jack

2,625
5
33
56

Hey Jack, thank you for the responses. Where would the os.path.join go? In place of os.listdir? – chandler22 Jun 17 '19 at 18:05
I've edited the code example. The call to `os.path.join()` is on the first line of the for loop. Sorry that it wasn't clear. – Jack Jun 17 '19 at 18:07
Thank you for the edit. It did fix the iteration issue, but with the file_array line, I'm getting an error saying "Could not convert string to float". Maybe the numbers in each file are being stored as strings? Is there a way to convert them within that line? – chandler22 Jun 17 '19 at 18:16
can you edit you question to show an example of the file format? That would be a big help – Jack Jun 17 '19 at 18:21
Ah! Are you using `np.loadtxt()` or `np.fromfile()`? I suggested `np.fromfile()` at first, but changed it because `np.loadtxt()` seems to be more what you need. – Jack Jun 17 '19 at 18:28
I am using np.loadtxt() at the moment. – chandler22 Jun 17 '19 at 18:29
Have you set the delimiter property so that it uses commas? Eg: `np.loadtxt(filepath, delimiter=',')` – Jack Jun 17 '19 at 18:39
Yes I have. But I just realized that the files that I'm trying to work with are saved as tab delimited and not comma delimited. Gonna resave them and see if that works. – chandler22 Jun 17 '19 at 18:51

How can I iterate over a list of .txt files using numpy?

1 Answers1