reading files from a folder using os module

Question

for a pattern recognition application, I want to read and operate on jpeg files from another folder using the os module.

I tried to use str(file) and file.encode('latin-1') but they both give me errors

I tried :

allLines = []

path = 'results/'
fileList = os.listdir(path)
for file in fileList:
   file = open(os.path.join('results/'+ str(file.encode('latin-1'))), 'r')
   allLines.append(file.read())
print(allLines)

but I get an error saying: No such file or directory "results/b'thefilename"

when I expect a list with the desired file names that are accessible

Your use of `os.path.join` is not how it's intended. You're doing the join yourself with string concatenation rather than passing it a relative path — roganjosh, Mar 24 '19 at 13:30
Try `file = open('results/{}'.format(file))` in your `for` loop — roganjosh, Mar 24 '19 at 13:32
You can’t treat jpeg files as text files that have lines. (And please `close()` your files!) — Jens, Mar 24 '19 at 13:33
You can read from this [documentation](https://docs.python.org/3/library/os.path.html) — YusufUMS, Mar 24 '19 at 13:34
```file = open('results/{}'.format(file))``` gives me utf-8 error even now — Mr. Johnny Doe, Mar 24 '19 at 13:38
Something else is going on here. It makes no sense that python returns a string in a format it then can't use immediately afterwards from `os.listdir(path)`. — roganjosh, Mar 24 '19 at 13:39
Please update the question to show the approach I suggested and include the full traceback — roganjosh, Mar 24 '19 at 13:40
Why inside the "for" loop do you change the contents of the loop variable `file`? — s3n0, Mar 24 '19 at 13:58

score 1 · Answer 1 · answered Mar 24 '19 at 13:52

1

If you can use Python 3.4 or newer, you can use the pathlib module to handle the paths.

from pathlib import Path

all_lines = []
path = Path('results/')
for file in path.iterdir():
    with file.open() as f:
        all_lines.append(f.read())
print(all_lines)

By using the with statement, you don't have to close the file descriptor by hand (what is currently missing), even if an exception is raised at some point.

answered Mar 24 '19 at 13:52

Querenker

2,242
1
18
29

still getting an error saying all_lines.append(f.read()) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte – Mr. Johnny Doe Mar 24 '19 at 14:12
1

Yes, because someone wrote to you before you use binary data (.jpeg) and they have nothing to do with UTF-8 or another text encoding. These are bytes. You must not understand it as lines, not as text at all. First, you have to determine that this is a binary data `f = open("myfile", "rb")` and secondly you have to work with the buffer if you want to read the data sequentially. For example: https://stackoverflow.com/questions/1035340/reading-binary-file-and-looping-over-each-byte – s3n0 Mar 24 '19 at 14:18
If you actually want to read binary, you can read with 'rb' mode. with `file.open('rb') as f:` – Querenker Mar 24 '19 at 14:19

reading files from a folder using os module

1 Answers1