0

I want to open and read several text files. The plan is to find a string in the text files and print the whole line from the string. The thing is, I can´t open the paths from the array. I hope it is unterstandable what I want to try.

import os
from os import listdir
from os.path import join
from config import cred

path = (r"E:\Utorrent\Leaked_txt")
for filename in os.listdir(path):
    list = [os.path.join(path, filename)]
    print(list)

for i in range(len(list)-1):
    with open(str(list[i], "r")) as f:
        for line in f:
            if cred in line:
                print(line)

Thanks :D

2 Answers2

1

I prefer to use glob when reading several files in a directory

import glob

files = glob.glob(r"E:\Utorrent\Leaked_txt\*.txt") # read all txt files in folder

for file in files: # iterate over files
    with open(file, 'r') as f: # read file
        for line in f.read(): # iterate over lines in each file
            if cred in line: # if some string is in line
                print(line) # print the line
It_is_Chris
  • 13,504
  • 2
  • 23
  • 41
  • Worked so far but I got this error now: Exception has occurred: UnicodeDecodeError 'charmap' codec can't decode byte 0x9d in position 7656: character maps to –  Jul 20 '21 at 13:13
  • If all your files are `UTF-8` then you can try adding the encoding: `with open(file , 'r', encoding='utf8') as f:` – It_is_Chris Jul 20 '21 at 13:16
  • nah, suddenly got this: Exception has occurred: UnicodeDecodeError 'utf-8' codec can't decode byte 0xfb in position 7685: invalid start byte –  Jul 20 '21 at 13:25
  • How are your files encoded, ASCII? You can also try adding `read()` – It_is_Chris Jul 20 '21 at 13:27
  • If I open them there is utf8, how can I check them? –  Jul 20 '21 at 13:28
0

With os, you can do something like this:

import os
from config import cred 

path = "E:/Utorrent/Leaked_txt"
files = [os.path.join(path, file) for file in os.listdir(path) if file.endswith(".txt")]

for file in files:
    with open(file, "r") as f:
        for line in f.readlines():
            if cred in line:
                print(line)

Edit

os.listdir only includes files from the parent directory (specified by path). To get the .txt files from all sub-directories, use the following:

files = list()
for root, _, f in os.walk(path):
    files += [os.path.join(root, file) for file in f if file.endswith(".txt")]
not_speshal
  • 22,093
  • 2
  • 15
  • 30
  • Thank for your reply, it worked put I also got this error: 'charmap' codec can't decode byte 0x9d in position 7656: character maps to –  Jul 20 '21 at 13:39
  • @t0xic - That's a completely different issue. See [here](https://stackoverflow.com/questions/9233027/unicodedecodeerror-charmap-codec-cant-decode-byte-x-in-position-y-character). You might need a `try/except` blog where you try reading the file with multiple encodings. – not_speshal Jul 20 '21 at 13:42
  • Okay got it. One question left, how can I remove the "[]" because those got printed for every line? –  Jul 20 '21 at 13:52
  • Stil have an issue :( It only searches about 1/16th of the dictionary –  Jul 20 '21 at 15:57
  • What dictionary? – not_speshal Jul 20 '21 at 16:00
  • In "files =" I took all the paths from the text files and open/search them (around 12,5k text files). If I print "files" I only got around 250 txt files. As an example: I search for a string in the 250th + 251th text file it only finds the string in the 250th text file –  Jul 20 '21 at 16:10
  • `os.listdir` shows all the files in path (not including sub-directories). If you're trying to access all files within all sub-folders, use `os.walk` – not_speshal Jul 20 '21 at 16:12
  • That´s the problem, all the files are in the same folders with no sub-folders :( –  Jul 20 '21 at 18:41