I am trying to search a large group of text files (160K) for a specific string that changes for each file. I have a text file that has every file in the directory with the string value I want to search. Basically I want to use python to create a new text file that gives the file name, the string, and a 1 if the string is present and a 0 if it is not.
The approach I am using so far is to create a dictionary from a text file. From there I am stuck. Here is what I figure in pseudo-code:
**assign dictionary**
d = {}
with open('file.txt') as f:
d = dict(x.rstrip().split(None, 1) for x in f)
**loop through directory**
for filename in os.listdir(os.getcwd()):
***here is where I get lost***
match file name to dictionary
look for string
write filename, string, 1 if found
write filename, string, 0 if not found
Thank you. It needs to be somewhat efficient since its a large amount of text to go through.
Here is what I ended up with
d = {}
with open('ibes.txt') as f:
d = dict(x.rstrip().split(None, 1) for x in f)
import os
for filename in os.listdir(os.getcwd()):
string = d.get(filename, "!@#$%^&*")
if string in open(filename, 'r').read():
with open("ibes_in.txt", 'a') as out:
out.write("{} {} {}\n".format(filename, string, 1))
else:
with open("ibes_in.txt", 'a') as out:
out.write("{} {} {}\n".format(filename, string, 0))