Hoi!
I am indexing files on a file-server using os.walk() (check certain files for their content and then adding them to a list to be reviewed later). It works quite nicely for a smaller folders, but to do the whole thing it takes quite some time (>1h) and there is constantly data added to the server.
How does this impact my indexing? What files will be included? The ones that happen to be in a certain location, when it is scanned? I am mainly interested in a "snapshot" at a certain moment in time.
Thank you!
Kind regards,
Sebastian
This is the code I am using currently. is there maybe a better/faster way to do this?
inport os
file_list = []
for current_folder, sub_folders, file_names in os.walk(my_path):
for file in file_names:
if file == "title":
with open(os.path.join(current_folder, file), "r") as f:
title = f.read()
file_list.append([file, current_folder, title])