Is there a faster way to open multiple txt files one after the other in python? (avg 300 000 files)

Question

i am working on a project where i have a raw data that i need to extract from each txt file (around 300 000) only once and move to another batch of 300 000 files. And there for i need to open each txt file one after the other in a effetion way posible to minimize the time of this process. I'm using open() but it can take up to 10-15 min for avg 160 000 txt files no more than 600 bytes each.

Thank you for your time :)

for filename in os.listdir("folder1"):

    with open(os.path.join("folder1", filename), 'r') as f:
        text = f.read()
        text = re.findall(r'\w+', text)
        index = 0

        while index < len(text):
            if text[index] == "P1":
                function1(text)

            elif text[index] == "T1":
                function2(text)

            index += 1

You can use the multiprocessing library in python to split the work up between CPU cores. [Multiprocessing Library](https://docs.python.org/3/library/multiprocessing.html). If you need to maintain concurrency, you can also explore [Threading](https://docs.python.org/3/library/threading.html) — Lateralus, Oct 12 '22 at 12:22
@Lateralus: chances are that this will slow down the whole process due to simultaneous disk accesses. — , Oct 12 '22 at 12:23
@YvesDaoust Hii, thank you for your answer. I am using an SSD already and a intel core i5 — Youssef Sabaa, Oct 12 '22 at 12:33
Here is a link to similar question resolved using multiprocessing: https://stackoverflow.com/a/36590187/10151980 — Lateralus, Oct 12 '22 at 12:59

Is there a faster way to open multiple txt files one after the other in python? (avg 300 000 files)

0 Answers0