0

I'm very new to using Python, and I suspect this is easier than I think, but I have a lot (more than 200) .txt files in a folder that I would like to concatenate in a single one.

The problem: I want each .txt file to be separated by a new line in this new file.

I'm on Mac, by the way.

I saw some options online that require a list of all files names. I just have a lot of them, and would like to save some time by doing it differently. Any idea?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Tess
  • 15
  • 3
  • 1
    Look into the glob module. It can make a list of file names for you. – Tom G Feb 27 '23 at 20:55
  • 1
    What have you tried so far? – Louis Lac Feb 27 '23 at 20:56
  • 1
    see: https://stackoverflow.com/questions/2512386/how-to-merge-200-csv-files-in-python particularly this answer https://stackoverflow.com/a/17947216/218663 – JonSG Feb 27 '23 at 20:58
  • There should be plenty of duplicates among the existing [2,118,763 Python questions](https://stackoverflow.com/questions/tagged/python). – Peter Mortensen Mar 20 '23 at 12:26
  • A starting point: *[How do I concatenate text files in Python?](https://stackoverflow.com/questions/13613336/how-do-i-concatenate-text-files-in-python)*. See also [its linked questions](https://stackoverflow.com/questions/linked/13613336?sort=votes). – Peter Mortensen Mar 20 '23 at 12:28
  • A candidate (some answers talk about adding empty lines; I haven't checked further): *[How to join all the txt files that are inside a directory? (Respecting that all lines are one below the other)](https://stackoverflow.com/questions/69062589/how-to-join-all-the-txt-files-that-are-inside-a-directory-respecting-that-all)* – Peter Mortensen Mar 20 '23 at 12:31

3 Answers3

1

As you would like to concatenate whole files you can read in the contents of one file in a single statement, and write it out in a single statement, like this:

import glob
import os


def main():
    os.chdir('H:\\')

    with open('out.txt','w') as outfile:
        for fname in glob.glob('*.txt'):
            if fname == 'out.txt':
                continue
            # print(fname)
            with open(fname, 'r') as infile:
                txt = infile.readlines()
                outfile.writelines(txt)
                outfile.write('\n') # separator line


if __name__ == '__main__':
    main()

Using the with statement takes care of closing the file properly after the with-block is left, or even if the script crashes.

user1016274
  • 4,071
  • 1
  • 23
  • 19
0

The OS only answer

It's as simple as:

import os  # OS is the only module you'll need

l = os.listdir('text folder')  # Folder to access

text = ""  # Main string
for file in l:
  if file.endswith(".txt"):  # Check file extension
    with open(f'text folder/{file}', "r") as t:
      text += t.read()  # Read file and add to the main string

l is the folder. text is the full text from all the .txt files. We use os.listdir() to retrieve all items from a folder.

Blue Robin
  • 847
  • 2
  • 11
  • 31
0

You can use the glob package to get all the text files from a given folder. Then, iterate each file and gather the contents into a list. Finally, write the contents separated by a newline in the output file using the .join() method of Python str.

Here is an example:

from glob import glob


def main():
    txt_files = glob("folder/*.txt")

    contents = []
    for file in txt_files:
        with open(file) as f_in:
            contents.append(f_in.read())

    with open("out_file.txt", mode="w") as f_out:
        f_out.write("\n".join(contents))


if __name__ == "__main__":
    main()

If you have lots of files or/and the files are huge, consider using a lazy version to avoid saturating the RAM:

from glob import glob


def gen_contents(txt_files: list[str]):
    for file in txt_files:
        with open(file) as f_in:
            yield from f_in.readlines()
        yield "\n"


def main():
    txt_files = glob("*.txt")

    with open("result.txt", mode="w") as f_out:
        contents = gen_contents(txt_files)
        f_out.writelines(contents)


if __name__ == "__main__":
    main()
Louis Lac
  • 5,298
  • 1
  • 21
  • 36