It is possible to dump content of a text file into a Python list?

Question

I have a directory of 50 txt files. I want to combine the contents of each file into a Python list.

Each file looks like;

line1
line2
line3

I am putting the files / file path into a list with this code. I just need to loop through file_list and append the content of each txt file to a list.

from pathlib import Path


def searching_all_files():
    dirpath = Path(r'C:\num')
    assert dirpath.is_dir()
    file_list = []
    for x in dirpath.iterdir():
        if x.is_file():
            file_list.append(x)
        elif x.is_dir():
            file_list.extend(searching_all_files(x))
    return file_list

But I am unsure best method

Maybe loop something close to this?

NOTE: NOT REAL CODE!!!! JUST A THOUGHT PULLED FROM THE AIR. THE QUESTION ISNT HOW TO FIX THIS. I AM JUST SHOWING THIS AS A THOUGHT. ALL METHODS WELCOME.

file_path = Path(r'.....')
    with open(file_path) as f:
        source_path = f.read().splitlines()
    source_nospaces = [x.strip(' ') for x in source_path]
    return source_nospaces

As a slight shortcut, you can use `.readlines()` instead of `.read().splitlines()`. — John Gordon, Feb 17 '23 at 23:08
are you trying to build a search engine for your files? Are you only looking for text. Using getChunk will load in blocks of data which can be searched. — Golden Lion, Feb 17 '23 at 23:08
I just need to put the contents of these files into a list to save time. That's it. — uncrayon, Feb 17 '23 at 23:11
See https://stackoverflow.com/a/45172387 for how you can use `iglob(.., recursive=True)` to get all files returned automagically without having to handle them yourself. Also, .readlines() and `.read().splitlines()` will behave differently; the first will include the newline at the end of each line, while the latter won't. You can use `list.extend` with the returned value from `f.read().splitlines()` to append the content of each file to your main list. — MatsLindh, Feb 17 '23 at 23:11
"As a slight shortcut, you can use .readlines() instead of .read().splitlines()" A shortcut to what? That snippet was me spitballing. — uncrayon, Feb 17 '23 at 23:12
If you're just trying to combine a set of files by concatenation, you can do that from a command line with no programming required. Is that your task? — Tim Roberts, Feb 17 '23 at 23:16
@Tim Roberts no I don't want to do `copy *.txt newfile.txt` see op. — uncrayon, Feb 17 '23 at 23:23
If you're on Windows, you can do `copy a.txt+b.txt+c.txt+d.txt out.txt`. On Linux, you can use `cat` to do the same function. Why wouldn't you want the easiest method that solves your problem? — Tim Roberts, Feb 17 '23 at 23:41
Tim I literally said I didn't want to do that. See my code snippet for a more succinct way btw. I solved the problem in OP... — uncrayon, Feb 28 '23 at 19:18

score 3 · Answer 1 · answered Feb 17 '23 at 23:48

3

You could make use of pathlib.rglob in order to search for all files in a directory recursively and readlines() to append the contents to list:

from pathlib import Path
files = Path('/tmp/text').rglob('*.txt')
res = []
for file in files:
    res += open(file).readlines()
print(res)

Out:

['file_content2\n', 'file_content3\n', 'file_content1\n']

answered Feb 17 '23 at 23:48

Maurice Meyer

17,279
4
30
47

You forgot the `r`. This code will results in the path having too many \. For anyone reading, Pathlib also gives a horribly misleading error too. It makes it sound like something else is happening. Always use `Path(r'path...')` – uncrayon Feb 28 '23 at 19:14

It is possible to dump content of a text file into a Python list?

1 Answers1