How to iterate over all files in multiple folders with python

Question

I'm trying to iterate over all directories in directory and find all .html files there. So far I've this code:

def find_path():
"""

:return: List
"""
paths = []
for filename in os.listdir(DIRECTORY):
    if filename.endswith('.html'):
        fname = os.path.join(DIRECTORY, filename)
        with open(fname, 'r') as f:
            soup = BeautifulSoup(f.read(), 'html.parser')
            path = soup.select_one('#tree > li > span').contents[-1]
            paths.append(path)
return paths

But it only works if all .html files are in one directory. What I need is to iterate over all .html files in this directory and save it, but for every directory in that directory there are also .html files that I need to have access to. So ideally, I need to open all of these directories in my parent directory and save whatever I need from .html files. Is there a way to do it?

Thanks!

A recursive glob such as `glob.glob("**/*.html", recursive=True)` should do that out of the box. — MisterMiyagi, May 27 '22 at 08:06
@MisterMiyagi thank you. Could you please explain further? I'm trying to use this but I'm not able to find a working solution using this :X. — acolyter11, May 27 '22 at 08:47

score 1 · Answer 1 · answered May 27 '22 at 08:11

1

os.walk() can help you

import os


def find_path(dir_):
    for root, folders, names in os.walk(dir_):
        for name in names:
            if name.endswith(".html"):
                # Your code
                pass

answered May 27 '22 at 08:11

Lanbao

666
2
9

thank you. When I try to do it like this, the "fname" variable I'm creating is giving me full path, and it's giving me an error - No such file or directory: .... But the files are there. Would you know how to resolve this? – acolyter11 May 27 '22 at 08:22

lazyBug · Accepted Answer · 2022-05-27T08:48:30.923

1

You can use the below sample snippet both #1 or #2 works:

import os
path = "."
for (root, dirs, files) in os.walk(path, topdown=True):
    for file in files:
        if file.endswith(".html"):
            print(root+"/"+file)                #1
            print(os.path.join(root+"/"+file))  #2

edited May 27 '22 at 08:48

answered May 27 '22 at 08:47

lazyBug

99
1
5

How to iterate over all files in multiple folders with python

2 Answers2