os.walk searching improvement

Question

I'm using os.walk to search a html file, but it took 2 minutes to return the file. Any suggestion to improve the performances?

path1 = "//ggco/kk"
shp_list = []
for dirpath, dirnames, y in os.walk(path1):
    for k in y:
        k.startswith(Lot_Operation_combine) and k.endswith(".html")
        fullpath = os.path.join(dirpath, k)
        shp_list.append(fullpath)

        if os.path.isfile(fullpath):
            path = "//ggco/kk"
            shp_list = []
            for dirpath, dirnames, x in os.walk(path):
                for f in x:
                    if f.startswith(Lot_Operation_combine1) and f.endswith(".html"):
                        fullpath = os.path.join(dirpath, f)
                        shp_list.append(fullpath)

                        if os.path.isfile(fullpath):
                            with open(fullpath, 'r')as f:
                                f_contens = f.read()
                                print(f_contens)

                            kau = f_contens
                            context = {"Final": kau}
                            return render(request, 'Output.html', context)

        else:

            path = "//ggco/kk"
            shp_list = []
            for dirpath, dirnames, x in os.walk(path):

                for f in x:
                    if f.startswith(Lot_Operation_1A) and f.endswith(".html"):
                        fullpath = os.path.join(dirpath, f)
                        shp_list.append(fullpath)

                        if os.path.isfile(fullpath):
                            with open(fullpath, 'r')as f:
                                f_contens = f.read()

                                print(f_contens)

                            kau = f_contens
                            context = {

                                "Final": kau
                            }
                            return render(request, 'Output.html', context)

I'm new in python programming language.

Have you any idea of using os.walk to search 1 specific file with better performance?

I hope you guys can share some idea for this problem.

Thank you.

So you mean recursively find the specific file? Check this: [find files recursively in Python?](https://stackoverflow.com/a/2186565/7428855) — Ian, May 23 '17 at 07:24
This code is slowing because for each path, you're going through all the paths again this results in O(N^2) complexity where N is the number of files in the path. And this also results in printing same data many times. — Pavan, May 23 '17 at 07:34
i think you can optimize your code a lot, you have multiple checks. Try to get rid of all those things — NMN, May 23 '17 at 07:36
You have multiple walks, that's the problem, Do you need all those checks. Give a overview where you require to check if a file exists and then check the whole path for another. — Pavan, May 23 '17 at 07:38

NMN · Answer 1 · 2017-05-23T08:30:13.080

I think you can optimize your code a lot, you have multiple redundant checks. Try to get rid of all those things,

Just keep this part of the code it will work for you

path = "//ggco/kk"
shp_list = []
for dirpath, dirnames, x in os.walk(path):

    for f in x:
        if f.startswith(Lot_Operation_combine1) and f.endswith(".html"):
            fullpath = os.path.join(dirpath, f)
            shp_list.append(fullpath)
            kau = ''                
            with open(fullpath, 'r')as f:
                f_contens = f.read()

                print(f_contens)

                kau = f_contens
                context = {"Final": kau}

you can add the last line where you want to return your data from your function

last line nothing but your return statement **return render(request, 'Output.html', context)**. You should place it in the right indentation, as you haven't mentioned the full function, i didn't mention return statement — NMN, May 23 '17 at 08:09

os.walk searching improvement

1 Answers1