2

List all the files having ext .txt in the current directory .

L = [txt for f in os.walk('.') 
            for txt in glob(os.path.join(file[0], '*.txt'))]

I want to avoid files from one specific directory and its subdirectories . Lets say I do not want to dig into folder3 and its available subdirectories to get the .txt files. I tried below

d = list(filter(lambda x : x != 'folder3', next(os.walk('.'))[1]))

but further steps not able to figure it out.How to include both to work together?

EDIT:

I tried referring the link provided as already answered query but I am unable to get desired output with below and surprisingly getting empty list as output for a

a=[]
for root, dirs, files in os.walk('.'):
  dirs[:] = list(filter(lambda x : x != 'folder3', dirs)) 
  for txt in glob(os.path.join(file[0], '*.txt')): 
      a.append(txt)
RonyA
  • 585
  • 3
  • 11
  • 26
  • 5
    Possible duplicate of [Excluding directories in os.walk](https://stackoverflow.com/questions/19859840/excluding-directories-in-os-walk) – Isma Aug 21 '18 at 18:27
  • Its kind of duplicate but seems doesnot serves my purpose as per my requirement and the above code but let me try .. – RonyA Aug 21 '18 at 18:31
  • @Isma The link referred by you is not helping me with my code .Need Help. – RonyA Aug 21 '18 at 18:44
  • `a=[] for root, dirs, files in os.walk('.'): dirs[:] = list(filter(lambda x : x != 'folder3', dirs)) for pdf in glob(os.path.join(file[0], '*.txt')): a.append(txt)` `a` is giving empty list as output, however there are other folders available. @Isma – RonyA Aug 21 '18 at 18:47
  • @Isma I think `for pdf in glob(os.path.join(file[0], '*.txt')` needs to be fixed so that it should avoid digging `folder3` – RonyA Aug 21 '18 at 19:00
  • @Isma Can you help me with drafting it .. I am not sure how to mold my existing above code in the question as per the provided link . – RonyA Aug 22 '18 at 00:42

1 Answers1

6

The following solution seems to be working, any directory specified in the exclude set will be ignored, any extension in the extensions set will be included.

import os

exclude = set(['folder3'])
extensions = set(['.txt', '.dat'])
for root, dirs, files in os.walk('c:/temp/folder', topdown=True):
    dirs[:] = [d for d in dirs if d not in exclude]
    files = [file for file in files if os.path.splitext(file)[1] in extensions]
    for fname in files:
        print(fname)

This code uses the option topdown=True to modify the list of dir names in place as specified in the docs:

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search

Isma
  • 14,604
  • 5
  • 37
  • 51
  • What is splittext? Can it be done for other filetype? – RonyA Aug 22 '18 at 09:16
  • That's to get the extension. Check the last edit, you can now specify multiple extensions. – Isma Aug 22 '18 at 09:35
  • The documentation for [`os.walk`](https://docs.python.org/3/library/os.html#os.walk) explicitly says that if `topdown=True` (which is the default) then modifying the `dirs` directory achieve what the OP wants. You could probably add a quote to the docs to clarify that this is not an accident... – Giacomo Alzetta Aug 22 '18 at 09:40
  • Thanks for the elobrate answer. I found this also working from one of the recent stack post `for root, dirs, files in os.walk('.'): if "folder3" not in root: for file in files: if file.endswith(".txt"): print (os.path.join(root, file))` . The answer you provided is very helpful if we have to ignore multiple directories. – RonyA Aug 22 '18 at 09:42
  • Yes, the intention is that - if my code is in `Desktop` and inside it I have multiple folders and their respective subfolders. but let me give that a try – RonyA Aug 22 '18 at 09:46
  • @Isma Its traversing inside folders and subfolders inside the current directory . It can be used with function so that file ext can be passed as per our need. I found it easy to use without much code, please test from your end as well.Your time and answer is much appreciated also, I am marking it as a solution. – RonyA Aug 22 '18 at 09:49
  • For the record, this solution only works for direct child directories - it doesn't work if the path you want to exclude is e.g. `folder4/folder41`, but want to include the rest of `folder4`. This ofc wasn't required by the original asker. – Carolus Mar 30 '21 at 09:41
  • For my end, one could replace the sixth line with `dirs[:] = [d for d in dirs if os.path.join(root, d) not in exclude]` and also `exclude` should contain absolute paths instead of relative ones. – Carolus Mar 30 '21 at 10:08