3

I have a datapath to a file couple of data files, let us say data01.txt, data02.txt and so on. During processing the user will provide mask files for the data (potentially also via an external tool). Mask files will contain the string 'mask', e.g., data01-mask.txt.

from pathlib import Path
p = Path(C:\Windowns\test\data01.txt)
dircontent = list(p.parent.glob('*'))

Gives me a list of all the filespath as Path objects including potential masks. Now I want a list that gives me the directory content but not including any file containing mask. I have tried this approach to use fancy regex *![mask]* but I do not get it to work.

Using,

dircontentstr = [str(elem) for elem in x]
filtereddir = [elem.find('mask') for elem in dircontentstr if elem.find('mask')==-1]

I can get the desired result, but it seems silly to then convert back to Path elements. Is there a straight forward way to exclude from the directory list?

Matthias Arras
  • 565
  • 7
  • 25

1 Answers1

10

There is no need to convert anything to strings here, as Path objects have helpful attributes that you can use to filter on. Take a look at the .name and .stem attributes; these let you filter path objects on the base filename (where .stem is the base name without extension):

dircontent = [path for path in p.parent.glob('*') if 'mask' not in path.stem]
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Given the updated question, dircontent = [path for path in x if path.name.find('mask') == -1] works. Thanks. – Matthias Arras Feb 14 '19 at 15:35
  • 1
    @MatthiasArras: I've updated my answer to fit your question better. Don't use `.find()`, use `'mask' in ...`. – Martijn Pieters Feb 14 '19 at 16:15
  • @MartijnPieters : Is it possible to add more conditions to the `if` ? For example I want to exclude 'MASK', but also exclude BAK, but include TXT etc ? – FMFF Apr 19 '22 at 21:52
  • 1
    @FMFF: it's just a boolean expression. `BAK` and `TXT` are extensions. presumably, so you want to use the [`.suffix` attribute](https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix). E.g. `if 'MASK' not in path.stem and path.suffix == '.TXT'` would include files ending in `.TXT` that have `MASK` in their file stem, etc. (note that files that end in `.TXT` can't also end in `.BAK`). – Martijn Pieters Apr 20 '22 at 12:19
  • 1
    @FMFF: note that if you are looking for files with a specific, single extension, then it is easier to let the `glob()` call limit the files, e.g. `p.parent.glob('*.TXT')` would only list files ending with `.TXT`. – Martijn Pieters Apr 20 '22 at 12:20