-4

How would I go about filtering out files that include a certain string:

Say I'm filtering file paths with the following regex

"/.xml$|.json$/i"

And I'd like to ignore files that are xml but include _unwanted_string in the name

For example:

file_name_unwanted_string.xml

SOMEOTHERNAME_unwanted_string.xml

NAME3_unwanted_string.xml

NAME3_unwanted_string_MORE_TEXT.xml

should all not be caught in the regex.

Is this possible with regex?

Valu3
  • 374
  • 3
  • 15
  • But `/^(?!file_name_unwanted_string\.xml$).+\.(?:xml|json)$/i` is specific for `file_name_unwanted_string.xml` and not any other file ending with `_unwanted_string.xml`, so no – Valu3 Mar 03 '21 at 14:25
  • @trincot no it does not – Valu3 Mar 03 '21 at 14:28
  • Why not? Did you try using the negative look ahead in that reference? I don't see it in your attempt... Yet that is what is the solution. The comment of anubhava is the right idea, but just leave out `file` as that is not part of the string you want to exclude. – trincot Mar 03 '21 at 14:36
  • 1
    Use `/^(?!.*_unwanted_string\.xml$).+\.(?:xml|json)$/i` – anubhava Mar 03 '21 at 14:51

1 Answers1

-2

The question is quite incomplete regarding the environment used.
Assuming you have access to a python console you could run something like this:

import re
import os

l = ( "file_name_unwanted_string.xml",
       "SOMEOTHERNAME_unwanted_string.xml",
       "NAME3_unwanted_string.xml",
       "NAME3_unwanted_string_MORE_TEXT.xml",
       "good_file.xml",
       "bad_unwanted_string_good.xml",
)
# or, assuming the current directory has the files you are interested in
# l = os.listdir()

[i for i in l if not re.search(r"(?=_unwanted_string)(.+)[.]xml$", i)] 

>>> ['good_file.xml']

You could of course change l by os.listdir

Leonardo Maffei
  • 352
  • 2
  • 6
  • 16