0

Update: How do I return a string from a regex match in python? - this question seems similar and I've updated accordingly.


I'm trying to open a file to write to it but, I don't know its case, so I'm trying to case-insensitive match its filename from its path.

Currently, with print(my_file), I see all the files in the folder_path but, file_match returns None. I also tried re.search instead of re.match and it returns None as well.

import re
from pathlib import Path as p
cwd = p.cwd()

folder_path = p(cwd / 'somefolder')

for my_file in folder_path.iterdir():
    print(my_file)
    file_match = re.search('.+/test.txt', str(my_file), re.IGNORECASE)
    print(file_match) # Gives me <re.Match object; span=(0, 97), match='/the/path>

print(file_match) # gives me None

I'm hoping to match one of the following etc:

  • /cwd/somefolder/test.txt
  • /cwd/somefolder/TEst.Txt
  • /cwd/SOMEFolder/teSt.TXT

but not this:

  • /cwd/somefolder/NotTest.txt
  • /cwd/somefolder/Other.mp3

Ultimately, I want a variable that includes the path to the /cwd/somefolder/test.txt, just in any case it happens to exist in. This way I can open and write to said file.

Here's the regular expression I'm using, which appears to be working: https://regex101.com/r/hz5qNe/2

Note: there is only one test.txt in each folder I'm searching in but, I don't know its case i.e. the filename is always the same but, the case can be different depending on the current folder_path iteration (I'm looping to find each folder's test.txt path but, didn't include that in the above code, as to par it down to its simplest form.)

Any help would be appreciated, thanks!

Natetronn
  • 466
  • 3
  • 12
  • Your example looks correct, so your issue is probably somewhere else. I just used your code with your regex and it returned the expected output. – exhuma Sep 04 '20 at 06:10
  • @exhuma thanks! Yeah, I'm not sure why it's not working. You may be right, could be something else effecting it. I'll keep digging. – Natetronn Sep 04 '20 at 06:22
  • @exhuma this looks like a duplicate but, I haven't been able to get file_match.group(0) to work: https://stackoverflow.com/questions/18493677/how-do-i-return-a-string-from-a-regex-match-in-python - Ultimately, I want a variable that has the path to /cwd/somefolder/test.txt (any case of test.txt, depending on whatever case it exists as.) – Natetronn Sep 04 '20 at 14:46
  • Look at my solution below. Using `my_file.parent` will give you that path. If you really, absolutely, positively, must use regexes, I can try to update the solution. But for this case, you are *already* using `pathlib` so I would use that fo rmore maintainable code. – exhuma Sep 04 '20 at 14:49
  • @exhuma Thanks, I think I got it working. `if file_match: print(file_match.group())` - In regards to your answer, does it account for any case of text.txt? I'll try it... – Natetronn Sep 04 '20 at 15:01
  • OT: `cd` usually stands for “*change* dir”. Making it stand for the working dir is quite confusing. Maybe `wd` or `cwd` would be better here. – Konrad Rudolph Sep 04 '20 at 15:26
  • It's really not that confusing lol. But I see your point. Mostly it's demoing the fact that I was using pathlib. I'll update it so it's more clear. – Natetronn Sep 04 '20 at 15:35
  • @Natetronn it actually *is* quite confusing. As you have now posted your own answer I finally understand what you actually *wanted* to do. Your code & variable names were misleading. It is always good posting the "expected result" in a question which you didn't. That surely explains why it took so long to get an answer. – exhuma Sep 04 '20 at 19:45

2 Answers2

1

I would suggest dropping re and use the following test:

if my_file.name.lower() == 'test.txt':
    ...

This should make the code also easier to understand and maintain.

If you then need to access the folder name you can just use the my_file.parent property.

exhuma
  • 20,071
  • 12
  • 90
  • 123
0

You need to pull it out of the regex Match Object using .group(), like this demo - (source)

import re
from pathlib import Path as p
cwd = p.cwd()

folder_path = p(cwd / 'somefolder')

for my_file in folder_path.iterdir():
     file_match = re.search('.+/test.txt', str(my_file), re.IGNORECASE)
     if file_match:
         match = file_match.group()

print(match) # prints: /cwd/somefolder/test.txt (or any case of test.txt, which ever case is present...)
Natetronn
  • 466
  • 3
  • 12