1

I am pretty weak in regex. I'm looking for some help with how to extract the .sav file name from the following string:

C:\Users...\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\AutumnHi-20180531-183047-34-SystemNormal\AutumnHi-20180531-183047-34-SystemNormal.sav

Currently I am using this code:

re.findall(r'\\(.+).sav',txt)

but it only finds

['Users\\...\\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\\AutumnHi-20180531-183047-34-SystemNormal\AutumnHi-20180531-183047-34-SystemNormal.sav was']

I'm trying to find "AutumnHi-20180531-183047-34-SystemNormal.sav"

I am using Python 3.7.

TrebledJ
  • 8,713
  • 7
  • 26
  • 48
Rick Zhang
  • 69
  • 1
  • 4
  • Does this answer your question? [Extract file name from path, no matter what the os/path format](https://stackoverflow.com/questions/8384737/extract-file-name-from-path-no-matter-what-the-os-path-format) – Tomerikoo Mar 05 '21 at 13:50

5 Answers5

1

You could match a backslash and then capture in a group matching not a backslash using a negated character class. Then match a dot and sav.

You might use a negative lookahead to assert what is directly on the right is not a non whitespace char.

\\([^\\]+\.sav)(?!\S)

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

Regex101 (link):

txt = r'''C:\Users\\...\\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\\WinterLo-20180729-043047-34-SystemNormal\\WinterLo-20180729-043047-34-SystemNormal.sav'''

import re

print(re.findall(r'(?<=\\)[^\\]+sav',txt)[0])

Prints:

WinterLo-20180729-043047-34-SystemNormal.sav

You could achieve the same without re module:

print(txt.split('\\')[-1])
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
0

The following pattern should match the filename:
(?=[^\\]*$).*\.sav

Regex101 Demo

The above pattern asserts (?= is positive lookahead) that no other character up to the end of the string is a backslash. So essentially it finds the last backslash and then matches the desired text. For other details, see "EXPLANATION" on the right side of the regex101 demo at the link above.

SanV
  • 855
  • 8
  • 16
0

I am assuming you are not learning about regex but want to know how to handle parsing filenames.

I would use the pathlib module to handle parsing the filename.

C:\Users\barry>py -3.7
Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pathlib
>>> filename = r'C:\Users\...\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\WinterLo-20180729-043047-34-SystemNormal\WinterLo-20180729-043047-34-SystemNormal.sav'
>>> path = pathlib.Path(filename)
>>> path.name
'WinterLo-20180729-043047-34-SystemNormal.sav'
>>> path.parent
WindowsPath('C:/Users/.../Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20/WinterLo-20180729-043047-34-SystemNormal')
>>>
Barry Scott
  • 799
  • 6
  • 13
0

I'm guessing that these expressions:

[^\\]+\.sav
([^\\]+\.sav)

or some similar derivative of those might likely extract what we might want here.

Test

import re

print(re.findall(r"([^\\]+\.sav)", "C:\\Users...\\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\\AutumnHi-20180531-183047-34-SystemNormal\\AutumnHi-20180531-183047-34-SystemNormal.sav"))

Output

['AutumnHi-20180531-183047-34-SystemNormal.sav']

Demo

Emma
  • 27,428
  • 11
  • 44
  • 69