-2

I've been at it, for ~4 hours now, and I've failed completely, so I humbly ask for help

I've got a string with the following structure

a197 8101 aaa/bbb/ccc/ddd.doc

I need a regex that will give me ddd.doc. Obviously ddd is not always ddd, might be 'potato', might contain numerals etc. Basically I want regex that will give my anything between the last '/' and up until (including) .doc

Edit: \/(.*\.html) this is the closest I've got but it will return /bbb/ccc/ddd.doc

Edit2: I'm not looking to split, maybe I misspoke. I just want to match

3 Answers3

2
import re
pattern = re.compile(r"/([^/\\]+.doc)")
print(pattern.search("a197 8101 aaa/bbb/ccc/ddd.doc").group(1))
print(pattern.search("a197 8101 aaa/bbb/ccc/potato.doc").group(1))
print(pattern.search("a197 8101 aaa/bbb/ccc/01_-2,,.3.doc").group(1))

output:

ddd.doc
potato.doc
01_-2,,.3.doc
nulladdr
  • 741
  • 5
  • 15
0

This should work

import re

string="a197 8101 aaa/bbb/ccc/ddd.doc"
result = re.findall(r'\w+\.\w+$', string)
print(result)
Eraklon
  • 4,206
  • 2
  • 13
  • 29
0
import re

string = "a197 8101 aaa/bbb/ccc/ddd.doc"
# parenthesis forms regex groups
# (group 1 matches start of line to and including /, greedily) = (^.*\/)
# (group 2 matches any character, non-greedy, until end of line) = (.*?$)
result = re.search(r'(^.*\/)(.*?$)', string)
print(result.group(0))
print(result.group(1))
print(result.group(2))

output = result.group(2)

will yield:

a197 8101 aaa/bbb/ccc/ddd.doc
a197 8101 aaa/bbb/ccc/
ddd.doc

Note that group 0 will always be the whole match.

WGriffing
  • 604
  • 6
  • 12