0

I was trying to extract the file which contains EOB_FILE

for example I have something like

s = "path Omega/CC/Pune/SYNTT/EOB_PROCESSED_BY_OCR/EOB_FILE/0A225618045646F2AEEFC23E74CAC253/0A225618045646F2AEEFC23E74CAC253_page1.json"

How can I get only the file name which is 0A225618045646F2AEEFC23E74CAC253_page1.json

Code I tried :

val = re.findall(r'([^.]*EOB_FILE[^.]*)', s)
val
['path Omega/CC/Pune/SYNTT/EOB_PROCESSED_BY_OCR/EOB_FILE/0A225618045646F2AEEFC23E74CAC253/0A225618045646F2AEEFC23E74CAC253_page1']

Output expected :

0A225618045646F2AEEFC23E74CAC253_page1.json

pylearner
  • 1,358
  • 2
  • 10
  • 26

2 Answers2

1
import os
s = "path Omega/CC/Pune/SYNTT/EOB_PROCESSED_BY_OCR/EOB_FILE/0A225618045646F2AEEFC23E74CAC253/0A225618045646F2AEEFC23E74CAC253_page1.json"

os.path.basename(s)

os is python miscellaneous operating system interfaces. Check documentation here

Tom Ron
  • 5,906
  • 3
  • 22
  • 38
1

you can use pathlib.Path:

from pathlib import Path

Path(s).name

output:

'0A225618045646F2AEEFC23E74CAC253_page1.json'

to check if EOB_FILE is in the path you could use:

'EOB_FILE' in Path(s).parts

or:

'EOB_FILE' in s

if 'EOB_FILE' in s:
    val = Path(s).name
kederrac
  • 16,819
  • 6
  • 32
  • 55