0

I need to split the string using delimiter "\" The string can be in any of the following format:

  1. file://C:\Users\xyz\filename.txt
  2. C:\Users\xyz\filename.txt

I need my script to give the output as "filename.txt" I tried to use split('\\\\'). It does not work out. Which is the better function to use?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Geethanjali
  • 57
  • 2
  • 11

3 Answers3

1

Two issues here.

Path splitting

You'd normally use os.path.split to work with paths:

>>> import os.path
>>> p=r'C:\Users\xyz\filename.txt'
>>> head, tail = os.path.split(p)
>>> head
'C:\\Users\\xyz'
>>> tail
'filename.txt'

Caveat: os.path works with the path format of the operating system it's used on. If you know you specifically want to work with Windows paths (even when your program is ran on Linux or OSX), then instead of the os.path you'd work with the ntpath module. See the note:

Note Since different operating systems have different path name conventions, there are several versions of this module in the standard library. The os.path module is always the path module suitable for the operating system Python is running on, and therefore usable for local paths. However, you can also import and use the individual modules if you want to manipulate a path that is always in one of the different formats. They all have the same interface:

  • posixpath for UNIX-style paths
  • ntpath for Windows paths
  • macpath for old-style MacOS paths
  • os2emxpath for OS/2 EMX paths

Format support

You have 2 formats to support:

  1. file://C:\Users\xyz\filename.txt
  2. C:\Users\xyz\filename.txt

2 is a normal Windows path, and 1 is... Frankly, I have no idea what that is. It kind of looks like a file URI, but uses Windows-style delimiters (backslashes). This is strange. When I open a PDF in Chrome on Windows the URI looks different:

file:///C:/Users/kos/Downloads/something.pdf

and I'll assume that's the format you're interested in. If not, then I can't vouch for what you're dealing with and you can make some educated guess on how to interpret it (drop the file:// prefix and treat it as a Windows path?).

An URI you can split into meaningful parts using the urlparse module (see urllib.parse for python 3), and once you've extracted the path part of the URI, you can just .split('/') it (URI grammar is simple enough to allow that). Here's what happens if you use this module on a file:// URI:

>>> r = urlparse.urlparse(r'file:///C:/Users/xyz/filename.txt')
>>> r 
ParseResult(scheme='file', netloc='', path='/C:/Users/xyz/filename.txt', params='', query='', fragment='')
>>> r.path
'/C:/Users/xyz/filename.txt'
>>> r.path.lstrip('/').split('/')
['C:', 'Users', 'xyz', 'filename.txt']

Please read this URI scheme description to have a better idea how this format looks like and why there are three slashes after file:.

Kos
  • 70,399
  • 25
  • 169
  • 233
0

Suppose your string is pathName, then you can use fileName = pathName.split('\\')[-1].

barak manos
  • 29,648
  • 10
  • 62
  • 114
0

Try the following steps, do notice the valid string format for using \ inside strings and to avoid \x scope error

>>> file = 'file://C:\\Users\\xyz\\filename.txt'
>>> file.split('\\')[-1]
'filename.txt'

>>> file = 'C:\\Users\\xyz\\filename.txt'
>>> file.split('\\')[-1]
'filename.txt'
softvar
  • 17,917
  • 12
  • 55
  • 76