1

I'm having an issue with the following code:

name = "epubtxt\ursita.txt"

And I want to remove the directory, to get output ursita.txt

I am doing this:

name.lstrip('epubtxt\\')

The main problem is that I get this output:

rsita.txt

What is going wrong here?

PrinceOfCreation
  • 389
  • 1
  • 12
Adrian
  • 178
  • 1
  • 1
  • 13
  • Possible duplicate of [Extract file name from path, no matter what the os/path format](https://stackoverflow.com/questions/8384737/extract-file-name-from-path-no-matter-what-the-os-path-format) – TemporalWolf May 20 '19 at 19:32

4 Answers4

6

s1.lstrip(s2) does not strip the whole s2 from the left of s1. What it does is strip all the characters contained in s2 from the left of s1.

Examples:

'aaabbbccc'.lstrip('a') == 'bbbccc'
'aaabbbccc'.lstrip('ac') == 'bbbccc'
'aaabbbccc'.lstrip('ab') == 'ccc'

In your example, 'epubtxt\\' contains the character u, so the u after the backslash is stripped.

What you probably need is:

if name.startswith('epubtxt\\'):
    name[len('epubtxt\\'):]
Mikhail Burshteyn
  • 4,762
  • 14
  • 27
  • 2
    Might want to put it in a conditional `if name.startswith(...):`. +1 regardless. – Mad Physicist May 20 '19 at 19:19
  • 2
    Python has a proper way to separate file names from paths: [`os.path.basename(path)`](https://docs.python.org/3/library/os.path.html#os.path.basename). Bonus is it will work regardless of os. – TemporalWolf May 20 '19 at 19:25
  • @TemporalWolf even better is to use `pathlib.Path` objects IMO. The `os.path` API is wordy and clunky – juanpa.arrivillaga May 21 '19 at 00:51
2

The reasons the u gets stripped is not related to \u or unicode. The problem is that the strip functions take a list of characters to strip, regardless of the order they are in (either in the word on in the function call). In other words, in your example, the function will strip any characters that match any of 'e', 'p', 'u', etc. until one of the characters doesn't match.

vlsd
  • 945
  • 6
  • 18
2

What may be easier to do is to split the string on \. That way you won't get any false matches (as a filename can't contain a \ on Windows).

This could be done with the following code:

name = "epubtxt\ursita.txt"
name = name.split("\\")
name = name[-1] # use the last element of the list, which will be the absolute file.
>>> name
'ursita.txt'

I've used a double slash in line two - this is because \ is an escape character but we don't want it to escape anything. Therefore we escape the escape.

PrinceOfCreation
  • 389
  • 1
  • 12
1

You've been already told by others why lstrip does not work. A suitable solution would be to split and take the last component:

 name.split('\\')[-1]
 # "ursita.txt"
DYZ
  • 55,249
  • 10
  • 64
  • 93