7

I love Python because it comes batteries included, and I use built-in functions, a lot, to do the dirty job for me.

I have always been using happily the os.path module to deal with file path but recently I ended up with unexpected results on Python 2.5 under Ubuntu linux, while dealing with string that represent windows file paths :

filepath = r"c:\ttemp\FILEPA~1.EXE"
print os.path.basename(filepath)
'c:\\ttemp\\FILEPA~1.EXE']
print os.path.splitdrive(filepath)
('', 'c:\ttemp\\FILEPA~1.EXE')

WTF ?

It ends up the same way with filepath = u"c:\ttemp\FILEPA~1.EXE" and filepath = "c:\ttemp\FILEPA~1.EXE".

Do you have a clue ? Ubuntu use UTF8 but I don't feel like it has something to do with it. Maybe my Python install is messed up but I did not perform any particular tweak on it that I can remember.

Bite code
  • 578,959
  • 113
  • 301
  • 329

3 Answers3

26

If you want to manipulate Windows paths on linux you should use the ntpath module (this is the module that is imported as os.path on windows - posixpath is imported as os.path on linux)

>>> import ntpath
>>> filepath = r"c:\ttemp\FILEPA~1.EXE"
>>> print ntpath.basename(filepath)
FILEPA~1.EXE
>>> print ntpath.splitdrive(filepath)
('c:', '\\ttemp\\FILEPA~1.EXE')
Moe
  • 28,607
  • 10
  • 51
  • 67
  • 2
    This is an excellent answer that deserves to be the accepted one, IMHO. It is much better to use ready-made tools than craft your own regexps. – Eli Bendersky Oct 08 '08 at 13:29
4

From a os.path documentation:

os.path.splitdrive(path)
Split the pathname path into a pair (drive, tail) where drive is either a drive specification or the empty string. On systems which do not use drive specifications, drive will always be the empty string. In all cases, drive + tail will be the same as path.

If you running this on unix, it doesnt use drive specifications, hence - drive will be empty string.

If you want to solve windows paths on any platform, you can just use a simple regexp:

import re
(drive, tail) = re.compile('([a-zA-Z]\:){0,1}(.*)').match(filepath).groups() 

drive will be a drive letter followed by : (eg. c:, u:) or None, and tail the whole rest :)

kender
  • 85,663
  • 26
  • 103
  • 145
  • Yep, just realize that : the string process is based on the OS, not on syntax. It does not make any difference between win and unix path, it just apply a different algo according to you platform. Crap. – Bite code Oct 08 '08 at 11:40
  • Don't use regular expressions, use the ntpath module instead - see my answer. – Moe Oct 08 '08 at 12:13
  • I agree :) Never had to deal with windows-style patchs, just had no idea of that module:) – kender Oct 09 '08 at 04:48
1

See the documentation here, specifically:

splitdrive(p) Split a pathname into drive and path. On Posix, drive is always empty.

So this won't work on a Linux box.

Adam Bellaire
  • 108,003
  • 19
  • 148
  • 163