0

I'm trying to create a small script that copies files with partially unicode names in variables, but I just can't get it to work.

The code looks like this:

    fileextension = filename.split(".")[len(filename.split(".")) - 1]
    if not os.path.exists(artistdir + "\\" + songname + "." + fileextension):
        print basedir + filename, artistdir + "\\" + songname + "." + fileextension
        shutil.copy(basedir + filename, artistdir + "\\" + songname + "." + fileextension)

I get the following return:

E:\music\_collections\Adrian von Ziegler\2012 Starchaser\01. Adrian von Ziegler - Nidh├Âggr.mp3 C:\Temp\Adrian von Ziegler\Nidh├Âggr.mp3
Traceback (most recent call last):
  File "E:\main\Coding\Python\WinampPlaylistExport\winampplaylistexport.py", line 72, in <module>
    iteratePlaylists()
  File "E:\main\Coding\Python\WinampPlaylistExport\winampplaylistexport.py", line 20, in iteratePlaylists
    iteratePlaylist(playlist.get("title"), playlist.get("filename"))
  File "E:\main\Coding\Python\WinampPlaylistExport\winampplaylistexport.py", line 69, in iteratePlaylist
    shutil.copy(basedir + filename, artistdir + "\\" + songname + "." + fileextension)
  File "C:\Python27\lib\shutil.py", line 119, in copy
    copyfile(src, dst)
  File "C:\Python27\lib\shutil.py", line 82, in copyfile
    with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: 'E:\\music\\_collections\\Adrian von Ziegler\\2012 Starchaser\\01. Adrian von Ziegler - Nidh\xc3\xb6ggr.mp3'

The first line shows the target and the source path of the file where the copying fails (return of the print statement).

Thanks in advance.

Lukas Bach
  • 3,559
  • 2
  • 27
  • 31
  • Possible duplicate of: http://stackoverflow.com/questions/4173477/copying-files-with-unicode-names ? – amito Sep 06 '15 at 17:49
  • I already tried lots of answers such as using .encode("utf-8") and prepending u"" + to the path strings, but that threw the following error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 84: ordinal not in range(128) – Lukas Bach Sep 06 '15 at 18:01
  • 2 more suggestions: if possible define your variables as `unicode` (sequence of characters) objects rather than `string` (sequence of bytes). Then concatenation will just work. And the conventional way of concatenating path components is to use `os.path.join(...)`. – roeland Sep 06 '15 at 22:51
  • unrelated: use `os.path.join()` to create a full path instead of concatenating the parts manually. – jfs Sep 12 '15 at 14:50

2 Answers2

2

try changing it to:

fileextension = filename.split(".")[len(filename.split(".")) - 1]
if not os.path.exists(artistdir + "\\" + songname + "." + fileextension):
    print basedir + filename, artistdir + "\\" + songname + "." + fileextension
    shutil.copy(basedir + filename.decode('utf8'), artistdir + "\\" + songname.decode('utf8') + "." + fileextension)

notice its 'decode' and not "encode" (you comented you already tried encode, but it doesnt make sense to encode a string witch is already in utf8)

DorElias
  • 2,243
  • 15
  • 18
  • Thanks, but that doesn't work either, I get the same error message: `UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 84: ordinal not in range(128)` – Lukas Bach Sep 06 '15 at 18:29
  • does the basedir ot the artistdir contain ut8 characters too? if so add .decode('utf8') to them too – DorElias Sep 06 '15 at 18:31
  • Argh yeah, I forgot about artistdir, now it works.. Thanks! – Lukas Bach Sep 06 '15 at 18:52
0

You should pass Unicode strings to shutil.copy(). Don't mix bytestrings and Unicode strings.

All variables (filename, artistdir, songname, fileextension, basedir) should be Unicode strings here (assert isinstance(s, unicode)).

Sprinkling your code with .decode('utf-8') in various places is error-prone. Use Unicode sandwich instead:

  1. convert input bytestring to Unicode text as soon as possible
  2. use Unicode internally to work with text
  3. convert Unicode text to bytes as late as possible on output (if it is necessary at all)
jfs
  • 399,953
  • 195
  • 994
  • 1,670