2

I do have the following path in the memory:

video_path = u'C:\\Documents and Settings\\user\\My Documents\\Downloads\\\xf5iv - Neon Phoenix [Free DL].mp3'

I'm trying to use it as a parameter in cmd, so I have to encode it.

video_path = video_path.encode(sys.getfilesystemencoding())
cmd = 'ffmpeg -y -i "%s" -vn -ac 2 -f mp3 audio.mp3' % video_path
subprocess.Popen(cmd)

However the string is not encoded in the right way - it converts the \xf5 to ? instead of õ. Therefore the file could not be found.

How can this happen? I'm using the default filesystem encoding (which is mbcs).

iTayb
  • 12,373
  • 24
  • 81
  • 135
  • Windows uses Unicode paths. Why are you encoding your Unicode string? – André Caron Apr 23 '12 at 21:51
  • 1
    @André: It uses "Unicode", not Unicode. – Ignacio Vazquez-Abrams Apr 23 '12 at 21:54
  • I am speculating now, but what happens if you leave `video_path` as a unicode object (without encoding it), construct `cmd = u'..' % videopath` as a Unicode, too, and then encode at the end: `os.system(cmd.encode(sys.getfilesystemencoding()))`? On Linux and Python 2.7 it makes no difference, but it may be worth a try on your platform. – jogojapan Apr 25 '12 at 06:14
  • It's the same thing. The problem is that the 'mbcs' doesn't convert the `\xf5` char as it should, even though it is the default system encoding of windows XP (and probably 7 as well). It would seem like an implementation bug, but I'm not sure here. – iTayb Apr 25 '12 at 06:25
  • Related: http://stackoverflow.com/q/1910275/777186 – jogojapan Apr 25 '12 at 06:36

3 Answers3

2

From an answer here:

In Py3K - at least from "Python" 3.2 - subprocess.Popen and sys.argv work consistently with (default unicode) str's on Windows. CreateProcessW and GetCommandLineW are used obviously.

In Python - up to v2.7.2 at least - subprocess.Popen is buggy with unicode arguments. It sticks to CreateProcessA (while os.* are consistent with unicode). And shlex.split creates additional nonsense. Pywin32's win32process.CreateProcess also doesn't auto-switch to the W version, nor is there a win32process.CreateProcessW. Same with GetCommandLine. Thus ctypes.windll.kernel32.CreateProcessW... needs to be used. The subprocess module perhaps should be fixed regarding this issue.

Therefore, subprocess.Popen can't handle unicode right at Python 2.x versions.

My solution was renaming the input file to something random (with os.rename, which supports unicode), convert with ffmpeg that i launch with subprocess.Popen, and then rename it back.

Community
  • 1
  • 1
iTayb
  • 12,373
  • 24
  • 81
  • 135
0

Try to encode using UTF-8:

video_path = video_path.encode("utf-8")
Silviu
  • 835
  • 14
  • 22
0

Unless I am totally mistaken, the double backslash in

video_path = u'C:...\\xf5iv...'

causes the problem. There should be only one:

video_path = u'C:...\xf5iv...'

Otherwise the backslash is preserved as a backslash and left for os.system(), rather than .encode(), to deal with.

jogojapan
  • 68,383
  • 11
  • 101
  • 131
  • I don't know why stackoverflow shows it as two slashes - if you look at the code there are three backslashes. one escapes the first backslash, and the third escapes the unicode char. – iTayb Apr 25 '12 at 05:45
  • @iTayb Interesting. But anyway, shouldn't _one_ backslash be sufficient? – jogojapan Apr 25 '12 at 05:55
  • @iTayb. Ah sorry. No. I get it now. – jogojapan Apr 25 '12 at 05:55
  • @iTayb I have taken the liberty of editing the question. It uses verbatim now instead of blockquote. At least the backslashes are correctly displayed now. – jogojapan Apr 25 '12 at 06:05