0

I'm working to export a small subset of music from my iTunes library to an external drive, for use with a Sonos speaker (via Media Library on Sonos). All was going fine until I came across some unicode text in track, album and artist names.

I'm moving from iTunes on Mac to a folder structure on Linux (Ubuntu), and the file paths all contain the original Unicode names and these are displayed and play fine from Sonos in the Artist / Album view. The only problem is playlists, which I'm generating via a bit of Python3 code.

Sonos does not appear to support UTF-8 encoding in .m3u / .m3u8 playlists. The character ÷ was interpreted by Sonos as ÷, which after a bit of Googling I found was clearly mixing up UTF-8 and UTF-16 (÷ 0xC3 0xB7 in UTF-8, whilst à is U+00C3 in UTF-16 and · is U+00B7 in UTF-16). I've tried many different ways of encoding it, and just can't get it to recognise tracks with non-standard (non-ASCII?) characters in their names.

I then tried .wpl playlists, and thought I'd solved it. Tracks with characters such as ÷ and • in their path now work perfectly, just using those characters in their unicode / UTF-8 form in the playlist file itself.

However, just as I was starting to tidy up and finish off the code, I found some other characters that weren't being handled correctly: ö, å, á and a couple of others. I've tried both using these as their original unicode characters, but also as their encoded XML identifier e.g. ́ Using this format doesn't make a difference to what works or does not work - ÷ (÷) and • (•) are fine, whilst ö (ö), å (å) and á (á) are not.

I've never really worked with unicode / UTF-8 before, but having read various guides and how-to's I feel like I'm getting close but probably just missing something simple. The fact that some unicode characters work now, and others don't, makes me think it's got to be something obvious! I'm guessing the difference is that accents modify the previous character, rather than being a character in itself, but tried removing the previous letter and that didn't work!

Within Python itself I'm not doing anything particularly clever. I read in the data from iTunes' XML file using:

    with open(settings['itunes_path'], 'rb') as itunes_handle:
        itunes_library = plistlib.load(itunes_handle)

For export I've tried dozens of different options, but generally something like the below (sometimes with encoding='utf-8' and various other options):

with open(dest_path, 'w') as playlist_file:
    playlist_file.write(generated_playlist)

Where generated_playlist is the result of extracting and filtering data from itunes_library, having run urllib.parse.unquote() on any iTunes XML data.

Any thoughts or tips on where to look would be very much appreciated! I'm hoping that to someone who understands Unicode better the answer will be really really obvious! Thanks!

Current version of the code available here: https://github.com/dwalker-uk/iTunesToSonos

DaveWalker
  • 521
  • 1
  • 4
  • 15
  • Filename encoding is a problematic topic when it comes to portability. Have you considered replacing all non-ASCII characters, eg. using the `unidecode` Python module? I don't usually suggest this approach, but for having portable filenames across OSes this seems to me like the least pain. – lenz Feb 27 '19 at 21:27
  • So I've made a useful extra discovery... for the é character, the default encoding (using `.encode('ascii', 'xmlcharrefreplace')`) turns it into `é` which does not work. However, I found if I manually change it to `E9;` (as a single code for the é character) it does work. Therefore I think the key question is: how do I get Python to encode unicode characters in the latter format? Indeed, what is that form even called? – DaveWalker Feb 27 '19 at 21:45
  • Thanks @lenz, that's on my list as backup option, but as I've been playing around with this for three evenings now and it feels so close to being fixable... I'd like to see if I can get it resolved. I'll then share the whole code on GitHub, as I've seen lots of people struggling with this in the Sonos forums! – DaveWalker Feb 27 '19 at 21:46
  • 1
    That last thing is called Unicode normalization. There are NFD ("decomposed", ie. base character plus combining diacritic) vs. NFC ("composed", a singled accented character), plus the legacy versions NFKD/NFKC. Use Python's std-lib `unicodedata.normalize()` to convert into NFC form before encoding with "xmlcharrefreplace". – lenz Feb 27 '19 at 21:49
  • Thanks sounds like the right thing, I’ll give that a go tomorrow night and report back! – DaveWalker Feb 27 '19 at 22:06

1 Answers1

0

With thanks to @lenz for the suggestions above, I do now have unicode playlists fully working with Sonos.

A couple of critical points that should save someone else a lot of time:

  • Only .wpl playlists seem to work. Unicode will not work with .m3u or .m3u8 playlists on Sonos.
  • Sonos needs any unicode text to be normalised into NFC form - I'd never heard of this before, but essentially means that any accented characters have to be represented by a single character, not as a normal character with a separate accent.
  • The .pls playlist, which is an XML format, needs to have unicode characters encoded in an XML format, i.e. é is represented in the .pls file as é.
  • The .pls file also needs the XML reserved characters (& < > ' ") in their escaped form, i.e & is &amp;.

In Python 3, converting a path from iTunes XML format into something suitable for a .pls playlist on Sonos, needs the following key steps:

left = len(itunes_library['Music Folder'])
path_relative = 'Media/' + itunes_library['Tracks'][track_id]['Location'][left:]
path_unquoted = urllib.parse.unquote(path_relative)
path_norm = unicodedata.normalize('NFC', path_unquoted)
path = path_norm.replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;').replace('"', '&quot;')

playlist_wpl += '<media src="%s"/>\n' % path

with open(pl_path, 'wb') as pl_file:
    pl_file.write(playlist_wpl.encode('ascii', 'xmlcharrefreplace'))

A full working demo for exporting from iTunes for use in Sonos (or anything else) as .pls is available here: https://github.com/dwalker-uk/iTunesToSonos

Hope that helps someone!

DaveWalker
  • 521
  • 1
  • 4
  • 15