I've recently bought a Nook Simple Touch. I use Calibre to manage my ebooks, and to transfer them to the Nook.
Due to a non-standard implementation of the epub specification on B&N's part, the Nook ST does not display cover images when they are brought over from many sources. The issue is described here: http://john.nachtimwald.com/2011/08/21/nook-covers-not-showing-up/ Basically the Nook ST requires the XML attribute for the cover to be in the format:
<meta name="cover" content="id5" />
But many epub creators have them around this way:
<meta content="id5" name="cover" />
And the Nook ST then ignores the cover image entirely.
I have been manually editing the content.opf file in my epub files. So far they have all had the image meta, but it was always around the "wrong" way (wrong, according to the Nook).
Recently I've been playing around with REGEX, mostly to try to automate the cleaning up of epubs converted by Calibre from PDF files. I'm still very much a beginner with REGEX.
What I was wondering is how I might go about automating the swapping of the 'name' and 'content' attributes? I figure it can be done with a combination of REGEX and scripting. I know some of the other epub related scripts I have are in Python. I am on a Mac (OS X) and they seem to run fine. AppleScript might be a good option too, although I'd like something that people can run on any platform, as I am sure other folk will find this useful.
Here are the steps I foresee:
~ Extract epub file
~ Use REGEX to look for:
<meta content="???" name="cover">
~ If found, use REGEX to change it around to:
<meta name="cover" content="???">
~ Zip extracted files back into an epub using the correct zipping process.
I found info here: http://www.mobileread.com/forums/showthread.php?t=55681 explaining how to zip up an epub file correctly. Basically it requires these two commands:
zip -X0 "full path to new epub file" mimetype
zip -rDX9 "full path to new epub file" * -x "*.DS_Store" -x mimetype
I'd like to post the resulting script online where ever it might be found and made use of (until B&N resolve their poor epub/XML implementation). Posting it on the Calibre forums and the mobileread forums comes to mind (since they are two I am familiar with, and have seen people discussion manual fixes to this issue).
Is there someone who can walk me through how to create such a script? Ideally, I'd love to actually know how to create the script, so that over time I can start to figure out these sorts of things myself (especially the REGEX part, as I see more and more how useful it is).
Thank you.
Jonathan
@Haldean: ADDED to illustrate what I mean in a comment to Haldean regarding making his script work through all content.opf files in all subfolders recursively.
> My_expanded_epubs
- -> epub_one_expanded
- - - -> content.opf
- -> epub_two_expanded
- - - -> content.opf
- -> epub_three_expanded
- - - -> content.opf
etc.