I want to do a very simple bit of manipulation of a LibreOffice Writer document... then save again as the ODT file...
What might be wrong with this? If I try this I get 2 content.xmls in the zip file (ODT file)... strangely, both these (if unzipped as "content.xml" and "content_1.xml" for example) seem to contain the content as modified...
zipfile = ZipFile( file_path, "a" )
for zip_info in zipfile.infolist():
contents = zipfile.read( zip_info.filename )
if ( zip_info.filename == "content.xml" ):
document_root = parseString( contents )
# ... mess around with the contents DOM document...
zipfile.writestr( zip_info, document_root.toxml() )
zipfile.close()
I'm aware that there are various add-ins and things you can use (UNO)... but I want to keep it as simple as possible...
later
my solution: finding that there is no way to delete an element from a zip file programmatically in Python, I initially decided to take the "make a new zip" approach: Delete file from zipfile with the ZipFile Module
however, although I was able to open the resulting ODT file, and to extract all the files from it, 7Zip complained about a CRC failure, saying content.xml was now "broken". Obviously due to this brutal substitution of one "content.xml" for another.
final answer: 1) output modified DOM structure to a simple file in the same directory, calling it "content.xml":
f = open( file_dir + '\\content.xml', "w" )
print >>f, document_root.toxml()
f.close()
2) harness 7zip CLI when the ODT file has been closed programmatically:
import subprocess
subprocess.Popen( "7z u temp.odt content.xml", cwd=file_dir, shell=True )