0

I have a Python function that writes xml string to a file. The code is as follows:

My file_contents is a string as follows:

<A>Some Text</A>
<B>Some other Text</B>

I would like to enclose this string inside and prepend the xml header : to the file. I can't seem to figure this out.

from lxml import etree

def create_or_overwrite_file(file_name, file_contents):
    try:        
        print('file contents to be written: %s' % file_contents)

        '''
        #root = etree.fromstring(file_contents) #this line definitely fails because file_contents is NOT a valid XML

        #print('root: %s' % str(root))
        #newroot = etree.Element('RootElement')
        #newroot.insert(0, root)
        #print('newroot: %s' % newroot)
        #new_file_contents = etree.tostring(newroot)
        #print('NewFileContents: %s' % str(new_file_contents))
         '''

        root = etree.fromstring(file_contents).getroot()
        et = etree.ElementTree(root)
        with open(str(file_name), "wb") as f:
            et.write(f, encoding="utf-8", xml_declaration=True, pretty_print=True)
            print('wrote file_contents to %s' % str(file_name))
    except Exception as f:
        print(str(f))

I can't seem to get this code working. Any help is appreciated.

SoftwareDveloper
  • 559
  • 1
  • 5
  • 18
  • What does it do or not do? – Tim Roberts Feb 17 '22 at 19:24
  • `file_name` is already a string. There is no point in calling `str(file_name)`. – Tim Roberts Feb 17 '22 at 19:30
  • "Not working" is not a helpful description of the problem. Please precisely clarify the issue. Errors? Undesired result? Also, the markup shown in `file_contents` is *not* XML since it does not have a root. So, `lxml` should error out when attempting to parse. – Parfait Feb 17 '22 at 21:28
  • @Parfait: From what you are saying, there's no way to enclose my xml inside a root element using lxml library. I would have to prepend and append the tag element myself like '' + file_contents + '' – SoftwareDveloper Feb 17 '22 at 21:30
  • Again, that markup is _not_ XML. By definition, XML is well-formed (i.e., has a root among other rules). Check how that markup is generated. It may be built by string. See: [What's so bad about building XML with string concatenation?](https://stackoverflow.com/q/3034611/1422451) Consider using DOM libraries to build XML like Python's `lxml`. – Parfait Feb 17 '22 at 23:43

1 Answers1

1

If you are parsing from a string (as opposed to reading from a file). then fromstring already returns the root. You do not need the .getroot() call:

    root = etree.fromstring(file_contents)

With that change, your code works for me.

Now, it's not clear why you are doing this. If file_contents is already XML, as it must be for this to work, then why not just open(file_name).write(file_contents)? Why decode the XML and then re-encode it again?


Followup

If all you want is a wrapper, then this would work:

def create_or_overwrite_file(file_name, file_contents):
    f = open(file_name,'w')
    print( '<?xml version="1.0"?>\n<root>', file=f )
    f.write(file_contents)
    print( '</root>', file=f )

It's probably more efficient not to involve lxml. Where it makes more sense is up to you.

Tim Roberts
  • 48,973
  • 4
  • 21
  • 30
  • I want to write the XML Header to the file. The file_contents does not have the header in it. In fact, it doesn't have the root element. I haven't figured out how to enclose the "file_contents" inside a root element and add the header at the top. I'll modify my code to describe this better. – SoftwareDveloper Feb 17 '22 at 21:21
  • The only thing your code is that if the write fails, the user would have a malformed xml file. – SoftwareDveloper Feb 17 '22 at 21:42
  • Technically true, but in practical terms, writes like that don't fail. If the first `print` succeeded, all future ones will. – Tim Roberts Feb 17 '22 at 21:43