1

Is it possible in python to pretty print the root's attributes?

I used etree to extend the attributes of the child tag and then I had overwritten the existing file with the new content. However during the first generation of the XML, we were using a template where the attributes of the root tag were listed one per line and now with the etree I don't manage to achieve the same result.

I found similar questions but they were all referring to the tutorial of etree, which I find incomplete.

Hopefully someone has found a solution for this using etree.

EDIT: This is for custom XML so HTML Tidy (which was proposed in the comments), doesn't work for this.

Thanks!

generated_descriptors = list_generated_files(generated_descriptors_folder)
counter = 0
for g in generated_descriptors:
    if counter % 20 == 0:
        print "Extending Descriptor # %s out of %s" % (counter, len(descriptor_attributes))

    with open(generated_descriptors_folder + "\\" + g, 'r+b') as descriptor:
        root = etree.XML(descriptor.read(), parser=parser)

        # Go through every ContextObject to check if the block is mandatory
        for context_object in root.findall('ContextObject'):
            for attribs in descriptor_attributes:
                if attribs['descriptor_name'] == g[:-11] and context_object.attrib['name'] in attribs['attributes']['mandatoryobjects']:
                    context_object.set('allow-null', 'false')
                elif attribs['descriptor_name'] == g[:-11] and context_object.attrib['name'] not in attribs['attributes']['mandatoryobjects']:
                    context_object.set('allow-null', 'true')

        # Sort the ContextObjects based on allow-null and their name
        context_objects = root.findall('ContextObject')
        context_objects_sorted = sorted(context_objects, key=lambda c: (c.attrib['allow-null'], c.attrib['name']))

        root[:] = context_objects_sorted

        # Remove mandatoryobjects from Descriptor attributes and pretty print
        root.attrib.pop("mandatoryobjects", None)
        # paste new line here


        # Convert to string in order to write the enhanced descriptor
        xml = etree.tostring(root, pretty_print=True, encoding="UTF-8", xml_declaration=True)

        # Write the enhanced descriptor
        descriptor.seek(0)  # Set cursor at beginning of the file
        descriptor.truncate(0)  # Make sure that file is empty
        descriptor.write(xml)

        descriptor.close()

    counter+=1
Arne Uten
  • 23
  • 6
  • I'm using lxml and from that library I'm using etree. The tutorial I'm referring to comes from another thread and the link is https://lxml.de/tutorial.html – Arne Uten Nov 14 '18 at 15:44
  • Perhaps this helps: https://stackoverflow.com/q/40410923/407651 – mzjn Nov 14 '18 at 15:56
  • Thanks @mzjn ! I took a first glance on the 'fix' that was found but as he also mentions after his try-out, it is not an optimal solution and I would like to keep it as clean as possible. – Arne Uten Nov 14 '18 at 16:14
  • It's tricky. There have been many questions about pretty-printing XML. This one has 19 answers, https://stackoverflow.com/q/749796/407651, but I'm not sure if they are of any help in this case (arranging attributes). – mzjn Nov 14 '18 at 16:34
  • You might want to try the Tidy utility (http://www.html-tidy.org/). It has a `indent-attributes` option: http://api.html-tidy.org/tidy/quickref_5.6.0.html#indent-attributes. – mzjn Nov 14 '18 at 16:42
  • Can it be used together with lxml? – Arne Uten Nov 15 '18 at 18:01
  • I was thinking that you could use Tidy to post-process the XML created by lxml. – mzjn Nov 15 '18 at 18:22
  • Perhaps you can use this: https://pythonhosted.org/pytidylib/ – mzjn Nov 15 '18 at 18:29
  • @mzjn, any idea if html-tidy is importable in pycharm? – Arne Uten Nov 16 '18 at 10:29
  • Not sure if it is significant that you use PyCharm. pytidylib is a regular Python library as far as I can tell, installed using "pip install pytidylib". Why not just try it? I haven't used it myself. – mzjn Nov 16 '18 at 11:04
  • I'm able to install the pytidylib in pycharm but the documentation of that library says that I need to put the dll of Tidy in a directory of the system path but whatever I try, I'm not able to access the options specfied by Tidy. – Arne Uten Nov 16 '18 at 12:06
  • If you want help with pytidylib specifically, I think you should post a new question about that. As I said, have not used pytidylib myself. – mzjn Nov 16 '18 at 12:17

0 Answers0