1

I am adding some elements to some nodes in a a graphml file using Python and etree. I have two lists of strings with some data which I want to write to my .graphml file. I have managed to do this but when using the .append() function it writes the two new elements on the same line. Is there a good way to get a line separation between these new elements while writing them in the same loop?

I have the following dataset:

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
  <node id="node1">
    <data key="label">node1</data>
    <data key="degree">6</data>
  </node>
  <node id="node2">
    <data key="label">node2</data>
    <data key="degree">32</data>
  </node>
  <node id="node3">
    <data key="label">node3</data>
    <data key="degree">25</data>
  </node>
</graphml>

and two lists containing years:

lastActive["2013","2014","2015"]
lastRelated["2012","2014","2011"]

Using the following code to append the lists as elements in the dataset:

for node in root:

    #checks if correct node
    for index, i in enumerate(nameOfNode):
        if i == node[0].text:

            #create and add lastRelated element
            lastRelated = Element('data')
            lastRelated.set('key', 'lastRelated')
            node.append(lastRelated)
            lastRelated.text = lastRelated[index]

            #create and add lastActive element
            lastActive = Element('data')
            lastActive.set('key', 'lastActive')
            node.append(lastActive)
            lastActive.text = lastActive[index]

            updatedText = etree.tostring(node)

            #write to file
            file = open('dataset.graphml', 'wb')
            file.write(updatedText)
            file.close()

The following results are:

  <node id="node1">
  <data key="label">node1</data>
  <data key="degree">6</data>
  <data key="lastActive">2015</data><data key="lastRelated">2011</data></node>

I would like it to be structured as:

  <node id="node1">
  <data key="label">node1</data>
  <data key="degree">6</data>
  <data key="lastActive">2015</data>
  <data key="lastRelated">2011</data>
  </node>

Anyone got a solution for this?

Arefo
  • 49
  • 6
  • I'm assuming `\n` doesn't work? – Eugene Oct 17 '16 at 11:53
  • I have tried using \n with: lastActive.set('\n key', 'lastActive') though this of course results in the new line starting at " – Arefo Oct 17 '16 at 11:59
  • It would go in `updatedText = etree.tostring(node)`, because `file.write(updatedText)` is where the `\n` should be. So you're going to have to append each element, and then a new line. Or see http://stackoverflow.com/questions/3095434/inserting-newlines-in-xml-file-generated-via-xml-etree-elementtree-in-python, or http://stackoverflow.com/questions/34608740/how-to-get-xml-output-in-a-file-with-new-line-using-python-xml-etree, or http://stackoverflow.com/questions/17402323/use-xml-etree-elementtree-to-write-out-nicely-formatted-xml-files – Eugene Oct 17 '16 at 12:03
  • It seems that lxml might have the solution im looking for, but I am having problems installing the lxml package. When installing via Pycharm the following error occurs:ERROR: b"'xslt-config' is not recognized as an internal or external command,\r\noperable program or batch file.\r\n" I have tried installing all the different versions of lxml from[link](http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml), and checked if pip is up to date, but I get the error: "...whl is not a supported wheel on this platform. Any experience concerning this? – Arefo Oct 17 '16 at 12:56
  • Looks like a Windows problem. Go to http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml and download the 32-bit binary for your version of python (the number after cp). Use `pip` to install (if you're not using `pip` at all, get it: (https://pip.pypa.io/en/stable/installing/) `pip install lxml-3.6.4-cpXX-cpXXm-win32.whl` in the same directory using cmd or powershell or ming etc. – Eugene Oct 17 '16 at 14:10

1 Answers1

1

You should be able to get the wanted output by providing a suitable value for the tail property on the new elements. The tail is text that comes after an element's end tag and before the following element's start tag.

...

thetail = "\n  "
lastRelated.tail = thetail
lastActive.tail = thetail

updatedText = etree.tostring(node)

...
mzjn
  • 48,958
  • 13
  • 128
  • 248
  • Worked as a charm! Thanks! – Arefo Oct 18 '16 at 09:56
  • See also the `indent()` function, which was added in Python 3.9. It is easier to use than fiddling with `tail`. https://stackoverflow.com/a/68618047/407651 – mzjn Nov 22 '22 at 10:06