2

I'm new to xml and processing it in R.

I've been able to read and retrieve info from xml files using the xml2 package, but creating xml files from R objects has proven to be more challenging.

In particular, I'd like to generate a xml file from a R list. Consider the example below:

library(reprex)
library(xml2)

r_list <- list(person1 = list(starts = letters[1:3], ends = letters[4:6]), person2 = list(starts = LETTERS[1:4], ends = LETTERS[5:8]))
str(r_list)
#> List of 2
#>  $ person1:List of 2
#>   ..$ starts: chr [1:3] "a" "b" "c"
#>   ..$ ends  : chr [1:3] "d" "e" "f"
#>  $ person2:List of 2
#>   ..$ starts: chr [1:4] "A" "B" "C" "D"
#>   ..$ ends  : chr [1:4] "E" "F" "G" "H"

test1 <- xml2::as_xml_document((r_list))
#> Error: Root nodes must be of length 1

new_xml <- xml_new_root(.value = "category", name = "personList")

for(person in names(r_list)){
  xml_add_child(new_xml, as_xml_document(r_list[person]))
}

new_xml
#> {xml_document}
#> <category name="personList">
#> [1] <person1>ad</person1>
#> [2] <person2>AE</person2>

Created on 2021-11-25 by the reprex package (v2.0.1)

I tried to directly coerce the list to xml using the as_xml_document function, but I get the error Root nodes must be of length 1.

Following the idea on this question, I tried to create the xml document with a root node and xml_add_child() to this document, but I did not get the expected result (see code output). In that question, they transform from an R data frame and not a list.

I'd also like to have personalized tag names and add attributes to these tags. The wished output would be:

<category name="personList">
    <pers name="person1">
        <starts>
            <value>a</value>
            <value>b</value>
            <value>c</value>
        </starts>
        <ends>
            <value>d</value>
            <value>e</value>
            <value>f</value>
        </ends>
    </pers>
    <pers name="person2">
        <starts>
            <value>A</value>
            <value>B</value>
            <value>C</value>
            <value>D</value>
        </starts>
        <ends>
            <value>D</value>
            <value>E</value>
            <value>F</value>
            <value>G</value>
        </ends>
    </pers>
</category>

Thanks for your help and have a nice day

symduk
  • 78
  • 1
  • 7
  • Does this answer your question? [How to create xml from R objects, e.g., is there a 'listToXml' function?](https://stackoverflow.com/questions/6256064/how-to-create-xml-from-r-objects-e-g-is-there-a-listtoxml-function) – Limey Nov 25 '21 at 12:31
  • Hi @Limey, thanks for your quick response. The link was very useful, I managed to generate the wished output with the `newXMLNode` function. Nevertheless, it needed nested for loops to recover each element of the list. As open questions, do you know a better solutions ? or an equivalent in the `xml2` package ? – symduk Nov 25 '21 at 14:59

2 Answers2

2

R list attributes can be mapped to XML attributes:

library(xml2)
library(tidyverse)

r_list <- list(person1 = list(starts = letters[1:3], ends = letters[4:6]), person2 = list(starts = LETTERS[1:4], ends = LETTERS[5:8]))
r_list

new_xml <- xml_new_root(.value = "category", name = "personList")

for (person in names(r_list)) {
  p <- list()
  p[["pers"]] <- list(
    starts = r_list[[person]]$starts %>% map(~list(value = list(.x))),
    ends = r_list[[person]]$ends %>% map(~list(value = list(.x)))
  )
  attr(p[["pers"]], "name") <- person
  
  xml_add_child(new_xml, as_xml_document(p))
}

write_xml(new_xml, "foo.xml")

output:

<?xml version="1.0" encoding="UTF-8"?>
<category name="personList">
  <pers name="person1">
    <starts>
      <value>a</value>
      <value>b</value>
      <value>c</value>
    </starts>
    <ends>
      <value>d</value>
      <value>e</value>
      <value>f</value>
    </ends>
  </pers>
  <pers name="person2">
    <starts>
      <value>A</value>
      <value>B</value>
      <value>C</value>
      <value>D</value>
    </starts>
    <ends>
      <value>E</value>
      <value>F</value>
      <value>G</value>
      <value>H</value>
    </ends>
  </pers>
</category>
danlooo
  • 10,067
  • 2
  • 8
  • 22
  • Hi @danlooo, thanks for your answer. I've been trying your suggestion, but I can't get the a ... correctly. If I understand correctly, the list names (defined with `p[["pers"]] <- list()` defines the xml tag, and the attribute `name` sets the tag attribute in the xml, but I can't get it to work in the loop – symduk Nov 25 '21 at 14:17
  • @symduk I revised my answer. However, This type of xml serialization is highly inefficient and can be done with much lesser nesting. – danlooo Nov 25 '21 at 14:47
  • Thanks for your update, your code produces the wished outcome (I'll need some time to understand each line). I'll add an answer with the method suggested by Limey for completeness, but attribute to you the accepted answer. Have a nice day ! – symduk Nov 25 '21 at 15:08
1

Following the comment by @Limey (to see this question), I could generate the wished output with the following code (posted as answer just for completeness, as @danlooo answer also produces the same output).

library(XML)

r_list <- list(person1 = list(starts = letters[1:3], ends = letters[4:6]), person2 = list(starts = LETTERS[1:4], ends = LETTERS[5:8]))
str(r_list)

category = newXMLNode("category", attrs = c(name="personList"))

for(person in names(r_list)){
  pers <- newXMLNode("pers", attrs = c(name = person), parent = category)
  startsn <- newXMLNode("starts", parent = pers)
  for(value in seq_along(r_list[[person]][["starts"]])){
    svalue <- newXMLNode("value", r_list[[person]][["starts"]][[value]], parent = startsn)
  }
  endsn <- newXMLNode("ends", parent = pers)
  for(value in seq_along(r_list[[person]][["ends"]])){
    evalue <- newXMLNode("value", r_list[[person]][["ends"]][[value]], parent = endsn)
  }
}
category
symduk
  • 78
  • 1
  • 7