3

I am trying to use R to read a XML file, select few nodes and write it back to another XML I am learning now to handle XML files in R, referred the example in this link "http://www.r-bloggers.com/r-and-the-web-for-beginners-part-ii-xml-in-r/", which explains how to read the XML and print selected nodes. I want to extend the example mentioned - I want to select the a range of "plant" nodes (For instance 1 through 5) and store it in anoter XML

The input XML file looks like this

<?xml version="1.0"?>
<CATALOG>
 <PLANT>
  <COMMON>Bloodroot</COMMON>
  <BOTANICAL>Sanguinaria canadensis</BOTANICAL>
  <ZONE>4</ZONE>
  <LIGHT>Mostly Shady</LIGHT>
  <PRICE>$2.44</PRICE>
  <AVAILABILITY>031599</AVAILABILITY>
 </PLANT>
 <PLANT>
  <COMMON>Columbine</COMMON>
  <BOTANICAL>Aquilegia canadensis</BOTANICAL>
  <ZONE>3</ZONE>
  <LIGHT>Mostly Shady</LIGHT>
  <PRICE>$9.37</PRICE>
  <AVAILABILITY>030699</AVAILABILITY>
 </PLANT>
 .
 .
 <CATALOG>

I have the following code

library(XML)
xml.url <- "http://www.w3schools.com/xml/plant_catalog.xml"
xmlfile <- xmlTreeParse(xml.url)
xmltop <- xmlRoot(xmlfile)
saveXML(xmltop[1:5],file="out.xml")

But R gives an error message "Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘saveXML’ for signature ‘"XMLNodeList".
Note: When I try to write the complete XML (using "saveXML(xmlroot,file="out.xml")") it works fine. But only when I try to write the subset it fails.

James
  • 163
  • 1
  • 8

1 Answers1

2

Try something like

top <- xmlNode(xmlName(xmltop))
for(i in 1:5) top <- addChildren(top, xmltop[[i]])
saveXML(top, file="out.xml")
file.show("out.xml")

So I created an xmlNode named top and added some children before saving it. I suppose it is not the most elegant way to do that but now it works.

Hope it helps,

Alex

alko989
  • 7,688
  • 5
  • 39
  • 62
  • Thanks Alex. It does write this time, but not as XML. It appears as " – James May 07 '14 at 06:52
  • In `saveXML` you are using the argument `file` to specify the ouput file. If this argument is `NULL` then you get the output in your R console. Are you sure you used a filename there? – alko989 May 07 '14 at 09:23
  • @ alko: It does write the output to the file out.xml. But the issue is the contents do not appear as XML, instead as mentioned above. – James May 07 '14 at 10:51
  • Of course you are right, sorry I did not actually check the output file carefully. I edited my answer. – alko989 May 07 '14 at 12:58
  • Thanks Alex. It works :-) Could you guide me to some manuals/websites where I can learn more about XML handling in R. I would like to do similar processing for large XML files(> 2GB), for which I read from http://www.omegahat.org/RSXML/shortIntro.pdf that I should use xmlEventParse(). But I find it difficult to understand the manual - there are many functions in this package, and just don't know where to start. Any guidance will be greatly appreciated. – James May 07 '14 at 13:38
  • XML can be really slow; if you need speed and have a predictable source, some grep-like magic can be much faster. As a compromise, use XML and cache the results for further processing. – Dieter Menne May 07 '14 at 13:42
  • The manual of the XML package can be confusing but it has all the information about the available functions. I searched a bit around for a nice tutorial but without success. SO is always a good source for specific questions. – alko989 May 07 '14 at 13:48
  • @ Alex: Learnt from the manual that we could use append.xmlNode function, so if I replace the second line with this "top = append.xmlNode (top, xmltop[1:5])" it gives the same result. Thanks for motivating me to still read the manual :-) A steep learning curve ahead for me, but in the end it will really help I think. – James May 07 '14 at 17:39