0

In my job I have to perform some analytics on data shared by external organisation through user access granted on web portal. Various reports are available there, which I can view and download in many formats. Two of these formats are very useful namely MS Excel and 'XML file with report data'. Excel file is normally heavily formatted (with sub-totals, merged cells, etc.) to suit the purpose of Excel users. Converting these Excel files to data frame/table is normally a big hassle. I therefore prefer to download 'xml' file and then parse it through -> save it in csv and then carry out my analysis in R.

However, whenever I try to parse xml file directly into R (to avoid intervening convert to csv step) I never succeed. So far I have tried XML xml2 libraries in R but to no avail.

Recently I tried this code.

library("XML")
library("methods")
setwd("C:\\Users\\Administrator\\Desktop\\")
res <- xmlParse("Skil.xml")

> res <- xmlParse("Skil.xml")
xmlns: URI RptSancDig_VoucherCompilationSheet is not absolute

rootnode <- xmlRoot(res)
rootsize <- xmlSize(rootnode)

> rootsize
[1] 2

xmldataframe <- xmlToDataFrame("Skil.xml")

> xmldataframe <- xmlToDataFrame("Skil.xml")
xmlns: URI RptSancDig_VoucherCompilationSheet is not absolute

> xmldataframe 
  Textbox24 Textbox63 DDOName_Collection
1      <NA>      <NA>               <NA>
2                                       

Just to mention the file size of Skil.xml is about 12.1 Mb, and is successfully parsed in Excel.

I have also tried read_xml() function of xml2 but to no avail.

I would have happily shared a sample file to try, but I am unable to do so. Moreover, I am also unable to generate a sample file in that kind of xml format.

Can someone help?

AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
  • 1
    *"but to no avail"* doesn't really give us an idea of what went wrong. While I don't recommend posting the whole 12MB file, perhaps you can repeat your process for a much smaller table (say, 3x5?) and include the verbatim XML in your question. Also, for that sample data, please show the literal error text. Thanks! – r2evans Oct 15 '20 at 05:33
  • @r2evans I'll post the screenshots of what I got. But regarding sample I'm aware that how to generate that file. The sample files available on net can be parsed. May I show you a screenshot of that file opened in html? – AnilGoyal Oct 15 '20 at 08:05

0 Answers0