I have received a set of xml files and I'm trying to convert them in dataframe using R. Problem is that the structure seems different from the ones I observe in other questions asked online, so I have no idea on how to solve this. I used the XLM library.
library(XML)
doc <- xmlParse("ULS_rows.xml")
xmltop = xmlRoot(doc)
class(xmltop)
[1] "XMLInternalElementNode" "XMLInternalNode" "XMLAbstractNode"
xmlName(xmltop)
[1] "Workbook"
xmlName(xmltop[[1]]) #name of root's children
[1] "Styles"
When I check the 2nd child entry I get something like this, which corresponds more or less to the content of the first sheet in the xml file:
<Worksheet ss:Name="TOC">
<Names>
<NamedRange ss:Name="Print_Titles" ss:RefersTo="=TOC!R1"/>
</Names>
<Table>
<Row ss:StyleID="HeaderStyle">
<Cell>
<Data ss:Type="String">Sheet Name</Data>
</Cell>
<Cell>
<Data ss:Type="String">Description</Data>
</Cell>
</Row>
<Row>
<Cell ss:StyleID="HyperlinkStyle" ss:HRef="#MemCheROWSru1MemResBri!A1">
<Data ss:Type="String">MemCheROWSru1MemResBri</Data>
</Cell>
<Cell>
<Data ss:Type="String">Member_Check_ROWS.run(1) : Member Result Brief</Data>
</Cell>
</Row>
<Row>
<Cell ss:StyleID="HyperlinkStyle" ss:HRef="#MeChROWSr1NoMeRe201!A1">
<Data ss:Type="String">MeChROWSr1NoMeRe201</Data>
</Cell>
<Cell>
<Data ss:Type="String">Member_Check_ROWS.run(1) : Norsok Member Result 2013</Data>
</Cell>
</Row>
<Row>
<Cell ss:StyleID="HyperlinkStyle" ss:HRef="#MeChROWSr1NoCoRe201!A1">
<Data ss:Type="String">MeChROWSr1NoCoRe201</Data>
</Cell>
<Cell>
<Data ss:Type="String">Member_Check_ROWS.run(1) : Norsok Cone Result 2013</Data>
</Cell>
</Row>
<Row>
<Cell ss:StyleID="HyperlinkStyle" ss:HRef="#MemCheROWSru1JoiResBri!A1">
<Data ss:Type="String">MemCheROWSru1JoiResBri</Data>
</Cell>
<Cell>
<Data ss:Type="String">Member_Check_ROWS.run(1) : Joint Result Brief</Data>
</Cell>
</Row>
<Row>
<Cell ss:StyleID="HyperlinkStyle" ss:HRef="#MeChROWSr1NoJoRe201!A1">
<Data ss:Type="String">MeChROWSr1NoJoRe201</Data>
</Cell>
<Cell>
<Data ss:Type="String">Member_Check_ROWS.run(1) : Norsok Joint Result 2013</Data>
</Cell>
</Row>
</Table>
</Worksheet>
Clueless, I have tried to more or less blindly follow instructions I have found online, but functions such as:
ffgg <- xmlSApply(xmltop, function(x) xmlSApply(x, xmlValue))
ffgg_df <- data.frame(t(ffgg),row.names=NULL)
Gives nothing near to a dataframe. Any advice on what is the problem? Am I dealing with a regular XML file or am I missing something? Thanks