Trying to extract two attributes from the XML file extract (from a large XML file) namely 'nmRegime' and 'CalendarSystemT' (this is the date). Once extract those two records need to be saved as two columns in a data frame in R along with the filename.
There are several 'event' nodes within one given XML file and there are nearly 100 individual XML files.
<Event tEV="FirA" clearEV="false" onEV="true" dateOriginEV="Calendar" nYrsFromStEV="" nDaysFromStEV="" tFaqEV="Blank" tAaqEV="Blank" aqStYrEV="0" aqEnYrEV="0" nmEV="Fire_Cool" categoryEV="CatUndef" tEvent="Doc" idSP="105" nmRegime="Wheat, Tilled, stubble cool burn" regimeInstance="1">
<notesEV></notesEV>
<dateEV CalendarSystemT="FixedLength">19710331</dateEV>
<FirA fracAfctFirA="0.6" fracGbfrToAtmsFirA="0.98" fracStlkToAtmsFirA="0.98" fracLeafToAtmsFirA="0.98" fracGbfrToGlitFirA="0.02" fracStlkToSlitFirA="0.02" fracLeafToLlitFirA="0.02" fracCortToCodrFirA="1.0" fracFirtToFidrFirA="1.0" fracDGlitToAtmsFirA="0.931" fracRGlitToAtmsFirA="0.931" fracDSlitToAtmsFirA="0.931" fracRSlitToAtmsFirA="0.931" fracDLlitToAtmsFirA="0.931" fracRLlitToAtmsFirA="0.931" fracDCodrToAtmsFirA="0.0" fracRCodrToAtmsFirA="0.0" fracDFidrToAtmsFirA="0.0" fracRFidrToAtmsFirA="0.0" fracDGlitToInrtFirA="0.019" fracRGlitToInrtFirA="0.019" fracDSlitToInrtFirA="0.019" fracRSlitToInrtFirA="0.019" fracDLlitToInrtFirA="0.019" fracRLlitToInrtFirA="0.019" fracDCodrToInrtFirA="0.0" fracRCodrToInrtFirA="0.0" fracDFidrToInrtFirA="0.0" fracRFidrToInrtFirA="0.0" fracSopmToAtmsFirA="" fracLrpmToAtmsFirA="" fracMrpmToAtmsFirA="" fracSommToAtmsFirA="" fracLrmmToAtmsFirA="" fracMrmmToAtmsFirA="" fracMicrToAtmsFirA="" fracSopmToInrtFirA="" fracLrpmToInrtFirA="" fracMrpmToInrtFirA="" fracSommToInrtFirA="" fracLrmmToInrtFirA="" fracMrmmToInrtFirA="" fracMicrToInrtFirA="" fracMnamNToAtmsFirA="" fracSAmmNToAtmsFirA="" fracSNtrNToAtmsFirA="" fracDAmmNToAtmsFirA="" fracDNtrNToAtmsFirA="" fixFirA="" phaFirA="" />
</Event>
Had some success in extracting 'nmRegime' but no success with 'CalendarSystemT'. Used below code for data extraction.
The second question, is there a way to loop the list of XML files and do this operation?
# get records
library(xml2)
recs <- xml_find_all(xml, "//Event")
#extract the names
labs <- trimws(xml_attr(recs, "nmRegime"))
names <- labs[!is.na(labs)]
# Extract the date
recs_t <- xml_find_all(xml, "//Event/dateEV")
time <- trimws(xml_attr(recs_t, "CalendarSystemT"))