I am trying to use XML package and either xmlToList or xmlToDataFrame function. My input data is on the internet (first 2 lines) and I only need to work with certain part of the XML (see the third nodeset command)
url<- 'http://ClinicalTrials.gov/show/NCT00191100?resultsxml=true'
xml = xmlTreeParse(url,useInternalNode=TRUE)
ns <- getNodeSet(xml, '/clinical_study/clinical_results/reported_events/serious_events/category_list')
It is a list of categories and inside categories are “events”. And events have counts (and counts are specific to clinical trial arms (eg, drug vs. placebo arms)
I only need the events, so the best listing is here for cario-respiratory arrest using xmlToList
xl<-xmlToList(url)
set2<-xl$clinical_results$reported_events$serious_events$category_list
set2[[3]]
> set2[[3]]
$title
[1] "Cardiac disorders"
$event_list
$event_list$event
$event_list$event$sub_title
[1] "Cardio-respiratory arrest"
$event_list$event$counts
group_id events subjects_affected subjects_at_risk
"E1" "1" "1" "260"
$event_list$event$counts
group_id events subjects_affected subjects_at_risk
"E2" "0" "0" "255"
I am not able to use xmlToDataFrame due to this error. (the nodeset2 has all data in XMLattributes and I think the xmlTODataFrame may not like this)
hopefulyDF <- getNodeSet(xml, '/clinical_study/clinical_results/reported_events/serious_events/category_list/category/event_list/event/counts')
xmlToDataFrame(node = hopefulyDF)
Error in matrix(vals, length(nfields), byrow = TRUE) :
'data' must be of a vector type, was 'NULL'
How to best extract the counts data? I tried unlist but I am not advanced in R enough, probably. I would like to avoid loop and manual xmlGetAttr. But in the worst case, any solution is accepted. I find the XML package very dense with 2 version of XML data as list and as NodeSets... :-(
Ideal output would look like this: (all events(not just row 3)
event group_ID numerator denumerator
Cardio-respiratory arrest E1 1 260
Cardio-respiratory arrest E2 0 250
(or even have a category column (cardiac disorders) - that would be super-ideal)
p.s. I used this question How to transform XML data into a data.frame? and that question R list to data frame but with no luck. :-(