I'm trying to convert dat data to data frame using the code however it is giving the mentioned error. Any help will be highly appreciated. the data file is also attached https://drive.google.com/file/d/1y7IMpsnrCXSZXXFU4F6SUvDUFGeWPAnt/view?usp=sharing
Code So Far:
library(XML)
require(plyr)
library(stringr)
dat <- readLines("NTISDATD-Events-2020-05-10-Day8.dat")
datDF <- data.frame(
tags = unlist(str_extract_all(dat, "<([^>]*)>(?=[^>]*</\\1>)")),
values = unlist(str_extract_all(dat, "(?<=<([^>]{1,100})>).*(?=</\\1>)"))
)
datDF
Desired Output:
tags values
1 <d2lm:country> gb
2 <d2lm:nationalIdentifier> NTIS
3 <d2lm:feedType> Event Data
4 <d2lm:publicationTime> 2020-05-10T00:00:44.778+01:00
5 <d2lm:country> gb
6 <d2lm:nationalIdentifier> NTIS
7 <d2lm:areaOfInterest> national
Many Thanks