TL;DR
The underlying problem is the missing library LibKML for windows. My solution is extracting the data directly from the KML via a function.
Problem
I ran into the same problem and after some googling it appears that this has something to do with LibKML and Windows. Executing the same code on my Ubuntu machine yielded different results, namely the ExtendedData was retrieved when loading the saved KML file.
library(rgdal)
library(dplyr)
poly_df<-data.frame(x=c(1,1,0,0),y=c(1,0,0,1))
poly<-poly_df %>%
Polygon %>%
list %>%
Polygons(ID="1") %>%
list %>%
SpatialPolygons(proj4string = CRS("+init=epsg:4326")) %>%
SpatialPolygonsDataFrame(data=data.frame(test="this is a test"))
writeOGR(poly,"test.kml",driver="KML",layer="poly")
poly2<-readOGR("test.kml")
poly2@data
If one would manage to build LibKML [1], s/he would be able to load KML files with the ExtendedData [2].
On Windows the LibKML needs to be build with Visual Studio 2005 [1]. This Visual Studio version is not supported anymore [3]. In [3] user2889419 supplies the link to the 2005 version.
I downloaded and installed the version but building LibKML eventually failed with a lot of errors and warnings (certain files do not exist). This is were I stopped because I am way out of my comfort zone but wanted to share the results of my chase.
Solution in R
My solution is to read the KML directly and then extract the ExtendedData while loading the Spatial Object via rgdal's readOGR. My assumption is that readOGR starts on top of the file as does my extraction routine. Both are then merged and the output is a SpatialPolygonsDataFrame.
I had some troubles extracting the nodes from the KML files at first because I was not aware of the concept of namespaces [4]. (Edited the following function because I ran into troubles with KML files of other origins.)
readKML <- function(file,keep_name_description=FALSE,layer,...) {
# Set keep_name_description = TRUE to keep "Name" and "Description" columns
# in the resulting SpatialPolygonsDataFrame. Only works when there is
# ExtendedData in the kml file.
sp_obj<-readOGR(file,layer,...)
xml1<-read_xml(file)
if (!missing(layer)) {
different_layers <- xml_find_all(xml1, ".//d1:Folder")
layer_names <- different_layers %>%
xml_find_first(".//d1:name") %>%
xml_contents() %>%
xml_text()
selected_layer <- layer_names==layer
if (!any(selected_layer)) stop("Layer does not exist.")
xml2 <- different_layers[selected_layer]
} else {
xml2 <- xml1
}
# extract name and type of variables
variable_names1 <-
xml_find_first(xml2, ".//d1:ExtendedData") %>%
xml_children()
while(variable_names1 %>%
xml_attr("name") %>%
is.na() %>%
any()&variable_names1 %>%
xml_children() %>%
length>0) variable_names1 <- variable_names1 %>%
xml_children()
variable_names <- variable_names1 %>%
xml_attr("name") %>%
unique()
# return sp_obj if no ExtendedData is present
if (is.null(variable_names)) return(sp_obj)
data1 <- xml_find_all(xml2, ".//d1:ExtendedData") %>%
xml_children()
while(data1 %>%
xml_children() %>%
length>0) data1 <- data1 %>%
xml_children()
data <- data1 %>%
xml_text() %>%
matrix(.,ncol=length(variable_names),byrow = TRUE) %>%
as.data.frame()
colnames(data) <- variable_names
if (keep_name_description) {
sp_obj@data <- data
} else {
try(sp_obj@data <- cbind(sp_obj@data,data),silent=TRUE)
}
sp_obj
}
Old: extracting via ReadLines
My solution is to read the KML directly and then extract the ExtendedData while loading the Spatial Object via rgdal's readOGR. My assumption is that readOGR starts on top of the file as does my extraction routine. Both are then merged and the output is a SpatialPolygonsDataFrame.
library(tidyverse)
library(rgdal)
readKML<-function(file,keep_name_description=FALSE,...) {
# Set keep_name_description = TRUE to keep "Name" and "Description" columns
# in the resulting SpatialPolygonsDataFrame. Only works when there is
# ExtendedData in the kml file.
if (!grepl("\\.kml$",file)) stop("File is not a KML file.")
if (!file.exists(file)) stop("File does not exist.")
map<-readOGR(file,...)
f1<-readLines(file)
# get positions of ExtendedData in document
exdata_position<-grep("ExtendedData",f1) %>%
matrix(ncol=2,byrow = TRUE) %>%
apply(1,function(x) {
pos<-x[1]:x[2]
pos[2:(length(pos)-1)]
}) %>%
t %>%
as.data.frame
# if there is no ExtendedData return SpatialPolygonsDataFrame
if (ncol(exdata_position)==0) return(map)
# Get Name of different columns
extract1<-f1[exdata_position[1,] %>%
unlist]
names_of_data<-extract1 %>%
strsplit("name=\"") %>%
lapply(function(x) strsplit(x[[2]],split="\"") ) %>%
unlist(recursive = FALSE) %>%
lapply(function(x) return(x[1])) %>%
unlist
# Extract Extended Data
dat<-lapply(seq(nrow(exdata_position)),function(x) {
extract2<-f1[exdata_position[x,] %>%
unlist]
extract2 %>%
strsplit(">") %>%
lapply(function(x) strsplit(x[[2]],split="<") ) %>% unlist(recursive = FALSE) %>%
lapply(function(x) return(x[1])) %>%
unlist %>%
matrix(nrow=1) %>%
as.data.frame
}) %>%
do.call(rbind,.)
# Rename columns
colnames(dat)<-names_of_data
# Check if Name and Description should be dropped
if (keep_name_description) {
map@data<-cbind(map@data,dat)
} else {
map@data<-dat
}
map
}
[1] https://github.com/google/libkml/wiki/Building-and-installing-libkml
[2] https://github.com/r-spatial/sf/issues/499
[3] Where to download visual studio express 2005?
[4] Parsing XML in R: Incorrect namespaces