1

I created a path in google earth and then copied and pasted the kml file using the following instructions (https://developers.google.com/kml/faq#validation - How do I create KML files?)

Using R's xml package I had no issues using xmlInternalTreeParse:

doc2<-xmlInternalTreeParse("ROUTE_3.kml")

But this is what I got when I tried to use xpathApply:

xpathApply(doc2,"/kml//coordinates",xmlValue)
list()

After I stripped out the attributes of the kml tag, I got the following:

    xpathApply(doc2,"/kml//coordinates",xmlValue)
    [[1]]
    [1] "4.538678046760991,43.96218242485241,0 4.536099605055323,43.96220903572051,0              
    4.53771014982657,43.96415063050954,0 4.536106012183452,43.96535632643623,0  
    4.538664824256699,43.9660402294286,0 4.539486616025195,43.96777930035288,0 
    4.54165951159373,43.96623221715382,0 4.543909553814832,43.96588360581748,0 
    4.541906820403621,43.96447824521096,0 4.543519784610379,43.96288529313735,0 
    4.540449258644572,43.9633940089841,0 4.544185719673153,43.9516337999984,0 
    4.536212701406948,43.94157791460842,0 4.539125112498221,43.96125976359349,0"

I checked the orginal kml file using http://www.kmlvalidator.com/home.htm and it said the file was "valid and complies with best practices". I'm new to xpath (xml in general so any advice on how to handle this issue with the kml tag attributes would be appreciated.

Now that I have the coordinates as an element of a list, is there a clever way to make a three column data frame with lon lat elv as column headers? I tried the following but I'm sure there is a better way (Thanks to: Split column at delimiter in data frame): Please let me know if you have a more straight forward solution. Thank you.

ll<-xpathApply(doc2,"/kml//coordinates",xmlValue)
s<-ll[[1]]
ss<-strsplit(s,split=" ")

df <- data.frame(do.call('rbind', strsplit(as.character(ss[[1]]),',',fixed=TRUE)))
colnames(df)<-c("lon", "lat", "elv")
df
                lon               lat elv
1  4.538678046760991 43.96218242485241   0
2  4.536099605055323 43.96220903572051   0
3   4.53771014982657 43.96415063050954   0
4  4.536106012183452 43.96535632643623   0
5  4.538664824256699  43.9660402294286   0
6  4.539486616025195 43.96777930035288   0
7   4.54165951159373 43.96623221715382   0
8  4.543909553814832 43.96588360581748   0
9  4.541906820403621 43.96447824521096   0
10 4.543519784610379 43.96288529313735   0
11 4.540449258644572  43.9633940089841   0
12 4.544185719673153  43.9516337999984   0
13 4.536212701406948 43.94157791460842   0
14 4.539125112498221 43.96125976359349   0

Here is the original kml file:

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
<Document>
  <name>KmlFile</name>
    <StyleMap id="m_ylw-pushpin">
    <Pair>
        <key>normal</key>
        <styleUrl>#s_ylw-pushpin</styleUrl>
    </Pair>
    <Pair>
        <key>highlight</key>
        <styleUrl>#s_ylw-pushpin_hl</styleUrl>
    </Pair>
</StyleMap>
<Style id="s_ylw-pushpin">
    <IconStyle>
        <scale>1.1</scale>
        <Icon>
            <href>http://maps.google.com/mapfiles/kml/pushpin/ylw-pushpin.png</href>
        </Icon>
        <hotSpot x="20" y="2" xunits="pixels" yunits="pixels"/>
    </IconStyle>
</Style>
<Style id="s_ylw-pushpin_hl">
    <IconStyle>
        <scale>1.3</scale>
        <Icon>
            <href>http://maps.google.com/mapfiles/kml/pushpin/ylw-pushpin.png</href>
        </Icon>
        <hotSpot x="20" y="2" xunits="pixels" yunits="pixels"/>
    </IconStyle>
</Style>
<Placemark>
    <name>ROUTE_3</name>
    <styleUrl>#m_ylw-pushpin</styleUrl>
    <LineString>
        <tessellate>1</tessellate>
        <coordinates>
4.538678046760991,43.96218242485241,0 
4.536099605055323,43.96220903572051,0 
4.53771014982657,43.96415063050954,0
4.536106012183452,43.96535632643623,0 
4.538664824256699,43.9660402294286,0 
4.539486616025195,43.96777930035288,0 
4.54165951159373,43.96623221715382,0 
4.543909553814832,43.96588360581748,0 
4.541906820403621,43.96447824521096,0 
4.543519784610379,43.96288529313735,0 
4.540449258644572,43.9633940089841,0 
4.544185719673153,43.9516337999984,0 
4.536212701406948,43.94157791460842,0      
4.539125112498221,43.96125976359349,0 
        </coordinates>
    </LineString>
</Placemark>
</Document>
</kml>

UPDATE: After doing a little more reading. Specifically the XML package documentation section titled - Find matching nodes in an internal XML tree/DOM - Details. I know now that the kml tag attributes deal with namespace so I corrected the xpathApply to:

xpathApply(doc2,"/kml:kml//kml:coordinates",xmlValue)

Note that the path now includes the kml: namespace.

Now I can use a kml file without modifications. Here is an example wrapped in a function:

library(XML)
KML_geo_path_coordinates_to_dataframe<-function(kml_file){
#this requires the xml library
doc2<-xmlInternalTreeParse(kml_file)
#the namespace issue (kml:) is explained in the getNodeSet(XML) R documentation under Details
ll<-xpathApply(doc2,"/kml:kml//kml:coordinates",xmlValue)
# ll delivers a list, I take the element I need out...a long string of coordinates    separated by "  "
s<-ll[[1]]
#however it may need some clean up
s<-gsub(pattern="\t",replacement="",x=s)
s<-gsub(pattern="\n",replacement="",x=s)

#split out the coordinate sets lon, lat, elv
ss<-strsplit(s,split=" ")
df <- data.frame(do.call('rbind', strsplit(as.character(ss[[1]]),',',fixed=TRUE)))
colnames(df)<-c("lon", "lat", "elv")

return(df)
}
Community
  • 1
  • 1
user2461125
  • 75
  • 1
  • 5

1 Answers1

0

Just implementing @Gavin's excellent suggestion: (assumes the file is named map.kml).

library(rgdal)

setwd("<directory containing kml file>")

system(paste("ogrinfo", "map.kml")) # diagnostic to identify the layers
# Had to open data source read-only.
# INFO: Open of `map.kml'
#       using driver `KML' successful.
# 1: KmlFile (Line String)          <- This is the layer name

map <- readOGR(dsn="map.kml",layer="KmlFile")
df  <- data.frame(map@lines[[1]]@Lines[[1]]@coords)
colnames(df) <- c("lon","lat")
df

#         lon      lat
# 1  4.538678 43.96218
# 2  4.536100 43.96221
# 3  4.537710 43.96415
# 4  4.536106 43.96536
# 5  4.538665 43.96604
# ...

Some notes:

  1. The KML driver for readOGR(...) expects the file name (optionally with path) as the dsn, and the text of the kml name tag as the layer. The system call at the beginning identifies the layer(s).

  2. readOGR(...) throws out the z-dimension. So if you need that, this approach will not work for you.

  3. The location of the coordinates will depend on on the geometry, and the number of elements. In your case you have just one path.

  4. There was actually an error in your file, on line 2 (missing closing quote in the xmlns:gx namespace declaration). You need to fix that or the file will not import..

jlhoward
  • 58,004
  • 7
  • 97
  • 140