4

As a follow up on a question I posted yesterday, is there package/function within R that can give me the country (and continent) in which a point defined by longitude and latitude is located? Something along the lines of what is done here in MatLab.

Dataframe looks like this...

Point_Name              Longitude   Latitude
University of Arkansas  36.067832   -94.173655
Lehigh University       40.601458   -75.360063
Harvard University      42.379393   -71.115897

And I would like to output the above dataframe with country and continent columns added to it. As an added bonus, a column with US states for those in the US (and "other" for those outside the US)?

Community
  • 1
  • 1
rg255
  • 4,119
  • 3
  • 22
  • 40
  • 1
    Have a look at [**this question**](http://stackoverflow.com/questions/14334970/convert-latitude-and-longitude-coordinates-to-country-name-in-r?rq=1), picked from the the list of questions under "Related" in the right margin on this page. – Henrik Feb 11 '14 at 17:19
  • [This answer](http://stackoverflow.com/a/8751965/980833) will get you the states. – Josh O'Brien Feb 11 '14 at 17:21
  • Thanks, I don't know how I missed those (I had looked at quite a number but didn't find those ones) - as soon as I get the script working based on the old answers I will delete my Q. – rg255 Feb 11 '14 at 17:35
  • (currently country and state are generally working - some coming up with NA in country, and trying to get continent sorted) – rg255 Feb 11 '14 at 17:55

2 Answers2

12

To get continents you can modify the last line of the coords2country function using rworldmap from this answer to create a coords2continent function as shown below. Choose whether you want the 6 or 7 continent model. I'll think about putting this code into rworldmap.

library(sp)
library(rworldmap)

# The single argument to this function, points, is a data.frame in which:
#   - column 1 contains the longitude in degrees
#   - column 2 contains the latitude in degrees
coords2continent = function(points)
{  
  countriesSP <- getMap(resolution='low')
  #countriesSP <- getMap(resolution='high') #you could use high res map from rworldxtra if you were concerned about detail

  # converting points to a SpatialPoints object
  # setting CRS directly to that from rworldmap
  pointsSP = SpatialPoints(points, proj4string=CRS(proj4string(countriesSP)))  


  # use 'over' to get indices of the Polygons object containing each point 
  indices = over(pointsSP, countriesSP)

  #indices$continent   # returns the continent (6 continent model)
  indices$REGION   # returns the continent (7 continent model)
  #indices$ADMIN  #returns country name
  #indices$ISO3 # returns the ISO3 code 
}

Here is a test.

points = data.frame(lon=c(0, 90, -45, -100, 130), lat=c(52, 40, -10, 45, -30 ))

coords2continent(points)
#[1] Europe        Asia          South America North America Australia  
coords2country(points)
#[1] United Kingdom  China   Brazil   United States of America  Australia
Community
  • 1
  • 1
Andy
  • 1,821
  • 13
  • 23
  • 1
    How come the continents seem to be a bit off? Like Mexico and Central America are part of South America (instead of North America) and all of Russia is part of Europe, whereas some of it should be in Asia? – Alex May 31 '20 at 18:58
  • Nice code, only problem I have is that points that I would like to assign points that fall into the ocean to the closest continent, because at the moment the result is NA. – Herman Toothrot Oct 13 '20 at 16:44
1

So this is an alternative that uses the reverse geocoding API on Google. This code is based in part on this reference.

Calling your dataframe above df,

reverseGeoCode <- function(latlng) {
  require("XML")
  require("httr")
  latlng    <- as.numeric(latlng)
  latlngStr <- gsub(' ','%20', paste(round(latlng,2), collapse=","))
  url   <- "http://maps.google.com"
  path  <- "/maps/api/geocode/xml"
  query <- list(sensor="false",latlng=latlngStr)
  response <- GET(url, path=path, query=query)
  if (response$status !=200) {
    print(paste("HTTP Error:",response$status),quote=F)
    return(c(NA,NA))
  }
  xml    <- xmlInternalTreeParse(content(response,type="text"))
  status <- xmlValue(getNodeSet(xml,"//status")[[1]])
  if (status != "OK"){
    print(paste("Query Failed:",status),quote=F)
    return(c(NA,NA))
  }
  xPath   <- '//result[1]/address_component[type="country"]/long_name[1]'
  country <- xmlValue(getNodeSet(xml,xPath)[[1]])
  xPath   <- '//result[1]/address_component[type="administrative_area_level_1"]/long_name[1]'
  state   <- xmlValue(getNodeSet(xml,xPath)[[1]])
  return(c(state=state,country=country))
}
st.cntry <- t(apply(df,1,function(x)reverseGeoCode(x[2:3])))
result   <- cbind(df,st.cntry)
result
#               Point_Name Longitude  Latitude         state       country
# 1 University of Arkansas  36.06783 -94.17365      Arkansas United States
# 2      Lehigh University  40.60146 -75.36006  Pennsylvania United States
# 3     Harvard University  42.37939 -71.11590 Massachusetts United States

In the API definition, "administrative_area_level_1" is the highest administrative area below country. In the US these are states. In other countries the definition varies (might be provinces, for example).

Incidentally, I'm fairly certain you have latitude and longitude reversed.

jlhoward
  • 58,004
  • 7
  • 97
  • 140