1

I have a large set of data with Longitudes and Latitudes that I want to convert into UK Postcodes. I first tried downloading all of the UK postcodes with their corresponding long/lat and then joining the data together. This worked for some of the data but the majority didn't match due to postcode latitude and longitude being the centre of each postcode, where as my data is more accurate.

I've also tried a bit of code that converts Lat/long in America to give the corresponding state (given by Josh O'Brien here Latitude Longitude Coordinates to State Code in R), but I couldn't find a way to alter this to UK postcodes.

I've also tried running a calculation that tries to find the closest postcode to the long/lat but this create a file too large for R to handle.

Also seen some code that uses google maps (geocoding) and this does seem to work but I've read it only allows 2000 calculations a day, I have much more than this (around 5 million rows of data)

epo3
  • 2,991
  • 2
  • 33
  • 60
Naveed
  • 11
  • 1
  • 5
  • By _convert_ you mean find the nearest? If so, and R cannot cope with this, then you could try dedicated GIS software, for example, this could be done in [PostGIS](https://postgis.net) and I would imagine [QGIS](https://www.qgis.org) too. – Dan Winchester Jan 08 '18 at 15:26
  • I hate the GIS crowd FUD abt R. It's plenty speedy (it uses the same underlying libraries most GIS products do), especially with `sf`. @Naveed: do you have a shapefile with postcode info? – hrbrmstr Jan 08 '18 at 15:28
  • Thank you for your comment @hrbrmstr , and I don't have a shapefile at the moment, the only postcode information I have is a list of all UK postcodes with their representing longitude and latitudes – Naveed Jan 08 '18 at 15:35
  • Thanks @DanWinchester, yes I do mean find the nearest. I might try this as a last resort as I haven't used GIS software before. Would it be able to hand 5 million records? This could be broken down if not – Naveed Jan 08 '18 at 15:37
  • Can't you just geocode your postcodes, e.g. with https://postcodes.io/ – Phil Jan 08 '18 at 15:43
  • Hi @Phil , I could, but unfortunately this only allows 2,500 runs a day, I have way more data than this – Naveed Jan 08 '18 at 15:50
  • @Naveed [PostGIS](https://postgis.net) could easily handle 5 million records. I have found [PostGIS](https://postgis.net) better able to handle large datasets than [QGIS](https://www.qgis.org). – Dan Winchester Jan 08 '18 at 19:12

2 Answers2

2

You might want to try my PostcodesioR package which includes reverse geocoding functions. However, there is a limit to the number of API calls to postcodes.io.

devtools::install_github("ropensci/PostcodesioR")
library(PostcodesioR)
reverse_geocoding(0.127, 51.507)

Another option is to use this function for reverse geocoding more than one pair of geographical coordinates:

geolocations_list <- structure(
 list(
 geolocations = structure(
 list(
 longitude = c(-3.15807731271522, -1.12935802905177),
 latitude = c(51.4799900627036, 50.7186356978817),
 limit = c(NA, 100L),
 radius = c(NA, 500L)),
 .Names = c("longitude", "latitude", "limit", "radius"),
 class = "data.frame",
 row.names = 1:2)),
 .Names = "geolocations")

bulk_rev_geo <- bulk_reverse_geocoding(geolocations_list)

bulk_rev_geo[[1]]$result[[1]]

Given the size of your data set and usual limitations to the API calls, you might want to download the latest database with the UK geographical data and join it to your files.

epo3
  • 2,991
  • 2
  • 33
  • 60
  • Thank you for this Epo3, do you know if the bulk reverse geocode has a limit? for instance would the code above be classed as one run or 2 (as you are changing 2 long/lats to postcodes)? If its 1 then I could split my data down a bit and run them – Naveed Jan 09 '18 at 08:40
  • No idea. I never hit the limit. – epo3 Jan 09 '18 at 11:58
  • Is that because you've always worked with data lower than the limit (2500) or you've just not had a problem yet? – Naveed Jan 09 '18 at 14:38
0

I believe you want to do "Reverse Geocoding" with the google maps API. That is to parse the latitude and longitude and get the closest address. After that you can easily take just the post code from the address. (It is an item in the list you receive as an address from the google maps API.)

The api (last time I checked) allows 2500 free calls per day, but you can do several tricks (depending on your dataset) to match more records:

  1. You can populate your dataset with 2400 records each day until it is complete or
  2. You can change your IP and API key a few times to get more records in a single day or
  3. You can always get a premium API key and pay for the number of requests you make

I did such geocoding in R a few years ago by following this popular tutorial: https://www.r-bloggers.com/using-google-maps-api-and-r/

Unfortunately the tutorial code is a bit out-of-date, so you will need to fix a few things to adapt it to your needs.

Borislav Aymaliev
  • 803
  • 2
  • 9
  • 20
  • Hi Borislav, thanks for this but I've dabbled in this a little bit already. I've got about 5million rows of data at the moment so this wouldn't really work (even if I broke it down into smaller pieces) – Naveed Jan 08 '18 at 15:48