2

What is the fastest way to get the local timezone in text of a large dataset of coordinates? My current method works fine, but the package I'm using "rundel/timezone" (which is simple and great for small sets) is quite slow for large sets.

Is there a faster way to accomplish the task reproduced below?:

  library(data.table)

#REPRODUCE DATA
  data <- data.table(latitude=sample(seq(47,52,by=0.001), 1000000, replace = TRUE),
                     longitude=sample(seq(8,23,by=0.001), 1000000, replace = TRUE))

  ###get timezone package via rundel/timezone
  if (!require("timezone")) devtools::install_github("rundel/timezone")
  library(timezone)


###CURRENT SLOW METHOD 

system.time(data[,timezone:=find_tz(longitude,latitude),])
       user  system elapsed 
     49.017  21.394  74.086 
Neal Barsch
  • 2,810
  • 2
  • 13
  • 39

1 Answers1

2

I happened to find the lutz package when I saw this question. It seems that this package is working for OP. I thought it'd be nice to leave a note here. In the package, there is a function called tz_lookup_coords(). You can set up method in two ways with this function. One is method = "fast" and the other is method = "accurate". If you want speed, choose the first option. If you want accuracy, choose the second option. I leave the following result. You see a huge difference in time.

library(lutz) 
set.seed(111)
data <- data.table(latitude=sample(seq(47,52,by=0.001), 1000000, replace = TRUE),
                   longitude=sample(seq(8,23,by=0.001), 1000000, replace = TRUE))

system.time(data[, timezone := tz_lookup_coords(lat = latitude, lon = longitude, method = "fast")])

#user  system elapsed 
#6.46    3.42    9.92 

#Warning message:
#Using 'fast' method. This can cause inaccuracies in timezones
#near boundaries away from populated ares. Use the 'accurate'
#method if accuracy is more important than speed. 

system.time(data[, timezone := tz_lookup_coords(lat = latitude, lon = longitude, method = "accurate")])

#  user  system elapsed 
#154.44    0.18  154.93 
jazzurro
  • 23,179
  • 35
  • 66
  • 76
  • From your comment this is exactly what I ended up with. It's about 10x faster than the previous. Didn't know lutz existed. – Neal Barsch Oct 20 '18 at 23:24
  • 1
    @NealBarsch I was surprised to see the difference. I am glad that this package helped you! – jazzurro Oct 21 '18 at 00:17
  • Thanks - I added both R libraries mentioned here to [the list](https://stackoverflow.com/questions/16086962/how-to-get-a-time-zone-from-a-location-using-latitude-and-longitude-coordinates). – Matt Johnson-Pint Oct 22 '18 at 01:58