The following should work. Suppose we generate a data frame of 2 million rows:
> N <- 2e6
> R <- data.frame(year = sample(2000:2009,N,TRUE),
+ dayofyear = sample(365,N,TRUE),
+ time = floor(runif(N,0,12))*100+floor(runif(N,0,60)),
+ humidity = 99,
+ temp = floor(runif(N,15,40)))
> R$date <- as.Date(with(R,strptime(paste(year,dayofyear),
+ "%Y %j", tz="GMT")))
> nrow(R)
[1] 2000000
> head(R)
year dayofyear time humidity temp date
1 2000 206 307 99 39 2000-07-24
2 2009 101 1019 99 16 2009-04-11
3 2004 307 547 99 21 2004-11-02
4 2003 270 1158 99 33 2003-09-27
5 2006 21 330 99 22 2006-01-21
6 2005 154 516 99 21 2005-06-03
>
In this case, date
is already a Date
column, but if yours is a character column, then:
> R$date <- as.Date(R$date)
should only take a few seconds.
Now, get a list of all the unique date values. This should be quite fast:
> dates <- unique(R$date)
> print(length(dates))
[1] 3650
>
Now, run getSunlightTimes
on this vector. This only took a couple of seconds on my machine using suncalc
version 0.4 and R version 3.4.4:
> times <- suncalc::getSunlightTimes(dates, lat=0, lon=0)
Now, generate an index vector giving the index of each date in R$date
within the vector of unique dates dates
:
> i <- match(R$date, dates)
Now, select rows of the times
dataframe by this same index:
> solarNoons <- times[i,]
> nrow(solarNoons)
[1] 2000000
>
If we pick a row of R:
> R[1234567,]
year dayofyear time humidity temp date
1234567 2002 24 535 99 17 2002-01-24
you'll see that the corresponding row of solarNoons
is the result for that date:
> solarNoons[1234567,]
date lat lon solarNoon nadir
2616.352 2002-01-24 12:00:00 0 0 2002-01-24 12:13:14 2002-01-24 00:13:14
sunrise sunset sunriseEnd
2616.352 2002-01-24 06:09:42 2002-01-24 18:16:46 2002-01-24 06:11:58
sunsetStart dawn dusk
2616.352 2002-01-24 18:14:30 2002-01-24 05:47:49 2002-01-24 18:38:39
nauticalDawn nauticalDusk nightEnd
2616.352 2002-01-24 05:22:22 2002-01-24 19:04:06 2002-01-24 04:56:50
night goldenHourEnd goldenHour
2616.352 2002-01-24 19:29:38 2002-01-24 06:38:39 2002-01-24 17:47:49
>
If you want, you can merge the two data frames together:
> R2 <- cbind(R, solarNoons)
This all assumes that "1.65 MM" meant 1.65 million. If you meant 1.65 million million (i.e., an American trillion), then you're going to need a bigger computer.