2

I have a data frame called result which looks like that.

lat lng Night
41.60701 1.000831 2019-06-19
41.98151 1.973059 2020-04-11
... ... ...

Basically, I whoul add 4 columns. One column for the time of sun set, the second for the sun rise, the third for the duration of the night in hour and finally the fourth for the sampling effort (I juste add the time buff to the duration of the night). I managed to do this by using a loop in the following code (unsing suncalc package for the getSunlightTimes).

library("plyr")
library("dplyr")
library("reshape")
library("data.table")
library("stringr")
library("tidyr")
library("ineq")
library("suncalc")

library(suncalc)
time_buff <- 0.30
posta <- ls()
sorti <- ls()
night_hours <- ls()
temp <- result
for (i in 1:dim(temp)[1]) {
  lat <- temp$lat[i]
  long <- temp$lng[i]
  sset <- as.Date(temp$Night[i])
  sris <- sset + 1
  Tsset <- getSunlightTimes(sset, lat, long,
    keep = c("sunrise", "sunset"), tz = "UTC"
  )$sunset
  Tsris <- getSunlightTimes(sris, lat, long,
    keep = c("sunrise", "sunset"), tz = "UTC"
  )$sunrise
  posta[i] <- Tsset
  sorti[i] <- Tsris
  night_hours[i] <- round(as.numeric(Tsris - Tsset), 2)
}


# fetch results
temp$sun_set <- as.POSIXct(as.numeric(unlist(posta)),
  origin = "1970-01-01", tz = "UTC"
)
temp$sun_rise <- as.POSIXct(as.numeric(unlist(sorti)),
  origin = "1970-01-01", tz = "UTC"
)
temp$night_hours <- as.numeric(unlist(night_hours))
temp$night_effort <- as.numeric(temp$night_hours) + (time_buff * 2)

result <- temp

But it take very long time to run. So, I would know if there is an other simplest way to do that, using for example the mutate function from dplyr package instead of using a loop ?

lobarth
  • 43
  • 6
  • 2
    You need to provide a [minimal, reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of your data. – M-- Apr 13 '23 at 14:27
  • like provide a sample of my real data ? – lobarth Apr 13 '23 at 14:30
  • yeah, only two rows would be sufficient. And what package is `getSunlightTimes` from? – M-- Apr 13 '23 at 14:31
  • suncalc package and ok thanks I will complete the data frame – lobarth Apr 13 '23 at 14:32
  • it's done. Is it ok ? or something is missing ? – lobarth Apr 13 '23 at 14:37
  • @barthdufau if your date is in `22-06-2000`, then the `as.Date` needs `format` argument – akrun Apr 13 '23 at 14:39
  • sorry I replace in the data with the right format @akrum – lobarth Apr 13 '23 at 14:41
  • @barthdufau with dplyr, you can use rowwise i.e. `data %>% mutate(sset = as.Date(Night), sris = sset + 1) %>% rowwise %>% mutate(Tsset = getSunlightTimes(sset, lat, lng, keep ="sunset", tz = "UTC")$sunset, Tsris = getSunlightTimes(sris, lat, lng, keep ="sunrise", tz = "UTC")$sunrise) %>% ungroup` (it may not be that fast though) – akrun Apr 13 '23 at 14:49

2 Answers2

2

The basic calculation can be done in tidyverse with rowwise - i.e. getSunlightTimes is not vectorized for the lat, long so we have to provide only a single value at a time. If there are duplicates for 'lat', 'long', instead of rowwise, may be better to do group_by(lat, lng) and then use first(lat), first(lng) in the getSunlightTimes call

library(dplyr)
data %>%
  rowwise %>%
  mutate(sset = as.Date(Night),  sris = sset + 1) %>% 
  mutate(Tsset = getSunlightTimes(sset, lat, lng,  keep ="sunset",
   tz = "UTC")$sunset,
  Tsris = getSunlightTimes(sris, lat, lng,  keep ="sunrise", 
   tz = "UTC")$sunrise) %>%
 ungroup

-output

# A tibble: 2 × 7
    lat   lng Night      sset       sris       Tsset               Tsris              
  <dbl> <dbl> <chr>      <date>     <date>     <dttm>              <dttm>             
1  41.6  1.00 2019-06-19 2019-06-19 2019-06-20 2019-06-19 19:34:19 2019-06-20 04:22:55
2  42.0  1.97 2020-04-11 2020-04-11 2020-04-12 2020-04-11 18:29:30 2020-04-12 05:17:10

data

data <- structure(list(lat = c(41.60701, 41.98151), lng = c(1.000831, 
1.973059), Night = c("2019-06-19", "2020-04-11")), class = "data.frame", row.names = c(NA, 
-2L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • when I run this code the cosonle said : Error in .buildData(date = date, lat = lat, lon = lon, data = data) : 'lat' must be a unique element. Use 'data' for multiple 'lat' – lobarth Apr 13 '23 at 14:57
  • @barthdufau there wasa a typo. if you add `rowwise` as in the update, it should work – akrun Apr 13 '23 at 14:59
1

Update:

We don't need to use a group_by or rowwise. Reading ?getSunlightTimes tells us to use data as an alternative if we have multiple coordinates:

date : Date. Single or multiple Date. YYYY-MM-DD

lat : numeric. Single latitude

lon : numeric. Single longitude

data : data.frame. Alternative to use date, lat, lon for passing multiple coordinates

keep : character. Vector of variables to keep. See Details

tz :> character. Timezone of results

So we can pass the dataframe as a whole to the function, but need to have the right names for the columns. See below;

result %>% 
  mutate(night = as.Date(night)) %>% 
  mutate(sunset = getSunlightTimes(data = transmute(., 
                                          date = night, lat = lat, lon = long), 
                                   keep = "sunset")$sunset,
         sunrise = getSunlightTimes(data = transmute(., 
                                           date = night + 1, lat = lat, lon = long), 
                                    keep = "sunrise")$sunrise,
         night_hr = as.numeric(round(difftime(sunrise, sunset, units = "hour"), 2)),
         night_effort = night_hr + (time_buff * 2))

#> # A tibble: 2 x 7
#>     lat   long night      sunset              sunrise             night_hr night_effort
#>   <dbl>  <dbl> <date>     <dttm>              <dttm>                 <dbl>        <dbl>
#> 1  40.0  -75.2 2023-04-13 2023-04-13 23:37:13 2023-04-14 10:25:55     10.8         11.4
#> 2  34.1 -118.  2023-04-01 2023-04-02 02:14:19 2023-04-02 13:40:21     11.4         12.0


We can use rowwise instead of a loop. Or better, group_by(lat, long) and only pass the first lat and long for each group.

library(lubridate)
library(dplyr)
library(suncalc)

result <- data.frame(lat = c(39.9526,34.0522), 
                     long = c(-75.1652, -118.243), 
                     night = c(mdy("4/13/2023"),mdy("4/01/2023")))
time_buff <- 0.3

result %>% 
  group_by(lat, long) %>% 
  mutate(sunset = getSunlightTimes(as.Date(night), lat[1], long[1])$sunset,
         sunrise = getSunlightTimes(as.Date(night) + 1, lat[1], long[1])$sunrise,
         night_hr = as.numeric(round(difftime(sunrise, sunset, units = "hour"), 2)),
         night_effort = night_hr + (time_buff * 2)) %>% 
  ungroup()

#> # A tibble: 2 x 7
#>     lat   long night      sunset              sunrise             night_hr night_effort
#>   <dbl>  <dbl> <date>     <dttm>              <dttm>                 <dbl>        <dbl>
#> 1  40.0  -75.2 2023-04-13 2023-04-13 23:37:13 2023-04-14 10:25:55     10.8         11.4
#> 2  34.1 -118.  2023-04-01 2023-04-02 02:14:19 2023-04-02 13:40:21     11.4         12.0
M--
  • 25,431
  • 8
  • 61
  • 93
  • When I execute this code the console said : Error in .buildData(date = date, lat = lat, lon = lon, data = data) : 'lat' must be a unique element. Use 'data' for multiple 'lat' – lobarth Apr 13 '23 at 14:57
  • @barthdufau you see that I have `rowwise`, right? That is actually important for this to work. – M-- Apr 13 '23 at 16:16
  • @barthdufau I added an alternative which does not need looping and is considerably faster. – M-- Apr 13 '23 at 17:04