Using googledrive4 package, accessing files on google drive is wonderfully easy.
p_load(googledrive4, tidyverse)
dribble1 <- drive_ls()
> dribble1
# A dribble: 2000 × 3
name id drive_resource
<chr> <drv_id> <list>
1 somefile1.zip 1w76E2ze0p00jtxxxxxxxxxxxxxxxxxxx <named list [37]>
2 somefile2.zip 1Zau_jwYlDHFK4xxxxxxxxxxxxxxxxxxx <named list [37]>
...
however, I am struggling to filter the result, based on parameters nested inside the "drive_resource" named list. I wish for example, to filter by date-time to create a dribble that contains only files that are saved after certain date.
in desperation and after many try-and-error, I achieved what I wanted, in 2 steps like this:
p_load(tidyverse, lubridate)
# 1 - unnest to make a list of ID's that match my criteria
range <- interval(as_date("2021/1/1", now())
filtered_list <- dribble1 %>%
unnest_longer(col = drive_resource) %>%
filter(drive_resource_id == "modifiedTime") %>%
unnest_longer(drive_resource, values_to = "modtime") %>%
mutate(modtime = as_datetime(modtime)) %>%
filter(modtime %within% range)
# 2 - filter the original dribble with filtered list of ID's
result_dribble <- dribble1 %>%
filter(id %in% filtered_list$id)
This works, but I feel there must be a better way to handle nested lists more elegantly, without creating intermediate objects.
Could someone please shed some light on this?
(sorry for the lack of reprex. dribbles are constructed in a unique way I do not yet fully understand, and couldn't use datapasta to reconstruct a dribble containing nested named list)
above is a simplified example, the data I am processing is much larger, I hope the screenshot from RStudio makes sense to people not familiar with googledrive4.