I am trying to write an r script that will match the file name inside a directory and compare it to a file name located in a csv file. This is so I can tell what files have already been downloaded and what data I need to download. I have written code that will read the files from the directory and list them as a df as well as reading in the csv file. However I am having trouble changing the file name to pull out the string I want as well as matching the file name with the name column in the csv file. I also would want to ideally create a new spread sheet that can tell me what files match so I know what has been downloaded. This is what I have so far.
# read files from directory and list as df
file_names <-list.files(path="peaches/",
pattern="jpg",
all.files=TRUE,
full.names=TRUE,
recursive=TRUE) %>%
# turn into df
as.data.frame(x = file_names)
# read in xl file
name_data <- read_excel("peaches/all_data.xlsx")
# change the file_name from the string peaches//fruit/1234/12pink.jpg.txt to -> 12pink
# match the file name with the name column in name_data
# create a new spread sheet that pulls the id and row if it has been downloaded [enter image description here][1]