1

I have:

  • directories (let's say two: A and B) that contain files;
  • two character objects storing the directories (dir_A, dir_B);
  • a function that takes the directory as argument and returns the list of the names of the files found there (in a convenient way for me that is different from list.files()).
directories <- c(dir_A, dir_B)
read_names <- function(x) {foo}

Using a for-loop, I want to create objects that each contain the list of files of a different directory as given by read_names(). Essentially, I want to use a for-loop to do the equivalent as:

files_A <- read_names(dir_A)
files_B <- read_names(dir_B)

I wrote the loop as follows:

for (i in directories) {
  assign(paste("files_", sub('.*\\_', '', deparse(substitute(i))), sep = ""), read_names(i))
}

However, although outside of the for-loop deparse(substitute(dir_A)) returns "dir_A" (and, consequently, the sub() function written as above would return "A"), it seems to me that in the for-loop substitute(i) makes i stop being one of the directories, and just being i.

It follows that deparse(substitute(i)) returns "i" and that the output of the for-loop above is only one object called files_i, which contains the list of the files in the last directory of the iteration because that is the last one that has been overwritten on files_i.

How can I make the for-loop read the name (or part of the name in my case, but it is the same) of the object that i is representing in that moment?

Matteo
  • 2,774
  • 1
  • 6
  • 22
  • After you write `directories <- c(dir_A, dir_B)` there is no tie to the variable `dir_A`, only the value of `dir_A` at the time that you executed the statement creating directories. – G5W Nov 18 '19 at 15:15

1 Answers1

0

There are two issues here, I think:

  1. How to reference both the name (or index) and the value of each element within a list; and
  2. How to transfer data from a named list into the global (or any) environment.

1. Reference name/index with data

Once you index with for (i in directories), the full context (index, name) of i within directories is lost. Some alternatives:

for (ix in seq_along(directories)) {
   directories[[ix]]             # the *value*
   names(directories)[ix]        # the *name*
   ix                            # the *index*
   # ...
}

for (nm in names(directories)) {
   directories[[nm]]             # the *value*
   nm                            # the *name*
   match(nm, names(directories)) # the *index*
   # ...
}

If you're amenable to Map-like functions (a more idiomatic way of dealing with lists of similar things), then

out <- Map(function(x, nm) {
  x                              # the *value*
  nm                             # the *name*
   # ...
}, directories, names(directories))

out <- purrr::imap(directories, function(x, nm) {
  x                              # the *value*
  nm                             # the *name*
   # ...
})
# there are other ways to identify the function in `purrr::` functions

Note: while it is quite easy to use match within these last two to get the index, it is a minor scope-breach that I prefer to avoid when reasonable. It works, I just prefer alternative methods. If you want the value, name, and index, then

out <- Map(function(x, nm, ix) {
  x                              # the *value*
  nm                             # the *name*
  ix                             # the *index*
   # ...
}, directories, names(directories), seq_along(directories))

2. Transfer list to env

In your question, you're doing this to assign variables within a list into another environment. Some thoughts on that effort:

  1. If they are all similar (the same structure, different data), then Don't. Keep them in a list and work on them en toto using lapply or similar. (How do I make a list of data frames?)

  2. If you truly need to move them from a list to the global environment, then perhaps list2env is useful here.

    # create my fake data
    directories <- list(a=1, b=2)
    # this is your renaming step, rename before storing in the global env
    # ... not required unless you have no names or want/need different names
    names(directories) <- paste0("files_", names(directories))
    # here the bulk of the work; you can safely ignore the return value
    list2env(directories, envir = .GlobalEnv)
    # <environment: R_GlobalEnv>
    ls()
    # [1] "directories" "files_a"     "files_b"    
    files_a
    # [1] 1
    
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thanks for the answer. I am new to coding so I'll need to take a bit of time to really understand everything that your reasoning implies. In the meanwhile I tried to solve it with the `names(directories)` trick, but the function kept returning `NULL` (also when I tried it with sample dummy objects). But often it is key just to get the job done, so I found a workaround doing `names_directories <- sort(grep("dir_", (names(environment()), value = TRUE)` and then using `names_directories[i]` inside `paste()` in the loop. Maybe not the most efficient, but ok for now. Also @G5W helped me realise. – Matteo Nov 18 '19 at 17:27
  • `names(directories)` of `NULL` means it has no names. – r2evans Nov 18 '19 at 17:29
  • Yes, now I understand the point - however this to me looks like only changing place to the problem. If I have only two directories, then I can go `directories <- c(dir_A = dir_A, dir_B = dir_B)`. However if I want to automate the process then I fall back on the same issue. But I guess that this less about the original question and more about me getting to grips with R and programming. Anyway, I was thinking again about this and realised that in some sense I could just not bother about creating these objects and just keep using the `read_names()` when needed - which maybe is your last point. – Matteo Nov 18 '19 at 22:44