0

I'm trying to work out a way of extracting text files from multiple directories using fs:dir_ls and vroom.

The directory structure is essentially M:/instrument/project/experiment/measurements/time_stamp/raw_data/files*.txt.

Ideally, I want to be able to define the path to the experiment level then let the pattern take care of the rest, for example -

fs::dir_ls(path="M:/instrument/project/", glob = "experiment_*/2021-04-11*/raw_data/files*.txt", invert = TRUE, recurse = TRUE),

So I'm reading in all the .txt files across multiple experiment directories in one go, however, when I try this approach it returns all the files from the project level rather than those from the specific folders described by the pattern.

I've looked through the other SO questions on the topic covered here: Pattern matching using a wildcard, R list files with multiple conditions, list.files pattern argument in R, extended regular expression use, and grep using a character vector with multiple patterns, but haven't been able to apply them to my particular problem.

Any help is appreciated, I realise the answer is likely staring me in the face, I just need help seeing it.

Thanks

  • If you want files from `raw_data` folder can you try ? `list.files('M:/instrument/project/experiment/measurements/time_stamp/raw_data/', pattern = 'files.*\\.txt')` – Ronak Shah May 12 '21 at 01:16
  • Thanks, however, when I try this it returns an empty character. – DrBalticYaldie May 13 '21 at 10:54
  • What is the complete path of the files that you want to select? – Ronak Shah May 13 '21 at 10:56
  • The full path to one of the files is `M:/Operetta/LED_Wound/operetta_export/plate_variability[540]/robot_seed_wide_plate_1[1614]/2021-05-10T113438+0100[1764]/SC_data/arpe19_10x_hoescht_h2dcfda_ints_morpho_SC[47280].result.A1[46991].Population - vaid_nuclei[0].txt` – DrBalticYaldie May 13 '21 at 14:06
  • and it's the arpe19[...].txt file that I'm wanting to read. – DrBalticYaldie May 13 '21 at 14:13

1 Answers1

0

You can try the following with list.files :

files <- list.files('M:/Operetta/LED_Wound/operetta_export/plate_variability[540]/robot_seed_wide_plate_1[1614]/2021-05-10T113438+0100[1764]/SC_data', pattern = 'arpe19*\\.txt')
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thank you for your reply, this approach works but only searches one directory for the files when Ideally I wish to read files from multiple directories i.e. `robot_plate_1-3`. This way, I don't have to redefine the path starting after `plate_variability[540]`, including the timestamps, for each batch of files. So I need a glob/regex pattern that captures the preceding directories, something like ``~/robot_seed_wide_plate_*`/2021-05-//*/SC_Data/arpe19//*.txt``. – DrBalticYaldie May 17 '21 at 08:59
  • @DrBalticYaldie Add `recursive = TRUE` i.e `files <- list.files(..., pattern = ..., recursive = TRUE)` – Ronak Shah May 17 '21 at 10:48