0

This question is somewhat of a continuation of a previous question I asked Combining data based on user input in R. It was recommended to store my data in lists rather than as individual items in the global environment. Therefore I am attempting to import all similar file types (e.g. .csv) into a list of lists. Each nested list represents one of multiple time series .csv files. This currently works using the following code:

ImportData <- function(mypattern,...)
{
  mypattern <- readline(prompt = "Enter File Type:")
  temp <- list.files(".", pattern=mypattern)
  myfiles <- lapply(temp, fread, skip = 1) 
  names(myfiles) <- gsub("-.*", "", temp)
  header <- c("index","DateTime", "Voltage")
  myfiles <<- lapply(myfiles, setNames, header)
}
ImportData()

The next step is to combine lists of time series data with similar names that represent data from the same site. After running the above function myfiles contains lists with names such as ASW1.csv, ASW1_10Sept.csv, ASW1_2017, as well as others with different abbreviations (e.g. CSW). My goal with the below function is to prompt the user to input a site name (e.g. ASW1) and for every list within the myfiles list with a name containing ASW1 to be combined and sorted by date into a new list called ASW1 within myfiles. Then the user can call all the data from a particular site.

CombineData <- function()
{
  Site <- invisible(readline(prompt = "Enter Site Name:"))
  myfiles[[Site]] <<- rbindlist(mget(apropos(Site), inherits = TRUE))
}
CombineData()

The issue I am having is when running CombineData an empty list is added to myfiles rather than one that contains all the data from any list with the name provided in Site. I can manually combine the lists:

ASW1 <- myfiles[c(1,2,3,4)]
ASW1 <- rbindlist(ASW1)

but my goal is to make this process as automated as possible. In a previous version of my code I used the following to grab everything containing a particular name do.call(rbind, mget(ls(pattern = "^ASW1"))) but clearly there must be something different when working with lists.

halfer
  • 19,824
  • 17
  • 99
  • 186
bmdanhof
  • 23
  • 2

2 Answers2

0

Does the following solve your problem:

library(dplyr)
reqdNames = names(mylist)[grepl("asw",names(mylist))]

finalDf = bind_rows(mylist[reqdNames])

It is possible I have misunderstood the problem. Let me know if it works.

0

I think I understand what you are trying to do. Try to work with functions that require a data argument and that return the data, when the function executes.

Instead of have functions manipulating data in the global environment.

Here is an example:

MakeData <- function(abrr = 'ASW',nfiles = 10){
  for(i in 1:nfiles){
  lst <- list(a = 1:10,
              b = rnorm(10)>0,
              c = sample(LETTERS,10))
  write.csv(lst,paste0("test_",abrr,i,".csv"))
  }
}

ImportData <- function(mypattern)
{
  temp <- list.files(".", pattern=mypattern)

  myfiles <- lapply(temp, read.csv) 
  names(myfiles) <- gsub("-.*", "", temp)
  myfiles
}

CombineData <- function(datalst,get = 'a')
{
  lst <- lapply(datalst, function(x) x[[get]])
  do.call(rbind, lst)
}

set.seed(314)
MakeData("ASW")
impdat <- ImportData("ASW")
CombineData(impdat,'b')

Which returns:

                [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
test_ASW1.csv  FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE  TRUE FALSE
test_ASW10.csv FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE FALSE
test_ASW2.csv   TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE
test_ASW3.csv  FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE FALSE  TRUE FALSE
test_ASW4.csv  FALSE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE
test_ASW5.csv   TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE FALSE  TRUE  TRUE
test_ASW6.csv   TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE
test_ASW7.csv   TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
test_ASW8.csv  FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
test_ASW9.csv  FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE

or use tidyverse to load and process the data in a pipeline, like I prefer to do:

require(tidyverse)
data.frame(filename = list.files(".", pattern = "ASW"), stringsAsFactors = F) %>%
  mutate(lst = map(filename,~read.csv(.x)),
         dat = map(lst, ~as.data.frame(t(.x[['b']])))) %>%
  select(filename,dat) %>%
  unnest()
Wietze314
  • 5,942
  • 2
  • 21
  • 40