0

I'm struggling with the following issue: I have many data frames with different names (For instance, Beverage, Construction, Electronic etc., dim. 540x1000). I need to clean each of them, calculate and save as zoo object and R data file. Cleaning is the same for all of them - deleting the empty columns and the columns with some specific names.

For example:

Beverages <- Beverages[,colSums(is.na(Beverages))<nrow(Beverages)] #removing empty columns
Beverages_OK <- Beverages %>% select (-starts_with ("X.ERROR")) # dropping X.ERROR column
Beverages_OK[, 1] <- NULL #dropping the first column
Beverages_OK <- cbind(data[1], Beverages_OK) # adding a date column
Beverages_zoo <- read.zoo(Beverages_OK, header = FALSE, format = "%Y-%m-%d")
save (Beverages_OK, file = "StatisticsInRFormat/Beverages.RData")

I tied to use 'lapply' function like this:

list <- ls() # the list of all the dataframes
lapply(list, function(X) {
temp <- X
temp <- temp [,colSums(is.na(temp))< nrow(temp)] #removing empty columns
temp <- temp %>% select (-starts_with ("X.ERROR")) # dropping X.ERROR column
temp[, 1] <- NULL
temp <- cbind(data[1], temp)
X_zoo <- read.zoo(X, header = FALSE, format = "%Y-%m-%d") # I don't know how to have the zame name as X has.
save (X, file = "StatisticsInRFormat/X.RData")
})

but it doesn't work. Is any way to do such a job? Is any r-package that facilitates it?

Thanks a lot.

Imran Ali
  • 2,223
  • 2
  • 28
  • 41
Rom
  • 25
  • 6

1 Answers1

1

If you are sure the you have only the needed data frames in the environment this should get you started:

df1 <- mtcars
df2 <- mtcars
df3 <- mtcars
list <- ls()
lapply(list, function(x) {
    tmp <- get(x)

})
Valter Beaković
  • 3,140
  • 2
  • 20
  • 30
  • You can just do `lapply(list, get)`, the anonymous function and `tmp` don't do anything special. Though it might be good to illustrate assigning the resulting list to an object. – Gregor Thomas Nov 21 '16 at 23:08
  • Even better as you write... My idea was that there should be some code to follow in the final solution... – Valter Beaković Nov 21 '16 at 23:22
  • thank you for the messages. Yes, with get it works. it looks like it gets the dataframes but it wants to be in the folser with the initial files. why it is so? – Rom Nov 22 '16 at 09:54
  • If you remove line by line from the bottom at witch line it fails? Please, put the code inside `` when commenting it will be much more readable. – Valter Beaković Nov 22 '16 at 10:00
  • At the get (X) line ``lapply(list, function(X) { temp <- get (X) temp <- temp [,colSums(is.na(temp))< nrow(temp)] temp <- temp %>% select (-starts_with ("X.ERROR")) temp[, 1] <- NULL temp <- cbind(data[1], temp) X_zoo <- read.zoo(temp, header = FALSE, format = "%Y-%m-%d") save (temp, file = "X.RData") return(X_zoo) }) `` Is it possibile instead of "temp" dataframe as a proxy use the original dataframes with the names listed in "list " – Rom Nov 22 '16 at 10:19
  • Any chance you can provide me with same sample data? I could have time this evening (my TZ is UTC+1) to take a look... – Valter Beaković Nov 22 '16 at 10:45
  • ok, thank you. so, I have 28 csv files. I import them into R like ``filenames <- list.files(path ="D:/Documents/R/Trump/StatisticsReady/", pattern = "\\.csv$"); numfiles <- length(filenames); for (i in c(1:numfiles)){ assign(gsub("[.]csv$","",filenames[i]),read.csv(filenames[i], header=TRUE)) }`` SO the names of csv files and the dataframe are the same, like PersonalGoods, Beverages, Travel etc. Since they are raw, before calculation I need to clean them, i.e., delete empty columns, convert to zoo object etc. That's actually the code posted in my question above. – Rom Nov 22 '16 at 13:44
  • It could be good to rewrite the existing dataframes with the new ones, received in the result of ``lapply`` function for further manipulations. – Rom Nov 22 '16 at 13:47
  • I have few questions maybe we could switch to chat? – Valter Beaković Nov 22 '16 at 16:57
  • yes, but could you do it. please, since my "reputation"=12 and I cannot do it – Rom Nov 22 '16 at 17:16
  • I doesn't allow me either.... what is the `data` variable? read.zoo reads files and you are passing it a data frame instead of a file name? – Valter Beaković Nov 22 '16 at 17:18
  • data variable includes one column with the dates in the correct format for easier converting into zoo. So the final dataframe is like 540x800, for example, the first column is the date, the rest columns - the data – Rom Nov 22 '16 at 17:33
  • Right, the files are used only at the very beginning while reading them into R. Then only dataframes – Rom Nov 22 '16 at 17:52
  • I am refering to `read.zoo(temp, header = FALSE, format = "%Y-%m-%d")` inside the function I think you should probably use as.zoo() to convert the data frame to a zoo object. Note the cbind with data[1] will produce a column with all dates the same... – Valter Beaković Nov 22 '16 at 18:00
  • actually, it works without as.zoo(), so manually, data frame-by-data frame it works. I'm struggling with the following: how to read and re-write dataframes with the same names in ''lapply'' function... – Rom Nov 22 '16 at 18:33
  • Do you have a Gmail account, we can switch to hangouts? – Valter Beaković Nov 22 '16 at 18:38