2

I have a list of data frames. There are 28 data frames in my list. Some of the data frames have empty rows but not all. How can I use lapply or a similar function to remove empty rows from all data frames in the list?

Here is what I have tried which I modified from this question. Unfortunately, this returned only those rows that were empty.

#Get list of all files that will be analyzed
filenames = list.files(pattern = ".csv")

#read in all files in filenames
mydata_run1 = lapply(filenames, read.csv, header = TRUE, quote = "")

#Remove empty rows
mydata_run1 = lapply(mydata_run1, function(x) sapply(mydata_run1, nrow)>0)

Thank you.

Community
  • 1
  • 1
aminards
  • 309
  • 2
  • 11
  • You need to make [a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). – alistaire Mar 02 '17 at 18:08
  • If no other data is missing, you could use `complete.cases()`. – K.Daisey Mar 02 '17 at 18:14
  • This will work when `NA`s occur only in the empty rows: `mydata_run1 = lapply(mydata_run1, function(x) na.omit(x))` – Edward Carney Mar 02 '17 at 18:20
  • @EdwardCarney, this approach removes rows all rows with even 1 NA value. – user5249203 Mar 02 '17 at 19:41
  • Correct. My comment was intended to indicate that it won't work the way @aminards wants it to when NAs occur in other rows than just the "empty" ones. Perhaps, I should have been more clear. – Edward Carney Mar 02 '17 at 19:46

1 Answers1

1

I assume you want to remove empty rows when appeared across all columns. If so,

# remove row data if only all the columns have NA value. 
lapply(data, function(x){ x[rowSums(is.na(x)) != ncol(x),]})

output


$df1
  A B
1 1 4
3 3 6

$df2
  A  B
1 1 NA
3 3  6

data


data <- list(
            df1 = data.frame(A = c(1,NA,3), B = c(4, NA, 6)),
            df2 = data.frame(A = c(1,NA,3), B = c(NA, NA, 6)))
user5249203
  • 4,436
  • 1
  • 19
  • 45