I have a list of 18 data frames that I read in using read.xlsx. Each data frame has the same number of columns but some columns contain NA for some rows. Also, in the Abundance column there are rows that contain non-numeric data and I suspect that I may need to remove these rows from each data frame but I have not been able to find a way to remove those rows.
My data frame structure is like this:
$ :'data.frame': 118 obs. of 10 variables:
..$ Locus : Factor w/ 24 levels "A","CS",..: 14 14 14 14 22 22 NA 22 10 10 ...
..$ Target : Factor w/ 96 levels "[AAAGA]14","[AAAGA]15",..: 88 91 90 87 11 12 NA 9 65 67 ...
..$ Length : num [1:118] 60 76 72 56 24 39 NA 20 139 141 ...
..$ Abundance : num [1:118] 1479 1108 180 144 1786 ...
..$ Size : num [1:118] 15 19 18 14 6 9.3 NA 5 32 32.2 ...
..$ Call : Factor w/ 4 levels "Al","HAs",..: 1 1 3 3 1 1 NA 3 1 1 ...
..$ RAR : num [1:118] NA 74.92 12.17 9.74 NA ...
..$ Position : num [1:118] NA NA NA NA NA NA NA NA NA NA ...
..$ Al.1.s.percent: num [1:118] NA NA 12.17 9.74 NA ...
..$ Al.2.s.percent: num [1:118] NA NA 16.2 13 NA ...
I want to apply this function to each data frame in my list of data frames.
add.sum = function(df){
transform(df, Tot.count = ave(df[[Abundunce]], df[[Locus]], FUN = sum))
}
I tried using this line with lapply
transformed.data = lapply(mydata, add.sum)
I also tried it this way
transformed.data = lapply(mydata, function (x) add.sum(x))
But these give me the following error
Error in .subset2(x, i, exact = exact) : no such index at level 1
Any suggestions on how to get this working correctly?