Use rbind() in nested for loop with apply() in r

Question

How can you use rbind in a for loop that runs through a list of dataframes? I tried to follow Looping through list of data frames in R but receive the following:

Error in apply(dataFramesList, 2, function(x) { : dim(X) must have a positive length

I have two dataframes, dfTraining and dfAccuracy (code to reproduce dataframes is below), and need to add a row for any of the crop types missing from either of two columns, CROP or CROP_LABEL. I believe my problem is in my last line of code.

My code block is:

dataFramesList <- list(dfTraining, dfAccuracy)
apply(dataFramesList, 2, function(x){
  cropNumbers <- seq(1,23, by = 1)
  cropNumbers <- cropNumbers[-c(3)]
  cropNumbers <- append(cropNumbers, 34)

  listofCROPandCROP_LABELColumns <- list(dataFrameList$CROP, dataFrameList$CROP_LABEL)

  missingCROP <- NULL
  for (i in listofCROPandCROP_LABELColumns){
    for (j in cropNumbers){
      if (!j %in% i){
        # If crop number is missing from CROP_LABEL, add missingCROP observation (row)
        # Make row for missing crop type
        missingCrop <- list(FREQUENCY = 0, AA = 1, CROP = j, CROP_LABEL = j, ACRES = 0)
        dataFrameList <- rbind(dataFrameList, missingCrop)
      } 
    }
  }  
})

My dfAccuracy dataframe:

structure(list(FREQUENCY = c(4L, 2L, 1L, 1L, 1L, 1L, 65L, 1L, 
1L, 4L, 1L, 5L, 5L, 2L, 4L, 1L, 1L, 1L, 1L, 4L, 9L, 2L, 1L, 1L, 
1L, 2L, 4L, 1L, 2L, 18L, 1L, 10L, 3L, 1L, 7L, 1L, 1L, 1L, 3L, 
1L, 7L, 1L), AA = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), 
    CROP = c(1L, 4L, 12L, 13L, 14L, 18L, 1L, 1L, 1L, 1L, 1L, 
    4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 13L, 
    13L, 13L, 13L, 14L, 14L, 14L, 18L, 18L, 18L, 18L, 18L, 19L, 
    19L, 21L, 21L, 21L, 21L), CROP_LABEL = c(1L, 4L, 14L, 13L, 
    12L, 18L, 1L, 4L, 5L, 6L, 18L, 1L, 4L, 6L, 14L, 18L, 12L, 
    14L, 18L, 1L, 6L, 14L, 18L, 18L, 4L, 6L, 13L, 21L, 12L, 14L, 
    18L, 1L, 6L, 14L, 18L, 21L, 1L, 19L, 6L, 13L, 21L, 34L), 
    ACRES = c(331.737184484, 193.772138572, 26.48543619, 73.2696289437, 
    112.470306056, 66.6556450342, 3905.71121736, 24.9581079934, 
    39.9287379709, 259.662359273, 85.2786247851, 306.051491303, 
    368.342995232, 154.82030835, 265.754349805, 70.3722566979, 
    35.4066607701, 139.336463432, 58.4307705147, 251.070357093, 
    471.031628349, 150.965736858, 28.2780117926, 35.3426930108, 
    34.5730542194, 67.7383953308, 144.442123948, 33.2746560126, 
    69.4072817311, 1219.65459596, 92.4840910734, 582.983473317, 
    191.957841327, 35.708775262, 319.638682538, 60.6889287642, 
    82.6244195055, 36.2898952104, 267.422844756, 72.8352758659, 
    489.746546145, 65.5392893502)), row.names = c(25L, 26L, 27L, 
29L, 30L, 31L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 
70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 
83L, 84L, 85L, 86L, 87L, 88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L
), class = "data.frame")

and my dfTraining dataframe is:

structure(list(FREQUENCY = c(7L, 1L, 1L, 4L, 2L, 6L, 1L, 107L, 
1L, 21L, 1L, 1L, 1L, 2L, 1L, 19L, 3L, 1L, 1L, 12L, 1L, 2L, 32L, 
2L, 2L, 29L, 2L, 18L, 1L), AA = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L), CROP = c(1L, 1L, 4L, 4L, 12L, 13L, 21L, 
1L, 1L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 7L, 12L, 13L, 14L, 14L, 
14L, 18L, 18L, 18L, 19L, 21L, 34L), CROP_LABEL = c(1L, 4L, 1L, 
4L, 12L, 13L, 21L, 1L, 6L, 4L, 6L, 1L, 5L, 14L, 18L, 6L, 14L, 
1L, 12L, 13L, 1L, 6L, 14L, 6L, 14L, 18L, 19L, 21L, 34L), ACRES = c(624.940370218, 
26.9188766351, 37.8773839813, 291.79294767, 140.949264214, 391.571023675, 
44.5217011939, 6806.02216989, 72.7500299887, 1676.12121152, 14.8739557721, 
67.0700291739, 59.7438207953, 82.6713019474, 75.62666152, 1370.78710769, 
145.215281276, 41.7380537313, 66.5236760194, 679.91208779, 70.9661875374, 
38.8514254734, 1749.63365551, 109.917242057, 79.7758083723, 1660.85759895, 
96.8771921798, 1428.71888481, 69.473161379)), row.names = c(18L, 
19L, 20L, 21L, 22L, 23L, 24L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 
45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 
58L, 59L), class = "data.frame")

Possible duplicate of [What's the difference between lapply and do.call?](https://stackoverflow.com/questions/10801750/whats-the-difference-between-lapply-and-do-call) — CPak, Aug 13 '18 at 20:00
You could simplify it and use `lapply` with `data.table::rbindlist` — Gautam, Aug 13 '18 at 20:27
Iracambi, the simplest approach is suggested by CPak: `do.call(rbind, lst_of_frames)` will do what you want. I recommend against `Reduce`, as that is an iterative call that will perform poorly with a large list. `apply` is not appropriate for `list`s, just `matrix`/`array`. (It can work on `data.frame`, but it converts to a `matrix` automatically, so really it just works on a `matrix` or `array`.) — r2evans, Aug 13 '18 at 21:08

Use rbind() in nested for loop with apply() in r

0 Answers0