3

Here is an example of data frame.

    x3 <- read.table(text = "  id1 id2 val1 val2
1   a   x    1    9
2   a   x    2    4
3   a   y    3    NA
4   a   y    4    NA
5   b   x    1    NA
6   b   y    4    NA
7   b   x    3    9
8   b   y    2    8", header = TRUE)

aggregate(. ~ id1+id2, data = x3, FUN = mean) returns:

  id1 id2 val1 val2
1   a   x  1.5  6.5
2   b   x  3.0  9.0
3   b   y  2.0  8.0

aggregate(x3[,3:4], by = list(x3$id1, x3$id2), FUN = mean, na.rm = TRUE) returns:

  Group.1 Group.2 val1 val2
1       a       x  1.5  6.5
2       b       x  2.0  9.0
3       a       y  3.5  NaN
4       b       y  3.0  8.0

Two aggregate syntaxes do not return the same amount of rows. What is the reason?

Henrik
  • 65,555
  • 14
  • 143
  • 159
pinawa
  • 129
  • 6

1 Answers1

1

Better use with and complete.cases in the list-method of aggregate, to exclude rows with missings beforehand what you probably attempt.

with(x3[complete.cases(x3), ], aggregate(cbind(val1, val2), by=list(id1, id2), FUN=mean))
#   Group.1 Group.2 val1 val2
# 1       a       x  1.5  6.5
# 2       b       x  3.0  9.0
# 3       b       y  2.0  8.0
jay.sf
  • 60,139
  • 8
  • 53
  • 110