0
a=c(1,4,5,7,3,3,NA)
b=c(2,7,3,NA,2,5,2)
d=c(2,7,5,NA,9,0,1)
dat=data.frame(a,b,d)

How to pick the max one out of each colomn and the NA are neglected? Finally the result is a vector which concludes 7,7 and 9.What is the code to obtain such a result?

3 Answers3

2

Use sapply to iterate over columns.

sapply(dat, max, na.rm = TRUE)
# a b d
# 7 7 9

Alternatively, if you use lapply it will return a list. sapply can be a little unpredictable, in that it calls lapply and then tries to simplify the result. This means that if the input type changes (e.g. one of the columns is a list column), you may get a different output, (e.g. a list rather than a vector).

If you want to ensure that the output is a vector and throw an error if this is not the case you can use vapply:

vapply(
    X = dat, 
    FUN = max, 
    FUN.VALUE = numeric(1), 
    na.rm = TRUE
)
# a b d
# 7 7 9
SamR
  • 8,826
  • 3
  • 11
  • 33
2

Using apply:

apply(dat, 2, max, na.rm = 1)
a b d 
7 7 9 
Karthik S
  • 11,348
  • 2
  • 11
  • 25
  • 1
    The documentation for [`max`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Extremes.html) indicates that `na.rm` is a `logical`. Any reason you have it as `1` rather than `TRUE`? I find it confusing. – SamR Sep 21 '22 at 07:48
  • @SamR, in R, `TRUE` and `FALSE` gets parsed to 1 and 0 respectively, so we can use them interchangeably – Karthik S Sep 21 '22 at 07:52
  • 1
    Sure you can... you could also write `na.rm = 2.7`, or `na.rm = is.finite(4)`. But why would you? – SamR Sep 21 '22 at 07:56
  • @SamR, faster to type 1 than TRUE isn't it? – Karthik S Sep 21 '22 at 08:00
  • Sure it is but you can use `T` if that's the issue. I think in general though it is better to optimise your code to be clear rather than for the fewest keystrokes. – SamR Sep 21 '22 at 08:18
1
a=c(1,4,5,7,3,3,NA)
b=c(2,7,3,NA,2,5,2)
d=c(2,7,5,NA,9,0,1)
dat=data.frame(a,b,d)
sapply(dat, max, na.rm=TRUE)
a b d 
7 7 9 
zimia
  • 930
  • 3
  • 16