0

My code:

library(dplyr)
df <- tibble(
    a = rnorm(10),
    b = rnorm(10),
    c = rnorm(10),
    d = rnorm(10)
)

output <- vector("double", ncol(df)) # 1. output
for (i in seq_along(df)) {           # 2. sequence
    output[i] <- median(df[i])
}
output

If I put median(df[i]) in the for loop it shows:

Error in median.default(df[i]) : need numeric data

why is that? Why I have to use [[]] here? I thought inside the median function, i just have to call the entire column, which can be done by df[i].

sam
  • 37
  • 4
  • 3
    `df[i]` is not the column, it's a dataframe with one column. – user2554330 Sep 25 '22 at 10:12
  • 1
    A dataframe is a list, and the `[[ ]]` notation is a way of selecting an element (i.e. a column) from that list. Alternatively, you can select a df column with `df[ ,i]`, which treats it like a 2-D array. – Andrew Gustar Sep 25 '22 at 10:13
  • 1
    Some more info that might be helpful: 1) tibbles have different behavior with regard to what type of object is returned than do (base) data frames, and 2) you can use the `drop` argument within the brackets to control the behavior, e.g., `df[, i, drop = TRUE]` to get a vector. – sashahafner Sep 25 '22 at 10:44

0 Answers0