4

Given a dataframe that looks like:

V1  V2  V3
5   8   12
4   9   5
7   3   9
...

How to add columns to the dataframe for min and median of these 3 columns, calculated for each row?

The resulting DF should look like:

V1  V2  V3  Min  Median
5   8   12  5    8
4   9   5   4    5
7   3   9   3    7
...

I tried using dplyr::mutate:

mutate(df, Min = min(V1,V2,V3)) 

but that takes the min of the entire dataframe and puts that value in every row. How can I get the min and median of just each row?

For Mean, I can use rowMeans in mutate, but there are no similar functions for min and median.

Also tried,

lapply(df[1:3], median)

but it just produces the median of each column

dd <- read.table(header = TRUE, text = 'V1  V2  V3
5   8   12
4   9   5
7   3   9')
rawr
  • 20,481
  • 4
  • 44
  • 78
brno792
  • 6,479
  • 17
  • 54
  • 71
  • have you tried `apply(df,1,median)` ? – Andrelrms Mar 09 '16 at 21:26
  • Possible duplicate of [R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) – A. Webb Mar 09 '16 at 21:53
  • You can google based on the naming convention `rowMeans` and find that like-named functions have been developed, `rowMins` and `rowMedians` – Frank Mar 09 '16 at 21:57
  • `df[c('min','median')] <- lapply(list(min, median), function(x) apply(df, 1, x))` – rawr Mar 09 '16 at 22:03
  • 1
    Since apparently 5+ answers are required for this `cbind(df,t(apply(df,1,quantile,c(0,0.5))))`. – A. Webb Mar 09 '16 at 22:12

5 Answers5

9

With dplyr, using the function rowwise

library(dplyr)
mutate(rowwise(df), min = min(V1, V2, V3), median = median(c(V1, V2, V3)))
# Using the pipe operator %>%
df %>% 
  rowwise() %>% 
  mutate(min= min(V1, V2, V3), median = median(c(V1, V2, V3)))

Output:

Source: local data frame [3 x 5]
Groups: <by row>

     V1    V2    V3   min median
  (int) (int) (int) (int)  (int)
1     5     8    12     5      8
2     4     9     5     4      5
3     7     3     9     3      7
mpalanco
  • 12,960
  • 2
  • 59
  • 67
5

You can use apply like this (the 1 means calculate by row, 2 would calculate by column):

the_min <- apply(df, 1, min)   
the_median <- apply(df, 1, median)
df$Min <- the_min
df$Median <- the_median
tcash21
  • 4,880
  • 4
  • 32
  • 39
  • 2
    I think it will not work as in the calculation of Median you are using the column "Min" as well – adaien Mar 09 '16 at 21:31
  • That's correct, so you should calculate separately first. I'll edit. – tcash21 Mar 09 '16 at 21:38
  • 2
    You don't really need to save to separate variables first; you can just go straight to the column: `df$Min <- apply(df, 1, min); df$Median <- apply(df[,1:3], 1, median)` – alistaire Mar 09 '16 at 21:49
0
min<-apply(df,1,min)
median<-apply(df,1,median)
df$Min<-min
df$Median<-median
adaien
  • 1,932
  • 1
  • 12
  • 26
0

You can do it with dplyr, but you need to group by a unique ID variable so evaluate separately for each row. If, say, V1 is definitely unique, this is pretty easy:

dat %>% group_by(V1) %>% mutate(min = min(V1, V2, V3), median = median(c(V1, V2, V3)))

If you don't have a unique ID, you can make (and delete, if you like) one pretty easily:

dat %>% mutate(id = seq_len(n())) %>% group_by(id) %>% 
  mutate(min = min(V1, V2, V3), median = median(c(V1, V2, V3))) %>% 
  ungroup() %>% select(-id)

Either way, you get

Source: local data frame [3 x 5]

     V1    V2    V3   min median
  (int) (int) (int) (int)  (int)
1     5     8    12     5      8
2     4     9     5     4      5
3     7     3     9     3      7
alistaire
  • 42,459
  • 4
  • 77
  • 117
  • dplyr also has a `rowwise` function, like `dat %>% mutate(min = pmin(V1,V2,V3)) %>% rowwise() %>% mutate(med = median(c(V1,V2,V3)))` – Frank Mar 09 '16 at 21:56
  • 1
    @Frank Cool, that's useful! I knew there were are a bunch of functions for working with rows in there, but I haven't used them that much. – alistaire Mar 09 '16 at 22:26
  • why you use `c(V1.V2,V3)` for `combine` col names for median but not for `min` – ok1more Oct 18 '21 at 14:50
  • @ok1more Sorry, late. But because the first parameter of `min()` is `...`, whereas for `median()` it's `x`, so in the latter the elements must be collected into a vector while `min(3, 1, 2)` and `min(c(3, 1, 2))` both return the same thing, so it doesn't matter. – alistaire Jan 10 '22 at 23:30
0
data<- data.frame(a=1:3,b=4:6,c=7:9)
data
#   a b c
# 1 1 4 7
# 2 2 5 8
# 3 3 6 9

data$Min <- apply(data,1,min)
data
#   a b c Min
# 1 1 4 7   1
# 2 2 5 8   2
# 3 3 6 9   3

data$Median <-apply(data[,1:3],1,median)
data
#     a b c min median
#  1  1 4 7   1    4
#  2  2 5 8   2    5
#  3  3 6 9   3    6

Hope this helped.

Sowmya S. Manian
  • 3,723
  • 3
  • 18
  • 30