2

How do I add a column to a data frame consisting of the minimum values from other columns? So in this case, to create a third column that will have the values 1, 2 and 2?

df = data.frame(A = 1:3, B = 4:2)

5 Answers5

4

You can use apply() function to do this. See below.

df$C <- apply(df, 1, min)

The second argument allows you to choose the dimension in which you want min to be applied, in this case 1, applies min to all columns in each row separately.

You can choose specific columns from the dataframe, as follows:

df$newCol <- apply(df[c('A','B')], 1, min)
  • Thanks, that does answer my question. But let's say there were three columns to begin with and I only wanted the minimum values from the first two. Can I use apply() and specify that I want the minimum values specifically of columns A and B? – Sergei Walankov Nov 30 '21 at 15:35
  • I added the answer to this, to my answer. Please see above. – Homayoun Hamedmoghadam Nov 30 '21 at 15:36
3

You can call the parallel minimum function with do.call to apply it on all your columns:

df$C <- do.call(pmin, df)
Peter Csala
  • 17,736
  • 16
  • 35
  • 75
Tony
  • 66
  • 3
2

You do simply:

df$C <- apply(FUN=min,MARGIN=1,X=df)

Or:

df[, "C"] <- apply(FUN=min,MARGIN=1,X=df)

or:

df["C"] <- apply(FUN=min,MARGIN=1,X=df)

Instead of apply, you could also use data.farme(t(df)), where t transposes df, because sapply would traverse a data frame column-wise applying the given function. So the rows must be made columns. Since t outputs always a matrix, you need to make it a data.frame() again.

df$C <- sapply(data.frame(t(df)), min)

Or one could use the fact that ifelse is vectorized:

df$C <- with(df, ifelse(A<B,A,B))

Or:

df$C <- ifelse(df$A < df$B, df$A, df$B)

matrixStats

# install.packages("matrixStats")

matrixStats::rowMins(as.matrix(df))

According to this SO answer the fastest. apply-type functions use lists and are always quite slow.

Gwang-Jin Kim
  • 9,303
  • 17
  • 30
2
df %>%
  rowwise() %>%
  mutate(C = min(A, B))

# A tibble: 3 × 3
# Rowwise: 
      A     B     C
  <int> <int> <int>
1     1     4     1
2     2     3     2
3     3     2     2

Using input with equal values across rows:

df = data.frame(A = 1:10, B = 11:2)
df %>%
  rowwise() %>%
  mutate(C = min(A, B))

# A tibble: 10 × 3
# Rowwise: 
       A     B     C
   <int> <int> <int>
 1     1    11     1
 2     2    10     2
 3     3     9     3
 4     4     8     4
 5     5     7     5
 6     6     6     6
 7     7     5     5
 8     8     4     4
 9     9     3     3
10    10     2     2
cazman
  • 1,452
  • 1
  • 4
  • 11
  • Does this not fail if you have the same value in the row i.e. if A[1] = B[1] = 1, the output should be 1,1,2 in that case whereas you will get 1,2,2 – cgvoller Nov 30 '21 at 15:30
  • @cgvoller Does the edit answer what you are asking? – cazman Nov 30 '21 at 15:34
1

You can use transform() to add the min column as the output of pmin(a, b) and access the elements of df without indexing:

df <- transform(df, min = pmin(a, b))

or

In data.table

library(data.table)

DT = data.table(a = 1:3, b = 4:2)
DT[,  min := pmin(a, b)]
Rfanatic
  • 2,224
  • 1
  • 5
  • 21