R: fill new columns in data.frame based on row values by condition?

Question

I want to create a new columns in my data.frame, based on values in my rows.

If 'type" is not equal to "a", my "new.area" columns should contain the data from "area" of type "a". This is for multiple "distances".

Example:

# create data frame
distance<-rep(seq(1,5, by = 1),2)
area<-c(11:20)
type<-rep(c("a","b"),each = 5)

# check data.frame
(my.df<-data.frame(distance, area, type))

   distance area type
1         1   11    a
2         2   12    a
3         3   13    a
4         4   14    a
5         5   15    a
6         1   16    b
7         2   17    b
8         3   18    b
9         4   19    b
10        5   20    b

I want to create a new columns (my.df$new.area), where for every "distance" in rows, there will be values of "area" of type "a".

   distance area type new.area
1         1   11    a       11
2         2   12    a       12
3         3   13    a       13
4         4   14    a       14
5         5   15    a       15
6         1   16    b       11
7         2   17    b       12
8         3   18    b       13
9         4   19    b       14
10        5   20    b       15

I know how to make this manually for a single row:

my.df$new.area[my.df$distance == 1 ] <- 11

But how to make it automatically?

score 4 · Accepted Answer · answered Dec 09 '16 at 14:29

Here is a base R solution using index subsetting ([) and match:

my.df$new.area <- with(my.df, area[type == "a"][match(distance, distance[type == "a"])])

which returns

my.df
   distance area type new.area
1         1   11    a       11
2         2   12    a       12
3         3   13    a       13
4         4   14    a       14
5         5   15    a       15
6         1   16    b       11
7         2   17    b       12
8         3   18    b       13
9         4   19    b       14
10        5   20    b       15

area[type == "a"] supplies the vector of possibilities. match is used to return the indices from this vector through the distance variable. with is used to avoid the repeated use of my.df$.

akrun · Answer 2 · 2016-12-09T15:53:40.283

We can use data.table

library(data.table)
setDT(my.df)[, new.area := area[type=="a"] , distance]
my.df
#     distance area type new.area
# 1:        1   11    a       11
# 2:        2   12    a       12
# 3:        3   13    a       13
# 4:        4   14    a       14
# 5:        5   15    a       15
# 6:        1   16    b       11
# 7:        2   17    b       12
# 8:        3   18    b       13
# 9:        4   19    b       14
#10:        5   20    b       15

Or we can use the numeric index of distance as it is in a sequence

with(my.df, area[type=="a"][distance])
#[1] 11 12 13 14 15 11 12 13 14 15

R: fill new columns in data.frame based on row values by condition?

2 Answers2

Linked