2

I want to create a new columns in my data.frame, based on values in my rows.

If 'type" is not equal to "a", my "new.area" columns should contain the data from "area" of type "a". This is for multiple "distances".

Example:

# create data frame
distance<-rep(seq(1,5, by = 1),2)
area<-c(11:20)
type<-rep(c("a","b"),each = 5)

# check data.frame
(my.df<-data.frame(distance, area, type))

   distance area type
1         1   11    a
2         2   12    a
3         3   13    a
4         4   14    a
5         5   15    a
6         1   16    b
7         2   17    b
8         3   18    b
9         4   19    b
10        5   20    b

I want to create a new columns (my.df$new.area), where for every "distance" in rows, there will be values of "area" of type "a".

   distance area type new.area
1         1   11    a       11
2         2   12    a       12
3         3   13    a       13
4         4   14    a       14
5         5   15    a       15
6         1   16    b       11
7         2   17    b       12
8         3   18    b       13
9         4   19    b       14
10        5   20    b       15

I know how to make this manually for a single row:

my.df$new.area[my.df$distance == 1 ] <- 11

But how to make it automatically?

maycca
  • 3,848
  • 5
  • 36
  • 67

2 Answers2

4

Here is a base R solution using index subsetting ([) and match:

my.df$new.area <- with(my.df, area[type == "a"][match(distance, distance[type == "a"])])

which returns

my.df
   distance area type new.area
1         1   11    a       11
2         2   12    a       12
3         3   13    a       13
4         4   14    a       14
5         5   15    a       15
6         1   16    b       11
7         2   17    b       12
8         3   18    b       13
9         4   19    b       14
10        5   20    b       15

area[type == "a"] supplies the vector of possibilities. match is used to return the indices from this vector through the distance variable. with is used to avoid the repeated use of my.df$.

lmo
  • 37,904
  • 9
  • 56
  • 69
2

We can use data.table

library(data.table)
setDT(my.df)[, new.area := area[type=="a"] , distance]
my.df
#     distance area type new.area
# 1:        1   11    a       11
# 2:        2   12    a       12
# 3:        3   13    a       13
# 4:        4   14    a       14
# 5:        5   15    a       15
# 6:        1   16    b       11
# 7:        2   17    b       12
# 8:        3   18    b       13
# 9:        4   19    b       14
#10:        5   20    b       15

Or we can use the numeric index of distance as it is in a sequence

with(my.df, area[type=="a"][distance])
#[1] 11 12 13 14 15 11 12 13 14 15
akrun
  • 874,273
  • 37
  • 540
  • 662