0

My df looks like this now.

A B C
1 3 .
1 6 .
1 9 .
2 1 .
2 2 . 
2 5 .
3 9 .
3 3 .
3 2 .

Below is the ideal dataframe I am try to create:

  • Variable A refers to individuals (user-id). Each individual has three rows.
  • Each individual has different values for variable B...
  • whereas they need to have a same value of variable C.

I need to repeat a same value of variable C for each individual. I was wondering how I can give each participant the same value of variable C so that variable C is repeated three times for each participant.

A B C
1 3 1
1 6 1
1 9 1
2 1 3
2 2 3 
2 5 3
3 9 8
3 3 8
3 2 8
smci
  • 32,567
  • 20
  • 113
  • 146
ND2020
  • 11
  • 1
  • 2
  • 2
    *"I need to repeat a same value of variable C for each individual."* But were you given the vector of C values `(1,3,8)` corresponding to the user-id A? Was that also an input parameter? And it is guaranteed that each user has 3 records. – smci Dec 31 '19 at 20:18
  • 1
    Where does that set of numbers come from? – camille Dec 31 '19 at 21:12
  • ND2020 could you please answer? It would be better to solve the question generally rather than hacking up something specific to your use-case which assumes each user has exactly three records, and breaks whenever that's not true, or whenever records aren't in-order of increasing user-id/A-value? Can we assume our input is a vector of C-values? named vector? dataframe which we can do [`merge/join`](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) on? – smci Jan 02 '20 at 15:18
  • They were dataframes. I used `merge()` and it worked perfectly well: variable C is now repeated for each participant. – ND2020 Jan 09 '20 at 21:38

3 Answers3

4

We can just use rep in base R as the number of repeats are already known as 3

df$C <-  rep(c(1, 3, 8), each = 3)
df
#  A B C
#1 1 3 1
#2 1 6 1
#3 1 9 1
#4 2 1 3
#5 2 2 3
#6 2 5 3
#7 3 9 8
#8 3 3 8
#9 3 2 8

Or another option is to use 'A' as integer index which would also work when there are unequal lengths

df$C <- c(1, 3, 8)[df$A]

If the values in 'A' are not in sequence or it is not numeric, use a named vector to replace

df$C <- setNames(c(1, 3, 8), unique(df$A))[as.character(df$A)]

data

df <- data.frame(A = rep(1:3, each = 3), B = c(3, 6, 9, 1, 2, 5, 9, 3, 2))
akrun
  • 874,273
  • 37
  • 540
  • 662
1

You could use an assignment matrix and match it with your A column.

am <- matrix(c(1, 1,
               2, 3,
               3, 8), byrow=TRUE, ncol=2)

dat$C <- am[match(dat$A, am[,1]), 2]
dat
#   A B C
# 1 1 3 1
# 2 1 6 1
# 3 1 9 1
# 4 2 1 3
# 5 2 2 3
# 6 2 5 3
# 7 3 9 8
# 8 3 3 8
# 9 3 2 8

Data:

dat <- structure(list(A = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), B = c(3L, 
6L, 9L, 1L, 2L, 5L, 9L, 3L, 2L)), row.names = c(NA, -9L), class = "data.frame")
jay.sf
  • 60,139
  • 8
  • 53
  • 110
1

The solution by @akrun is the most efficient so far. Here is another base R solution, which applies to the cases grouped by df$A for unequal size of groups...

v <- c(1,3,8)
df <- do.call(rbind,lapply(seq_along(v), function(k) cbind(split(df,df$A)[[k]],C=v[k])))

such that

> df
  A B C
1 1 3 1
2 1 6 1
3 1 9 1
4 2 1 3
5 2 2 3
6 2 5 3
7 3 9 8
8 3 3 8
9 3 2 8
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81