0

I have a df of 16k+ items. I want to assign values A, B and C to those items based.

Example: I have the following df with 10 unique items

df <- c(1:10)

Now I have three separate vectors (A, B, C) that contain row numbers of the df with values A, B or C.

A <- c(3, 9)
B <- c(2, 6, 8)
C <- c(1, 4, 5, 7, 10)

Now I want to add a new category column to the df and assign values A, B and C based on the row numbers that are in the three vectors that I have. For example, I would like to assign value C to rows 1, 4, 5, 7 and 10 of the df.

I tried to experiment with for loops and if statements to match the value of the vector with the row number of the df but I didn't succeed. Can anybody help out?

Jay
  • 13
  • 1
  • 3
  • Hello @Jay. Welcome to SO! Please provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). It would be easier for others SO users to help you. Thank you in advance. Cheers. – lovalery Nov 06 '21 at 19:25
  • Is this better now? – Jay Nov 06 '21 at 21:09
  • Thank you for improving your question. – lovalery Nov 06 '21 at 22:14

2 Answers2

0

Here is a way to assign the new column.

Create the data frame and a list of vectors:

df <- data.frame(n=1:10)
dat <- list( A=c(3, 9), B=c(2, 6, 8), C=c(1, 4, 5, 7, 10) )

Put the data in the desired rows:

df$new[unlist(dat)] <- sub("[0-9].*$","",names(unlist(dat)))

Result:

df
    n new
1   1   C
2   2   B
3   3   A
4   4   C
5   5   C
6   6   B
7   7   C
8   8   B
9   9   A
10 10   C
Andre Wildberg
  • 12,344
  • 3
  • 12
  • 29
  • Thanks Andre, When I tried to apply your solution, the unlist function results in values where the category name (A, B, C) is linked to numbers, so it results in values like A1, B7, up until C1001. The letters are correct though! So it partially works – Jay Nov 06 '21 at 22:19
  • Right, my bad. Forgot you had a big data set. I adjusted the substitution. Should work now! – Andre Wildberg Nov 06 '21 at 22:25
  • You are magician. I'm not yet sure how it works, but it works. Thanks a lot! – Jay Nov 07 '21 at 12:41
0

You could iterate over the names of a list and assign those names to the positions indexed by the successive sets of numeric values:

dat <- list(A=A,B=B,C=C)
for(i in names(dat)){ df$new[ dat[[i]] ] <- i}
IRTFM
  • 258,963
  • 21
  • 364
  • 487