1

I'm quite new to R and am currently stuck with my data.frame. I have a character column with different groups varying in numbers. For example the first seven rows being "A", the next five rows being "B" and so on. Now I have a vector with a length being equal to the total number of groups. My goal is to create a new column, where all "A" rows get the first vector value, all "B" rows the second value and so on.

I already tried:

values <- c("G", "H", "J", "K")
dat$col2 <- values[dat$col1]

from an earlier entry (Create new column based on 4 values in another column) and it worked. But after updating R it somehow doesn't work anymore. Though it creates the new column "col2", the values are now all NA and not corresponding the vector.

Can anyone help me out with that?

edit: example as reproducible code:

first_column <- c(rep("value_1", 6),rep("value_2",7))
df <- data.frame(first_column)
df$second_column <- c("A","B")[df$first_column]
nuno
  • 13
  • 3
  • 1
    Are you using the same dataframe and code from that link and getting `NA`'s or you are applying this on some other dataframe? – Ronak Shah Dec 08 '20 at 10:16
  • I'm applying it on another dataframe, which is quite large. Thus it's too complicated to do it manually everytime I get a new set of data of a similar type. I noticed that even after the update it worked if "col1" was a numeric column as in the link. But meanwhile also in this case it doesn't work anymore. – nuno Dec 08 '20 at 10:39
  • Can you provide sample of the dataset that you are applying this to so that we can reproduce the issue that you are facing. – Ronak Shah Dec 08 '20 at 11:17
  • I added an edit with a reproducible example – nuno Dec 08 '20 at 11:24

2 Answers2

1

I think that you are simply looking for an ifelse.

group.sizes <- c(10, 20, 30 , 40)
names(group.sizes) <- c("G", "H", "J", "K")

df$new.column <- ifelse(df$column == "G",
                        group.sizes["G"],
                        ifelse(df$column == "H",
                               group.sizes["H"],
                               ifelse(df$column == "J",
                                      group.sizes["J"],
                                      ifelse(df$column == "K",
                                             group.sizes["K"],
                                             NA)))
MacOS
  • 1,149
  • 1
  • 7
  • 14
0

You have character values in first_column. You cannot use character value to index the vector here. Use match to create the index.

df$second_column <- c("A","B")[match(df$first_column, unique(df$first_column))]
df

#   first_column second_column
#1       value_1             A
#2       value_1             A
#3       value_1             A
#4       value_1             A
#5       value_1             A
#6       value_1             A
#7       value_2             B
#8       value_2             B
#9       value_2             B
#10      value_2             B
#11      value_2             B
#12      value_2             B
#13      value_2             B
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213