0

I would like to mutate one of the columns of data frame depending on the certain conditions matched. I have looked the around but couldn't find some neat solution so far on this. use-mutate-to-create-new-column-label-with-conditions

So here is the simple data frame that I used

gr = rep(seq(1,2),each=3)
clas=c("A_1","A_2","A_3","A_4","A_5","A_6")

df <- data.frame(gr,clas)

> df
  gr clas
1  1  A_1
2  1  A_2
3  1  A_3
4  2  A_4
5  2  A_5
6  2  A_6

I would like to chance A_4, A_5 and A_6 with B_1, B_2 and B_3

So I tried

match <- paste('_',seq(4,6),sep='')
 df%>%
  mutate(clas=ifelse(clas %in% match,paste('B',seq(1,3),sep='_'),clas))

       gr clas
    1  1    1
    2  1    2
    3  1    3
    4  2    4
    5  2    5
    6  2    6

and 2nd try with grepl

df%>%
mutate(clas=ifelse(clas==grepl(paste(match,collapse='|'),clas),paste('B',seq(1,3),sep='_'),clas))

   gr clas
1  1    1
2  1    2
3  1    3
4  2    4
5  2    5
6  2    6

Which is A's also gone :) The expected result is;

   gr clas
1  1  A_1
2  1  A_2
3  1  A_3
4  2  B_1
5  2  B_2
6  2  B_3

Thanks!

EDIT: I realized that it is easier to do if there are LETTERS in the data clas column. But if we have data like this and no gr column how do we that ??

    clas
1   CD_1
2  X.2_2
3  K$2_3
4 12k3_4
5   .A_5
6   xy_6

The expected output is

    clas
1   CD_1
2  X.2_2
3  K$2_3
4 12kB_4
5   .B_5
6   xB_6

I guess I was looking for solution like that

Alexander
  • 4,527
  • 5
  • 51
  • 98

3 Answers3

1

Here's a base R solution that relies on df$gr:

paste(LETTERS[df$gr], ave(df$gr, df$gr, FUN=seq_along), sep="_")
[1] "A_1" "A_2" "A_3" "B_1" "B_2" "B_3

LETTERS are the Latin capital letters, LETTERS[1] is "A". So "A" and "B" are pasted to the results of the running count constructed by seq_along which is reset by group using ave. These two are pasted together with "_" as the separator.

lmo
  • 37,904
  • 9
  • 56
  • 69
1

Here is dplyr solution:

df%>%group_by(gr)%>%dplyr::mutate(clas=paste0(toupper(letters[gr]),"_",row_number()))
#you can change toupper(letters[gr]) to LETTERS[gr]

# A tibble: 6 x 2
# Groups:   gr [2]
     gr  clas
  <int> <chr>
1     1   A_1
2     1   A_2
3     1   A_3
4     2   B_1
5     2   B_2
6     2   B_3
BENY
  • 317,841
  • 20
  • 164
  • 234
  • 1
    May as well use `LETTERS` rather than `toupper(letters)`. – lmo Aug 16 '17 at 19:13
  • @Wen Hi Wen thanks for the answer. However, I just put letters in clas not in purpose the real solution I look for is in the EDIT part. Could you check OP's EDIT part. I guess your solution not working on that ? – Alexander Aug 17 '17 at 00:24
  • @Wen Sorry I edited again the expected output. What I am looking is just change strings before the `_` to B and keep rest of it. – Alexander Aug 17 '17 at 01:23
  • 1
    @Alexander Seems like you do not need `dplyr` try this ?`paste0(str_sub(str_split_fixed(df$clas,"_",2)[1],1,-2),LETTERS[df$gr],'_',str_split_fixed(df$clas,"_",2)[,2])` – BENY Aug 17 '17 at 01:37
1

I will try to use base R: specifically just to solve this problem:

First ensure your vector is in character form. I called the table above B

  B[,1]=as.character(B[,1])
  B[4:6,1]=sapply(B$clas[4:6],function(i) {substr(i,nchar(i)-2,nchar(i)-2)<-"B";i})
  B
     clas
 1   CD_1
 2  X.2_2
 3  K$2_3
 4 12kB_4
 5   .B_5
 6   xB_6
Onyambu
  • 67,392
  • 3
  • 24
  • 53