1

I have a dataframe like this

df <- data.frame(name1 = c("a" , "a", "a", "a", "c", "c", "c", "c"),
                 name2 = c(NA,"a","a",NA, NA, "c", "c", NA),
                 name3 = c(NA, "b", "b", NA, NA, "d","d",NA))

Then, I did make a new column based on some conditions

library(tidyverse)
df %>% mutate(name4 = ifelse(!is.na(name3), name3, name1))

    name1 name2 name3 name4
1     a  <NA>  <NA>     a
2     a     a     b     b
3     a     a     b     b
4     a  <NA>  <NA>     a
5     c  <NA>  <NA>     c
6     c    c     d      d
7     c    c     d      d
8     c  <NA>  <NA>     c   

I would like to replace a, c by b, d in name4, respectively without calling the character i.e a, b. (Making another column also a good option right?)

Any suggestions for this?

Desired output

    name1 name2 name3 name4
1     a  <NA>  <NA>     b
2     a     a     b     b
3     a     a     b     b
4     a  <NA>  <NA>     b
5     c  <NA>  <NA>     d
6     c    c     d      d
7     c    c     d      d
8     c  <NA>  <NA>     d   
zx8754
  • 52,746
  • 12
  • 114
  • 209
Anh
  • 735
  • 2
  • 11
  • I do not follow, could you clarify, why a becomes b in name4 column? – zx8754 Sep 28 '21 at 08:11
  • Lets say a in name2 column will be changed to b in name3 column. Then, name4 column like a final column containing old name and new name – Anh Sep 28 '21 at 08:15
  • Or the character in name3 column as my first priority to use, followed by name1 column, but now I would like to standard old name `a` to new name `b`. Is that clear? – Anh Sep 28 '21 at 08:17
  • I think you are looking for coalesce, see https://stackoverflow.com/q/19253820/680068 – zx8754 Sep 28 '21 at 08:21
  • I did not think it solves my problem, anw thank you :D – Anh Sep 28 '21 at 08:49
  • Did coalesce - linked post - solved your issue? – zx8754 Sep 28 '21 at 08:50
  • 1
    @zx8754 no sir, since my real dataframe is very long with different characters in name1 column that I want to keep it. Coalesce will remove those name from my data frame. – Anh Sep 28 '21 at 08:55
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/237582/discussion-between-anh-and-zx8754). – Anh Sep 28 '21 at 09:02

2 Answers2

1

Here are two possible answers:

df <- data.frame(name1 = c("a" , "a", "a", "a", "c", "c", "c", "c"),
                 name2 = c(NA,"a","a",NA, NA, "c", "c", NA),
                 name3 = c(NA, "b", "b", NA, NA, "d","d",NA))
library(tidyverse)
df %>% mutate(name4 = ifelse(!is.na(name3), name3, name1), 
              name4=sub('a','b', sub('c','d',name4)))
#>   name1 name2 name3 name4
#> 1     a  <NA>  <NA>     b
#> 2     a     a     b     b
#> 3     a     a     b     b
#> 4     a  <NA>  <NA>     b
#> 5     c  <NA>  <NA>     d
#> 6     c     c     d     d
#> 7     c     c     d     d
#> 8     c  <NA>  <NA>     d
df %>% mutate(name4 = ifelse(!is.na(name3), name3, name1), 
              name4=c('a'='b','c'='d','b'='b','d'='d')[name4])
#>   name1 name2 name3 name4
#> 1     a  <NA>  <NA>     b
#> 2     a     a     b     b
#> 3     a     a     b     b
#> 4     a  <NA>  <NA>     b
#> 5     c  <NA>  <NA>     d
#> 6     c     c     d     d
#> 7     c     c     d     d
#> 8     c  <NA>  <NA>     d

Created on 2021-09-28 by the reprex package (v2.0.1)

Bart
  • 1,267
  • 7
  • 18
1

fill the NAs, then use coalesce from right to left, getting the latest name for name4 column:

df %>% 
  group_by(name1) %>% 
  fill(name2, name3, .direction = "downup") %>% 
  mutate(name4 = coalesce(name3, name2, name1))

## A tibble: 8 x 4
## Groups:   name1 [2]
#  name1 name2 name3 name4
#  <chr> <chr> <chr> <chr>
#1 a     a     b     b    
#2 a     a     b     b    
#3 a     a     b     b    
#4 a     a     b     b    
#5 c     c     d     d    
#6 c     c     d     d    
#7 c     c     d     d    
#8 c     c     d     d    
zx8754
  • 52,746
  • 12
  • 114
  • 209