3

Absolute newbie to R. I have a dataframe that has some common values in one column(C1), but only one of the corresponding column has a value(C2), so I want to paste that value to all of the empty/NA spaces in C2 based on same value in C1. This would make more sense:

df:

C1    C2  
A      NA  
A      val10  
A      NA  
B      val14  
B      NA  
B      NA  
B      NA  
C      NA  
C      val9  

What I wanted it to look like is

C1   C2  
A    val10  
A    val10  
A    val10  
B    val14  
B    val14  
B    val14  
B    val14  
C    val9  
C    val9 

(C2 and C1 don't have any particular pattern or sequence between each other)

I'm assuming I would do a Group_by for C1, but I'm bit confused how to copy the values. Using transmute/mutate or paste. I tried a few iterations but wasn't successful.

Umar.H
  • 22,559
  • 7
  • 39
  • 74
peebs
  • 33
  • 3
  • careful with the tags, try to target the language that you're using, spamming tags isn't seen as good behaviour here so you'll get downvoted or flagged by a mod – Umar.H Nov 19 '21 at 03:58
  • I'm sorry, new here. I only tagged it to R. And PANDAS, since there is some commonality. But now I know. Thanks!! – peebs Nov 19 '21 at 04:00
  • commonality there may have, but you could argue that python and java have similarities, you still wouldn't tag a python question with java and so forth :) - no worries we all live and learn! – Umar.H Nov 19 '21 at 04:01
  • For future questions, it's helpful to provide the data in a way that can easily be used in R. To provide part of your data, you can just use `dput(head(df))`, then paste the result in with your question. See also https://stackoverflow.com/a/5963610/15293191 – AndrewGB Nov 19 '21 at 04:56

2 Answers2

3

You can use the fill function from tidyr, which makes it really easy to take care of the NAs.

library(tidyr)
library(dplyr)

df %>%
  dplyr::group_by(C1) %>%
  tidyr::fill(C2) %>% #default direction down
  tidyr::fill(C2, .direction = "up")

Output

# A tibble: 9 × 2
# Groups:   C1 [3]
  C1    C2   
  <chr> <chr>
1 A     val10
2 A     val10
3 A     val10
4 B     val14
5 B     val14
6 B     val14
7 B     val14
8 C     val9 
9 C     val9

Data

df <- structure(list(C1 = c("A", "A", "A", "B", "B", "B", "B", "C", 
"C"), C2 = c(NA, "val10", NA, "val14", NA, NA, NA, NA, "val9"
)), class = "data.frame", row.names = c(NA, -9L))
AndrewGB
  • 16,126
  • 5
  • 18
  • 49
1

I doubt this is the most elegant solution, but a Tidyverse-style method could be:

df <- tibble::tribble(
                      ~C1,     ~C2,
                      "A",      NA,
                      "A", "val10",
                      "A",      NA,
                      "B", "val14",
                      "B",      NA,
                      "B",      NA,
                      "B",      NA,
                      "C",      NA,
                      "C",  "val9"
                      )

df %>%
    filter(!is.na(C2)) %>%
    rename(C3 = C2) %>%
    right_join(df) %>%
    select(-C2) %>%
    rename(C2 = C3)

Which gives you:

enter image description here