-1

EDITED

I have a dataframe where the identification variable contains duplicates. How can I create a new variable (VAR2) where I assign values to NA's based on this identification variable.

    df <- data.frame(
  ID = c(1,2,3,4,4,4,7,8,9,10),
  VAR1 = c("a","b","c","d",NA,NA,"g","h","i","j")
)

The dataframe looks like this :

   ID VAR1 

    1   a        
    2   b      
    3   c       
    4   d       
    4   NA     
    4   NA    
    7   g      
    8   h      
    9   i    
   10   j    

The expected output

   ID VAR1 

    1   a        
    2   b      
    3   c       
    4   d       
    4   d     
    4   d    
    7   g      
    8   h      
    9   i    
   10   j    
  • When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Feb 08 '18 at 20:13
  • look at this example [here](https://stackoverflow.com/questions/48587639/how-to-propagate-value-of-a-cell-to-other-rows-based-on-criteria-in-r/48587754#48587754). yours will be `transform(dat,var2=zoo::na.locf(var1))` – Onyambu Feb 08 '18 at 20:55
  • @Onyambu Thanks for the example. I edited my post so you can reproduce the example. I think your code is not exactly what I am looking at. – ChubStewey Feb 09 '18 at 13:14

1 Answers1

0
require(data.table)
df <- fread('ID VAR1 VAR2
    1   a    a    
    2   b    b  
    3   c    c   
    4   d    d   
    4   NA   d  
    4   NA   d 
    7   g    g  
    8   h    h  
    9   i    i
   10   j    j')[,-'VAR1']
df
df[, VAR1 := replace(VAR2, seq_len(.N) > 1, NA), by = ID]
df
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38