-1

My data

Type1   Type2   Type3       Expected_Output                 
Red     Orange  Pink        Pink                
Green   abc     na          abc             
Blue     na     na          Blue                
white    na     Green       Green               
na     Brown    purple      purple              
na     black     na         black               
grey    na       na         grey                

How to Achieve the following expected output:

  1. If type3 is filled than expected output should be type3's content
  2. If type2 is filled and type3 is blank(na) than expected output should be type2's content
  3. If type3 and type2 is blank(na) than expcetd output should be type1's content
David Arenburg
  • 91,361
  • 17
  • 137
  • 196

3 Answers3

2

A data.frame which consists of multiple columns of the same data type is an indication that the data might be reshaped from wide to long format. Therefore, melt() is used to get rid of the NAs and join to add the new column to the original data.frame:

library(data.table)
DT[melt(DT[, rn := .I], id.vars = "rn", na.rm = TRUE)[
    order(variable), .(New = last(value)), by = rn], on = .(rn)][, rn := NULL][]
   Type1  Type2  Type3 Expected_Output    New
1:   Red Orange   Pink            Pink   Pink
2: Green    abc     NA             abc    abc
3:  Blue     NA     NA            Blue   Blue
4: white     NA  Green           Green  Green
5:    NA  Brown purple          purple   grey
6:    NA  black     NA           black purple
7:  grey     NA     NA            grey  black

Data

fread() is used to read the sample data set. The na.strings parameter tells fread() to convert the "na" strings to NA:

library(data.table)
DT <- fread(
"Type1   Type2   Type3       Expected_Output                 
Red     Orange  Pink        Pink                
Green   abc     na          abc             
Blue     na     na          Blue                
white    na     Green       Green               
na     Brown    purple      purple              
na     black     na         black               
grey    na       na         grey    ",
na.strings = "na")

Edit

The OP has requested that also lines which are fully NA should appear in the output. This can be achieved by changing the order of data.table objects in the right join. In data.table syntax, X[Y] is a right join which takes all rows of Y. If all rows of X are required, the right join Y[X] has to be used

library(data.table)
# new data with 8th row
DT <- fread(
  "Type1   Type2   Type3       Expected_Output                 
  Red     Orange  Pink        Pink                
  Green   abc     na          abc             
  Blue     na     na          Blue                
  white    na     Green       Green               
  na     Brown    purple      purple              
  na     black     na         black               
  grey    na       na         grey    
  na      na       na         na",
  na.strings = "na")

melt(DT[, rn := .I], id.vars = "rn", na.rm = TRUE)[
  order(variable), .(New = last(value)), by = rn][DT, on = .(rn)][, rn := NULL][]
      New Type1  Type2  Type3 Expected_Output
1:   Pink   Red Orange   Pink            Pink
2:    abc Green    abc     NA             abc
3:   Blue  Blue     NA     NA            Blue
4:  Green white     NA  Green           Green
5: purple    NA  Brown purple          purple
6:  black    NA  black     NA           black
7:   grey  grey     NA     NA            grey
8:     NA    NA     NA     NA              NA
Uwe
  • 41,420
  • 11
  • 90
  • 134
  • if suppose i have 8th row as Type1 - NA , Type2 -NA , Type3- NA , I wnt the Exp_Output to be NA only , but the above code will erase the 8th record itself – Dinesh Kumar Jul 14 '17 at 07:39
1

You can do that with last from dplyr inside an apply call. Make sure your na are NA so that na.omit ignores them.

library(dplyr)
df[df=="na"] <- NA #change "na" to NA
df$expected2 <-apply(df[,1:3],1,function(x) last(na.omit(x)))

  Type1  Type2  Type3 Expected_Output expected2
1   Red Orange   Pink            Pink      Pink
2 Green    abc   <NA>             abc       abc
3  Blue   <NA>   <NA>            Blue      Blue
4 white   <NA>  Green           Green     Green
5  <NA>  Brown purple          purple    purple
6  <NA>  black   <NA>           black     black
7  grey   <NA>   <NA>            grey      grey

data

df <- read.table(text="Type1   Type2   Type3 Expected_Output                 
Red     Orange  Pink        Pink                
                 Green   abc     na          abc             
                 Blue     na     na          Blue                
                 white    na     Green       Green               
                 na     Brown    purple      purple              
                 na     black     na         black               
                 grey    na       na         grey ",header=TRUE,stringsAsFactors=FALSE)
Pierre Lapointe
  • 16,017
  • 2
  • 43
  • 56
  • hi @ P Lapointe am getting error message when i try the same code today ,Error message - Error: Don't know how to generate default for object of class character - how to resolve this ? – Dinesh Kumar Jul 14 '17 at 07:54
0

Here is another simple solution based on ifelse; hope it helps.

DT$new <- ifelse(is.na(DT$Type3), ifelse(is.na(DT$Type2),DT$Type1,DT$Type2),DT$Type3)

enter image description here

Sam S.
  • 627
  • 1
  • 7
  • 23