0

I'm very new person for R programming Language.help me to achieve my posted Question. Note: i need to achieve below task dynamically why because in future it will come more than 1000 column .

My Data Frame.

A_ID    Queues  COL1    COL2

 1     First      Z        Z
 1     First      T        Y
 2     Second     C        C
 1     Second     A        H
 1     First      B        C
 2     Second     N        F
 2     Second     I        K
 2     Second     M        A
 3     First      A        Y
 3     First      P        L
 3     First      L        H
 4     First      R        J
 4     First      U        J

Data frame Dput data.

structure(list(A_ID = c(1, 1, 2, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4
), Queues = c("First", "First", "Second", "Second", "First", 
"Second", "Second", "Second", "First", "First", "First", "First", 
"First"), COL1 = c("Z", "T", "C", "A", "B", "N", "I", "M", "A", 
"P", "L", "R", "U"), COL2 = c("Z", "Y", "C", "H", "C", "F", "K", 
"A", "Y", "L", "H", "J", "J")), .Names = c("A_ID", "Queues", 
"COL1", "COL2"), row.names = c(NA, -13L), class = "data.frame")

For Identification of changed and unchanged value.

A_ID    Queues   COL1      COL2   COL1Changedval   COL2changedval    
 1        First      Z        Z      0                    0
 1        First      T        Y      1                    1
 1        First      B        C      1                    1
 1        Second     A        H      0                    0  

 2        Second     C        C      0                    0
 2        Second     N        F      1                    1
 2        Second     I        K      1                    1
 2        Second     M        A      1                    1

 3        First      A        Y      0                    0
 3        First      P        L      1                    1 
 3        First      L        H      1                    1

 4        First      R        J      0                    0
 4        First      U        J      1                    0

Output data frame will be.

A_ID    Queues     COL1Changedval     COL2changedval

 1        First        2                  2              
 1        Second       0                  0  
 2        Second       3                  3
 3        First        2                  2
 4        First        1                  0
  • Possible duplicate of [How to create a lag variable within each group?](https://stackoverflow.com/questions/26291988/how-to-create-a-lag-variable-within-each-group) – lebelinoz Sep 14 '17 at 04:43

2 Answers2

1

We need to group by 'A_ID', 'Queues', check whether the values in the "COL"s are equal to the adjacent ones and sum

library(data.table) 
res <- setDT(df1)[, lapply(.SD, function(x) sum(x != shift(x, fill = x[1]))), 
                by = .(A_ID, Queues)][order(A_ID)]

setnames(res, 3:ncol(res), paste0(names(res)[3:ncol(res)], "ChangedVal"))[]
#   A_ID Queues COL1ChangedVal COL2ChangedVal
#1:    1  First              2              2
#2:    1 Second              0              0
#3:    2 Second              3              3
#4:    3  First              2              2
#5:    4  First              1              0
akrun
  • 874,273
  • 37
  • 540
  • 662
0

A dplyr based solution:

As @akrun described the flow: Grouping by A_ID and Queues, checking within columns the consecutive values, if they change report as 1 and then sum.

So created myfun to do the same:

myfun<-function(vec){
if(length(vec)==1){
    return(0)
}else{
    sum(sapply(1:(length(vec)-1),
               function(t,vec){
                   if(vec[t]==vec[t+1]){
                       return(0)
                   }else{
                       return(1)
                   }
               },vec))
    }
}

Then using dplyr verbs:

unique(df%>%group_by(A_ID,Queues)%>%mutate(Col1changedval=myfun(COL1),Col2changedval=myfun(COL2))%>%select(A_ID,Queues,Col1changedval,Col2changedval))

# A tibble: 5 x 4
# Groups:   A_ID, Queues [5]
#   A_ID Queues Col1changedval Col2changedval
#  <dbl>  <chr>          <dbl>          <dbl>
#1     1  First              2              2
#2     2 Second              3              3
#3     1 Second              0              0
#4     3  First              2              2
#5     4  First              1              0
tushaR
  • 3,083
  • 1
  • 20
  • 33