1

The 4th column is my desired column. Video,Webinar,Meeting,Conference are the 4 type of activities that the different customers(names) can engage in. You can see,in a given row, all the column names with zero value are in the final column(NextStep) and the value there(character string separated by commas) excludes the column name with non-zero value. The character strings(column names) in the final column usually appear in the column order with two exceptions. Webinar always appears first if it has a zero value and video always appears last if it has a zero value.

    library(data.table)
     dt <- fread('
 Name     Video   Webinar Meeting Conference   NextStep
  John       1         0        0       0         Webinar,Meeting,Conference
  John       1         1        0       0         Meeting,Conference
  John       1         1        1       0         Conference      
  Tom        0         0        1       0         Webinar,Conference,Video
  Tom        0         0        1       1         Webinar,Video   
  Kyle       0         0        0       1         Webinar,Meeting,Video

                                    ')

My question is how to create the next step column. Thanks a lot for your help!

gibbz00
  • 1,947
  • 1
  • 19
  • 31

3 Answers3

4

In case you are looking for a way to do this without simply re-ordering the columns in the order you want (in fact I see no reason why not to do so, but anyway..) you could try the following approach. It melts and updates by reference in a join:

lvls <- c("Webinar", "Meeting", "Conference", "Video")  # make sure order is correct
dt[, row := .I]   # add a row-identifier
dtm <- melt(dt, id.vars = c("Name", "row"), measure.vars = lvls) # melt to long format
# summarise dtm by using factor, sorting it and converting to strin; then join to dt
dt[dtm[value == 0, list(NextStep2 = toString(sort(factor(variable, levels = lvls)))), 
    by = row], NextStep2 := NextStep2, on = "row"][, row := NULL]

#    Name Video Webinar Meeting Conference                   NextStep                    NextStep2
# 1: John     1       0       0          0 Webinar,Meeting,Conference Webinar, Meeting, Conference
# 2: John     1       1       0          0         Meeting,Conference          Meeting, Conference
# 3: John     1       1       1          0                 Conference                   Conference
# 4:  Tom     0       0       1          0   Webinar,Conference,Video   Webinar, Conference, Video
# 5:  Tom     0       0       1          1              Webinar,Video               Webinar, Video
# 6: Kyle     0       0       0          1      Webinar,Meeting,Video      Webinar, Meeting, Video

If you want to paste all column names as in the data for those cases where there's no activity, you can add the following line to your code:

dt[rowSums(dt[, mget(lvls)]) == 0, NextStep2 := toString(names(dt)[2:5])]
Jaap
  • 81,064
  • 34
  • 182
  • 193
talat
  • 68,970
  • 21
  • 126
  • 157
  • Thank you for the great solution. For someone who has not any activity, so all the column are zero, in that case I want to do video,webinar,meeting,conference just as the column order appears in the original dataframe. – gibbz00 Jun 09 '16 at 20:25
  • Thank you so much! – gibbz00 Jun 10 '16 at 13:53
3

A possible solution:

DT[, nextstep := paste0(names(.SD)[.SD==0], collapse = ','), 1:nrow(DT), .SDcols = 2:5][]

which gives:

   Name Video Webinar Meeting Conference                   nextstep
1: John     1       0       0          0 Webinar,Meeting,Conference
2: John     1       1       0          0         Meeting,Conference
3: John     1       1       1          0                 Conference
4:  Tom     0       0       1          0   Video,Webinar,Conference
5:  Tom     0       0       1          1              Video,Webinar
6: Kyle     0       0       0          1      Video,Webinar,Meeting

When you want to order the names as you specified in the comments, you can do:

lvls <- c('Webinar', 'Meeting', 'Conference', 'Video')
DT[, nextstep := paste0(lvls[lvls %in% names(.SD)[.SD==0]], collapse = ','), 
   1:nrow(DT), .SDcols = 2:5][]

which gives:

   Name Video Webinar Meeting Conference                   nextstep
1: John     1       0       0          0 Webinar,Meeting,Conference
2: John     1       1       0          0         Meeting,Conference
3: John     1       1       1          0                 Conference
4:  Tom     0       0       1          0   Webinar,Conference,Video
5:  Tom     0       0       1          1              Webinar,Video
6: Kyle     0       0       0          1      Webinar,Meeting,Video

Instead of using paste0 (with collapse = ',') you can also use toString.


Used data:

DT <- fread('Name     Video   Webinar  Meeting  Conference
             John       1         0        0        0
             John       1         1        0        0
             John       1         1        1        0
             Tom        0         0        1        0
             Tom        0         0        1        1
             Kyle       0         0        0        1')
Jaap
  • 81,064
  • 34
  • 182
  • 193
  • Thank you for the awesome solution. Is there any way to make Webinar always appear first and Video always appear last in the nextstep2 field? – gibbz00 Jun 09 '16 at 19:58
  • @gibbz00 and what about 'Meeting' and 'Conference'? – Jaap Jun 09 '16 at 20:04
  • In the case of meeting and conference, we can resort to the column order where meeting comes before conference – gibbz00 Jun 09 '16 at 20:20
1

Here you go:

setcolorder(dt, c("Name", "Webinar", "Meeting", "Conference", "Video", "NextStep"))
dt[, NextStepNew:=apply(dt, 1, function(x) paste0(names(x)[x==0], collapse=","))][]
   Name Webinar Meeting Conference Video                   NextStep                NextStepNew
1: John       0       0          0     1 Webinar,Meeting,Conference Webinar,Meeting,Conference
2: John       1       0          0     1         Meeting,Conference         Meeting,Conference
3: John       1       1          0     1                 Conference                 Conference
4:  Tom       0       1          0     0   Webinar,Conference,Video   Webinar,Conference,Video
5:  Tom       0       1          1     0              Webinar,Video              Webinar,Video
6: Kyle       0       0          1     0      Webinar,Meeting,Video      Webinar,Meeting,Video
ytk
  • 2,787
  • 4
  • 27
  • 42