R dplyr: Find a specific value in a column, then replace the adjacent cell in the subsequent columns to the right with that value

Question

I am trying to create a matrix of site and time-of-event. In my case, once the event has occurred ("1") it is permanent and cannot go back to a "0". Once a cell in a column is a "1" I am trying to populate the adjacent cell in the subsequent columns to the right with a "1" (see bellow example).

site <- c('A','B','C','D','E','F','G') #site
time <- c(0,1,4,0,3,2,0) # time in which even occured
event <- c(0,1,1,0,1,1,0) # did a event occur
data <- data.frame(site, time, event)

site.time.matrix <- cast(data, site~time)

# This is the output      # This is the desired output
#site   0  1  2  3  4     #site   0  1  2  3  4
#    A  0 NA NA NA NA     #    A  0  0  0  0  0
#    B NA  1 NA NA NA     #    B  0  1  1  1  1
#    C NA NA NA NA  1     #    C  0  0  0  0  1
#    D  0 NA NA NA NA     #    D  0  0  0  0  0
#    E NA NA NA  1 NA     #    E  0  0  0  1  1
#    F NA NA  1 NA NA     #    F  0  0  1  1  1
#    G  0 NA NA NA NA     #    G  0  0  0  0  0

I have found some promising code using dplyr e.g. (Replacing more than one elements with replace function or Apply function to each column in a data frame observing each columns existing data type) which replaces values, although I am unsure of how to specify the adjacent cell in subsequent columns argument.

My apologies if this question is unclear, this is my first post on StackOverflow.

Thank you.

score 3 · Accepted Answer · answered Oct 19 '16 at 11:08

It was welcome surprise for a first user post to be detailed, reproducible and interesting, +1!

With na.locf from zoo package you could do:

library(reshape) # for cast function
library(zoo)    #for na.locf function short for if NA, last observation carrried forward, ?na.locf

site <- c('A','B','C','D','E','F','G') #site
time <- c(0,1,4,0,3,2,0) # time in which even occured
event <- c(0,1,1,0,1,1,0) # did a event occur
data <- data.frame(site, time, event)

site.time.matrix <- reshape::cast(data, site~time)

site.time.matrix.fill <- site.time.matrix


# Transpose the matrix excluding first column, carry forward last observation and 
# transpose again to return to original matrix structure

site.time.matrix.fill[,-1] <- t(na.locf(t(site.time.matrix.fill[,-1])))

site.time.matrix.fill[is.na( site.time.matrix.fill)] <- 0

site.time.matrix.fill

#  site 0 1 2 3 4
#1    A 0 0 0 0 0
#2    B 0 1 1 1 1
#3    C 0 0 0 0 1
#4    D 0 0 0 0 0
#5    E 0 0 0 1 1
#6    F 0 0 1 1 1
#7    G 0 0 0 0 0

Thank's Osssan, I have never heard of the 'zoo' package, this is exactly what I needed. These are all really great answers, I love seeing all the different ways you can do the same thing :) — CarlaBirdy, Oct 19 '16 at 22:26

score 1 · Answer 2 · answered Oct 19 '16 at 11:32

1

A base R approach using apply.

Basically, for every row we are trying to find any element that has 1 in it and assigning 0 to every element in left of it and 1 for every element to the right.

t(apply(site.time.matrix, 1, function(x) {
       temp = if(any(x == 1, na.rm = T)) which(x==1)-1 else length(x)
       x[temp:length(x)] <- 1
       x[0:temp] <- 0
       x
}))


#  0 1 2 3 4
#A 0 0 0 0 0
#B 0 1 1 1 1
#C 0 0 0 0 1
#D 0 0 0 0 0
#E 0 0 0 1 1
#F 0 0 1 1 1
#G 0 0 0 0 0

answered Oct 19 '16 at 11:32

Ronak Shah

377,200
20
156
213

1

Thank you for your answer Ronak. I love seeing all the different ways you can do the same thing. I don't have much experience using the apply function, it's something I am hoping to improve on so thank you for helping me with this :) – CarlaBirdy Oct 19 '16 at 22:29

lmo · Answer 3 · 2016-10-19T12:51:28.773

Here is a second base R method (excluding the reshaping). This uses apply and cummax (cumulative maximum). If only one event occurs for each site, then replacing cummax with cumsum would return the same result.

# reshape the data
temp <- cast(data, site~time)

# construct matrix of 0s and 1s
myMat <- as.matrix(temp[-1])
myMat[is.na(myMat)] <- 0

# expand 1s to the right when they appear
myMat <- t(apply(myMat, 1, cummax))

# add row and column names
dimnames(myMat) <- list(levels(temp$site), seq_len(ncol(myMat))-1)

This returns

myMat
  0 1 2 3 4
A 0 0 0 0 0
B 0 1 1 1 1
C 0 0 0 0 1
D 0 0 0 0 0
E 0 0 0 1 1
F 0 0 1 1 1
G 0 0 0 0 0

Note: The reshaping (with cast) can also be performed with the base R reshape function, but you have to also re-order the variables afterward. For example,

# reshape data
temp <- reshape(data, direction="wide", idvar="site")
# reorder variables
temp <- temp[c("site", sort(names(temp)[-1]))]

would produce the expected data frame.

  site event.0 event.1 event.2 event.3 event.4
1    A       0      NA      NA      NA      NA
2    B      NA       1      NA      NA      NA
3    C      NA      NA      NA      NA       1
4    D       0      NA      NA      NA      NA
5    E      NA      NA      NA       1      NA
6    F      NA      NA       1      NA      NA
7    G       0      NA      NA      NA      NA

@RonakShah Thanks. I didn't check the results of `reshape` closely enough. — lmo, Oct 19 '16 at 12:32
I love seeing all the different ways you can do the same thing. Thank you for your answer, I look forward to going over all the different types of code today :) — CarlaBirdy, Oct 19 '16 at 22:31

R dplyr: Find a specific value in a column, then replace the adjacent cell in the subsequent columns to the right with that value

3 Answers3

Linked