2

I have a data-frame with 627 observations and 16 variables are present. I am considering one column named "ZoneDivison" which has factors: North Eastern, Eastern and South Eastern. So, I want to compare the adjacent row values and create a new column which has 1, if two adjacent rows have same zones, else 0, if the adjacent rows are different.

I referred to the following links to find a way out: [here] Matching two Columns in R [here] compare row values over multiple rows (R)

library(dplyr)
a <- c(rep("Eastern",3),rep ("North Eastern", 6),rep("South Eastern", 3))
a=data.frame(a)
colnames(a)="ZoneDivision"

#comparing the zones
library(plyr)
ddply(n, .(ZoneDivision),summarize,ZoneMatching=Position(isTRUE,ZoneDivision))


Expected Result

   ZoneDivision ZoneMatching
 1      Eastern       NA
 2      Eastern       1
 3       Eastern      1               
 4 North Eastern      0
 5 North Eastern      1
 6 North Eastern      1
 7 North Eastern      1
 8 North Eastern      1
 9 North Eastern      1
 10 South Eastern     0
 11 South Eastern     1
 12 South Eastern     1

Actual Result
    ZoneDivision ZoneMatching
1       Eastern           NA
2 North Eastern           NA
3 South Eastern           NA

How should I proceed? Please help!!

Ami
  • 197
  • 1
  • 12

4 Answers4

2

Using base R, we can do

as.numeric(c(NA, a$ZoneDivision[-1] == a$ZoneDivision[-nrow(a)]))
#[1] NA  1  1  0  1  1  1  1  1  0  1  1
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

The data.table way:

a <- c(rep("Eastern",3),rep ("North Eastern", 6),rep("South Eastern", 3))
dt <- as.data.table(a)

dt[,'ZoneMatching' := as.numeric(.SD[,a] == shift(.SD[,a],1))]

Where you add a new ZoneMatching column as the numeric values of the logical comparison between the a column and the lagged values, generated by the shift() function.

Newl
  • 310
  • 2
  • 12
1

You can use lag to get that:

library(dplyr)
a %>%
  mutate(ZoneMatching = as.numeric((ZoneDivision == lag(ZoneDivision, 1))))
    ZoneDivision ZoneMatching
1        Eastern           NA
2        Eastern            1
3        Eastern            1
4  North Eastern            0
5  North Eastern            1
6  North Eastern            1
7  North Eastern            1
8  North Eastern            1
9  North Eastern            1
10 South Eastern            0
11 South Eastern            1
12 South Eastern            1
Sonny
  • 3,083
  • 1
  • 11
  • 19
1

We can use base R

with(a, c(NA, +(head(ZoneDivision, -1) == tail(ZoneDivision, -1))))
#[1] NA  1  1  0  1  1  1  1  1  0  1  1
akrun
  • 874,273
  • 37
  • 540
  • 662