I am trying to SEE (not delete) the rows that are duplicated in a data frame. The problem I am having is that when I use duplicated
it assumes one of the rows in each duplicate group is the original and doesn't provide it. I need to see all rows that have a duplicate. I have looked around stack and google and don't see a fix. Does anyone know of a way to do this? Thanks in advance.
Data:
> dput(testx1)
structure(list(DISASTER_NUMBER = c(1921L, 1921L, 1921L, 1921L,
1921L, 1921L, 1921L, 1921L, 1922L, 1922L, 1922L, 1922L, 1922L,
1922L, 1922L, 1922L, 1922L, 1922L, 1922L, 1922L, 1922L, 1922L,
1922L, 1922L, 1922L, 1922L, 1922L, 1922L, 1922L, 1922L, 1922L
), PW_NUMBER = c(498L, 500L, 501L, 502L, 510L, 519L, 542L, 542L,
1L, 1L, 7L, 7L, 7L, 9L, 9L, 9L, 9L, 14L, 14L, 15L, 15L, 15L,
16L, 16L, 16L, 17L, 17L, 18L, 18L, 18L, 18L), VERSION_NUMBER = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L,
0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 2L), PROJECT_AMOUNT = c(53388,
84, 2912, 13555, 12921, 53068, 1738887, 23101, 12792, 3986, 14701,
13544, 18120, 20066, 525251, 0, 11976, 16016, 12025, 3363, 29894,
23845, 4120, 3550, 2261, 3327, 17521, 2670, 54467, 163913, 220707
), TOTAL_ELIGIBLE = c(53388, 84, 2912, 13555, 12921, 53068, 1738887,
23101, 12792, 3986, 14701, 13544, 18120, 20066, 525251, 0, 11976,
16016, 12025, 3363, 29894, 23845, 4120, 3550, 2261, 3327, 17521,
2670, 54467, 163913, 220707), TOTAL_OBLIGATED = c(40041, 63,
2184, 10167, 9690, 39801, 1304165, 17326, 9594, 2990, 11025,
13544, 13590, 15050, 525251, 0, 8982, 12012, 9019, 2522, 29894,
23845, 3090, 3550, 2261, 3327, 13141, 2670, 40850, 122935, 0),
MITIGATION_COST = c(0, 0, 0, 13555, 2250, 0, 1028338, 0,
3987, 0, 18120, 18120, 0, 97426, 97426, 0, 0, 9060, 0, 19129,
19129, 0, 3966, 3966, 0, 8712, 8712, 18327, 18327, -10768,
0)), .Names = c("DISASTER_NUMBER", "PW_NUMBER", "VERSION_NUMBER",
"PROJECT_AMOUNT", "TOTAL_ELIGIBLE", "TOTAL_OBLIGATED", "MITIGATION_COST"
), row.names = 77710:77740, class = "data.frame")
Code:
testx2.0 <- testx1 %>% subset(select = DISASTER_NUMBER:VERSION_NUMBER)
testx2.1 <- which(duplicated(testx2.0))
testx2.2 <- testx1[testx2.1, ]