0

I am trying to extract values of X1 for which X2 and X3 are same pair of values.

  X1 X2  X3
1  1  1   8
2  2  4   8
3  3  4   8
4  4 90 199
5  5  3   9 
6  6  3   9

I can give a command such that

subset(df, df$X2==4 & df$X3==8)

that will return me

  X1 X2 X3
2  2  4  8
3  3  4  8

But how can I make it dynamic? So It returns

  X1 X2 X3
2  2  4  8
3  3  4  8
5  5  3  9 
6  6  3  9
hariszaman
  • 8,202
  • 2
  • 40
  • 59
  • 1
    None of the rows in your example data frame have the X2 == X3 condition. `df` is also a non-ideal name for a data frame. `xdf[(xdf$X2 == xdf$X3),]` – hrbrmstr Oct 21 '17 at 13:08
  • 1
    It is not quite clear to me what rule should be applied here. Can you please describe it in more detail? – DatamineR Oct 21 '17 at 13:08
  • It looks like you're going for rows where X2 and X3 are duplicated, in which case take a look at this QA: https://stackoverflow.com/questions/7854433/finding-all-duplicate-rows-including-elements-with-smaller-subscripts – Nick Nimchuk Oct 21 '17 at 13:13
  • 2
    Try `df[ave(df$X1,df[,2:3],FUN=length)>1,]`. – nicola Oct 21 '17 at 13:14
  • I want to extract subset for which X2 and X3 are same. For example for rows 2,3,5,6 pair of X2, X3 are same i-e 4,8 and 3,9 respectively – hariszaman Oct 21 '17 at 13:15
  • `df$pastedcolumn <- paste(df$X2, df$X3); df[duplicated(df$pastedcolumn) | duplicated(df$pastedcolumn, fromLast = TRUE),]`. You can remove the `pastedcolumn` if not needed. – Ronak Shah Oct 21 '17 at 13:23

2 Answers2

1

A way to do that with dplyr. Count the number of occurence of each pair (X2,X3), filter pairs occuring more than once, join them in the initial data.frame :

df <- tribble(~X1, ~X2,  ~X3,
  1,  1,   8,
  2,  4,   8,
  3,  4,   8,
  4, 90, 199,
 5,  3,   9,
  6,  3,   9)

df2 <- df %>% 
  group_by(X2, X3) %>% 
  count() %>% 
  ungroup()
df2 
# A tibble: 4 x 3
#     X2    X3     n
#  <dbl> <dbl> <int>
#1     1     8     1
#2     3     9     2
#3     4     8     2
#4    90   199     1

df2 %>%  
  filter(n>1) %>% 
  select(-n) %>% 
  left_join(df, by = c("X2","X3"))

# A tibble: 4 x 3
#     X2    X3    X1
#  <dbl> <dbl> <dbl>
#1     3     9     5
#2     3     9     6
#3     4     8     2
#4     4     8     3
Flo.P
  • 371
  • 2
  • 7
0

INPUT

df <- data.frame(x1 = c(1,2,3,5,4,6), x2 = c(3,4,5,3,6,6), x3 = c(4,4,2,4,2,2))
df
#   x1 x2 x3
# 1  1  3  4
# 2  2  4  4
# 3  3  5  2
# 4  5  3  4
# 5  4  6  2
# 6  6  6  2

Paired x2 and x3

Creating df1 with x1 and pair of x2, x3 values as one column

df1 <- data.frame(x1 = df$x1, pair = paste(df$x2,df$x3, sep = ","))
df1
#   x1 pair
# 1  1  3,4
# 2  2  4,4
# 3  3  5,2
# 4  5  3,4
# 5  4  6,2
# 6  6  6,2

Required Output

Extracting rows from df where the frequency of a pair is greater than 1, i.e. if any x2, x3 pair has frequency greater than 1 those pairs would be extracted from df with the help fo df1

df[df1$pair %in% names(which(table(df1$pair) > 1)),]
#   x1 x2 x3
# 1  1  3  4
# 4  5  3  4
# 5  4  6  2
# 6  6  6  2
Sowmya S. Manian
  • 3,723
  • 3
  • 18
  • 30