1

I'm trying to find unequal values in 2 different columns from the same dataframe.

I'm using

df %>%
  filter(column1 != column2)

However it is returning some values that seem to be equal. However it does not return the same set of values as its counterpart

df %>%
  filter(column1 == column2)

Both columns are the double data type.

Am I supposed to find out a different method to compare them?

Example of equal values showing up in != query

Mako212
  • 6,787
  • 1
  • 18
  • 37
  • 1
    The numbers displayed may be equal, but only the first few decimal digits are displayed. They underlying value could be different in the digit after that, or there could even be a very small difference many decimal digits in due to [floating point precision limits](https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal) – IceCreamToucan Oct 12 '21 at 18:57
  • Its always best to share your data with `dput` rather than an image as that let's people copy the data to try answer the question. – Tjn25 Oct 14 '21 at 07:11

1 Answers1

2

The == and != are elementwise comparison i.e. it compares the 1st value of column1 against the 1st of column2, 2nd against 2nd and so on. If the intention is to returns match from any values in 'column2' with that of 'column1', use %in%

library(dplyr)
df %>%
    filter(column1 %in% column2)

The reverse will be to negate (!)

df %>%
   filter(!column1 %in% column2)

Regarding the OP's description However it is returning some values that seem to be equal. and Both columns are the double data type.. It is a tricky situation with the columns are double i.e. it would also have to consider precision to make them equal if it is elementwise. As there is no reproducible example, it is only based on assumption. One option is to round the column values and do the elementwise comparison

df %>% 
    filter(round(column1, 1) == round(column2, 1))
akrun
  • 874,273
  • 37
  • 540
  • 662