0

I have an issue where I was trying to do some conditional indexing across a data frame but things weren't quite working out as I expected. For example I was looking for all rows where lat == -26.34.

> t.df[t.df$lat == -26.34,]   
<0 rows> (or 0-length row.names)

But I know for a fact that number is in the data frame...

> t.df$lat[4140]
[1] -26.34

And even more strange...

> t.df$lat[4140] == -26.34
[1] FALSE

Upon further investigation it turns out the numbers I'm working with have different floating point values. Hence why -26.34 == -26.34 evaluates FALSE.

> sprintf("%.30f", t.df$lat[4140])
[1] "-26.339999999999996305177774047479"
> sprintf("%.30f", -26.34)
[1] "-26.340000000000003410605131648481"

I understand why this can happen (see here) but what I want to know is the following:

  1. How can I avoid this?
  2. How can I do indexing such as t.df[t.df$lat == -26.34,] successfully when this issue does crop up?
Muon
  • 1,294
  • 1
  • 9
  • 31
  • 3
    See my post [here](https://stackoverflow.com/a/47222009/8245406). Or [this one](https://stackoverflow.com/a/50569202/8245406). – Rui Barradas Mar 02 '20 at 08:35
  • Thanks @Rui and also Henrik. Those are very useful posts. I will mark this question as a duplicate. – Muon Mar 02 '20 at 08:39

0 Answers0