1

R produces an unexpected result when attempting to subset a dataframe with the greater than >= operator.

Here are the results when using the == operator:

> head(sessions[sessions$datetime == "2016-06-25 13:29:43",],2)
   id   birdie            datetime side_speed end_speed full_coverage
15 65 CALAN197 2016-06-25 13:29:43      -0.34     -0.34             1

However, when using the >= operator, the result that previously appeared in the previous operation no longer comes up.

> head(sessions[sessions$datetime >= "2016-06-25 13:29:43",],2)
  id   birdie            datetime side_speed end_speed full_coverage
1  2 CALAN190 2016-06-30 08:54:40      -0.34     -0.34             1
2  3 CALAN190 2016-06-30 09:55:05      -0.34      0.00             1

In fact, this result is identical to the greater > operator.

How could this be?

Here is a minimal reproducible example:

d <- read.table(text = "1 | 2 | CALAN190 | 2016-06-30 08:54:40   |   -0.34   |  -0.34      |       1
                2 | 3 | CALAN190 | 2016-06-30 09:55:05  |    -0.34   |   0.00      |       1
                15 | 65 | CALAN197 | 2016-06-25 13:29:43  |    -0.34  |   -0.34       |      1", sep = "|")
d$V4 <- as.POSIXct(d$V4)
head(d[d$V4 == "2016-06-25 13:29:43", ], 2)
head(d[d$V4 >= "2016-06-25 13:29:43", ], 2)
Jaap
  • 81,064
  • 34
  • 182
  • 193
Levi
  • 301
  • 3
  • 12
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input that can be used to test and verify possible solutions. You are only doing `head(,2)`, are you sure it's not just in one of the rows that's being trimmed from display? – MrFlick Nov 29 '18 at 22:06
  • Are you definitely working with formatted dates? – camille Nov 29 '18 at 22:09
  • Good tip @MrFlick about the trimming but it's not the case. – Levi Nov 29 '18 at 22:09
  • @camille, they were pulled from a MySQL database that was formatted as datetime – Levi Nov 29 '18 at 22:10
  • @camille, to be extra sure the field is formatted as date, I used sessions$datetime <- as.POSIXct(sessions$datetime) with the same result – Levi Nov 30 '18 at 00:18
  • I thought perhaps the dates need to be ordered so I tried, d[order(as.POSIXct(d$V4)),] on the simple working example, but that's not the issue. – Levi Nov 30 '18 at 04:54
  • 1
    *How could this be?* The third record (that for which `==` is TRUE) is cut by the `head(..., 2)` – jogo Nov 30 '18 at 18:26
  • Yeah, in your "reproducible example", if you just take off the `head()`, you'd see that it works just fine. The same record is returned both times. – MrFlick Nov 30 '18 at 18:53
  • But I need 2 records that match the criterium. – Levi Dec 02 '18 at 17:06

0 Answers0