-1

I have a dataframe that has dates in the format YYYY/MM/DD. I tried subseting it in two ways and got different values:

Method 1:

 a <- mydata[(mydata$Date > 2010-01-01),]

Result:

This gave me results that include dates in 2008, 2009, etc.

Method 2:

 a <- mydata[(mydata$Date > 2010/01/01),]

Result:

This gave me the correct results.

As you can see, the difference is the way I format the dates - "/" vs "-". Can someone explain to me what the difference is? The dates in the dataframe itself are in the form of YYYY-MM-DD, which is why I used the hyphen in Method 1.

pnuts
  • 58,317
  • 11
  • 87
  • 139
Trung Tran
  • 13,141
  • 42
  • 113
  • 200
  • Initially they were factors and I changed them to Date – Trung Tran Dec 22 '14 at 20:35
  • 7
    Neither of those methods should work with a proper date value (or at least not in the way you are expecting). It would be helpful if you included a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input so we can replicate the behavior. A proper comparison would look like `a <- mydata[(mydata$Date > as.Date("2010-01-01"),]` – MrFlick Dec 22 '14 at 20:52

1 Answers1

1

If your dates are character values (and not factors or Dates which unfortunately look just the same when printed to the console) then you can use ">" or "<" or "==" but to do so the expression for the value needs to be quoted. Otherwise you just get the arithmetic value of

> 2010-01-01
[1] 2008

No error will be thrown because you can compare numerics and character vectors but the results will not be to your liking:

> 2010-01-01 > "2007-01-01"
[1] TRUE

So to be safe and get meaningful results, try this:

 asub <- mydata[(as.character(mydata$Date > "2010-01-01"), ]

The as.character will convert either factor- (or Date)-classed vectors to character.

IRTFM
  • 258,963
  • 21
  • 364
  • 487