-3

I have a dataframe with two columns(both are dates) and a million rows. I have to compare both the dates and return value in the third column. i.e if date in column A is greater than date in column B, return 1 in column C.

Thanks in advance :)

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
stef1
  • 11
  • 1
  • 1
  • What have you tried? This is very easy to achieve in R provided you have your data set up correctly, can you show a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of your data? – Marius Jul 11 '17 at 05:49
  • `DF$C = DF$A > DF$B` ? – MichaelChirico Jul 11 '17 at 05:52
  • yes, my dates are set correctly using as.date command. I have never done comparison and returned value accordingly before. – stef1 Jul 11 '17 at 05:56
  • @MichaelChirico will this alone return 1 or zero? I would try it but my rstudio is currently running a task on a large dataframe – stef1 Jul 11 '17 at 05:58
  • @stef1 you can start a new Rstudio session and try it on example data. – Marius Jul 11 '17 at 06:02
  • because you have million rows, I provide `data.table` way for u below to manipulate large data. – Peter Chen Jul 11 '17 at 06:07

2 Answers2

1

In base:

DF$C <- as.numeric(DF$A > DF$B)

In dplyr:

DF %>% 
  mutate(C = as.numeric(A > B))
HNSKD
  • 1,614
  • 2
  • 14
  • 25
0
library(data.table)
dt <- as.data.table(dt)
dt$A <- as.Date(dt$A)
dt$B <- as.Date(dt$B)

Here are two ways you can try:

dt[, C := ifelse(A > B, 1, 0)]

or

dt[, C := 0][A > B, C := 1]

In second way, you can change to dt[, C := 1][A <= B, C := 0] by checking which has less obs.

Maybe you need to provide a little reproducible example.

Peter Chen
  • 1,464
  • 3
  • 21
  • 48
  • 2
    These are very roundabout ways of achieving something that you can do very directly in R. – Marius Jul 11 '17 at 06:04
  • I know this is roundabout. But if the data just contains million rows, `data.table` combined with this is OK maybe. Is there other ways to do super large data in this question? – Peter Chen Jul 11 '17 at 06:11
  • 1
    I say this is roundabout because the result of `DF$A > DF$B` is so close to the desired answer that it seems unnecessary to add extra steps on top of it, or send it through `ifelse`. `as.numeric(A > B)` is simple and clean. – Marius Jul 11 '17 at 06:14