1

I have data frame with two variables ID and arrival. Here is head of my data frame:

head(sun_2)
Source: local data frame [6 x 2]

         ID  arrival
      (chr)   (dats)
1 027506905 01.01.15     
2 042363988 01.01.15    
3 026050529 01.01.15    
4 028375072 01.01.15    
5 055384859 01.01.15     
6 026934233 01.01.15 

How could I subset data by ID which has arrive within 7 days?

Kara
  • 6,115
  • 16
  • 50
  • 57
zde
  • 21
  • 3
  • Within 7 days of what? The first observation for each ID? Also a `dput(head(sun_2))` would be a more helpful way to present your data. – Mike H. Jun 01 '16 at 12:48
  • 4
    can you provide an example of your output? It'd be great reproduced your question following the guidelines in this link http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – ArunK Jun 01 '16 at 12:50
  • Sorry for bad post... Yes, within 7 days from first observation. There are some observations (same IDs) which are duplicated, but I am intrested only in that one, wich arrive withih 7 days from first observation. – zde Jun 01 '16 at 12:54
  • Please make your question reproducible. – Roman Luštrik Jun 01 '16 at 13:12

2 Answers2

0

So like a lot of the other folks were saying, without more information (what the original observation looks like for example) we can't get at exactly what your issue is without making some assumptions.

I assumed that you have a column of data that indicates the original Date? And that these rows are formatted as.Date.

#generate Data
Data <- data.frame(ID = as.character(1394:2394),
               arrival = sample(seq(as.Date('2015/01/01'), as.Date('2016/01/01'), by = 'day'), 1001, replace = TRUE))

# Make the "Original Observation" Variable
delta_times <- sample(c(3:10), 1001, replace = TRUE)
Data$First <- Data$arrival - delta_times

this gives me a data set that looks like this

    ID    arrival      First
1 1394 2015-11-06 2015-10-28
2 1395 2015-08-04 2015-07-26
3 1396 2015-04-19 2015-04-16
4 1397 2015-05-13 2015-05-03
5 1398 2015-07-18 2015-07-11
6 1399 2015-01-08 2015-01-03

If that is the case then the solution is to use difftime, like so:

# Now we need to make a subsetting variables
Data$diff_times <- difftime(Data$arrival, Data$First, units = "days")
Data$diff_times

within_7 <- subset(Data, diff_times <=7)

max(within_7$diff_times)
Time difference of 7 days
calder-ty
  • 431
  • 3
  • 9
0

It's a bit difficult to be sure given the information you've provided, but I think you could do it like this:

library(dplyr)
dt %>% group_by(ID) %>% filter(arrival < min(arrival) + 7)
David_B
  • 926
  • 5
  • 7