Say I have a data set with 3 columns, A, B and C, that contain dates for a large number of rows. How can I create a subset that omits the rows where the date in C is not within the range of the dates in A and B?
Asked
Active
Viewed 234 times
-5
-
1Hi Jason, Welcome to StackOverflow. Please have a look at [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and try to provide us with minimal reproducible example. – Ronak Shah Sep 09 '16 at 04:03
-
Also, this kind of operation is called "subsetting" - this should help you to easily google the answer. If you fail to find anything (unlikely), *then* ask a question here (and provide us with some data and what you have tried so far). – jakub Sep 09 '16 at 07:06
-
Possible duplicate of [R - check if string contains dates within specific date range](http://stackoverflow.com/questions/31716187/r-check-if-string-contains-dates-within-specific-date-range) – Sotos Sep 09 '16 at 07:12
1 Answers
0
Are you asking something like the following?
Let's say your initial dataframe is df, which is the following:
df
A B C
1 2016-02-16 2016-03-21 2016-01-01
2 2016-07-07 2016-06-17 2016-01-31
3 2016-05-19 2016-09-10 2016-03-01
4 2016-01-14 2016-08-21 2016-04-01
5 2016-09-02 2016-06-15 2016-05-01
6 2016-05-09 2016-07-17 2016-05-31
7 2016-06-13 2016-06-23 2016-07-01
8 2016-09-17 2016-03-11 2016-07-31
9 2016-03-09 2016-05-13 2016-08-30
10 2016-01-20 2016-09-01 2016-09-30
Now if you do the following subset operation, we shall get the following dataframe subset:
df.sub <- df[apply(df, 1, function(x) (x[3] < min(x[1], x[2])) |
(x[3] > max(x[1], x[2]))),]
df.sub
A B C
1 2016-02-16 2016-03-21 2016-01-01
2 2016-07-07 2016-06-17 2016-01-31
3 2016-05-19 2016-09-10 2016-03-01
5 2016-09-02 2016-06-15 2016-05-01
7 2016-06-13 2016-06-23 2016-07-01
9 2016-03-09 2016-05-13 2016-08-30
10 2016-01-20 2016-09-01 2016-09-30
Hope it helps.

Sandipan Dey
- 21,482
- 2
- 51
- 63