I have two data tables in R; Table 1 is smaller and has about 4,000 rows and 4 cols. These cols are ID, Report_Date, Start_date, and End_date. The other table, Table 2, is a larger table which has thousands of rows and 8 cols. Both tables have the ID col but this ID col can have multiple rows in both tables. There is a date column in Table 2 as well and a col called "Value". The rest of the columns in Table 2 are not of use for this task.
For each row in Table 1, I need to calculate standard deviation of the col "Value" from Table 2 based on 2 conditions; 1. ID has to match, 2. The date range is within the interval defined by Start_Date and End_Date in Table 1. How can I do this in R? I have attached a small sample of both tables below. In the example below, I would need the standard deviation of Value for A1 for the date range in Start_Date and End_Date so STD_Value(A1, report date of 2008/10/02)= sd(10,11,11).
Sample Table 1 looks like this:
ID | Report_Date | Start_Date | End_Date |
---|---|---|---|
A1 | 2008/10/02 | 2008/09/27 | 2008/09/30 |
A1 | 2008/11/02 | 2008/10/27 | 2008/10/30 |
A2 | 2008/02/02 | 2008/01/15 | 2008/01/17 |
Sample Table 2 looks like this:
ID | Date | Value |
---|---|---|
A1 | 2008/09/27 | 10 |
A1 | 2008/09/28 | 11 |
A1 | 2008/09/30 | 11 |
A1 | 2008/10/01 | 12 |
A1 | 2008/10/02 | 13 |
A2 | 2008/01/14 | 5 |
A2 | 2008/01/15 | 4 |
A2 | 2008/01/16 | 3 |
A2 | 2008/01/17 | 5 |
A2 | 2008/01/18 | 5 |