I have data frame(df1) of start and end dates look like this.
Start Date End Date
1875-01-01 1877-09-30
1881-07-01 1886-03-31
1888-01-01 1889-06-30
1890-10-01 1890-12-31
.
.
.
2016-10-01 2018-12-31
I have a different set of data frame (df2) which consists of daily time series. For example,
Date Value
1875-01-01 7.21
1875-01-02 7.23
1875-01-03 7.22
1875-01-04 7.12
.
.
.
2018-12-31 3.12
I set dates as an index for df2.
I am trying to make stats table based on df2 using df1. First I created an empty data frame to add the values. For example,
outputtable = pd.DataFrame(columns = ('Max','Min','Ave'))
for i in df1.index:
try:
df3 = df2.loc[df1['Start Date'][i]:df1['End Date'][i]]
minimum = df3['Value'].min()
maximum = df3['Value'].max()
average = df3['Value'].mean()
outputtable[-1]= [minimum, maximum, average]
except:
pass
I used try because some of the dates in df1 are not in df2. In that case, I want the code to ignore and move onto the next set of dates.
I want the code to go through every row of df1 and do the stats (min, max and mean) and put them into the outputtable to fo further calculations. So far the code above is not working. Help would be much appreciated.
Desired output
Start Date End Date Min Max Ave
1875-01-01 1877-09-30 7 8 7.2
1881-07-01 1886-03-31 1 4 2.2
1888-01-01 1889-06-30 2 6.5 3
1890-10-01 1890-12-31 3 5 4.2
.
.
.
2016-10-01 2018-12-31 1 2 1.7