I have two sets of dataframes: datamax, datamax2015 and datamin, datamin2015.
Snippet of data:
print(datamax.head())
print(datamin.head())
print(datamax2015.head())
print(datamin2015.head())
Date ID Element Data_Value
0 2005-01-01 USW00094889 TMAX 156
1 2005-01-02 USW00094889 TMAX 139
2 2005-01-03 USW00094889 TMAX 133
3 2005-01-04 USW00094889 TMAX 39
4 2005-01-05 USW00094889 TMAX 33
Date ID Element Data_Value
0 2005-01-01 USC00200032 TMIN -56
1 2005-01-02 USC00200032 TMIN -56
2 2005-01-03 USC00200032 TMIN 0
3 2005-01-04 USC00200032 TMIN -39
4 2005-01-05 USC00200032 TMIN -94
Date ID Element Data_Value
0 2015-01-01 USW00094889 TMAX 11
1 2015-01-02 USW00094889 TMAX 39
2 2015-01-03 USW00014853 TMAX 39
3 2015-01-04 USW00094889 TMAX 44
4 2015-01-05 USW00094889 TMAX 28
Date ID Element Data_Value
0 2015-01-01 USC00200032 TMIN -133
1 2015-01-02 USC00200032 TMIN -122
2 2015-01-03 USC00200032 TMIN -67
3 2015-01-04 USC00200032 TMIN -88
4 2015-01-05 USC00200032 TMIN -155
For datamax, datamax2015, I want to compare their Data_Value
columns and create a dataframe of entries in datamax2015 whose Data_Value
is greater than all entries in datamax for the same day of the year. Thus, the expected output should be a dataframe with rows from 2015-01-01 to 2015-12-31 but with dates only where the values in the Data_Value
column are greater than those in the Data_Value
column of the datamax dataframe.
i.e 4 rows and anything from 1 to 364 columns depending on the condition above.
I want the converse (min) for the datamin and datamin2015 dataframes.
I have tried the following code:
upper = []
for row in datamax.iterrows():
for j in datamax2015["Data_Value"]:
if j > row["Data_Value"]:
upper.append(row)
lower = []
for row in datamin.iterrows():
for j in datamin2015["Data_Value"]:
if j < row["Data_Value"]:
lower.append(row)
Could anyone give me a helping hand as to where I am going wrong?