I have 3 data frames similar to these dataframes in this topic as follows:
A = pd.DataFrame({"Crop_type":["Wheat", "Wheat", "Wheat" ,"Barley", "Barley", "Barley"], "date": ["03/22/2014","04/15/2014","06/17/2014","03/22/2014","04/15/2014", "06/17/2014"], "impact": ["good", "good", "excellent", "fair", "fair", "fair"]})
B = pd.DataFrame({"Crop_type":["Barley", "Barley",], "date":["03/15/2014","06/26/2014"], "value": ["3","5"] })
C = pd.DataFrame({"Crop_type":["Wheat", "Wheat", "Wheat"], "date":["03/15/2014","04/16/2014","06/26/2014"], "value": ["4","8","1"] })
This is the only snippet. Actually, the first data frame has about 4600 rows and other tables about 250 rows. I need to create a new table, where values from data frames B and C come to A, but also with the dates (as second date). I would like to merge it on the nearest date and crop type. (Data frame samples should tell more). Manually in excel would be suicide.
The output should be something like this:
>>> Output_dataset
Crop_type Date Date_Nearest value impact
0 Wheat 03/22/2014 03/15/2014 4 good
1 Wheat 04/15/2014 04/16/2014 8 good
2 Wheat 06/17/2014 06/26/2014 1 excellent
3 Barley 03/22/2014 03/15/2014 3 fair
4 Barley 04/15/2014 NaN Nan fair
5 Barley 06/17/2014 06/26/2014 5 fair
I've already tried Reindex and Merge, however, It seems to me that Python/Pandas has a problem with different length of tables.
ValueError: cannot reindex a non-unique index with a method or limit