I am working on analyzing financial donations made to an NGO and how their social media engagement has impacted these donations. To this end, I wanted to group together donations made to the organizations with the dates on which social media posts were made. The dates of donations are stored in a dataframe under a 'Transaction Date' header, and the dates on which social media posts have been made have been scraped from Facebook and put in an array. The datatype for donation dates is datetime64[ns]
and the datatype for social media post dates is datetime.date
.
This is my code, roughly reproduced. Any help with regards to what changes I could make?
donation_timestamp = pd.DataFrame()
donation_timestamp['Dates'] = np.array(['2019-05-01', '2019-05-12', '2019-05-23'])
donation_timestamp['Dates'] = pd.to_datetime(donation_timestamp['Dates'])
post_dates = pd.to_datetime(scraped_dates)
post_timestamps = []
for i in post_dates:
time = dt.datetime.strptime(str(i), "%Y-%m-%d %H:%M:%S").date()
post_timestamps.append(time)
post_timestamps = np.array(post_timestamps)
post_donations = dict.fromkeys(post_timestamps)
for i in post_timestamps:
for j in donation_timestamp:
if j-i < dt.timedelta(days=2):
post_donations[i] = np.append(post_donations[i], j)
I created a new dictionary with the social media posts dates as keys, and tried to iterate over both these arrays. Wherever a donation has been made within two days of a social media post, I have tried to classify that donation date under the respective post date in my dictionary. I ran into problems with this loop logic, and my condition statement. For some reason, my pd.timedelta
definition to check for the difference between two dates is not working - all iterations are satisfying the condition. Also I don't understand how to convert the datatypes so that I can seamlessly take the difference between the two dates.