I have two dataframes one containing the timestamps and another containing some tweets with their timestamps which look like this fig1 and fig2. I am trying to assign the tweets to the tweet column in timestamps dataframe.
If the timestamp is "t" then it can take all the tweets which are tweeted in the time interval of [t-30,t+30). I created a new column called tweet in the timestamp dataframe which contains empty lists and was trying to allocate tweets using this logic:
for i in range(0,len(timestamps)):
for j in tweet_data.date:
if (pd.to_timedelta([(pd.Timestamp(timestamps.date[i])-pd.Timestamp(j))]).astype('timedelta64[m]')[0]) < 30 and (pd.to_timedelta([(pd.Timestamp(timestamps.date[i])-pd.Timestamp(j))]).astype('timedelta64[m]')[0]) >= -30 :
timestamps.iloc[i].tweets.append(tweet_data.tweet[getIndexes(tweet_data, j)])
Here getIndexes() is used for getting the index of the timestamp of the to-be allocated tweet. Since both dataframes are large and the for loops are nested so it is taking so much time to execute. How can I map the tweets faster?
Thanks in advance.