I have two dataframes:
df_new
:
time_start 0 1313575263 1 1313575263 2 1313575263 3 1313579775 4 1313579775
df_information
:
my_start value 0 1313575263 foo 1 1313579775 bar
I want to use the values in df_information
to populate a new column in df_new
. If my_start
in df_information
matches time_start
in df_new
, then use the corresponding value
in df_information
to populate df_new
. Here is my desired output:
time_start value 0 1313575263 foo 1 1313575263 foo 2 1313575263 foo 3 1313579775 bar 4 1313579775 bar
I figured a way to do this using two nested loops but since I am actually working with large dataframes, it takes a lot of time to run:
import pandas as pd
import numpy as np
dict1={'time_start':[1313575263,1313575263,1313575263,1313579775,1313579775]}
dict2={'my_start':[1313575263,1313579775],'value':['foo','bar']}
df_new=pd.DataFrame.from_dict(data=dict1)
df_information=pd.DataFrame.from_dict(data=dict2)
df_new['value']=np.nan
for index_new, row_new in df_new.iterrows():
for index_information, row_information in df_information.iterrows():
if row_information['my_start']==row_new['time_start']:
df_new['value'][index_new]=df_information['value'][index_information]
Is there a more efficient way to do this? Thanks in advance for your help!