0

I have two dataframes:

  • df_new:
time_start
0  1313575263
1  1313575263
2  1313575263
3  1313579775
4  1313579775 
  • df_information:
 my_start value 
 0  1313575263 foo  
 1  1313579775 bar

I want to use the values in df_information to populate a new column in df_new. If my_start in df_information matches time_start in df_new, then use the corresponding value in df_information to populate df_new. Here is my desired output:

 time_start value 
 0  1313575263   foo 
 1  1313575263   foo 
 2  1313575263   foo 
 3  1313579775   bar 
 4  1313579775   bar

I figured a way to do this using two nested loops but since I am actually working with large dataframes, it takes a lot of time to run:

import pandas as pd
import numpy as np

dict1={'time_start':[1313575263,1313575263,1313575263,1313579775,1313579775]}
dict2={'my_start':[1313575263,1313579775],'value':['foo','bar']}

df_new=pd.DataFrame.from_dict(data=dict1)
df_information=pd.DataFrame.from_dict(data=dict2)

df_new['value']=np.nan

for index_new, row_new in df_new.iterrows():
    for index_information, row_information in df_information.iterrows():  
        if row_information['my_start']==row_new['time_start']:
            df_new['value'][index_new]=df_information['value'][index_information]

Is there a more efficient way to do this? Thanks in advance for your help!

Sheldon
  • 4,084
  • 3
  • 20
  • 41

0 Answers0