struggling to get to grips with update/merge/concat etc to achieve what I thought was a simple merge or update procedure. I have df (recorded values) and df2 (times series with 0s), and would like to update/replace the values in df2 with the corresponding recorded values of df, matching on date_time.
Example: df=
Date_Time Perc
03/03/2010 00:05 1.0
03/03/2010 00:15 2.0
df2 =
Date_Time Perc
03/03/2010 00:00 0.0
03/03/2010 00:05 0.0
03/03/2010 00:10 0.0
03/03/2010 00:15 0.0
03/03/2010 00:20 0.0
Where the result would return:
Date_Time Perc
03/03/2010 00:00 0.0
03/03/2010 00:05 1.0
03/03/2010 00:10 0.0
03/03/2010 00:15 2.0
03/03/2010 00:20 0.0
I'm finding this frustrating as there is good info and examples on http://pandas.pydata.org/pandas-docs/stable/merging.html and a great SO question with multiple solutions on How to update values in a specific row in a Python Pandas DataFrame? but having tried several approaches, none work.
Tried so far: Re-indexing to Date_time and df2.update(df) with variations, multiple merging/join/concat variations, an adapted definition (below) with apply...and now wondering if I need to use iterrows (see below?). Any suggestions on pointers in the right direction would be appreciated...maybe I'm missing something fundamental in my approach...
def update_vals(row, data=data):
if row.Date_Time == df.Date_Time:
row.Perc = df.Perc
return row
for index, row in df2['Date_Time'].iterrows():
x = df2['Date_Time']
for index, row in df['Date_Time'].iterrows():
x2 = df['Date_Time']
if x2 ==x:
df2['Perc'] = df['Perc']
Thought this would work as well, but it results in (ValueError: cannot reindex from a duplicate axis)
df.set_index('Date_Time', inplace=True)
df2.set_index('Date_Time', inplace=True)
df2.update(df[['Perc']])