0

I have 2 dataframes .... big_df and small_df

big_df
------
Typ  col1 col2 col3 ... 
A    None None None ...
B    None None None ...
A    None None None ...
C    None None None ...
B    None None None ...
D    None None None ...
E    None None None ...
F    None None None ...
.
.
.

small_df
------
Typ  col1 col3 col8 ... 
A    1.2  'a'  3
E    2.2  'z'  5
L    0.5  'w'  4
.
.
.

I need to efficiently update big_df fields using the values in small_df.

Typ is not unique in big_df.

Both DF(s) are currently indexed numerically ...0,1,2,3 .... and so on

Attempt to reindex both DF(s) by Typ will throw:

ValueError: cannot reindex from a duplicate axis

I would appreciate any suggestion/code example on best way to do this.

jscriptor
  • 775
  • 1
  • 11
  • 26
  • 1
    `big_df.update(small_df)` and then `print(big_df)` ?? – anky Apr 30 '19 at 14:27
  • Possible duplicate of [Update a pandas dataframe with data from another dataframe](https://stackoverflow.com/questions/51394653/update-a-pandas-dataframe-with-data-from-another-dataframe) – anky Apr 30 '19 at 14:31
  • This is great ... 'key' is a column (not index). how do I make sure that keys are matched before the update? Thanks – jscriptor Apr 30 '19 at 14:33
  • use `set_index()` like `big_df=big_df.set_index('key')`, same for the other `df`, then `update()` – anky Apr 30 '19 at 14:35
  • Hi ... I followed you suggested solution, but I ran into error, which I was not expecting.. I updated the question to explain. I appreciate your help in finding a work around. – jscriptor Apr 30 '19 at 15:15

1 Answers1

0

I figured out a way to solve the problem, inspired by the response I found in this post: Python Pandas update a dataframe value from another dataframe

I applied couple of for loops to handle multiple columns merge and delete situation and to fit my needs.

jscriptor
  • 775
  • 1
  • 11
  • 26