2

Here is the original post: Python Pandas update a dataframe value from another dataframe

df1 and df2 have the same data structure.

The original column order: assignee id issuetype key

The problem is that after runing the following code:

df1.set_index('key',inplace=True)
df1.update(df2.set_index('key'))
df1.reset_index() 

The column order of df1 changed to this:

key assignee id issuetype

How can I recover the initial structure after update one dataframe with another dataframe? thanks.

Ming
  • 379
  • 1
  • 6
  • 20

1 Answers1

2

Change order by original columns by DataFrame.reindex:

cols = df1.columns
df1.set_index('key',inplace=True)
df1.update(df2.set_index('key'))
df1 = df1.reset_index().reindex(cols, axis=1)
print (df1)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    I tired it, it didn't work. The "Key" is still the first column. – Ming Sep 10 '20 at 07:12
  • 1
    @Ming - Sorry, answer was edited. `cols` is necessary use for variable – jezrael Sep 10 '20 at 07:13
  • 1
    it didn't work on my side, here is my code: df1=pd.read_csv("update_1.csv") df2=pd.read_csv("update_2.csv") cols = df1.columns df1.set_index('key',inplace=True) df1.update(df2.set_index('key')) df1.reset_index().reindex(cols, axis=1) – Ming Sep 10 '20 at 07:20
  • 1
    @Ming - Is assigned back output? Like `df1 = df1.reset_index().reindex(cols, axis=1)` and then `print (df1)` ? Answer was edited. – jezrael Sep 10 '20 at 07:30
  • 1
    Yes, the printed out structure is not recovered. – Ming Sep 10 '20 at 07:34
  • 1
    @Ming - What is `print (df1.columns)` before `df1 = df1.reset_index().reindex(cols, axis=1)`? – jezrael Sep 10 '20 at 07:35
  • 1
    df1 = df1.reset_index().reindex(cols, axis=1) worked. but it added new blank column as the firts column. the new structure is this : "Blank Column" Unnamed:0 assignee id issuetype key – Ming Sep 10 '20 at 07:59
  • 1
    @Ming - So in original data is `Blank Column" Unnamed: 0` ? What is `print (cols)` ? – jezrael Sep 10 '20 at 08:01
  • 1
    the printed out content of original df1 and updated df1 are the same, but if you compare the CSV files, the output csv file (df1.to_csv(r'output.csv'))has a blank first column. – Ming Sep 10 '20 at 08:20
  • 1
    @Ming - Do you use `df1.to_csv(file, index=False)` ? – jezrael Sep 10 '20 at 08:21
  • 1
    index=False works, the index column is gone. Thank you very much, jezrael ! – Ming Sep 10 '20 at 08:26
  • 1
    another question if df1 and df2 have the same data structure, I want to update df1 with df2 when "key" values are matched and also add the record from df2 when its key value is not matched with df1, what kind of function should I use. Thanks. – Ming Sep 11 '20 at 02:54
  • 1
    @Ming - Can you create new question with sample data `df1, df2` and expected output? – jezrael Sep 11 '20 at 05:07