Is there a way to save highlighted part of a dataframe to an excel?

Question

I'm trying to compare two dataframes (headers same in both of them)and highlighted the data which is not similar in both the frames .

Now I want to print those rows which are highlighted to an excel sheet keeping the headers. And I'm unable to do that

Example output image

Select the rows you want via `.loc`, use the `.to_excel` method — Paul H, Jun 26 '19 at 05:57
How do I select these particular rows .Is there a code snippet or something for me to work on. Forgive me I'm pretty new to this . @PaulH — Murtuza Akhtari, Jun 26 '19 at 06:03
Please copy and paste your dataframe here, and what you tried — U13-Forward, Jun 26 '19 at 06:06
I posted a picture for reference in the question @U9-Forward — Murtuza Akhtari, Jun 26 '19 at 06:09
@MurtuzaAkhtari [NO, pictures are useless](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-on-so-when-asking-a-question) — U13-Forward, Jun 26 '19 at 06:10
There are *dozens* of questions on stack overflow about selecting rows out of data frames. Do none of them answer that for you? — Paul H, Jun 26 '19 at 14:07

Wytamma Wirth · Accepted Answer · 2019-06-27T09:55:14.857

You can check for differences by comparing each element of each corresponding row (here I use the unique id column to find corresponding rows). If there is a difference you can append it to a new dataframe. Finally save the new dataframe to excel format.

df_differnt_rows = pd.DataFrame(columns=['id','B','C'])
df1 = pd.DataFrame([[1,2,3],[2,2,3],[3,2,3]], columns=['id','B','C'])
df2 = pd.DataFrame([[1,2,3],[2,"different",2],[3,2,3]], columns=['id','B','C'])

for i, row in df1.iterrows():
  compare_row = df2.loc[df2['id'] == row['id']].iloc[0]
  if all(row == compare_row):
      continue
  df_differnt_rows = df_differnt_rows.append(compare_row)

This produces another df that has all the rows that are different between df1 and df2.

print(df_differnt_rows)

    id  B           C
1   2   different   2

Save using .to_excel() method:

df_differnt_rows.to_excel('df_differnt_rows.xlsx')

Check out openpyxl (i.e. patternfill) if you want to highlight cells in the excel file.

you (generally) don't need to loop with data frames. – Paul H Jun 26 '19 at 14:06 — Paul H, Jun 26 '19 at 14:06
@PaulH I updated the answer to remove loop :) – Wytamma Wirth Jun 27 '19 at 01:16 — Wytamma Wirth, Jun 27 '19 at 01:16

Aman Relan · Answer 2 · 2019-06-26T06:11:23.393

-1

Step 1 :- Select the row you want and store it in a new frame, say df ( selecting rows in python can be done using this) .

Step 2 :- Use this :-

df.to_excel (r'C:\Users\Desktop\selected_dataframe.xlsx')

#Don't forget to add '.xlsx' at the end of the path

edited Jun 26 '19 at 06:11

answered Jun 26 '19 at 06:08

Aman Relan

373
1
2
12

How do I select the row i want? – Murtuza Akhtari Jun 26 '19 at 06:11
Check the edited one – Aman Relan Jun 26 '19 at 06:11
I actually want to select the rows which I have highlighted . And I'm unable to find a way to access rows on the basis of that – Murtuza Akhtari Jun 26 '19 at 06:16
Did you read the link I've attached. you need to do something similar like that, probably and use a little tweak to be able to do so – Aman Relan Jun 26 '19 at 06:18
yes I did . will work on it . and will let you know about the result – Murtuza Akhtari Jun 26 '19 at 06:19

Is there a way to save highlighted part of a dataframe to an excel?

2 Answers2