0

I have 2 csv files and I am trying compare it, where the output should come in a different csv file with the the result comparing the data in multiple file with True or false.

Could you please help me to get the right code.

Ransaka Ravihara
  • 1,786
  • 1
  • 13
  • 30
  • Welcome to StackOverflow. Please consider revising your question using the guidance here: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – jsmart Aug 06 '20 at 16:21

2 Answers2

0

Using Pythons CSV module and assuming that your CSV files use commas as delimiters:

import csv

with open("Book1.CSV") as book1, open("Book2.CSV") as book2, open("Result.csv", "w") as result:
    reader1 = csv.DictReader(book1)
    reader2 = csv.DictReader(book2)
    writer = csv.DictWriter(result, ["Instrument", "Price", "colour"])
    writer.writeheader()

    for row1, row2 in zip(reader1, reader2):
        writer.writerow({
            "Instrument": row1["Instrument"],
            "Price": str(row1["Price"] is row2["Price"]).upper(),
            "colour": str(row1["colour"] is row2["colour"]).upper(),
        })
jasLogic
  • 11
  • 1
  • 3
  • Thanks for this code, it exactly worked for me.. But I am getting a blank row after every row, like below. So how to get rid of those blank row? Instrument Price Colour (Blank Row) A TRUE FALSE (Blank Row) B TRUE FALSE (Blank Row) C FALSE FALSE – Subhijit Aug 07 '20 at 14:24
  • Could have something todo with Windows line breaks which I can't test because I don't have a Windows machine. Maybe [this](https://stackoverflow.com/questions/3348460/csv-file-written-with-python-has-blank-lines-between-each-row) can help. – jasLogic Aug 10 '20 at 08:08
  • Thank you so much, I got my answer in that link... hv a nice day. :) – Subhijit Aug 10 '20 at 12:35
0

If the column names (i.e. "Instrument", "Price", "colour") and row names (i.e. "A", "B", "C") are identical, you can do this with ==:

df1 = pd.read_csv('Book1.CSV', index_col=0)
df2 = pd.read_csv('Book2.CSV', index_col=0)

compare_df = (df1 == df2)
compare_df.to_csv('Result.csv')

It can be helpful to compare values in a type-aware fashion, for example to recognize that 1 (integer) and 1.0 (float) are equal. For that type of case pandas handles the problem really well.