How do find the sum of difference between two csv

Question

I have two csv files.

Csv 1
A 1
B 2
C 3

Csv 2
A 10
B 20
C 30

How do i find the sum difference of column 2

Like 10-1+20-2+30-3

So output csv would be

A 9
B 18
C 27

Total 54

I can achieve the difference but not total. Any help?

You should show what code you have. – Akaisteph7 Jul 30 '19 at 03:16 — Akaisteph7, Jul 30 '19 at 03:16
I trying sir but it is not allowing me to edit. Using phone – James Jul 30 '19 at 03:21 — James, Jul 30 '19 at 03:21

PeptideWitch · Answer 1 · 2019-07-30T03:27:25.077

1

Time to get familiar with pandas:

import pandas as pd
data = pd.read_csv(file_path, columns=['name', 'value'])
sum = data['value'].sum()

Pandas has heaps of well-documented functions that you can use to determine differences between columns/rows.

Or, for your example:

df = pd.DataFrame(columns=['name', 'value'])
for file in filepaths:
    new_df = pd.read_csv(file, columns=['name', 'value'])
    pd.append(df, new_df)
sum = data['value'].sum()

But this is quite basic and loses some information regarding where the original data came from, in which case you want to start thinking about how to properly construct your dataframe with indexes

edited Jul 30 '19 at 03:27

answered Jul 30 '19 at 03:20

PeptideWitch

2,239
14
30

You are only achieving the second part of the question an not addressing adding to csv or the differences. also `data` in your second code is incorrect. and what is that link you are pointing to supposed to help with? – Akaisteph7 Jul 30 '19 at 12:28

Rakesh · Accepted Answer · 2019-07-30T06:54:52.853

1

Using the inbuild csv module.

Ex:

import csv

with open("Csv 1.csv") as csvfile_1, open("Csv 2.csv") as csvfile_2, open('outfile.csv', "w", newline="") as outfile_csv:
    reader = csv.reader(csvfile_1, delimiter=" ")     #read file 1
    reader2 = csv.reader(csvfile_2, delimiter=" ")    #read file 2
    writer = csv.writer(outfile_csv, delimiter=" ")   #Create writer object
    c = 0
    for m, n in zip(reader2, reader):
        val = int(m[1])-int(n[1])
        writer.writerow([m[0], int(m[1])-int(n[1])])  #Write result.
        c += val
    writer.writerow(["Total", c])  #Write Total

edited Jul 30 '19 at 06:54

answered Jul 30 '19 at 04:49

Rakesh

81,458
17
76
113

But how can i achieve total at the end. ? – James Jul 30 '19 at 05:08
@James. Did not see the last bit. Updated snippet – Rakesh Jul 30 '19 at 06:55

SM Abu Taher Asif · Answer 3 · 2019-07-30T03:39:19.317

first read your csv using pandas(you have to use your fileName/ path instead of file1.csv, file2.csv):

import pandas as pd
df1 = pd.read_csv('file1.csv',header=None)
df2 = pd.read_csv('file2.csv',header=None)

then find difference into another dataframe:

df3 = df2
df3[1] = df2[1].sub(df1[1], axis = 0)

then write your final datafarame into csv:

df3.to_csv('file3.csv')

you can find sum:

print(df3[1].sum())

output: 54

print(df3)

output:

How do find the sum of difference between two csv

3 Answers3