Python (3.7) CSV Sort/Sum by Field Value

Question

I have a csv file (of indefinite size) that I would like to read and do some work with. Here is the structure of the csv file:

User, Value
CN,500.00
CN,-250.00
CN,360.00
PT,200.00
PT,230.00
...

I would like to read the file and get the sum of each row where the first field is the same. I have been trying the following just to try and identify a value for the first field:

with open("Data.csv", newline='') as data:
    reader = csv.reader(data)
    for row in reader:
        if row.startswith('CN'):
            print("heres one")

This fails because startswith does not work on a list object. I have also tried using readlines().

EDIT 1:

I can currently print the following dataframe object with the sorted sums:

         Value
User
CN    3587881.89
D        1000.00
KC    1767783.99
REC     12000.00
SB      25000.00
SC    1443039.12
SS          0.00
T     9966998.93
TH    2640009.32
ls        500.00

I get this output using this code:

mydata=pd.read_csv('Data.csv')
out = mydata.groupby(['user']).sum()
print(out)

Id now like be able to write if statements for this object. Something like:

if out contains User 'CN'
    varX = Value for 'CN'

because this is now a dataframe type I am having trouble setting the Value to a variable for a specific user.

@yatu I have not looked into that as an option yet, but I will now — Newb 4 You BB, Jun 17 '19 at 15:51
Check https://stackoverflow.com/questions/39922986/pandas-group-by-and-sum — yatu, Jun 17 '19 at 15:52

score 1 · Accepted Answer · answered Jun 17 '19 at 15:54

1

You can do the followings:

import pandas as pd
my_data= pd.read_csv('Data.csv')
my_data.group_by('user').sum()

answered Jun 17 '19 at 15:54

Naik

1,085
8
14

score 0 · Answer 2 · answered Jun 17 '19 at 15:54

0

you can use first row element:

import csv

with open("Data.csv", newline='') as data:
    reader = csv.reader(data)
    for row in reader:
        if row[0].startswith('CN'):
            print("heres one")

answered Jun 17 '19 at 15:54

cccnrc

1,195
11
27

score 0 · Answer 3 · answered Jun 17 '19 at 15:56

Using collections.defaultdict

Ex:

import csv
from collections import defaultdict 

result = defaultdict(int)
with open(filename, newline='') as data:
    reader = csv.reader(data)
    next(reader)
    for row in reader:
        result[row[0]] += float(row[1])

print(result)

Output

defaultdict(<class 'int'>, {'CN': 610.0, 'PT': 430.0})

Python (3.7) CSV Sort/Sum by Field Value

3 Answers3