Reading rows from a CSV file in Python

Question

I have a CSV file, here is a sample of what it looks like:

Year:  Dec: Jan:
1      50   60
2      25   50
3      30   30
4      40   20
5      10   10

I know how to read the file in and print each column (for ex. - ['Year', '1', '2', '3', etc]). But what I actually want to do is read the rows, which would be like this ['Year', 'Dec', 'Jan'] and then ['1', '50', '60'] and so on.

And then I would like to store those numbers ['1', '50', '60'] into variables so I can total them later for ex.:

Year_1 = ['50', '60']. Then I can do sum(Year_1) = 110.

How would I go about doing that in Python 3?

score 122 · Answer 1 · edited Jan 29 '19 at 18:58

122

Use the csv module:

import csv

with open("test.csv", "r") as f:
    reader = csv.reader(f, delimiter="\t")
    for i, line in enumerate(reader):
        print 'line[{}] = {}'.format(i, line)

Output:

line[0] = ['Year:', 'Dec:', 'Jan:']
line[1] = ['1', '50', '60']
line[2] = ['2', '25', '50']
line[3] = ['3', '30', '30']
line[4] = ['4', '40', '20']
line[5] = ['5', '10', '10']

edited Jan 29 '19 at 18:58

Jemshit

9,501
5
69
106

answered Nov 17 '12 at 06:48

Joel Cornett

24,192
9
66
88

1

How would I make it so it prints the lines separately and not all together (ex. line 0 = ['Year:', 'Dec:', 'Jan:']), I tried print (line[0]) but it didn't work. – Goose Nov 17 '12 at 06:57
2

I was getting the following error in python3: `iterator should return strings, not bytes (did you open the file in text mode?)` and solved it by changing `rb` to `rt`. – J0ANMM Feb 19 '18 at 17:24
1

@J0ANMM Good callout. This answer was written at a time when Python 3 did not have as wide adoption and was thus implicitly targeted to Python 2. I will update the answer accordingly. – Joel Cornett Feb 19 '18 at 17:58

Ashwini Chaudhary · Accepted Answer · 2019-03-30T19:02:04.360

44

You could do something like this:

with open("data1.txt") as f:
    lis = [line.split() for line in f]        # create a list of lists
    for i, x in enumerate(lis):              #print the list items 
        print "line{0} = {1}".format(i, x)

# output 
line0 = ['Year:', 'Dec:', 'Jan:']
line1 = ['1', '50', '60']
line2 = ['2', '25', '50']
line3 = ['3', '30', '30']
line4 = ['4', '40', '20']
line5 = ['5', '10', '10']

or :

with open("data1.txt") as f:
    for i, line in enumerate(f):             
        print "line {0} = {1}".format(i, line.split())

# output         
line 0 = ['Year:', 'Dec:', 'Jan:']
line 1 = ['1', '50', '60']
line 2 = ['2', '25', '50']
line 3 = ['3', '30', '30']
line 4 = ['4', '40', '20']
line 5 = ['5', '10', '10']

Edit:

with open('data1.txt') as f:
    print "{0}".format(f.readline().split())
    for x in f:
        x = x.split()
        print "{0} = {1}".format(x[0],sum(map(int, x[1:])))

# output          
['Year:', 'Dec:', 'Jan:']
1 = 110
2 = 75
3 = 60
4 = 60
5 = 20

edited Mar 30 '19 at 19:02

answered Nov 17 '12 at 06:59

Ashwini Chaudhary

244,495
58
464
504

See the comment I left for @Joel Cornett's answer – Goose Nov 17 '12 at 07:50
@Goose you can do `lis[0]` to get line 0, see my edited answer. – Ashwini Chaudhary Nov 17 '12 at 07:53
Ok got it, now how can I find the element in lis[0]? For example, I need to total the month numbers (50+60) so for year 1 it would be 110. lis[0][0] doesn't work for me. That was my main goal. – Goose Nov 17 '12 at 20:36
@Goose see my edited answer, You didn't mentioned this at all in the original question. – Ashwini Chaudhary Nov 17 '12 at 20:48
Sorry I thought once I could read the columns I could figure it out myself. But your edited method isnt working for my "actual" file for some reason. See: http://i.imgur.com/EORK2.png. What I was trying to do is store each of the totals in a variable. so year1 = 110, etc. I'm not trying to just print it out, sorry for being so vague. I thought it would've been easier to do when I posted the question. – Goose Nov 17 '12 at 21:36
@Goose better post some content of the original file in the question body. – Ashwini Chaudhary Nov 17 '12 at 22:17
It's probably easier to give you the file since it is quite large: http://www.fileswap.com/dl/JYRBewhgvE/ – Goose Nov 17 '12 at 22:21

The Unfun Cat · Answer 3 · 2017-01-26T08:33:55.530

21

Reading it columnwise is harder?

Anyway this reads the line and stores the values in a list:

for line in open("csvfile.csv"):
    csv_row = line.split() #returns a list ["1","50","60"]

Modern solution:

# pip install pandas
import pandas as pd 
df = pd.read_table("csvfile.csv", sep=" ")

edited Jan 26 '17 at 08:33

answered Nov 17 '12 at 06:39

The Unfun Cat

29,987
31
114
156

When I implement this into my program I get an error: 'list' object has no attribute 'split' – Goose Nov 17 '12 at 06:55
Works like a charm here on 2.7 and 3.3 – The Unfun Cat Nov 17 '12 at 07:00
Perhaps its my file, the text above is just a sample, the real file is much bigger. – Goose Nov 17 '12 at 07:11
Size hasn't got anything to do with it. It is with your program, which we would need to see to help you further :) – The Unfun Cat Nov 17 '12 at 07:13
And what if a value in the line contains the split character? – Alexandre Nucera Nov 10 '16 at 01:06
Then one of the other solutions is better. Of course, these days I would do it in <3 Pandas <3 – The Unfun Cat Nov 11 '16 at 07:47
in 2023, it should be `df = pd.read_csv('csvfile.csv', sep=' ', header=0)` – Raptor Mar 22 '23 at 08:13

score 8 · Answer 4 · answered May 23 '20 at 16:55

The Easiest way is this way :

from csv import reader

# open file in read mode
with open('file.csv', 'r') as read_obj:
    # pass the file object to reader() to get the reader object
    csv_reader = reader(read_obj)
    # Iterate over each row in the csv using reader object
    for row in csv_reader:
        # row variable is a list that represents a row in csv
        print(row)

output:
['Year:', 'Dec:', 'Jan:']
['1', '50', '60']
['2', '25', '50']
['3', '30', '30']
['4', '40', '20']
['5', '10', '10']

score 5 · Answer 5 · edited Dec 16 '18 at 02:34

5

import csv

with open('filepath/filename.csv', "rt", encoding='ascii') as infile:
    read = csv.reader(infile)
    for row in read :
        print (row)

This will solve your problem. Don't forget to give the encoding.

edited Dec 16 '18 at 02:34

Blairg23

11,334
6
72
72

answered Aug 03 '16 at 11:20

prashasthbaliga

61
1
6

seems incorrect access mode. it should be `r+` , not `rt` – mootmoot Jan 20 '17 at 10:49

score 4 · Answer 6 · answered Jul 24 '16 at 10:33

#  This program reads columns in a csv file
import csv
ifile = open('years.csv', "r")
reader = csv.reader(ifile)

# initialization and declaration of variables
rownum = 0
year = 0
dec = 0
jan = 0
total_years = 0`

for row in reader:
    if rownum == 0:
        header = row  #work with header row if you like
    else:
    colnum = 0
    for col in row:
        if colnum == 0:
            year = float(col)
        if colnum == 1:
            dec = float(col)
        if colnum == 2:
            jan = float(col)
        colnum += 1
    # end of if structure

# now we can process results
if rownum != 0:
    print(year, dec, jan)
    total_years = total_years + year
    print(total_years)

# time to go after the next row/bar
rownum += 1

ifile.close()

A bit late but nonetheless... You need to create and identify the csv file named "years.csv":

Year Dec Jan 1 50 60 2 25 50 3 30 30 4 40 20 5 10 10

You forgot to indent the code block after `else`. But nice solution. — Joey, Apr 30 '19 at 16:12

score 4 · Answer 7 · edited Jun 20 '20 at 09:12

4

Example:

import pandas as pd

data = pd.read_csv('data.csv')

# read row line by line
for d in data.values:
  # read column by index
  print(d[2])

edited Jun 20 '20 at 09:12

Community

1
1

answered Jun 02 '19 at 06:14

bikram

7,127
2
51
63

score 2 · Answer 8 · answered Sep 02 '19 at 09:45

The csv module handles csv files by row. If you want to handle it by column, pandas is a good solution.

Besides, there are 2 ways to get all (or specific) columns with pure simple Python code.

1. csv.DictReader

with open('demo.csv') as file:
    data = {}
    for row in csv.DictReader(file):
        for key, value in row.items():
            if key not in data:
                data[key] = []
            data[key].append(value)

It is easy to understand.

2. csv.reader with zip

with open('demo.csv') as file:
    data = {values[0]: values[1:] for values in zip(*csv.reader(file))}

This is not very clear, but efficient.

zip(x, y, z) transpose (x, y, z), while x, y, z are lists. *csv.reader(file) make (x, y, z) for zip, with column names.

Demo Result

The content of demo.csv:

a,b,c
1,2,3
4,5,6
7,8,9

The result of 1:

>>> print(data)
{'c': ['3', '6', '9'], 'b': ['2', '5', '8'], 'a': ['1', '4', '7']}

The result of 2:

>>> print(data)
{'c': ('3', '6', '9'), 'b': ('2', '5', '8'), 'a': ('1', '4', '7')}

score 0 · Answer 9 · edited Mar 30 '19 at 15:24

0

One can do it using pandas library.

Example:

import numpy as np
import pandas as pd

file = r"C:\Users\unknown\Documents\Example.csv"
df1 = pd.read_csv(file)
df1.head()

edited Mar 30 '19 at 15:24

Szymon Maszke

22,747
4
43
83

answered Mar 30 '19 at 13:31

srikanth reddy

1

score 0 · Answer 10 · answered Apr 22 '20 at 18:09

I just leave my solution here.

import csv
import numpy as np

with open(name, newline='') as f:
    reader = csv.reader(f, delimiter=",")
    # skip header
    next(reader)
    # convert csv to list and then to np.array
    data  = np.array(list(reader))[:, 1:] # skip the first column

print(data.shape) # => (N, 2)

# sum each row
s = data.sum(axis=1)
print(s.shape) # => (N,)

Reading rows from a CSV file in Python

10 Answers10

Example:

1. csv.DictReader

2. csv.reader with zip

Demo Result

Linked