29

I am probably making a stupid mistake, but I can't find where it is. I want to count the number of lines in my csv file. I wrote this, and obviously isn't working: I have row_count = 0 while it should be 400. Cheers.

f = open(adresse,"r")
reader = csv.reader(f,delimiter = ",")
data = [l for l in reader]
row_count = sum(1 for row in reader)

print row_count
ballade4op52
  • 2,142
  • 5
  • 27
  • 42
Dirty_Fox
  • 1,611
  • 4
  • 20
  • 24
  • Possible duplicate of [Count how many lines are in a CSV Python?](http://stackoverflow.com/questions/16108526/count-how-many-lines-are-in-a-csv-python) – AjayKumarBasuthkar Mar 16 '16 at 14:26
  • Does this answer your question? [How to get line count of a large file cheaply in Python?](https://stackoverflow.com/questions/845058/how-to-get-line-count-of-a-large-file-cheaply-in-python) – Matthew Strawbridge Apr 08 '20 at 15:52
  • The reason this happens is because the reader has "emptied" itself by creating the `data` list. The reader object provides a one-time for-loop, once you've worked through it, it's gone. That's why the row_count is being read as 0: there's nothing left in the reader at that point. – Erdős-Bacon Jun 26 '20 at 01:19

7 Answers7

43
with open(adresse,"r") as f:
    reader = csv.reader(f,delimiter = ",")
    data = list(reader)
    row_count = len(data)

You are trying to read the file twice, when the file pointer has already reached the end of file after saving the data list.

jamylak
  • 128,818
  • 30
  • 231
  • 230
  • 1
    Just a note: if you do list on the reader you will lose its advantage of being generator. – MikeL Jun 21 '20 at 19:56
  • 1
    This reads potentially a lot of data into memory (although briefly) by creating that list. I think it's better to just do something like `entry_count = sum(1 for row in reader)` if we want entry count or `line_count = sum(1 for line in f)` if we want to count all rows in file (including header line). – Erdős-Bacon Jun 26 '20 at 01:16
  • @Erdős-Bacon I was just fixing the OP's code but that's correct – jamylak Jun 28 '20 at 23:01
7

First you have to open the file with open:

input_file = open("nameOfFile.csv","r+")

Then use csv.reader to open the csv:

reader_file = csv.reader(input_file)

At last, you can take the number of row with the instruction len:

value = len(list(reader_file))

The full code is as follows:

input_file = open("nameOfFile.csv","r+")
reader_file = csv.reader(input_file)
value = len(list(reader_file))

Remember that if you want to reuse the csv file, you have to make a input_file.fseek(0), because when you use a list for the reader_file, it reads the whole file, and the pointer in the file changes its position.

Community
  • 1
  • 1
protti
  • 871
  • 1
  • 9
  • 12
2

If you are working with python3 and have pandas library installed you can go with

import pandas as pd

results = pd.read_csv('f.csv')

print(len(results))
Hadi GhahremanNezhad
  • 2,377
  • 5
  • 29
  • 58
2

I would consider using a generator. It would do the job and keeps you safe from MemoryError of any kind

def generator_count_file_rows(input_file):
    for row in open(input_file,'r'):
        yield row

And then

for row in generator_count_file_rows('very_large_set.csv'):
        count+=1
hellbreak
  • 361
  • 4
  • 14
1

The important stuff is hidden in comments section of solution which is marked correct.

Re-sharing Erdős-Bacon's solution here for better visibility.

Why ? Because: It saves lot of memory without having to create list.

So I think it is better do this way


def read_raw_csv(file_name):
    with open(file_name, 'r') as file:
        csvreader = csv.reader(file)

        # count number of rows
        entry_count = sum(1 for row in csvreader)
        print(entry_count-1)  # -1 is for discarding header row.

Checkout this link for more info

Shakeel
  • 1,869
  • 15
  • 23
0
# with built in libraries
opened_file = open('f.csv')
from csv import reader

read_file = reader(opened_file)
apps_data = list(read_file)

rowcount = len(apps_data) #which incudes header row

print("Total rows incuding header: " + str(rowcount))
MuraliK
  • 1
  • 1
-6

Simply Open the csv file in Notepad++. It shows the total row count in a jiffy. :) Or in cmd prompt , Provide file path and key in the command find \c \v "some meaningless string" Filename.csv