6

In an CSV file with python we can read all the file line by line or row by row , I want to read specific line (line number 24 example ) without reading all the file and all the lines.

Mazdak
  • 105,000
  • 18
  • 159
  • 188
user3967257
  • 69
  • 1
  • 5
  • possible duplicate of [Start reading and writing on specific line on CSV with Python](http://stackoverflow.com/questions/11618207/start-reading-and-writing-on-specific-line-on-csv-with-python) – GhitaB Jun 21 '15 at 11:59

2 Answers2

9

You can use linecache.getline:

linecache.getline(filename, lineno[, module_globals])

Get line lineno from file named filename. This function will never raise an exception — it will return '' on errors (the terminating newline character will be included for lines that are found).

import linecache


line = linecache.getline("foo.csv",24)

Or use the consume recipe from itertools to move the pointer:

import collections
from itertools import islice

def consume(iterator, n):
    "Advance the iterator n-steps ahead. If n is none, consume entirely."
    # Use functions that consume iterators at C speed.
    if n is None:
        # feed the entire iterator into a zero-length deque
        collections.deque(iterator, maxlen=0)
    else:
        # advance to the empty slice starting at position n
        next(islice(iterator, n, n), None)

with open("foo.csv") as f:
    consume(f,23)
    line = next(f)
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • @xtofl, a file object is its own iterator, when you `for line in f:...`, next is repeatedly called – Padraic Cunningham Jun 21 '15 at 12:12
  • 1
    and to start the reading from a specific line and not from the beginning? it workwith simply consume(f,X) and increment the X each time (initialize the X on the desired position), thanks for your usefull answer :) – user3967257 Jun 21 '15 at 12:37
  • @user3967257, use the consume recipe if you want to start from a certain line, the second arg to consume is the amount of lines to consume then just `for line in f...` to read the rest of the lines. – Padraic Cunningham Jun 21 '15 at 12:40
  • this is what I mean for i in range(X,limit): consume(f,i) – user3967257 Jun 21 '15 at 12:47
0

Alternatively you can leverage the nrows and skiprows argument in pandas

line_number = 30
pd.read_csv('big.csv.gz', sep = "\t", nrows = 1, skiprows = line_number - 1)

remember skiprows can be a list so if you need the header use

pd.read_csv('big.csv.gz', sep = "\t", nrows = 1, skiprows = list(range(1, line_number - 1)))
Areza
  • 5,623
  • 7
  • 48
  • 79