How can I skip the header row and start reading a file from line2?
8 Answers
with open(fname) as f:
next(f)
for line in f:
#do something

- 307,395
- 66
- 306
- 293
-
66if you need the header later, instead of `next(f)` use `f.readline()` and store it as a variable – damned Oct 08 '13 at 05:38
-
40Or use `header_line = next(f)`. – Samuel Jan 06 '15 at 23:41
-
1myone complains `'file' object is not an iterator` if I use `next(f)` approach. Instead the `f.readline()` worked. – soMuchToLearnAndShare Sep 27 '20 at 20:11
-
1@soMuchToLearnAndShare Which python version did you use? It's better to provide information for reproducing the error. – Quan Hoang Oct 26 '21 at 17:22
f = open(fname,'r')
lines = f.readlines()[1:]
f.close()

- 19,015
- 9
- 33
- 33
-
-
3@LjubisaLivac is right - this answer generalises to any line, so this is a much more powerful solution. – Daniel Soutar Jan 25 '18 at 23:20
-
26This is fine UNTIL the file is too large to read. This is fine for small files. – CppLearner Feb 05 '18 at 02:05
-
1The slice also builds a *copy* of the contents. This is just unnecessarily inefficient. – chepner Dec 24 '19 at 13:43
-
1What about using `consume()` from `more-itertools` as stated in https://docs.python.org/3/library/itertools.html#itertools-recipes ? I heard about this on https://stackoverflow.com/questions/11113803 – AnotherParker Jun 05 '20 at 20:32
If you want the first line and then you want to perform some operation on file this code will helpful.
with open(filename , 'r') as f:
first_line = f.readline()
for line in f:
# Perform some operations

- 4,439
- 2
- 24
- 21
-
1It is not necessary to assign readline() to a variable if one does not need this line. I like this solution most, however. – Anna Apr 30 '19 at 20:37
-
Mixing direct reads with using the file as an iterator isn't recommended (although in this specific case no harm is done). – chepner Dec 24 '19 at 13:44
If slicing could work on iterators...
from itertools import islice
with open(fname) as f:
for line in islice(f, 1, None):
pass

- 5,413
- 2
- 34
- 25
-
2This is a really nice and pythonic way of solving the problem and can be extended to an arbitrary number of header lines – Dai Feb 05 '18 at 17:30
-
-
-
1
-
This solution is really good. This even works for in-memory uploaded file while iterating over file object. – haccks Nov 04 '20 at 12:40
f = open(fname).readlines()
firstLine = f.pop(0) #removes the first line
for line in f:
...

- 6,837
- 9
- 39
- 56
-
4This will read the entire file into memory at once, so it's only practical if you're reading a fairly small file. – Hayden Schiff Dec 04 '18 at 03:56
To generalize the task of reading multiple header lines and to improve readability I'd use method extraction. Suppose you wanted to tokenize the first three lines of coordinates.txt
to use as header information.
Example
coordinates.txt
---------------
Name,Longitude,Latitude,Elevation, Comments
String, Decimal Deg., Decimal Deg., Meters, String
Euler's Town,7.58857,47.559537,0, "Blah"
Faneuil Hall,-71.054773,42.360217,0
Yellowstone National Park,-110.588455,44.427963,0
Then method extraction allows you to specify what you want to do with the header information (in this example we simply tokenize the header lines based on the comma and return it as a list but there's room to do much more).
def __readheader(filehandle, numberheaderlines=1):
"""Reads the specified number of lines and returns the comma-delimited
strings on each line as a list"""
for _ in range(numberheaderlines):
yield map(str.strip, filehandle.readline().strip().split(','))
with open('coordinates.txt', 'r') as rh:
# Single header line
#print next(__readheader(rh))
# Multiple header lines
for headerline in __readheader(rh, numberheaderlines=2):
print headerline # Or do other stuff with headerline tokens
Output
['Name', 'Longitude', 'Latitude', 'Elevation', 'Comments']
['String', 'Decimal Deg.', 'Decimal Deg.', 'Meters', 'String']
If coordinates.txt
contains another headerline, simply change numberheaderlines
. Best of all, it's clear what __readheader(rh, numberheaderlines=2)
is doing and we avoid the ambiguity of having to figure out or comment on why author of the the accepted answer uses next()
in his code.

- 494
- 7
- 17
If you want to read multiple CSV files starting from line 2, this works like a charm
for files in csv_file_list:
with open(files, 'r') as r:
next(r) #skip headers
rr = csv.reader(r)
for row in rr:
#do something
(this is part of Parfait's answer to a different question)

- 14,289
- 18
- 86
- 145
# Open a connection to the file
with open('world_dev_ind.csv') as file:
# Skip the column names
file.readline()
# Initialize an empty dictionary: counts_dict
counts_dict = {}
# Process only the first 1000 rows
for j in range(0, 1000):
# Split the current line into a list: line
line = file.readline().split(',')
# Get the value for the first column: first_col
first_col = line[0]
# If the column value is in the dict, increment its value
if first_col in counts_dict.keys():
counts_dict[first_col] += 1
# Else, add to the dict and set value to 1
else:
counts_dict[first_col] = 1
# Print the resulting dictionary
print(counts_dict)

- 39,665
- 11
- 104
- 149