7

I think I may be missing something obvious here, but I am new to python and pandas. I am reading a large text file and only want to use rows in range(61,75496). I can skip the first 60 rows with

keywords = pd.read_csv('keywords.list', sep='\t', skiprows=60)

How can I only include the rows inbetween these values? There unfortunately is no userows parameter.

Is there something like

range(start, stop, start, stop)?
PandaBearSoup
  • 699
  • 3
  • 9
  • 20

3 Answers3

7

From the documentation, you can skip first few rows using

skiprows = X

where X is an integer. If there's a header, for example, a few rows into your file, you can also skip straight to the header using

header = X

Skip rows starting from the bottom of the file and counting upwards using

skipfooter = X

All together to set the header to row 3 (and skip the rows above) and ignore the bottom 4 rows: pd.read_csv('path/or/url/to/file.csv', skiprows=3, skipfooter=4)

k3t0
  • 81
  • 1
  • 3
6

Maybe you can use the nrows argument to give the number of rows to read.

From documentation -

nrows : int, default None
Number of rows of file to read. Useful for reading pieces of large files

Code -

keywords = pd.read_csv('keywords.list', sep='\t', skiprows=60,nrows=75436) #Here 75436 is 75496 - 60
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
1

You can use the nrows parameter

keywords = pd.read_csv('keywords.list', sep='\t', skiprows=60, nrows=(74596-60))
Jeremy Fisher
  • 2,510
  • 7
  • 30
  • 59