0

I have a file (in GBs) and want to read out only (let's say) 500MB of it. Is there a way I can do this?

PS: I thought of reading in first few lines of the dataset. See how much memory it uses and then accordingly get the number of lines. I'm looking for a way that can avoid this approach.

Clock Slave
  • 7,627
  • 15
  • 68
  • 109

1 Answers1

1

You can use generator here to read lines from a file in a memory efficient way, you can refer to this Lazy Method for Reading Big File in Python?

or you can use f.read(number of lines) to read from line, lets suppose you want to read first 100 lines in a file

fname='your file name'
with open(fname) as f:
    lines=100
    content = f.read(lines)
    print content

or

by using pandas nrows (number of rows)

import pandas as pd
myfile = pd.read('your file name',nrows=1000)
pushpendra chauhan
  • 2,205
  • 3
  • 20
  • 29