4

I am reading a text file with >10,000 number of lines.

results_file = open("Region_11_1_micron_o", 'r')

I would like to skip to the line in the file after a particular string "charts" which occurs at around line no. 7000 (different for different files). Is there a way to conveniently do that without having to read each single line of the file?

DPdl
  • 723
  • 7
  • 23
  • Possible duplicate of [Reading specific lines only (Python)](https://stackoverflow.com/questions/2081836/reading-specific-lines-only-python) – Van Peer Nov 06 '17 at 17:23

3 Answers3

5

If you know the precise line number then you can use python's linecache module to read a particular line. You don't need to open the file.

import linecache

line = linecache.getline("test.txt", 3)
print(line)

Output:

chart

If you want to start reading from that line, you can use islice.

from itertools import islice

with open('test.txt','r') as f:
    for line in islice(f, 3, None):
        print(line)

Output:

chart
dang!
It
Works

If you don't know the precise line number and want to start after the line containing that particular string, use another for loop.

with open('test.txt','r') as f:
    for line in f:
        if "chart" in line:
            for line in f:
                # Do your job
                print(line) 

Output:

dang!
It    
Works

test.txt contains:

hello
world!
chart
dang!
It
Works

I don't think you can directly skip to a particular line number. If you want to do that, then certainly you must have gone through the file and stored the lines in some format or the other. In any case, you need to traverse atleast once through the file.

Miraj50
  • 4,257
  • 1
  • 21
  • 34
  • 1
    linecache internally reads whole file into memory, so it's contradiction to OPs 'Is there a way to conveniently do that without having to read each single line of the file' need. – erhesto Nov 06 '17 at 16:20
  • @erhesto Yes, but I think if you want to go somewhere, you need to have the data somewhere, right? Take for example a list. How will you go to a particular line when you don't have the data stored somewhere. Correct me If I am wrong. – Miraj50 Nov 06 '17 at 16:24
  • Well, I totally agree with you! I'd just add this information to your answer that it might be problematic to find deterministic algorithm which might accomplish this task without reading the file at least once. Of course, it might be possible in some cases (for example - if we have predefined number of characters per line - in other words, we do know exact places of line breaks), but not in general. – erhesto Nov 06 '17 at 16:29
  • Thank you. The thing I do not always know the exact line number. I have to look for a certain string in the text file and start with the next line. – DPdl Nov 06 '17 at 17:13
  • @DPdl Then in that case you will have to go line by line. I shall update my answer. But if you have a rough idea of the line number then probably you can make it faster by skipping some of the lines as given in my answer. – Miraj50 Nov 06 '17 at 17:15
1

You can use itertools.dropwhile to consume the lines up to the point you want.

from itertools import dropwhile, islice

with open(fname) as fin:
    start_at = dropwhile(lambda L: 'Abstract' not in L.split(), fin)
    for line in islice(start_at, 1, None):
        print line
A.Bau
  • 82
  • 1
  • 10
1

If your text file has lines whose length is evenly distributed across your file you could try with seeking into thefile

from os import stat
size = stat(your_file).st_size
start = int(0.65*size)
f = open(your_file)
f.seek(start)
buff = f.read() 
n = buff.index('\nchart\n')
start = n+len('\nchart\n')
buff = buff[start:]
gboffi
  • 22,939
  • 8
  • 54
  • 85