0

I would like some help in figuring out how I can print out only a given number of lines in a .txt file.

I created a function file(x,y) with 2 input parameters, the first one 'x' which is the file, and the second one 'y' which is what decides how many lines it's going to print.

Example: lets say that the files name is x.txt and the contents inside the file are:

>Sentence 1
I like playing games
>Sentence 2
I like jumping around
>Sentence 3
I like dancing
>Sentence 4
I like swimming
>Sentence 5
I like riding my bike

And what I want to do with those contents are for it to read then to print out the sentences in the file when I call file("x.txt",3), so it's only going to print the first 3 lines like in this sample output:

'I like playing games'
'I like jumping around'
'I like dancing'

Here is what I have done so far:

def file(x, y):
    file = open(x, 'r')
    g = list(range(y))
    h = [a for i, a in enumerate(file) if i in g]
    return " ' ".join(h)

I wasn't able to figure out how to have the program print the number of lines that the user inputs, but so far when I run run the program this is what I get:

>Sentence 1
 ' I like playing games
 ' >Sentence 2

I only want it to print the sentences, and I don't want it to print the ">Sentence #" part.

Will someone be able to help me figure this out? Thank You!

Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
Joe
  • 35
  • 1
  • 9
  • Where the 'Sentence #' comes from ? – farbiondriven Oct 12 '17 at 15:47
  • So you only want to print the odd-numbered lines? – PM 2Ring Oct 12 '17 at 15:49
  • Consider `range(1, 2*y, 2)`. And there's no need to create a list from the range, you can perform `in` tests on it directly. – PM 2Ring Oct 12 '17 at 15:51
  • It's answered here: https://stackoverflow.com/questions/1767513/read-first-n-lines-of-a-file-in-python – Jake Steele Oct 12 '17 at 15:57
  • @Chris_Rands, yes it's in the FASFA format – Joe Oct 12 '17 at 16:00
  • @Gin i thought so, added a biopython solution too – Chris_Rands Oct 12 '17 at 16:05
  • Possible duplicate of [Read first N lines of a file in python](https://stackoverflow.com/questions/1767513/read-first-n-lines-of-a-file-in-python) – Davy M Oct 12 '17 at 16:08
  • Some advice: 1. Avoid using single-letter variable names like `g`, `a` of `h` (`i` is OK for something representing a counting integer). Use more informative names. 2. It is recommended to close open files after using them. You might be interested in the `with` context manager technique: http://www.pythonforbeginners.com/files/with-statement-in-python – bli Oct 16 '17 at 11:14

2 Answers2

3

A simple native Python solution, I'm assuming lines that don't start with > are the 'sentence' lines:

from itertools import islice

def extract_lines(in_file, num):
    with open(in_file) as in_f:
        gen = (line for line in in_f if not line.startswith('>'))
        return '\n'.join(islice(gen, num))

But is this is actually FASTA format (now it is clear this is true) then I suggest using BioPython instead:

from Bio import SeqIO
from itertools import islice

def extract_lines(in_file, num):
    with open(in_file) as in_f:
        gen = (record.seq for record in SeqIO.parse(in_f, 'fasta'))
        return list(islice(gen, num))
Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
0

The answer given by @Chris_Rands is good, but since you ask for solutions without imports in a comment, here is one possibility:

def extract_lines(in_file, num):
    """This function generates the first *num* non-header lines
    from fasta-formatted file *in_file*."""
    nb_outputted_lines = 0
    with open(in_file, "r") as fasta:
        for line in fasta:
            if nb_outputted_lines >= num:
                break # This interrupts the for loop
            if line[0] != ">":
                yield line.strip() # strip the trailing '\n'
                nb_outputted_lines += 1

To use it:

for line in extract_lines("x.txt", 3):
    print(line)
    # If you want the quotes:
    #print("'%s'" % line)
    # Or (python 3.6+):
    #print(f"'{line}'")
bli
  • 7,549
  • 7
  • 48
  • 94