8

I am trying to write a program that will take an HTML file and output each line. I am doing something wrong because my code is outputting each letter. How can I get all the HTML lines into a list?

This is the code so far:

f = open("/home/tony/Downloads/page1/test.html", "r")
htmltext = f.read()
f.close()

for t in htmltext:
    print t + "\n"

2 Answers2

6

You can use f.readlines() instead of f.read(). This function returns a list of all the lines in the file.

with open("/home/tony/Downloads/page1/test.html", "r") as f:
    for line in f.readlines():
        print(line)

Alternatively you could use list(f).

f = open("/home/tony/Downloads/page1/test.html", "r")
f_lines = list(f)
for line in f_lines:
    print(line)

Source: https://docs.python.org/3.5/tutorial/inputoutput.html

Roccy
  • 81
  • 3
1

f.read() will attempt to read and yield each character until an EOF is met. What you want is the f.readlines() method:

with open("/home/tony/Downloads/page1/test.html", "r") as f:
    for line in f.readlines():
        print(line) # The newline is included in line
asadmshah
  • 1,338
  • 9
  • 7