1

I'm making an easy scraper to see if I can get the value in an input that I need from a HTML I made. It always shows None as an answer, so I'm checkin with something more simple, the < title > in the html.

from bs4 import BeautifulSoup # parsing
r = open("C:/Python27/Pruebas/pruebahtml.html")
print(r.read())

soup = BeautifulSoup(r,"html.parser")
title = soup.title
print(title)
r.close()

But I'm still getting None as an answer, I have also used findALL, find_all and find to do this but I get some errors. Do anyone knows where's my mistake?

  • Related: http://stackoverflow.com/questions/3906137/why-cant-i-call-read-twice-on-an-open-file. – alecxe Oct 02 '15 at 19:47
  • Thanks, I was forgetting that Now my question is.. if the user gives an input in the html and clicks a button, can I read the input he/she made using this?? – Joshua Cazares Oct 02 '15 at 19:49

1 Answers1

2

You are passing an empty string to bs4 because print(r.read()) has moved the pointer to the end of the file, remove the print(r.read()) and pass it to BeautifulSoup( or call r.seek(0) and pass. Once you call read, readlines on or iterate over a file object the iterator is consumed so there is nothing left to read.

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321