0

Actually i am reading urls from a file that contains 1 url per line, but in loop when i read and open that url in python this gives output of BAD RESPONSE 400

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii">   
</HEAD>
<BODY><h2>Bad Request - Invalid URL</h2>
<hr><p>HTTP Error 400. The request URL is invalid.</p>
</BODY></HTML>
#$#$#$#$#$#$#$#$#$#$#$#
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii">    
</HEAD>
<BODY><h2>Bad Request - Invalid URL</h2>
<hr><p>HTTP Error 400. The request URL is invalid.</p>
</BODY></HTML>
#$#$#$#$#$#$#$#$#$#$#$#

This is 2 url's output.

But when i add only one Url in file, it reads it fine and output actual HTML page. (Even in loop)

Here my code of python

import time
import cfscrape
scraper = cfscrape.create_scraper()
f = open('links.txt')
f2 = open('pages.html','a')
for line in iter(f):
    line2 = line
    page = scraper.get(line2).content
    f2.write(page)
    f2.write("#$#$#$#$#$#$#$#$#$#$#$#")
    time.sleep(30)
f.close()
f2.close()

And Here are the links/urls that links.txt file contains

http://kissmanga.com/Manga/Mekakushi-no-Kuni
http://kissmanga.com/Manga/Gigi-Goegoe
Noman Ali
  • 3,160
  • 10
  • 43
  • 77

1 Answers1

1

Try to change line2 = line to line2 = line.strip()

Sergey Gornostaev
  • 7,596
  • 3
  • 27
  • 39