HTML parsing forbidden error

Question

import re
import urllib.request

url='''https://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol='''
Stock = input('Enter the stock name: ').upper()
url = url + Stock
comp_info = urllib.request.urlopen(url).read()

I am getting forbidden error, not able to understand the issue with the code. I am trying to input ITC.

Hm, weird. I also can't open this URL with `request.urlopen` because of response code 403, but it works well with `requests.get()`. — Mark Mishyn, Dec 31 '17 at 13:51
https://stackoverflow.com/questions/16627227/http-error-403-in-python-3-web-scraping — Muku, Dec 31 '17 at 14:00
Possible duplicate of [HTTP error 403 in Python 3 Web Scraping](https://stackoverflow.com/questions/16627227/http-error-403-in-python-3-web-scraping) — Jongware, Dec 31 '17 at 19:26

Mark Mishyn · Accepted Answer · 2017-12-31T18:24:20.267

2

Your code is correct. Seems like this resource is trying to block bots in simplest manner (by checking if request was sent from browser or not).

You can set dummy user agent with appropriate header to solve this issue:

request = urllib.request.Request(url, 
                                 headers={'User-Agent': 'Browser'}) 
urllib.request.urlopen(request).read()

edited Dec 31 '17 at 18:24

answered Dec 31 '17 at 13:57

Mark Mishyn

3,921
2
28
30

add this link for reference https://docs.python.org/3.4/howto/urllib2.html#headers – Albin Paul Dec 31 '17 at 13:59
thanks it worked, but i dont get what this code [request = urllib.request.Request(url, headers={'User-Agent': 'Browser'})] does... can you explain it in simple words – Rajat Garg Dec 31 '17 at 14:42
@RajatGarg this code is there to set HTTP header to Request object. Name of the header is "User-Agent", and value is "Browser" - just random string to emulate browsers behavior. Probably you should read about User-Agent header and/or about HTTP headers in general. – Mark Mishyn Dec 31 '17 at 18:33

HTML parsing forbidden error

1 Answers1