I am very, very new to web scraping. But I tried running the following code:
import requests
import json
headers={'Host': 'www.bloomberg.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0',
'Accept': '*/*',
'Accept-Language': 'de,en-US;q=0.7,en;q=0.3',
'Accept-Encoding': 'gzip, deflate, br',
'Referer': 'https://www.bloomberg.com/quote/AAPL:INDAAPL:IND',
'DNT': '1',
'Connection': 'keep-alive',
'TE': 'Trailers'}
url='https://www.bloomberg.com/markets2/api/datastrip/IBVC%3AIND?locale=en&customTickerList=true'
response = requests.get(url=url, headers=headers)
response.json()
An Error is shown as follows.
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
<ipython-input-5-543d39c3046b> in <module>
14 response = requests.get(url=url, headers=headers)
15
---> 16 response.json()
c:\programdata\anaconda3\lib\site-packages\requests\models.py in json(self, **kwargs)
896 # used.
897 pass
--> 898 return complexjson.loads(self.text, **kwargs)
899
900 @property
c:\programdata\anaconda3\lib\json\__init__.py in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
355 parse_int is None and parse_float is None and
356 parse_constant is None and object_pairs_hook is None and not kw):
--> 357 return _default_decoder.decode(s)
358 if cls is None:
359 cls = JSONDecoder
c:\programdata\anaconda3\lib\json\decoder.py in decode(self, s, _w)
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):
c:\programdata\anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I tried searching the web and found a couple of answered questions here but was unable to discover the issue. In particular, I tried using following the comment provided in this [link][1], but it was not helpful. That is I changed the last line to
requests.get(url, headers=headers).json()
I also tried the following code which is expecting my URL is an HTML file.
import requests
import json
headers={'Host': 'www.bloomberg.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0',
'Accept': '*/*',
'Accept-Language': 'de,en-US;q=0.7,en;q=0.3',
'Accept-Encoding': 'gzip, deflate, br',
'Referer': 'https://www.bloomberg.com/quote/AAPL:INDAAPL:IND',
'DNT': '1',
'Connection': 'keep-alive',
'TE': 'Trailers'}
url='https://www.bloomberg.com/markets2/api/datastrip/IBVC%3AIND?locale=en&customTickerList=true'
response = requests.get(url=url, headers=headers)
response.content.decode('utf-8')
Which gives the following results
'<!doctype html>\n<html>\n<head>\n <title>Bloomberg - Are you a robot?</title>\n <meta name="viewport" content="width=device-width, initial-scale=1">\n <link rel="stylesheet" type="text/css" href="https://assets.bwbx.io/font-service/css/BWHaasGrotesk-55Roman-Web,BWHaasGrotesk-75Bold-Web,BW%20Haas%20Text%20Mono%20A-55%20Roman/font-face.css">\n <style rel="stylesheet" type="text/css">\n html, body, div, span, applet, object, iframe,\n h1, h2, h3, h4, h5, h6, p, blockquote, pre,\n a, abbr, acronym, address, big, cite, code,\n del, dfn, em, img, ins, kbd, q, s, samp,\n small, strike, strong, sub, sup, tt, var,\n b, u, i, center,\n dl, dt, dd, ol, ul, li,\n fieldset, form, label, legend,\n table, caption, tbody, tfoot, thead, tr, th, td,\n article, aside, canvas, details, embed,\n figure, figcaption, footer, header, hgroup,\n menu, nav, output, ruby, section, summary,\n time, mark, audio, video {\n margin: 0;\n padding: 0;\n border: 0;\n font-size: 100%;\n font: inherit;\n vertical-align: baseline;\n }\n /* HTML5 display-role reset for older browsers */\n article, aside, details, figcaption, figure,\n footer, header, hgroup, menu, nav, section {\n display: block;\n }\n body {\n line-height: 1;\n }\n ol, ul {\n list-style: none;\n }\n blockquote, q {\n quotes: none;\n }\n blockquote:before, blockquote:after,\n q:before, q:after {\n content: \'\';\n content: none;\n }\n table {\n border-collapse: collapse;\n border-spacing: 0;\n }\n\n * {\n box-sizing: border-box;\n }\n\n body {\n background-color: #f2f2f2;\n font-family: "BWHaasGrotesk-55Roman-Web";\n line-height: 1.2;\n }\n\n .header {\n margin: 0;\n height: 60px;\n width: 100%;\n background-color: black;\n color: white;\n overflow-x: hidden;\n }\n\n .logo {\n float: left;\n margin: 0 20px;\n height: 60px;\n width: 140px;\n background-image: url(\'data:image/svg+xml;base64,PHN2ZyBpZD0iTGF5ZXJfMSIgZGF0YS1uYW1lPSJMYXllciAxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNTcuNzUgNDcuNjMiPjxkZWZzPjxzdHlsZT4uY2xzLTF7ZmlsbDojZmZmO308L3N0eWxlPjwvZGVmcz48dGl0bGU+Qmxvb21iZXJnX05IR193aHQ8L3RpdGxlPjxwYXRoIGNsYXNzPSJjbHMtMSIgZD0iTTgxLjczLDExMzhIMTAwLjZjMy41NywwLDYuMzIuODcsOC4yNiwyLjQ1YTkuNDUsOS40NSwwLDAsMSwzLjM3LDcuNmMwLDMuNjctMS40OCw2LTQuNTQsNy4zOXYwLjE1YzQsMS4zMyw2LjI3LDQuOSw2LjI3LDkuMjMsMCw0LjEzLTEuNTgsNy4zNC00LjE4LDkuMjgtMi4xOSwxLjU4LTUsMi4zNS04LjgyLDIuMzVIODEuNzNWMTEzOFptMTcsMTVjMiwwLDMuNTItMS4xMiwzLjUyLTMuMzdzLTEuNTMtMy4yNi0zLjU3LTMuMjZIOTIuMTlWMTE1M2g2LjUzWm0xLDE0Ljg5YTMuOTMsMy45MywwLDEsMC0uMDUtNy44NUg5Mi4xOXY3Ljg1aDcuNVoiIHRyYW5zZm9ybT0idHJhbnNsYXRlKC04MS43MyAtMTEzNy45OCkiLz48cGF0aCBjbGFzcz0iY2xzLTEiIGQ9Ik0xMTUuOCwxMTM4aDkuODl2MzguNDVIMTE1LjhWMTEzOFoiIHRyYW5zZm9ybT0idHJhbnNsYXRlKC04MS43MyAtMTEzNy45OCkiLz48cGF0aCBjbGFzcz0iY2xzLTEiIGQ9Ik0xMjcuNjksMTE2Mi43N2MwLTguNjcsNS42MS0xNC41NCwxNC4yOC0xNC41NHMxNC4xOCw1Ljg3LDE0
Which is also not the output that was expected as mentioned in the URL link.
Thank you in advance [1]: JSONDecodeError: Expecting value: line 1 column 1 (char 0)