2

I want to scrape data from this URL https://weibo.com/hebgqt?refer_flag=1001030103_&is_all=1 I am able to scrape the data if I pass the cookie in headers manually. But, I want to do it automatically. Here is the code.

import requests

url = 'https://weibo.com/hebgqt?refer_flag=1001030103_&is_all=1'

headers = {
    'authority': 'weibo.com',
    'cache-control': 'max-age=0',
    'sec-ch-ua': '^\\^',
    'sec-ch-ua-mobile': '?0',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-user': '?1',
    'sec-fetch-dest': 'document',
    'accept-language': 'en-IN,en-GB;q=0.9,en-US;q=0.8,en;q=0.7',
    'cookie': 'SINAGLOBAL=764815322341.5566.1622097283265; SUB=_2AkMXj8zTf8NxqwJRmP0RzmrjaY1yyg3EieKh0z0IJRMxHRl-yT92qmgntRB6PA_iPI199P4zlRz9zonVc5W23plzUH7V; SUBP=0033WrSXqPxfM72-Ws9jqgMF55529P9D9W55o9Nf.NuDNjNQuIS8pJY_; _s_tentry=-; Apache=3847225399074.1636.1624690011593; ULV=1624690011604:5:4:4:3847225399074.1636.1624690011593:1624608998989',
}

response = requests.get(url, headers=headers).text
print(response)

I tried to get cookies by the following code but I am getting an empty dictionary.

import requests
url = 'https://weibo.com/hebgqt?refer_flag=1001030103_&is_all=1'
r = requests.get(url)
print(r.cookies.get_dict())

Note: Website is Chinese. So, I am using Nord VPN & if I don't use it I will get SysCallError error. Please help me to find cookies or any other way to fetch data from the above URL.

azro
  • 53,056
  • 7
  • 34
  • 70
Sachin Gupta
  • 186
  • 1
  • 14

1 Answers1

-1

I think in order to read cookies, you should use a request Session as shown here: https://stackoverflow.com/a/25092059/7426792

Moritz Wilksch
  • 141
  • 2
  • 5