1

I have tried to scrape twitter data using BeautifulSoup and requests library. I tried to log in first using BeautifulSoup and then scrape the required page. But it is not working. I didn't get the mistake what I have done.

I am adding this code:

import requests
from bs4 import BeautifulSoup
session_rqst=requests.session()
url="https://twitter.com/login"
r=requests.get(url)
c=r.content
soup=BeautifulSoup(c,"html.parser")
token=soup.find("input",{"name":"authenticity_token"})
payload = {"username": "test_user", "password": "test_password"}
result=session_rqst.post(url, data=payload, headers = 
dict(referer="https://twitter.com/"))
all=result.content
soup1=BeautifulSoup(all,"html.parser")
page=requests.get("https://twitter.com/akhiltaker/following")
page.content
soup1=BeautifulSoup(page.content,"html.parser")

How I can scrape followers list from the webpage?

James Z
  • 12,209
  • 10
  • 24
  • 44

1 Answers1

0

Instead of scraping twitter via requests and BeautifulSoup manually, use the twitter API.

You can get the followers of an account directly, see the docs here: api-reference/get-followers-list which gives you your data as json.

There are various Python libraries for twitter, which you can use to for your purpose.


Edit: Regarding your question to BeautifulSoup: It looks like you cannot login to twitter that simple, so your response probably contains only some login/error page, but not your follower list. Have a look at this answer on how to login to twitter via Python.

bastelflp
  • 9,362
  • 7
  • 32
  • 67
  • 1
    This should be a comment, not an answer (in fact, it's already a comment). Yes, the Twitter API is the way to go, but that has nothing to do with BeautifulSoup. – David Makogon Nov 12 '17 at 13:23