I am trying to scrape some scripts from a TV-Show. I am able to get the text as I need it using BeautifulSoup and Requests.
import requests
from bs4 import BeautifulSoup
r = requests.get('http://www.example.com')
s = BeautifulSoup(r.text, 'html.parser')
for p in s.find_all('p'):
print p.text
This works great so far. But I want only those paragraphs from a certain character. Say his name is "stackoverflow". The text would be like this:
A: sdasd sd asda B: sdasds STACKOVERFLOW: Help?
So I only want the stuff that STACKOVERFLOW says. Not the rest.
I have tried
s.find_all(text='STACKOVERFLOW') but I get nothing.
What would be the right way to do this? A hint in the right direction would be most appreciated.