I need to be able to scrape the content of many articles of a certain category from the New York Times. For example, let's say we want to look at all of the articles related "terrorism." I would go to this link to view all of the articles: https://www.nytimes.com/topic/subject/terrorism
From here, I can click on the individual links, which directs me to a URL that I can scrape. I am using Python with the BeautifulSoup package to help me retrieve the article text.
Here is the code that I have so far, which lets me scrape all of the text from one specific article:
from bs4 import BeautifulSoup
session = requests.Session()
url = "https://www.nytimes.com/2019/10/23/world/middleeast/what-is-going-to-happen-to-us-inside-isis-prison-children-ask-their-fate.html"
req = session.get(url)
soup = BeautifulSoup(req.text, 'html.parser')
paragraphs = soup.find_all('p')
for p in paragraphs:
print(p.get_text())
The problem is, I need to be able to scrape all of these articles under the category, and I'm not sure how to do that. Since I can scrape one article as long as I am given the URL, I would assume my next step is to find a way to gather all of the URLs under this specific category, and then run my above code on each of them. How would I do this, especially given the format of the page? What do I do if the only way to see more articles is to manually select the "SHOW MORE" button at the bottom of the list? Are these capabilities that are included in BeautifulSoup?