I have a class I use for scraping a site. It uses the requests library sessions and looks something like this:
class Scraper:
def scrape_page(self):
"""Scrapes the current page"""
# do something
self.request_next_page()
def request_next_page(self):
""" Finds the 'next' link if available and requests next page"""
I want to create a method in the class that allows a parameter that can either scrape n
number of pages or all
the pages until there is no next page. The above methods work fine.
However, I don't know of a way to let the parameter be either an integer or just a simple True
for all. I'm trying to think of the best way to do this.
I want something similar to this:
def scrape_pages(self, num):
"""Scrape n number of pages"""
Where it can be ran as such:
>>> s = Scraper()
>>> s.scrape_pages(5) # scrape the first 5 pages.
or
>>> s = Scraper()
>>> s.scrape_pages(all) # where all can be True, or anything else that works. I'm not sure.
I know I could have two separate functions. Or have an if
statement to check whether it is True
or just an integer, and then run a different loop depending on the situation (maybe a for
if integer, and a while
if something else. I am just seeing if there is a better way to do this?
I noticed the .split()
method kind of does something similar. Where maxsplit can have a limit or not. However, I am not familiar with C to be able to understand how that was accomplished.