0

I'm using beautifulsoup to extract the start dates and end dates of events on a variety of different domains that all have different html tags where the start dates and end dates are located. So initially I've been doing this manually for each different domain but this will take forever.

So I'm wondering if I can search strings based off of datetime structure, something like:

find_string = soup.body.findAll('%B %d, %Y')

Obviously this doesn't work. but I'm wondering if there is any code I could use to locate February 14, 2018 for example.

Example: https://www.marketfairmall.com/event/Athleta-Semi-Annual-Sale/2145510733/

How can I extract 7/30/18 by searching for %m/%d/%y

HenryAD
  • 87
  • 1
  • 8
  • 1
    I'd use `re` and patterns from this answer: https://stackoverflow.com/questions/15491894/regex-to-validate-date-format-dd-mm-yyyy – mklucz Jul 28 '18 at 21:19
  • Yeah, you could use `re`, or `datetime` module. Do you have some sample html/url and desired output? – Andrej Kesely Jul 28 '18 at 21:21
  • I edited my post with an example and desired output, thanks for the response. How can I go about it using the datetime module? – HenryAD Jul 28 '18 at 21:26
  • @HenryDefner I cannot connect to the URL you posted, connection timed out. Is the webpage up? – Andrej Kesely Jul 28 '18 at 22:45

0 Answers0