I'm trying to use a regex to match tags with class="calendar-days-list2" but not class="calendar-days-list2 prev-next-month". I loaded up a sample piece of HTML with tags containing both options.
When I search the sample HTML using re.findall(), the regex matches as I would like. When I use that sample regex in beautifulsoup, it returns both the wanted and the unwanted class. I don't understand why this is, any thoughts? Thanks!
html = '''<td id="pagestructure_0_pagecontent_0_calendar1_2016_1_7_0" class="calendar-days-list2" width="14%">
<span class="date-number">7</span>
<p>
<img src="/wac/wacassets/images/icons/h1.gif" border="0">
<a href="http://www.woodruffcenter.org/Commerce/MuseumAdmissions?performanceId=86514">Special Exhibitions</a>
10:00 AM
</p>
<td id="pagestructure_0_pagecontent_0_calendar1_2015_11_29_1" class="calendar-days-list2 prev-next-month" width="14%"></td>
'''
soup = BeautifulSoup(html)
# WORKS
print re.findall(r"(calendar\-days\-list2)(?!\sprev\-next\-month)",html), "\n\n"
regex = re.compile(r"(calendar\-days\-list2)(?!\sprev\-next\-month)")
# DOESN'T WORK
tds = soup.find_all("td", {"class": regex})
print tds
output:
# re.findall
['calendar-days-list2']
# soup.find_all
[<td class="calendar-days-list2" id="pagestructure_0_pagecontent_0_calendar1_2016_1_7_0" width="14%">
<span class="date-number">7</span>
<p>
<img border="0" src="/wac/wacassets/images/icons/h1.gif"/>
<a href="http://www.woodruffcenter.org/Commerce/MuseumAdmissions? performanceId=86514">Special Exhibitions</a>
10:00 AM
</p>
</td>, <td class="calendar-days-list2 prev-next-month" id="pagestructure_0_pagecontent_0_calendar1_2015_11_29_1" width="14%"></td>]
`