2

I have a line of code in a python script as shown below

for summaries in soup.findAll('div',{'class':'cb-lv-scrs-col cb-font-12 cb-text-complete'}):
#do something with summaries

However, i want summaries to also include items from div items with another class called cb-scag-mtch-status cb-text-inprogress

I have tried the below as given here - BeautifulSoup findAll() given multiple classes?

for summaries in soup.findAll('div',{'class':['cb-lv-scrs-col cb-font-12 cb-text-complete','cb-scag-mtch-status cb-text-inprogress']}):
#do something with summaries

but this is not working. What is the problem and how do i fix it?

Community
  • 1
  • 1
RaviTej310
  • 1,635
  • 6
  • 25
  • 51
  • The spaces between each class mean multiple classes, so in the first line you are searching for classes `cb-lv-scrs-col`, `cb-font-12`, and `cb-text-complete`. – OneCricketeer Dec 23 '15 at 13:34
  • I don't think so. In the source code, it was given class="cb-lv-scrs-col cb-font-12 cb-text-complete" So I think it means the whole class. – RaviTej310 Dec 23 '15 at 13:47
  • I tried the solution mentioned in that question as mentioned in my question above. It didn't work. – RaviTej310 Dec 23 '15 at 13:54
  • What about the regex solution? And it shouldn't matter if it didn't work, this is still a duplicate question and should be closed. – OneCricketeer Dec 23 '15 at 13:56
  • I prefer not using regex because that was the main aim of me writing the program. To scrape a web page without using regex. – RaviTej310 Dec 23 '15 at 13:59
  • I don't mean directly regex on the HTML. If you look at the second answer, it has `soup.findAll(True, {"class": re.compile("^(equal|up)$")})` which would find classes `equal` or `up` – OneCricketeer Dec 23 '15 at 14:02
  • I don't want to import any additional modules like re. So that solution would not work for me. – RaviTej310 Dec 23 '15 at 14:04

1 Answers1

4

I would make a simple CSS selector:

soup.select('div[class="cb-lv-scrs-col cb-font-12 cb-text-complete"],div[class="cb-scag-mtch-status cb-text-inprogress"]')

but, I doubt you really need or should check all of the classes present on an element, would not that be sufficient:

soup.select('div.cb-text-complete,div.cb-text-inprogress')
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • This can be used to find all the elements with the mentioned classes like `findAll` . I just checked. Thanks. – RaviTej310 Dec 23 '15 at 15:41