No results calling find_all for text in a specific class

Question

I'm trying to get all the text for a specific class, but it is returning an empty list:

>>> soup.find_all(' dataRow odd')
[]

html:

<tr class=" dataRow odd" onblur="if (window.hiOff){hiOff(this);}" 
onfocus="if (window.hiOn){hiOn(this);}" onmouseout="if (window.hiOff){hiOff(this);}" 
onmouseover="if (window.hiOn){hiOn(this);}"><td class='actionColumn'>&nbsp;</td><th scope="row" class=" dataCell  ">
<a href="/a0I9000000hHJIN?btdid=0019000001piFE9">textexttext</a></th><td class=" dataCell  ">Active</td><td class=" dataCell  ">
<a href="/a089000001nOvG8?btdid=0019000001piFE9">BIG TEXT/a></td>
<td class=" dataCell  ">TEXTTEXTTEXT</td><td class=" dataCell  ">TEXTTEXTTEXT</td>
<td class=" dataCell  "> </td><td class=" dataCell  ">&nbsp;</td><td class=" dataCell  DateElement">8/02/2019</td></tr>

I'm trying to grab ALL text within that code. But when I run my code it returns [] as if it didn't find anything.

import requests, bs4, re
html = open('2.html')
soup = bs4.BeautifulSoup(exampleFile, "lxml")
duh = soup .find_all(' dataRow odd')
print (duh)

Where am I going wrong? Also, ideally the code would spit out all the separate text on different lines

I believe your `findAll()` is being given the wrong argument. You would need to `findAll('tr', {"class": ' dataRow odd'})`. As in [this](https://stackoverflow.com/questions/5041008/how-to-find-elements-by-class) question. — ktb, Jun 11 '17 at 05:30
Thanks, Problem is it now spits out the entire code. I'm trying to isolate just the text and print it just from the text — Alex, Jun 11 '17 at 06:00
Python doesn't have `nil`. See [Ruby use case for nil, equivalent to Python None or JavaScript undefined](https://stackoverflow.com/questions/3884004/ruby-use-case-for-nil-equivalent-to-python-none-or-javascript-undefined) and [What is closer to python None: nil or NULL?](https://stackoverflow.com/questions/25498810/what-is-closer-to-python-none-nil-or-null) — Peter Wood, Jun 11 '17 at 06:14
Hi yeah have read the manual, still need help, thanks anyway. Yes, I meant none. it prints as [] — Alex, Jun 11 '17 at 10:51

score 0 · Answer 1 · answered Jun 14 '17 at 22:10

Querying for dataRow odd yields the surrounding <tr> which includes all other elements within, <td> and <a> etc. You can grab just the text by accessing the .text property like so, it will give you just a big blob of text instead of HTML:

for d in duh:
    print d.text

Instead of that, you can fetch all <td> elements within that <tr> separately, and grab the .text from each individual element.

import requests, bs4, re

html = open('test.html')
soup = bs4.BeautifulSoup(html, "html.parser") # use html parser instead of XML
duh = soup.find_all('tr', {'class':' dataRow odd'}) # using ktb's suggestion from comments
for d in duh:
    tds = d.find_all()
    for td in tds:
        cleaned = td.text.strip().rstrip('\n') # remove newlines and spaces
        if cleaned != '':
            print cleaned

No results calling find_all for text in a specific class

1 Answers1