How can I extract only string in this phrase? (Python BeautifulSoup)

Question

This is HTML source:

<td style="padding-right: 10px;" valign="top">1.1</td>
<td valign="top">
         If applicable, do <a href="url"> link </a> to switch one to other mode.<br/>
</td>

From above, how can I extract only strings? I tried it like below. Although first one work, second one doesn't work.

print(soup.find_all("td")[0].string)
print(soup.find_all("td")[1].string)

1.1
None

Welcome to StackOverflow. What is the output that you would want from a working version of the code? — Caridorc, Feb 08 '23 at 00:51

score 0 · Answer 1 · answered Feb 08 '23 at 00:56

You should use the .text attribute rather then the .string attribute. The reason is quite complex and is explained here: Difference between .string and .text BeautifulSoup

Here is the working version of your code:

import bs4
source="""
<td style="padding-right: 10px;" valign="top">1.1</td>
<td valign="top">
         If applicable, do <a href="url"> link </a> to switch one to other mode.<br/>
</td>
"""

soup = bs4.BeautifulSoup(source, "html.parser")
print(soup.find_all("td")[0].text)
print(soup.find_all("td")[1].text)

With output:

1.1

         If applicable, do  link  to switch one to other mode.

Feel free to use .strip() to remove unwanted spaces at the start and at the end.

Thx a lot! I can extract text thanks to you and I used translate() also. — Shockyn, Feb 09 '23 at 04:34

How can I extract only string in this phrase? (Python BeautifulSoup)

1 Answers1