0

I am new to the community. I am working on a project for determining the address from an html file. The specific string that I am trying to process is

<address class="list-card-addr">1867 Central Ave, Augusta, GA 30904</address>

I have tried processing it using manual tools. I'd like to use python to process the entire html file. Can someone explain how to do this in python? Thank you in advance.

2 Answers2

1

You can extract the address using BeautifulSoup, which is very handy for accessing elements in HTML and XML documents.

from bs4 import BeautifulSoup
import requests

r = requests.get(url)
html = r.text
soup = BeautifulSoup(html, "html.parser")
addr = soup.find("address", class_="list-card-addr")
print(addr.text)

If there are multiple addresses in target HTML then use find_all() function and a loop to access all address elements.

for addr in soup.find_all("address", class_="list-card-addr"):
    print(addr.text)
CodeMonkey
  • 22,825
  • 4
  • 35
  • 75
0

Use Regex to find the addresses....

r1 = re.findall(r"<address class=\"?list-card-addr\"?>([^<]+)", html)
print(r1)