It might be easier to directly use pandas
to read and extract the table, as described here.
Otherwise, you can check if any cell in a row contains "United States" and then print data from the first two columns for this row:
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
if "United States" in cols:
data.append([ele for ele in cols if ele])
Complete code:
from bs4 import BeautifulSoup as bs
import requests
# loading web page
r = requests.get("https://sslproxies.org/")
# convert to a beautiful-soup object
webpage = bs(r.content, "html.parser")
data = list()
rows = iter(webpage.find('table').find_all('tr'))
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
if "United States" in cols:
data.append([ele for ele in cols if ele])
print(data)
The object data
will contain all the information you need:
IP Port Code Country Anonymity Google Https Last Checked
0 168.8.172.2 80 US United States elite proxy no yes 17 secs ago
1 20.47.108.204 8888 US United States anonymous no yes 10 mins ago
2 68.65.184.223 8888 US United States anonymous yes yes 3 hours 11 mins ago
3 3.82.203.47 3128 US United States anonymous no yes 4 hours 38 mins ago
4 20.84.106.205 8214 US United States elite proxy yes yes 5 hours 40 mins ago
5 150.136.139.194 3128 US United States elite proxy yes yes 6 hours 30 mins ago
6 172.104.24.22 3128 US United States anonymous no yes 7 hours 40 mins ago
7 159.65.69.186 9300 US United States anonymous no yes 7 hours 41 mins ago
8 47.252.4.64 8888 US United States anonymous no yes 9 hours 34 mins ago
9 12.144.254.185 9080 US United States anonymous no yes 9 hours 34 mins ago
10 66.94.116.111 3128 US United States anonymous no yes 9 hours 34 mins ago
11 35.170.197.3 8888 US United States anonymous no yes 9 hours 34 mins ago