2

From a link , I am trying to create two lists: one for country and the other for currency. However, I'm stuck at some point where it only gives me the first country name but doesn't iterate to list of all countries. Any help as to how I can fix this will be appreciated.Thanks in advance.

Here is my try:

from bs4 import BeautifulSoup
import urllib.request

url = "http://www.worldatlas.com/aatlas/infopage/currency.htm"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 
10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.80 
Safari/537.36'}

req = urllib.request.Request(url, headers=headers)
resp = urllib.request.urlopen(req)
html = resp.read()

soup = BeautifulSoup(html, "html.parser")
attr = {"class" : "miscTxt"}

countries = soup.find_all("div", attrs=attr)
countries_list = [tr.td.string for tr in countries]

for country in countries_list:
    print(country)
SIM
  • 21,997
  • 5
  • 37
  • 109

2 Answers2

1

You can also use a single comprehension list to make a list of tuples like [(country, currency)] & then convert the tuples to 2 lists with map & zip :

temp_list = [
    (t[0].text.strip(), t[1].text.strip()) 
    for t in (t.find_all('td') for t in countries[0].find_all('tr'))
    if t
]

countries_list, currency_list = map(list,zip(*temp_list))

The full code :

from bs4 import BeautifulSoup
import urllib.request

req = urllib.request.Request("http://www.worldatlas.com/aatlas/infopage/currency.htm")

soup = BeautifulSoup(urllib.request.urlopen(req).read(), "html.parser")

countries = soup.find_all("div", attrs = {"class" : "miscTxt"})

temp_list = [
    (t[0].text.strip(), t[1].text.strip()) 
    for t in (t.find_all('td') for t in countries[0].find_all('tr'))
    if t
]

countries_list, currency_list = map(list,zip(*temp_list))

print(countries_list)
print(currency_list)
Bertrand Martel
  • 42,756
  • 16
  • 135
  • 159
0

Try this script. It should give you the country names along with corresponding currencies. You didn't require to use headers for this site.

from bs4 import BeautifulSoup
import urllib.request

url = "http://www.worldatlas.com/aatlas/infopage/currency.htm"
resp = urllib.request.urlopen(urllib.request.Request(url)).read()
soup = BeautifulSoup(resp, "lxml")

for item in soup.select("table tr"):
    try:
        country = item.select("td")[0].text.strip()
    except IndexError:
        country = ""
    try:
        currency = item.select("td")[0].find_next_sibling().text.strip()
    except IndexError:
        currency = ""
    print(country,currency)

Partial Output:

Afghanistan afghani
Algeria dinar
Andorra euro
Argentina peso
Australia dollar
SIM
  • 21,997
  • 5
  • 37
  • 109