-1

I want to save the number of articles in each country in the form of the name of the country, the number of articles in a file for my research work from the following site. To do this, I wrote this code, which unfortunately does not work.

http://corona.sid.ir/

!pip install bs4
from bs4 import BeautifulSoup # this module helps in web scrapping.
import requests  # this module helps us to download a web page
url='http://corona.sid.ir/'
data  = requests.get(url).text 
soup = BeautifulSoup(data,"lxml")  # create a soup object using the variable 'data'
soup.find_all(attrs={"class":"value"})

Result= []

mo ta
  • 3
  • 3
  • Does this answer your question? [Web-scraping JavaScript page with Python](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python) – baduker May 24 '21 at 08:15
  • The question and answer in the submitted link is general, if my problem is minor and is related to a specific site that has its own type. My friend @chitown88 helped and I found out I entered the site address incorrectly :) – mo ta May 24 '21 at 11:23
  • @mota, well it's not that you entered the address incorrectly, it's that the site gets that data from that other url source which is then rendered in the original url you had. 2 ways to go about it is a) you can use the orginal url, but need to allow the page to render the data then parse it, or b) as in the link baduker provided, suggesting to go to the url that the data is sourced from. We just went straight to the source. – chitown88 May 24 '21 at 12:23

1 Answers1

0

You are using the wrong url. Try this:

from bs4 import BeautifulSoup # this module helps in web scrapping.
import requests  # this module helps us to download a web page
import pandas as pd

url = 'http://corona.sid.ir/world.svg'
data  = requests.get(url).text 
soup = BeautifulSoup(data,"lxml")  # create a soup object using the variable 'data'
soup.find_all(attrs={"class":"value"})

rows = []
for each in soup.find_all(attrs={"class":"value"}):
    row = {}
    row['country'] = each.text.split(':')[0]
    row['count'] = each.text.split(':')[1].strip()
    rows.append(row)
    
df = pd.DataFrame(rows)

Output:

print(df)
                  country count
0                 Andorra    17
1    United Arab Emirates   987
2             Afghanistan    67
3                 Albania   143
4                 Armenia    49
..                    ...   ...
179                 Yemen    54
180               Mayotte     0
181          South Africa  1938
182                Zambia   127
183              Zimbabwe   120

[184 rows x 2 columns]
chitown88
  • 27,527
  • 4
  • 30
  • 59