2

I have been dealing with something, but it didn't work no matter what i tried. I need to use multiple replace function, howewer python allowed me to use it only one time.

It's my csv output. (https://i.stack.imgur.com/HtBSn.png)]

Firstly, there are values which seem as N/A. it has to be 0 or something, briefly, should be string.

Secondly, there are space in some countries name. Like North Macedonia it shouldn't be there.

`

import csv
import requests
from bs4 import BeautifulSoup
from csv import QUOTE_NONE
from csv import writer


response = requests.get('https://www.worldometers.info/coronavirus/#news').content

soup = BeautifulSoup(response,'lxml')

tbody=soup.find('table', id='main_table_countries_today').find('tbody').find_all('tr')[100:110]

with open('corona1.csv','w', newline='') as csv_file:
    csv_writer = writer(csv_file, escapechar=' ', quoting=csv.QUOTE_NONE)
    csv_writer.writerow(['countries','total_cases','total_deaths','total_recovered','active_cases','total_cases_in_1m','deaths_in_1m','population'])



    for value in tbody:
            countries = value.find_all('td')[1].text.replace(",", "").strip()
            total_cases= value.find_all('td')[2].text.replace(",", "").strip()
            total_deaths=value.find_all('td')[4].text.replace(",", "").strip()
            total_recovered=value.find_all('td')[6].text.replace(",", "").strip()
            active_cases=value.find_all('td')[8].text.replace(",", "").strip()
            total_cases_in_1m=value.find_all('td')[10].text.replace(",", "").strip()
            deaths_in_1m=value.find_all('td')[11].text.replace(",", "").strip()
            population=value.find_all('td')[14].text.replace(",", "").strip()


            csv_writer.writerow([countries,total_cases,total_deaths,total_recovered,active_cases,total_cases_in_1m,deaths_in_1m,population])



this is my current python code. what should i change?

i would like to have something like

total_recovered=value.find_all('td')[6].text.replace(",", "").replace("N/A","0").replace(" ","").strip()

  • Are you running this code on Windows? If so, the extra line between rows could be the result of a behavior detailed here -> https://stackoverflow.com/questions/3348460/csv-file-written-with-python-has-blank-lines-between-each-row – Tanner Oct 31 '22 at 19:48

1 Answers1

0

Edit: I this code works for me. The repetitive work I excluded into a method and call it in the csv.writerow

import csv
import requests
from bs4 import BeautifulSoup
from csv import QUOTE_NONE
from csv import writer


response = requests.get('https://www.worldometers.info/coronavirus/#news').content

soup = BeautifulSoup(response,'lxml')

tbody=soup.find('table', id='main_table_countries_today').find('tbody').find_all('tr')[100:110]

replacement = {
    ",": "",
    "N/A": "0",
    "\n": "",
    " ": ""
}

def cleanup(webcontent, indecies):
    out = []
    for index in indecies:
        content = webcontent.find_all('td')[index].text
        for k in [*replacement]:
            content = content.replace(k,replacement[k])
        out.append(content)
    return out
     
with open('corona1.csv','w') as csv_file:
    csv_writer = writer(csv_file, escapechar=' ', quoting=csv.QUOTE_NONE)
    csv_writer.writerow(['countries','total_cases','total_deaths','total_recovered','active_cases','total_cases_in_1m','deaths_in_1m','population'])

    for value in tbody:
        csv_writer.writerow(cleanup(value, [1,2,4,6,8,10,11,14]))

Note: If you try to open the file in excel it is not correct formatted but for most other Programs and Apis it is. You have to change the separator in excel. Have a look here Import or export text(.txt or .csv) in Excel.

JPudel
  • 33
  • 4