1

I'm trying to log covid data from a website and update it each day with new cases. So far I have managed to put the numbers of cases in the file through scraping, but each day I have to manually enter the dates and run the file to get the updated statistics. How would I go about writing a script that will update the CSV each day, with new dates and the new number of cases, while saving the old ones for future use? I wrote this and run it in Virtual Studio Code.

import csv
import bs4
import urllib
from urllib.request import  urlopen as uReq
from urllib.request import Request, urlopen
from bs4 import BeautifulSoup as soup

#For sites that can't be opened due to Urllib blocker, use a Mozilla User agent to get access
pageRequest = Request('https://coronavirusbellcurve.com/', headers = {'User-Agent': 'Mozilla/5.0'})
htmlPage = urlopen(pageRequest).read()
page_soup = soup(htmlPage, 'html.parser')
specificDiv = page_soup.find("div", {"class": "table-responsive-xl"})

TbodyStats = specificDiv.table.tbody.tr.contents
TbodyDates = specificDiv.table.thead.tr.contents

def writeCSV():
    with open('CovidHTML.csv','w', newline= '') as file:
        theWriter = csv.writer(file)  

        theWriter.writerow(['5/8', ' 5/9', ' 5/10',' 5/11',' 5/12'])
        row = []
        for i in range(3,len(TbodyStats),2):
            row.append([TbodyStats[i].text])

        theWriter.writerow(row)


 writeCSV()
  • 1
    You could write code that reads the .csv, adds the data for the day and then rewrites the entire file; or you could read the .csv, detect what the last date was and then only append the data for new dates. A better route is probably to look at a library like `pandas` which has standard functions for all that stuff (reading and writing csv files and manipulating the data). – Grismar May 13 '20 at 00:15
  • 1
    I can't tell exactly how the new info is being added. If new information is new rows in the file, you can open the file in append mode ('a' instead of 'w') and when you write to it, the new info is appended. If the new info is in extra columns, then you'll have to clobber the file and do a full rewrite every day. – bfris May 13 '20 at 00:51
  • What operating system are you running this script from? If it is a Unix-like operating system you can use `crontab` to automate the calling of this script daily like you want. Here is an example: [link](https://stackoverflow.com/questions/8727935/execute-python-script-via-crontab/8728014#8728014). – Nana Owusu May 13 '20 at 01:25

1 Answers1

0

If you want to preserve the older contents of the csv file, then open the file in append mode (as correctly pointed out by @bfris)

    with open('CovidHTML.csv','a', newline= '') as file:

If you are using Linux, you can set up a cron job to invoke the python script every day at some specific time. First, locate the path to python using the which command:

$ which python3 

This gave me

/usr/bin/python3

Then the cron job will look like:

10 14 * * * /usr/bin/python3 /path/to/python/file.py

Add this line to the crontab file. This will call the python script everyday at 2:10PM everyday.
You can take a look here for details.

In case you are using Windows, you can take a look at this question.

Kashinath Patekar
  • 1,083
  • 9
  • 23