The data here is web-scraped from a website, and this initial data in the variable 'r' has three columns, where there are three columns: 'Country', 'Date', '% vs 2019 (Daily)'. From these three columns I was able to extract only the ones I wanted from dates: "2021-01-01" to current/today. What I am trying to do (have spent hours), is trying to organize the data in such a way where there is one column with just the dates which correspond to the percentage data, then 4 other columns which are the country names: Denmark, Finland, Norway, Sweden. Underneath those four countries should have cells populated with the percent data. Have tried using [], loc, and iloc and various other combinations to filter the panda dataframes in such a way to make this happen, but to no avail.
Here is the code I have so far:
import requests
import pandas as pd
import json
import math
import datetime
from jinja2 import Template, Environment
from datetime import date
r = requests.get('https://docs.google.com/spreadsheets/d/1GJ6CvZ_mgtjdrUyo3h2dU3YvWOahbYvPHpGLgovyhtI/gviz/tq?usp=sharing&tqx=reqId%3A0output=jspn')
data = r.content
data = json.loads(data.decode('utf-8').split("(", 1)[1].rsplit(")", 1)[0])
d = [[i['c'][0]['v'], i['c'][2]['f'], (i['c'][5]['v'])*100 ] for i in data['table']['rows']]
df = pd.DataFrame(d, columns=['Country', 'Date', '% vs 2019 (Daily)'])
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
# EXTRACTING BETWEEN TWO DATES
df['Date'] = pd.to_datetime(df['Date'])
startdate = datetime.datetime.strptime('2021-01-01', "%Y-%m-%d").date()
enddate = datetime.datetime.strptime('2021-02-02', "%Y-%m-%d").date()
pd.Timestamp('today').floor('D')
df = df[(df['Date'] > pd.Timestamp(startdate).floor('D')) & (df['Date'] <= pd.Timestamp(enddate).floor('D'))]
Den = df.loc[df['Country'] == 'Denmark']
Fin = df.loc[df['Country'] == 'Finland']
Swe = df.loc[df['Country'] == 'Sweden']
Nor = df.loc[df['Country'] == 'Norway']
Den_data = Den.loc[: , "% vs 2019 (Daily)"]
Den_date = Den.loc[: , "Date"]
Nor_data = Nor.loc[: , "% vs 2019 (Daily)"]
Swe_data = Swe.loc[: , "% vs 2019 (Daily)"]
Fin_data = Fin.loc[: , "% vs 2019 (Daily)"]
Fin_date = Fin.loc[: , "Date"]
Den_data = Den.loc[: , "% vs 2019 (Daily)"]
df2 = pd.DataFrame()
df2['DEN_DATE'] = Den_date
df2['DENMARK'] = Den_data
df3 = pd.DataFrame()
df3['FIN_DATE'] = Fin_date
df3['FINLAND'] = Fin_data
Want it to be organized like this so I can eventually export it to excel:
Date | Denmark | Finland| Norway | Sweden
2020-01-01 | 1234 | 4321 | 5432 | 6574
...
Any help is greatly appreicated. Thank you