I have 4 datasets with similar dates. Have extracted the data range need for the columns from the extracted datasource. Want to have the four countries: Denmark, Norway, Sweden, and Finland have a single Date column.
Am able to get the panda data frames to print out invidually with their corresponding Date column and Country, but I want a single unifying column that corresponds with all four countries.
Here is what prints out, when I run the panda df head method:
DATE DENMARK
1563 2021-01-02 -65.261383
1564 2021-01-03 -60.405405
1565 2021-01-04 -69.371429
1566 2021-01-05 -73.690932
1567 2021-01-06 -76.257310
DATE FINLAND
1883 2021-01-02 -75.416667
1884 2021-01-03 -77.408056
1885 2021-01-04 -75.598086
1886 2021-01-05 -79.251701
1887 2021-01-06 -80.445151
DATE NORWAY
4603 2021-01-02 -39.164087
4604 2021-01-03 -31.763620
4605 2021-01-04 -34.002770
4606 2021-01-05 -33.815689
4607 2021-01-06 -43.454791
DATE SWEDEN
5883 2021-01-02 -65.853659
5884 2021-01-03 -66.385669
5885 2021-01-04 -70.704467
5886 2021-01-05 -73.450509
5887 2021-01-06 -80.805484
And here is the code:
import requests
import pandas as pd
import json
import math
import datetime
from jinja2 import Template, Environment
from datetime import date
r = requests.get('https://docs.google.com/spreadsheets/d/1GJ6CvZ_mgtjdrUyo3h2dU3YvWOahbYvPHpGLgovyhtI/gviz/tq?usp=sharing&tqx=reqId%3A0output=jspn')
data = r.content
data = json.loads(data.decode('utf-8').split("(", 1)[1].rsplit(")", 1)[0])
d = [[i['c'][0]['v'], i['c'][2]['f'], (i['c'][5]['v'])*100 ] for i in data['table']['rows']]
df = pd.DataFrame(d, columns=['Country', 'Date', '% vs 2019 (Daily)'])
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
# EXTRACTING BETWEEN TWO DATES
df['Date'] = pd.to_datetime(df['Date'])
startdate = datetime.datetime.strptime('2021-01-01', "%Y-%m-%d").date()
enddate = datetime.datetime.strptime('2021-02-02', "%Y-%m-%d").date()
pd.Timestamp('today').floor('D')
df = df[(df['Date'] > pd.Timestamp(startdate).floor('D')) & (df['Date'] <= pd.Timestamp(enddate).floor('D'))]
Den = df.loc[df['Country'] == 'Denmark']
Fin = df.loc[df['Country'] == 'Finland']
Swe = df.loc[df['Country'] == 'Sweden']
Nor = df.loc[df['Country'] == 'Norway']
Den_data = Den.loc[: , "% vs 2019 (Daily)"]
Den_date = Den.loc[: , "Date"]
Nor_data = Nor.loc[: , "% vs 2019 (Daily)"]
Nor_date = Nor.loc[: , "Date"]
Fin_data = Fin.loc[: , "% vs 2019 (Daily)"]
Fin_date = Fin.loc[: , "Date"]
Swe_data = Swe.loc[: , "% vs 2019 (Daily)"]
Swe_date = Swe.loc[: , "Date"]
df1 = pd.DataFrame()
df1['DATE'] = Den_date
df1['DENMARK'] = Den_data
df2 = pd.DataFrame()
df2['DATE'] = Fin_date
df2['FINLAND'] = Fin_data
df3 = pd.DataFrame()
df3['DATE'] = Nor_date
df3['NORWAY'] = Nor_data
df4 = pd.DataFrame()
df4['DATE'] = Swe_date
df4['SWEDEN'] = Swe_data
print(df1.head())
print(df2.head())
print(df3.head())
print(df4.head())
Is it possible to unify the date to a single date column that corresponds to each country?
On the order of this:
DATE | DENMARK | FINLAND | SWEDEN | NORWAY
------------------------------------------------
2021-01-01 | 43.4 | 55.6 | 75.4 | 34.5