I have a 1st CSV panda dataframe like this:
Date_reported Country New_cases Cumulative_cases
0 23-Feb-20 Singapore 10 10
0 26-Mar-20 Singapore 100 110
0 27-Apr-20 Singapore 200 310
0 28-May-20 Singapore 400 710
0 30-Jun-20 Singapore 1000 1710
And I managed to churn out a grouped data of a monthly case in this form:
Date_reported
February 10
March 100
April 200
May 400
June 1000
Using this code:
df = pd.read_csv('Stackoverflow1.csv')
df['Date_reported'] = pd.to_datetime(df['Date_reported'])
df.groupby(df['Date_reported'].dt.strftime('%B'))['New_cases'].sum().sort_values()
I also have a 2nd panda dataframe like this:
Variables 2020 Feb 2020 Mar 2020 Apr 2020 May 2020 June
Hotel Revenue $100 $90 $80 $70 $60
Occupancy Rate 80% 70% 60% 50% 40%
My end result is to merge the data on the monthly reported cases from the first dataframe and variables from on the second dataframe (i.e. Monthly covid cases against Hotel Revenue) based on individual months and plot a simple graph.
i.e.
Date_reported Monthly_cases Hotel Revenue Occupancy Rate
February 10 $100 80%
March 100 $90 70%
April 200 $80 60%
May 400 $70 50%
June 1000 $60 40%
However, I am stuck at two problems:
I cannot convert the grouped data on the monthly cases itself as a list to plot against a variable from the second data frame although I managed to churn out the monthly summary.
Although I managed to transpose the 2nd dataframe using this code,
hotel = pd.read_csv('Stackoverflow2.csv') hotel2 = hotel.T headers = hotel2.iloc[0] hotel2 = pd.DataFrame(hotel2.values[1:], columns=headers)
I do not have the dates as an index shown below:
Variables Hotel Revenue Occupancy Rate
2020 Feb $100 80%
2020 Mar $90 70%
2020 Apr $80 60%
2020 May $70 50%
2020 Jun $60 40%
Your help would be greatly appreciated as I am a beginner struggling with pandas and python. Do let me know if you have a better way of presenting this. Thank you.