0

I am trying to make a multi-lineplot where one line shows daily count information from Jan to Jul in 2019 and the other one shows situation in 2020. Before plotting the graph, I also use pd.concat to 'vertically' combine the 2019 dataframe and 2020 dataframe to make the combined one a fit as the input dataset of seaborn's lineplot function. However, the result is somehow messy:

enter image description here

plt.figure(figsize=(18,6))

ax5 = sns.lineplot(x='OBSERVATION_DATE_ONLY', y='OBSERVATION_COUNT', hue='OBSERVATION_YEAR', data=Bird_Concat1920['rinphe'])

#Reset x and y axis labels
ax5.set_xlabel('Observation Date')
ax5.set_ylabel('Observation Count')
ax5.xaxis.labelpad = 15
ax5.yaxis.labelpad = 15

Here is a snapshot of how the concatenated dataframe looks like:

enter image description here

JohanC
  • 71,591
  • 8
  • 33
  • 66
  • Please share your code. – Pieter-Jan Aug 29 '20 at 10:27
  • Welcome to Stack Overflow! Please take a moment to read [How do I ask a good question?](https://stackoverflow.com/help/how-to-ask). You need to provide a [Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) that includes a toy dataset (refer to [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)) – Diziet Asahi Aug 29 '20 at 12:17

1 Answers1

0

There are two points: first, 2020 is a leap year, so we remove February 29th, and second, the data on the x-axis is set to the same number, and non-arrivals are set to NA.

import pandas as pd
import numpy as np

date_2019 = pd.date_range('2019-01-01', '2019-12-31', freq='1D')
date_2020 = pd.date_range('2020-01-01', '2020-12-31', freq='1D')
val_2019 = np.random.randint(100, 300, (365,))
val_2020 = np.random.randint(100, 300, (241,))
val_2020 = np.insert(val_2020.astype(float),len(val_2020),[np.NaN]*125)
df_2019 = pd.DataFrame({'Date':date_2019, 'OBSERVATION_COUNT': val_2019})
df_2020 = pd.DataFrame({'Date':date_2020, 'OBSERVATION_COUNT': val_2020})
df_2020 = df_2020[df_2020['Date'] != '2020-02-29']
Bird_Concat1920 = pd.concat([df_2019, df_2020], axis=0)
Bird_Concat1920['Date'] = pd.to_datetime(Bird_Concat1920['Date'])
Bird_Concat1920['OBSERVATION_YEAR'] = Bird_Concat1920['Date'].apply(lambda x:'Year_'+ str(x.year))
Bird_Concat1920['OBSERVATION_DATE_ONLY'] = Bird_Concat1920['Date'].apply(lambda x: x.strftime('%m-%d'))

import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

fig = plt.figure(figsize=(18,6))

ax5 = sns.lineplot(x='OBSERVATION_DATE_ONLY', y='OBSERVATION_COUNT', hue='OBSERVATION_YEAR', data=Bird_Concat1920)

#Reset x and y axis labels
ax5.set_xlabel('Observation Date')
ax5.set_ylabel('Observation Count')
ax5.xaxis.labelpad = 15
ax5.yaxis.labelpad = 15

months = mdates.MonthLocator(interval=1)
months_fmt = mdates.DateFormatter('%b')
ax5.xaxis.set_major_locator(months)
ax5.xaxis.set_major_formatter(months_fmt)

enter image description here

r-beginners
  • 31,170
  • 3
  • 14
  • 32