0

I would like to plot a time series, start Oct-2015 and end Feb-2018, in one graph, each year is a single line. The time series is int64 value and is in a Pandas DataFrame. The date is in datetime64[ns] as one of the columns in the DataFrame.

How would I create a graph from Jan-Dez with 4 lines for each year.

graph['share_price'] and graph['date'] are used. I have tried Grouper, but that somehow takes Oct-2015 values and mixes it with the January values from all other years.

This groupby is close to what I want, but I loose the information which year the index of the list belongs to.

graph.groupby('date').agg({'share_price':lambda x: list(x)})

Then I have created a DataFrame with 4 columns, 1 for each year but still, I don't know how to go ahead and group these 4 columns in a way, that I will be able to plot a graph in a way I want.

ccasimiro9444
  • 425
  • 1
  • 6
  • 22

1 Answers1

4

You can achieve this by:

  1. extracting the year from the date
  2. replacing the dates by the equivalent without the year
  3. setting both the year and the date as index
  4. unstacking the values by year

At this point, each year will be a column, and each date within the year a row, so you can just plot normally.

Here's an example.

Assuming that your DataFrame looks something like this:

>>> import pandas as pd
>>> import numpy as np
>>> index = pd.date_range('2015-10-01', '2018-02-28')
>>> values = np.random.randint(-3, 4, len(index)).cumsum()
>>> df = pd.DataFrame({
...    'date': index,
...    'share_price': values
>>> })
>>> df.head()
        date  share_price
0 2015-10-01            0
1 2015-10-02            3
2 2015-10-03            2
3 2015-10-04            5
4 2015-10-05            4
>>> df.set_index('date').plot()

enter image description here

You would transform the DataFrame as follows:

>>> df['year'] = df.date.dt.year
>>> df['date'] = df.date.dt.strftime('%m-%d')
>>> unstacked = df.set_index(['year', 'date']).share_price.unstack(-2)
>>> unstacked.head()
year   2015  2016  2017  2018
date                         
01-01   NaN  28.0 -16.0  21.0
01-02   NaN  29.0 -14.0  22.0
01-03   NaN  29.0 -16.0  22.0
01-04   NaN  26.0 -15.0  23.0
01-05   NaN  25.0 -16.0  21.0

And just plot normally:

unstacked.plot()

enter image description here

Carles Sala
  • 1,989
  • 1
  • 16
  • 34