Plot dataframe in Python

Question

I'm new to Python. I hope you can help me. I have a dataframe with two columns. The first column is called dates and the second column is filled with numbers. The dataframe has 351 row.

dates        numbers
01.03.2019   5
02.03.2019   8
...
20.02.2020   3
21.02.2020   2

I want the whole first column to be on the x axis from. I tried to plot it like this:

graph = FinalDataframe.plot(figsize=(12, 8))

graph.legend(loc='upper center', bbox_to_anchor=(0.5, -0.075), ncol=4)

graph.set_xticklabels(FinalDataframe['dates'])

plt.show()

But on the x axis are only the first few values from the column instead of the whole column. Furthermore, they are not correlated to the data from the second column.

Any suggestions?

Thank you in advance!

Cimbali · Accepted Answer · 2021-06-10T10:21:04.940

Your issue is that x ticks are generated automatically, and spaced out to be readable. However you the tell matplotlib to use all the labels. The simple fix is to tell him to use one tick label per entry, but that’s going to make your x-axis unreadable:

graph.set_xticks(range(len(FinalDataframe['dates'])))

Now you could space them out manually:

graph.set_xticks(range(0, len(FinalDataframe['dates']), 61))
graph.set_xticklabels(FinalDataframe['dates'][::61])

However the best result to plot dates on the x-axis is still to use pandas’ built-in date objects. We can do this with pd.to_datetime

This will also allow pandas to know where to place points on the x-axis, by specifying that you want the x-axis to be the dates. In that way, if dates are not sorted or missing, the gaps will be skipped properly, and points will be above the ordinate of the right date.

I’m first recreating a dataframe that looks like what you posted:

>>> df = pd.DataFrame({'dates': pd.date_range('20190301', '20200221', freq='D').strftime('%d.%m.%Y'), 'numbers': np.random.randint(0, 10, 358)})
>>> df
          dates  numbers
0    01.03.2019        2
1    02.03.2019        2
2    03.03.2019        5
3    04.03.2019        4
4    05.03.2019        3
..          ...      ...
353  17.02.2020        2
354  18.02.2020        1
355  19.02.2020        2
356  20.02.2020        3
357  21.02.2020        1

(This should be the same as FinalDataFrame, or if your dates are the index, then it’s the same as FinalDataFrame.reset_index())

Now I’m converting the dates:

>>> df['dates'] = pd.to_datetime(df['dates'], format='%d.%m.%Y')
>>> df
         dates  numbers
0   2019-03-01        2
1   2019-03-02        2
2   2019-03-03        5
3   2019-03-04        4
4   2019-03-05        3
..         ...      ...
353 2020-02-17        2
354 2020-02-18        1
355 2020-02-19        2
356 2020-02-20        3
357 2020-02-21        1

You can check your columns contain dates and not string representations of dates by checking their dtypes:

>>> df.dtypes
dates      datetime64[ns]
numbers             int64

Finally plotting:

>>> ax = df.plot(x='dates', y='numbers', figsize=(12, 8))
>>> ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.075), ncol=4)
<matplotlib.legend.Legend object at 0x7fc8c24fd4f0>
>>> plt.show()

Legends are taken care of automatically. This is what you get:

You’re welcome @Lisa ! Next question try to post your dataframe (or a part of it), [simply the output of `print(df)` is very helpful](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples), and if you have plotting issues try to include a screenshot (or a link to it) of what you get, and describe what you would like to achieve instead. That will make it even easier for people to reply ! — Cimbali, Jun 10 '21 at 12:56

Plot dataframe in Python

1 Answers1