11

I am unable to show a bar and line graph on the same plot. Example code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Df = pd.DataFrame(data=np.random.randn(10,4), index=pd.DatetimeIndex(start='2005', freq='M', periods=10), columns=['A','B','C','D'])

fig = plt.figure()
ax = fig.add_subplot(111)

Df[['A','B']].plot(kind='bar', ax=ax)
Df[['C','D']].plot(ax=ax, color=['r', 'c'])
user2546580
  • 155
  • 1
  • 1
  • 4

4 Answers4

15

You can also try this:

fig = plt.figure()
ax = DF['A','B'].plot(kind="bar");plt.xticks(rotation=0)
ax2 = ax.twinx()
ax2.plot(ax.get_xticks(),DF['C','D'],marker='o')
Michal
  • 1,927
  • 5
  • 21
  • 27
3

I wanted to know as well, however all existing answers are not for showing bar and line graph on the same plot, but on different axis instead.

so I looked for the answer myself and have found an example that is working -- Plot Pandas DataFrame as Bar and Line on the same one chart. I can confirm that it works.

What baffled me was that, the almost same code works there but does not work here. I.e., I copied the OP's code and can verify that it is not working as expected.

The only thing I could think of is to add the index column to Df[['A','B']] and Df[['C','D']], but I don't know how since the index column doesn't have a name for me to add.

Today, I realize that even I can make it works, the real problem is that Df[['A','B']] gives a grouped (clustered) bar chart, but grouped (clustered) line chart is not supported.

Community
  • 1
  • 1
xpt
  • 20,363
  • 37
  • 127
  • 216
3

The issue is that the pandas bar plot function treats the dates as a categorical variable where each date is considered to be a unique category, so the x-axis units are set to integers starting at 0 (like the default DataFrame index when none is assigned).

The pandas line plot uses x-axis units corresponding to the DatetimeIndex, for which 0 is located on January 1970 and the integers count the number of periods (months in this example) since then. So let's take a look at what happens in this particular case:

import numpy as np     # v 1.19.2
import pandas as pd    # v 1.1.3

# Create random data
rng = np.random.default_rng(seed=1) # random number generator
df = pd.DataFrame(data=rng.normal(size=(10,4)),
                  index=pd.date_range(start='2005', freq='M', periods=10),
                  columns=['A','B','C','D'])

# Create a pandas bar chart overlaid with a pandas line plot using the same
# Axes: note that seeing as I do not set any variable for x, df.index is used
# by default, which is usually what we want when dealing with a dataset
# containing a time series
ax = df.plot.bar(y=['A','B'], figsize=(9,5))
df.plot(y=['C','D'], color=['tab:green', 'tab:red'], ax=ax);

pandas_bar_line_wrongx

The bars are nowhere to be seen. If you check what x ticks are being used, you'll see that the single major tick placed on January is 420 followed by these minor ticks for the other months:

ax.get_xticks(minor=True)
# [421, 422, 423, 424, 425, 426, 427, 428, 429]

This is because there are 35 years * 12 months since 1970, the numbering starts at 0 so January 2005 lands on 420. This explains why we do not see the bars. If you change the x-axis limit to start from zero, here is what you get:

ax = df.plot.bar(y=['A','B'], figsize=(9,5))
df.plot(y=['C','D'], color=['tab:green', 'tab:red'], ax=ax)
ax.set_xlim(0);

pandas_bar_line_setxlim

The bars are squashed to the left, starting on January 1970. This problem can be solved by setting use_index=False in the line plot function so that the lines also start at 0:

ax = df.plot.bar(y=['A','B'], figsize=(9,5))
df.plot(y=['C','D'], color=['tab:green', 'tab:red'], ax=ax, use_index=False)
ax.set_xticklabels(df.index.strftime('%b'), rotation=0, ha='center');

# # Optional: move legend to new position
# import matplotlib.pyplot as plt    # v 3.3.2
# ax.legend().remove()
# plt.gcf().legend(loc=(0.08, 0.14));

pandas_bar_line

In case you want more advanced tick label formatting, you can check out the answers to this question which are compatible with this example. If you need more flexible/automated tick label formatting as provided by the tick locators and formatters in the matplotlib.dates module, the easiest is to create the plot with matplotlib like in this answer.

Patrick FitzGerald
  • 3,280
  • 2
  • 18
  • 30
0

You can do something like that, both on the same figure:

In [4]: Df = pd.DataFrame(data=np.random.randn(10,4), index=pd.DatetimeIndex(start='2005', freq='M', periods=10), columns=['A','B','C','D'])

In [5]: fig, ax = plt.subplots(2, 1) # you can pass sharex=True, sharey=True if you want to share axes.

In [6]: Df[['A','B']].plot(kind='bar', ax=ax[0])
Out[6]: <matplotlib.axes.AxesSubplot at 0x10cf011d0>

In [7]: Df[['C','D']].plot(color=['r', 'c'], ax=ax[1])
Out[7]: <matplotlib.axes.AxesSubplot at 0x10a656ed0>
  • Is there any way to get them on the same figure? i.e. share x-axis. – user2546580 Nov 13 '13 at 14:16
  • @moenad: This doesn't work. The line plot overplots the bar plot, and you can't see the bar plot any more. This seems to be pandas related, because if you do `ax.plot(Df.index.values, DF[['C', 'D']], ...)` it works.. – naught101 Nov 08 '15 at 23:25
  • 1
    @naught101, can you elaborate please? I tried `ax.plot(Df.index.values, Df[['C', 'D']], linestyle='--', marker='o')`, but it is [still not working for me](https://gist.github.com/suntong/4a689a846fbf60465540). – xpt Dec 31 '15 at 17:39