3

I have a pandas dataframe that uses strings as index. How can I set xlim for the x axis when my dataframe index is of type object? I tried adding two additional years one at the end and one at the beginning where all datasets are np.nan but that didn't work.

Here is the dataframe

Dataframe

The datatype of index is object

df.index
Out[52]: Index(['2003', '2004', '2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012'], dtype='object')

Here is the plot

Matplotlib Plot

So I would like to have some extra space on the x-axis in so the values for the fist and last year are better visible. What could I do?

EDIT:

Here is a minimal example using objects and not date objects as index

ipython notebook

Christian Rapp
  • 1,853
  • 24
  • 37
  • https://stackoverflow.com/questions/16897864/avoid-points-on-edges-of-plots-when-the-last-x-value-equals-the-tick/16898296#16898296 – tacaswell Jun 28 '14 at 19:11
  • https://stackoverflow.com/questions/15375791/how-to-autoscale-y-axis-in-matplotlib – tacaswell Jun 28 '14 at 19:11
  • https://stackoverflow.com/questions/14493334/add-margin-when-plots-run-against-the-edge-of-the-graph – tacaswell Jun 28 '14 at 19:12
  • Also, to be clear, you are not plotting against strings, you are plotting against the position array. `pandas` is just being (too) clever and re-labeling your tick markers. – tacaswell Jun 28 '14 at 19:13
  • @tcaswell the index is of type object. if it is a string or not is not relevant because I can't use set_xlim in this case. but margins looks great :) I tried this but seems like I did something wrong. Using ipython notebook not always updates the plot so maybe I should have just tried to redraw. – Christian Rapp Jun 28 '14 at 19:28
  • in either case, it is plotting your data against `np.arange(len(data))` so you can use `set_xlim`. – tacaswell Jun 28 '14 at 19:30
  • this is how it looks when using set_xlim on my ax object http://postimg.org/image/4rnex5u0z/d0936fee/. As you can see I have a the effect that I want but the labels at the bottom are screwed – Christian Rapp Jun 28 '14 at 19:35
  • ahh, right, because `pandas` is using as `FixedFormatter`, when you change the xlimit, it changes where the ticks are/how many of them there are, but the way `FixedFormatter` works is the first tick get the first string, the second the second, and so on. – tacaswell Jun 28 '14 at 19:41
  • The work-around is that a `FixedLocator` should also be used. Are you using the most recent version of pandas? If so, I think this is a bug on their part. – tacaswell Jun 28 '14 at 19:42
  • Can you put up a minimal example to demonstrate this problem? – tacaswell Jun 28 '14 at 19:52
  • @tcaswell added a minimal example. Please see also my comment in the answer of CT Zhu. Using a locator removed all labels on the x-axis – Christian Rapp Jun 29 '14 at 08:45
  • It is far better to put the example code directly in the question, links rot. – tacaswell Jun 29 '14 at 14:03

2 Answers2

5

Use set_xlim, +1 means moving 1 unit to the right and -1 means the reverse. In the following example I expanded the plot 0.5 months each side:

df=pd.DataFrame({'A': range(10), 'B': range(1, 11), 'C': range(2,12)})
df.index=pd.date_range('2001/01/01', periods=10, freq='M')
ax=df.plot(kind='line')
ax.set_xlim(np.array([-0.5, 0.5])+ax.get_xlim())

enter image description here

Edit, to have xticklabel for every year, instead the default every two years in pandas:

ax=df.plot(kind='line', xticks=df.index)
ax.set_xticklabels(df.index.map(lambda x: datetime.datetime.strftime(x, '%Y')))

enter image description here

CT Zhu
  • 52,648
  • 17
  • 120
  • 133
  • 1
    This is probably another case where `margins` would be useful. – tacaswell Jun 28 '14 at 19:11
  • Thanks for your suggestion. In fact it is likely a better idea to use real date objects and not "strings" as I did. Because with my index your solution does not work. Of course yours is the better solution. I will play with margins first – Christian Rapp Jun 28 '14 at 19:30
  • @ChristianRapp No, making your index `Date` objects is probably a better solution – tacaswell Jun 28 '14 at 19:47
  • If I use dates as Index how can I set which ticks are shown on the x axis? Now the plot shows only every second year as label but I would like to have each year displayed. – Christian Rapp Jun 28 '14 at 21:47
  • @ChristianRapp, `pandas` automatically does a lot of things, here it determines that having labels for every year is too busy and goes for every two years label. See edit for a work-around. It does means more work, but use `Dateindex` for `time series` data is often a good idea in `pandas`. – CT Zhu Jun 29 '14 at 00:43
  • @tcaswell, in version `1.3.1`, `ax.margins(x=0.1, y=0.1)` raise a `ValueError: more than two arguments were supplied`. Shouldn't the source code be `if len(args) == 2:` and `elif len(args) == 3:` in lines 1835 and 1837 respectively? – CT Zhu Jun 29 '14 at 00:49
  • @CTZhu This has already been fixed on master (https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/axes/_base.py#L1787) In 1.3.1 just call it as `ax.margin(.1, .1)` – tacaswell Jun 29 '14 at 00:52
  • and see https://stackoverflow.com/questions/24412292/x-axis-ticks-will-not-update-when-zoom-or-pan/24416670#24416670 for a brief introduction to how to control date tickers. – tacaswell Jun 29 '14 at 00:53
  • @tcaswell, don't known that is already fixed. Actually the `locator` method doesn't currently work for `pandas` consistently, see http://stackoverflow.com/questions/24337753/skip-gcf-autofmt-xdate-at-pandas-plot-creation/24337921#24337921 – CT Zhu Jun 29 '14 at 00:57
  • The 'this' I meant was the arguments on `margins`. – tacaswell Jun 29 '14 at 01:02
  • @CTZhu I don't have the bandwidth to chase down datetime issues tonight, if you think there is still a bug in mpl please make an issue on github. – tacaswell Jun 29 '14 at 01:13
  • Thanks for the locator hint. I tried to do this, but now all labels at x-axis are gone. http://nbviewer.ipython.org/gist/anonymous/a56b841316d92cffcd78 – Christian Rapp Jun 29 '14 at 08:36
  • @CTZhu ah great, I just realized you edited your answer :) The trick setting the labels manually works. Nice, thank you! Still why does my locator not produce any labels? Crazy, I am having most of the problems with axes labeling – Christian Rapp Jun 29 '14 at 10:19
2
from __future__ import print_function
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

df = pd.DataFrame({'Foo': pd.Series([2,3,4], index=['2002', '2003', '2004'])})
fig, ax = plt.subplots()

df.plot(ax=ax)

which gets you the plot. To take a look at how the x-ticks are getting dealt with look at:

# note this is an AutoLocator
print(ax.xaxis.get_major_locator())
# note this is a FixedFormatter
print(ax.xaxis.get_major_formatter())
# these are the ticks that are used
ff = ax.xaxis.get_major_formatter()
print(ff.seq)

This means that if you pan around the tick labels will stay the same, but will be at random positions. This is the same problem as changing the xlim, the way pandas sets up the plot initially the tick labels are completely decoupled from the data.

One (verbose) way to fix this is:

ax.xaxis.set_major_locator(mticker.FixedLocator(np.arange(len(df))))
ax.xaxis.set_major_formatter(mticker.FixedFormatter(df.index))


# note this is a FixedLocator
print(ax.xaxis.get_major_locator())
# note this is a FixedFormatter
print(ax.xaxis.get_major_formatter())

This will work no matter what you set your index to (strings vs dates)

I have created an issue with pandas https://github.com/pydata/pandas/issues/7612

tacaswell
  • 84,579
  • 22
  • 210
  • 199