0

I have some dataframes that represent user activity taken in a front-end app. I'm trying to plot these activities on a single plot per user where each kind of activity is on a separate line. The goal is to have a nice longitudinal view of user activity. I'm creating a subplot per user and then calling plot_date for each type of activity. The issue I'm seeing is that there are extra/misplaced ticks and grid-lines on the x-axis. This issue becomes worse as I increase my calls to plot_date (I have 5 different calls in my real code). I've tried with and without sharex. I've tried autofmt_xdate. I've even tried explicitly setting the xlim in various places. Nothing can get rid of the extra ticks/grid-lines. I'm sure I could fix this by manually overriding the ticks at the very end of my code but this feels wrong. Is there a better way to handle this issue? This feels broken to me.

ncols = 2
len_list = 4
nrows  = int(np.ceil(len_list / ncols))
fig, ax = plt.subplots(figsize=(16,2*nrows), nrows=nrows, ncols=ncols, sharex=True, sharey=True)
for i, user in enumerate(sorted(expected_users)[:4]):
    row = int(i/ncols)
    col = i%ncols
    user_paginations = expected_paginations[expected_paginations['action_by'] == user]
    user_actions = expected_actions[expected_actions['action_by'] == user]
    if not user_actions.empty:
        print('actions', user_actions['date'].min(), user_actions['date'].max())
        ax[row,col].plot_date(user_actions['date'], np.random.uniform(0, 0.5, user_actions.shape[0]) + 0, alpha=0.5, label='action')
    if not user_paginations.empty:
        print('pages', user_paginations['date'].min(), user_paginations['date'].max())
        ax[row,col].plot_date(user_paginations['date'], np.random.uniform(0, 0.5, user_paginations.shape[0]) + 1, alpha=0.5, label='paginate')
plt.tight_layout()
fig.autofmt_xdate()

broken_ticks

For reference I added some print statements to the code which produced the following output:

actions 2019-12-20 07:24:39.362000 2020-01-16 11:14:11.776000
pages 2019-12-20 07:33:58.294000 2020-01-16 07:13:17.629000
actions 2020-01-03 11:20:05.271000 2020-01-16 09:25:21.311000
pages 2020-01-14 13:27:02.093000 2020-01-16 09:18:14.726000
actions 2020-01-08 06:55:40.045000 2020-01-08 06:55:40.775000
actions 2020-01-07 10:04:37.674000 2020-01-08 13:53:58.130000
pages 2020-01-07 09:59:29.376000 2020-01-08 13:34:48.712000

EDIT: The issue I'm trying to highlight here is that the ticks are unevenly spaced. This becomes more apparent as I add more data points. I've attached some additional examples to highlight this issue further.

With all 6 activity types: enter image description here

And with fewer examples to show it's not the number of users (subplots) causing the issues: enter image description here

I also reran this with only one plot to verify it wasn't being cause by having multiple plots.

Looking a little closer, the issue always occurs on the first of the month. All dates are exactly the same distance apart except for on the first.

I've posted a "minimal" dataset and example code at https://gist.github.com/mdbecker/727a362ff573a459c5d7a66dfc46836e that you can use to reproduce this issue.

UPDATE 2: Updating matplotlib to 3.1.1 (from 3.0.2) fixed this bug.

mbecker
  • 647
  • 6
  • 15
  • I do not understand what you think is wrong with the plot. – ImportanceOfBeingErnest Jan 16 '20 at 22:53
  • Here's what I think you're after: In this case, 4 subplots for 4 users, but you don't want the bottom left plot to display the grid in the same shape as the top left - you want it to scale down to just display the 2020-01-09 data and get rid of the "extra/misplaced ticks and grid lines". Can you clarify what you mean by removing ticks and gridlines? There are methods like `plt.gca().grid(False)` and `plt.gca().set_xticks([])` and `ax.grid(b=None)` – PeptideWitch Jan 16 '20 at 22:56
  • I've edited my question to clarify the issue. Let me know if it's still unclear. Thanks! With a little more searching it looks like this might be a duplicate of https://stackoverflow.com/questions/54031757/why-is-the-first-of-the-month-automatically-plotted-as-tick-in-matplotlib-plot-d I verified that I have matplotlib 3.0.2 so I guess I need to update matplotlib! – mbecker Jan 17 '20 at 04:29

0 Answers0