Formatting Matplot libs

Question

I've got a dataset that looks a bit like this.

df

headline        some_url time                is_national
0   Holloway    url      2023-01-11 11:44:27 True
1   London      url      2023-01-11 11:25:10 False
2   Viral       url      2023-01-11 10:43:39 False
3   London      url      2023-01-11 09:41:18 True
4   Royal       url      2023-01-11 15:49:38 False

I've been able to create a categorical column for day of the week thus:

cats = [ 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
df['day_of_week'] = df.time.dt.day_name()
df['day_of_week'] = pd.Categorical(df['day_of_week'], categories=cats, ordered=True)

I planned to create an hour column like this:

df['hour'] = df.time.dt.hour

But the hour column comes out as a floating point.

The result when plotted is:

How do I avoid the floating point?

The second query is two-fold. I can produce a histogram of each using the .plot(kind=hist) function in pandas like so:

But the KDE plot with this query:

ax = df.hour.plot(kind='kde', title="Articles by hour")
ax.set_xlabel("Hour")
ax.set_ylabel("Number of articles")

Which looks like this:

Is there a simple way of cropping the plot to avoid minus hours or hours beyond 24?

For the first part of the question, see https://stackoverflow.com/a/22028426/1008142 — Rory Yorke, Jan 22 '23 at 04:21

Formatting Matplot libs

0 Answers0