0

I have a dataset with over 100k entries as per below:

    score       time
0     19     18 days 02:55:00
1     2949   1 day 01:20:11
2     42211  5 days 00:00:00
     ....
100000 22    100 days 01:11:03

I am trying to plot time on the x axis and score on the y axis as per below:

import matplotlib
matplotlib.use('Agg')
import pandas as pd
import matplotlib.pyplot as plt

k = pd.cut(df.score, bins)
plt.plot(time, score)
plt.show()

The issue I face is that I am trying to plot the scores by bins with time on the X-axis but so many plots don't fit on the one chart. Can anyone assist me?

IronKirby
  • 708
  • 1
  • 7
  • 24
Matt-pow
  • 946
  • 4
  • 18
  • 31

1 Answers1

0

Have you tried looking at the following? Histogram in matplotlib, time on x-Axis

As indicated in the above link:

Matplotlib uses its own format for dates/times, but also provides simple functions to convert which are provided in the dates module. It also provides various Locators and Formatters that take care of placing the ticks on the axis and formatting the corresponding labels. Provided that you pass in your respective date/time bins, we can plot this out accordingly and label this on the x-axis.

This should get you started:

import random
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

# generate some random data (approximately over 5 years)
data = [float(random.randint(1271517521, 1429197513)) for _ in range(1000)]

# convert the epoch format to matplotlib date format 
mpl_data = mdates.epoch2num(data)

# plot it
fig, ax = plt.subplots(1,1)
ax.hist(mpl_data, bins=50, color='lightblue')
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d.%m.%y'))
plt.show()

Result:

Python Hist Example

IronKirby
  • 708
  • 1
  • 7
  • 24