0

What I'm trying to achieve is a line graph of genres and their average score throughout history. X-axis = years, y-axis = score.

genre_list is an array of the types of genres.

for genre in genre_list:
    random_color = [np.random.random_sample(), np.random.random_sample(), np.random.random_sample()]
    plt.plot('release_year', 'vote_average', 
             data=genre_df, marker='', 
             markerfacecolor=random_color, 
             markersize=1, 
             color=random_color, 
             linewidth=1, 
             label = genre)

plt.legend()
plt.figure(figsize=(5,5))

Though what I end up with is quite ugly.

enter image description here

Question 1) I've tried setting the figure size, but it seems to stay the same proportion. How do I configure this?

Question 2) How do I set the line color to match the legend?

Question 3) How do I configure the x and y axis so that they are more precise? (potentially the same question as #1)

I appreciate any sort of input, thank you.

averageUsername123
  • 725
  • 2
  • 10
  • 22
  • 2
    (1) You need to create the figure with the envisionned size *before* plotting. (2) Currently, you are plotting the same data several times. I seems you want to use different data in each loop step. (3) See e.g. [here](https://stackoverflow.com/a/19972993/4124317). – ImportanceOfBeingErnest Aug 13 '18 at 01:39
  • Please show how *genre_list* is generated. – Parfait Aug 13 '18 at 02:29
  • @Parfait `genre_list = np.unique(genre_df['genres'].tolist()) print(genre_list)` Returns: `['Action' 'Adventure' 'Animation' 'Comedy' 'Crime' 'Documentary' 'Drama' 'Family' 'Fantasy' 'Foreign' 'History' 'Horror' 'Music' 'Mystery' 'Romance' 'Science Fiction' 'TV Movie' 'Thriller' 'War' 'Western']` – averageUsername123 Aug 13 '18 at 02:32

1 Answers1

0

Consider groupby to split dataframe by genre and then loop through subsets for plot lines. And as @ImportanceOfBeingErnest references above, use this SO answer to space out x axis at yearly intervals (rotating ticks as needed):

import matplotlib.pyplot as plt
import matplotlib.ticker as plticker
...

fig, ax = plt.subplots(figsize=(12,5))

for genre, sub_df in genre_df.groupby(['genres']):
    random_color = [np.random.random_sample() for _ in range(3)]

    plt.plot('release_year', 'vote_average', 
             data = sub_df, marker = '', 
             markerfacecolor = random_color, 
             markersize = 1, 
             color = random_color, 
             linewidth = 1, 
             label = genre)

loc = plticker.MultipleLocator(base=1.0)
ax.xaxis.set_major_locator(loc)
plt.xticks(rotation=45)

plt.legend()
plt.show()
plt.clf()
Parfait
  • 104,375
  • 17
  • 94
  • 125