Here I am using the IMDB Kaggle data set and I am plotting adjusted budget vs adjusted revenue and I am using popularity as the third variable for the bubble. There are values like 0 and 214 in the columns of budget_adj and revenue_adj, so I have taken the subset here. Now I am trying to have scientific notation along the axes since the variables are in millions and billions. Here is my code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import datetime
from collections import defaultdict
%matplotlib inline
# Lot of preprocessing was done before the following
# step
df = movies.loc[(movies['budget_adj'] > 1e7) & (movies['revenue_adj'] > 1e7)]
x = df['budget_adj']
y = df['revenue_adj']
s = df['popularity']
fig, ax = plt.subplots()
fig.set_size_inches(12, 10)
plt.ticklabel_format(style='sci', axis='x', scilimits=(0,0))
g = plt.scatter(x, y, s*3, alpha=0.5, c = 'blue')
g.axes.set_title('Title', fontsize = 20)
g.axes.set_xlabel('Adjusted Budget', fontsize=20)
g.axes.set_ylabel('Revenue Adjusted', fontsize=20)
plt.show()
So why the command plt.ticklabel_format
is not working. I see 1e9
and 1e8
along the axes but that is not the scientific notation.