I have a dataset with 17 features and 14k observations.
I would like to plot the price
distribution to get a better understanding. price
feature has a float64
data type
Plotting the price distribution gives me the following
The distribution looks like this
Why does this plot looks like this? Something wrong with my data? What's the proper way to solve this?
code:
fig, ax = plt.subplots(1, 1, figsize = (9,5))
data['sale_price'].hist(bins=50, ax=ax)
plt.xlabel('Price')
plt.title('Distribution of prices')
plt.ylabel('Number of houses')