0

I have a df of Airbnb where each row represents a airbnb listing. I am trying to plot two columns as bar plot using Matplotlib.

fig,ax= plt.subplots()
ax.bar(airbnb['neighbourhood_group'],airbnb['revenue'])
plt.show()

What I think is, this graph should plot every neighbourhood on x axis and avg revenue per neighbourhood group on y axis(by default bar graph takes mean value per category) This code of line keeps on running without giving me any error as if it has entered an indefinite while loop. Can someone please suggest what could be wrong?

  • Seems like it should work, but there's not [mre]. Try `airbnb.groupby('neighbourhood_group').revenue.agg(['mean']).plot(kind='bar', rot=0, legend=False, title='Mean Revenue per Neighbourhood')` – Trenton McKinney Sep 02 '21 at 02:45
  • Please provide enough code so others can better understand or reproduce the problem. – Community Sep 03 '21 at 10:54
  • I have taken the dataset from https://www.kaggle.com/dgomonov/new-york-city-airbnb-open-data?select=AB_NYC_2019.csv Also I added an additional column of Revenue, price*number_of_reviews Sorry, this is my first question on Stackoverflow, not sure if I have provided the dataset in the required manner. – Lokesh Varshney Sep 07 '21 at 16:56

1 Answers1

0

following I have used a dataframe, since none is available.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Create sample DataFrame
y = np.random.rand(10,2)
y[:,0]= np.arange(10)
df = pd.DataFrame(y, columns=["neighbourhood_group", "revenue"])

Make sure that the "np.random" always gives different values for the revenue column when you start the program.

df:

enter image description here

# bar plot
ax = df.plot(x="neighbourhood_group", y="revenue", kind="bar")

enter image description here

regarding your statement that your code runs like in a loop. Could it be that the amount of data to be processed from the DataFrame to display the bar chart is too much effort. However, to say that for sure you would have to provide us with a dataset.

mika
  • 173
  • 2
  • 16
  • 1
    I didn't understand how it's working. y array has shape of 10,3 but in pd dataframe we are providing 2 column names. I am encountering "ValueError: Shape of passed values is (10, 3), indices imply (10, 2)" if I am continuing with your code. Don't we have to specify three column names? – Lokesh Varshney Sep 07 '21 at 16:52
  • 1
    I edit my solution and show the DataFrame, then it should be more understandable – mika Sep 07 '21 at 17:01
  • Excuse me please of course the array Y must be (10,2). there I have actually changed it only with me and not in your solution. – mika Sep 07 '21 at 17:05