1

I have a dataframe(df) where there are several columns and I want to create a histogram of only few columns. I want to create a function for that.

{'airport_dist': {0: 18863.0, 1: 12817.0, 2: 21741.0},
 'balconies': {0: 0, 1: 2, 2: 0},
 'bedrooms': {0: 3, 1: 1, 2: 2},
 'bike_parking': {0: False, 1: False, 2: False},
 'ceiling_height': {0: 2.7, 1: 2.65, 2: 2.65},
 'city_center_dist': {0: 16028.0, 1: 18603.0, 2: 13933.0},
 'date_posted': {0: Timestamp('2019-03-07 00:00:00'),
  1: Timestamp('2018-12-04 00:00:00'),
  2: Timestamp('2015-08-20 00:00:00')},
 'days_listed': {0: 95, 1: 81, 2: 558},
 'floor': {0: 8, 1: 1, 2: 4},
 'floors_total': {0: 16, 1: 11, 2: 5},
 'is_open_plan': {0: False, 1: False, 2: False},
 'is_studio': {0: False, 1: False, 2: False},
 'kit_to_total': {0: 0.23, 1: 0.27, 2: 0.15},
 'kitchen_area': {0: 25.0, 1: 11.0, 2: 8.3},
 'last_price': {0: 260000.0, 1: 67000.0, 2: 103920.0},
 'liv_to_total': {0: 0.47, 1: 0.46, 2: 0.61},
 'living_area': {0: 51.0, 1: 18.6, 2: 34.3},
 'locality_name': {0: 'Saint Petersburg',
  1: 'Shushary village',
  2: 'Saint Petersburg'},
 'park_dist': {0: 482.0, 1: 455.0, 2: 90.0},
 'parks_within_3000': {0: 1.0, 1: 0.0, 2: 1.0},
 'pond_dist': {0: 755.0, 1: 502.0, 2: 574.0},
 'ponds_within_3000': {0: 2.0, 1: 0.0, 2: 2.0},
 'price_per_sqm': {0: 2407.0, 1: 1658.0, 2: 1856.0},
 'total_area': {0: 108.0, 1: 40.4, 2: 56.0},
 'total_images': {0: 20, 1: 7, 2: 10}}
def plotting(data):
  colsnew = ["total_area","living_area"]
  for i in colsnew:
    plt.figure()
    plt.hist(i)
  plt.show()

I then apply it to dataframe df

df.apply(plotting,axis=1)

I get several histograms but without any title and the loop does not stop plotting. The histogram is completely different if plotted without loop.

Neekunj
  • 35
  • 6
  • 1
    Please [create a reproducible copy of the DataFrame with `df.head(10).to_clipboard(sep=',')`](https://stackoverflow.com/questions/52413246/how-to-provide-a-copy-of-your-dataframe-with-to-clipboard), [edit] the question, and paste the clipboard into a code block. – Trenton McKinney Jun 20 '20 at 17:20
  • What does this `'total_area': {0: 108.0, 1: 40.4, 2: 56.0}` mean? In your histogram, what will be in x axis and in y axis? Do you need to show different histograms of each column listed in `colsnew`? – arshovon Jun 20 '20 at 17:53

1 Answers1

0

You can create a view from the original df with the wanted columns and call hist():

colsnew = ["total_area","living_area"]
df[colsnew].hist()

  • If i need it as separete figures, what should I do? The above code is giving histograms side by side – Neekunj Jun 20 '20 at 17:30