1

I have the follow data. How can I plot the three columns together (dist, price and count) for each neighbourhood?

t1 = pd.DataFrame({'neighbourhood': ['Allston-Brighton', 'Back Bay', 'Beacon Hill', 'Brookline', 'Cambridge'], 
                   'dist': [5.318724014750601, 0.3170049667872781, 1.2481192434918875, 4.122402023894361, 2.975557190604119],
                   'price':[130.39048767089844, 276.3820495605469, 231.87042236328125, 127.90569305419922, 195.94696044921875],
                  'count':[238, 239, 135, 7, 7]})


    neighbourhood       dist        price       count
0   Allston-Brighton    5.318724    130.390488  238
1   Back Bay            0.317005    276.382050  239
2   Beacon Hill         1.248119    231.870422  135
3   Brookline           4.122402    127.905693  7
4   Cambridge           2.975557    195.946960  7

Do you any suggestion with matplotlib, or seaborn ? Thank you!

Pedro Henrique
  • 168
  • 1
  • 2
  • 8
  • Does this answer your question? [multiple plot in one figure in Python](https://stackoverflow.com/questions/21254472/multiple-plot-in-one-figure-in-python) – Rajat Mishra Apr 26 '20 at 00:06
  • You could use a scatter plot between dist and price where the marker size is the count – Juan C Apr 26 '20 at 00:08

1 Answers1

1

What you want is called a grouped barplot. You can melt the dataframe, which reformats your dataframe in a long format i.e. each neighbourhood and one variable at a time per row. Then apply the seaborns barplot to this melted dataframe.

Update: since the y-values are far apart, we can add some labels. This is a workaround that goes through each row, and appends the label to the corresponding bar. The values you add to x and y are just by trial and error.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

t1 = pd.DataFrame({'neighbourhood': ['Allston-Brighton', 'Back Bay', 'Beacon Hill', 'Brookline', 'Cambridge'], 
                   'dist': [5.318724014750601, 0.3170049667872781, 1.2481192434918875, 4.122402023894361, 2.975557190604119],
                   'price':[130.39048767089844, 276.3820495605469, 231.87042236328125, 127.90569305419922, 195.94696044921875],
                  'count':[238, 239, 135, 7, 7]})

t1_melted = pd.melt(t1, id_vars="neighbourhood", var_name="source", value_name="value_numbers")
g = sns.barplot(x="neighbourhood", y="value_numbers", hue="source", data=t1_melted)

for index, row in t1.iterrows():
     g.text(row.name - 0.35, row.dist + 0.1, round(row.dist, 2), color='black')

plt.show()

enter image description here

Derek O
  • 16,770
  • 4
  • 24
  • 43
  • 2
    It's a good solution. But there are differents spread of values in each column and it cause a very difficult way to see the bars. It's possible to improve this visualization, like adding a value on top of bars. – Pedro Henrique Apr 26 '20 at 01:10
  • Yeah, I'll have another look but for seaborn there is no way to directly annotate and you'll need a workaround. A good place to get started is here: https://stackoverflow.com/questions/39519609/annotate-bars-with-values-on-pandas-on-seaborn-factorplot-bar-plot – Derek O Apr 26 '20 at 01:28
  • 1
    Updated. Feel free to experiment with the parameters to best fit your needs. – Derek O Apr 26 '20 at 01:38
  • The standard way to do this is to use different y-axes, like [this](https://matplotlib.org/3.2.1/gallery/ticks_and_spines/multiple_yaxis_with_spines.html). None-the-less, it's confusing to plot values with different meaning along the same axis, and other solutions should usually be used. – tom10 Apr 26 '20 at 02:59