2
   import matplotlib.pyplot as plt
    import numpy as np
    # data
    x=["IEEE", "Elsevier", "Others"]
    y=[7, 6, 2]
    import seaborn as sns
    plt.legend()
    plt.scatter(x, y, s=300, c="blue", alpha=0.4, linewidth=3)
    plt.ylabel("No. of Papers")
    plt.figure(figsize=(10, 4)) 

I want to make a graph as shown in the image. I am not sure how to provide data for both journal and conference categories. (Currently, I just include one). Also, I am not sure how to add different colors for each category. bubble chart

user3582228
  • 181
  • 2
  • 14
  • Does this answer your question? [pyplot scatter plot marker size](https://stackoverflow.com/questions/14827650/pyplot-scatter-plot-marker-size) – busybear Jan 25 '21 at 05:17

2 Answers2

6

You can try this code snippet for you problem.

- I modified your Data format, I suggest you to use pandas for data visualization.

- I added one more field to visualize the data more efficiently.

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import pandas as pd

# data
x=["IEEE", "Elsevier", "Others", "IEEE", "Elsevier", "Others"]
y=[7, 6, 2, 5, 4, 3]
z=["conference", "journal", "conference", "journal", "conference", "journal"]

# create pandas dataframe
data_list = pd.DataFrame(
    {'x_axis': x,
     'y_axis': y,
     'category': z
    })

# change size of data points
minsize = min(data_list['y_axis'])
maxsize = max(data_list['y_axis'])

# scatter plot
sns.catplot(x="x_axis", y="y_axis", kind="swarm", hue="category",sizes=(minsize*100, maxsize*100), data=data_list)
plt.grid()

OUTPUT:

Harsh gupta
  • 180
  • 7
  • Thank you for your help. I wonder if it's possible that overlapped bubbles can also be shown in the graph area. – user3582228 Jan 25 '21 at 09:13
  • yes it's possible, you can do this by changing the size of the bubble, in this code I hardcoded the value to be 100 as "sizes=(minsize*100, maxsize*100)", you can manipulate this according to your needs. – Harsh gupta Jan 25 '21 at 09:20
  • Thank you. I hope you understand my question. What I want to say is if for some value both categories have value 1 then in that case circles will overlap so one way is to use opacity to make the circles visible (alpha=0.4). Any other solution? – user3582228 Jan 26 '21 at 06:52
  • Also, to map data properly z should be z=["conference", "conference", "conference", "journal", "journal", "journal"] – user3582228 Jan 26 '21 at 07:55
  • In that case, you can use these examples https://seaborn.pydata.org/generated/seaborn.scatterplot.html – Harsh gupta Jan 27 '21 at 07:54
2

How to create the graph with correct bubble sizes and with no overlap

Seaborn stripplot and swarmplot (or sns.catplot(kind=strip or kind=swarm)) provide the handy dodge argument which prevents the bubbles from overlapping. The only downside is that the size argument applies a single size to all bubbles and the sizes argument (as used in the other answer) is of no use here. They do not work like the s and size arguments of scatterplot. Therefore, the size of each bubble must be edited after generating the plot:

import numpy as np     # v 1.19.2
import pandas as pd    # v 1.1.3
import seaborn as sns  # v 0.11.0

# Create sample data
x = ['IEEE', 'Elsevier', 'Others', 'IEEE', 'Elsevier', 'Others']
y = np.array([7, 6, 3, 7, 1, 3])
z = ['conference', 'conference', 'conference', 'journal', 'journal', 'journal']
df = pd.DataFrame(dict(organisation=x, count=y, category=z))

# Create seaborn stripplot (swarmplot can be used the same way)
ax = sns.stripplot(data=df, x='organisation', y='count', hue='category', dodge=True)

# Adjust the size of the bubbles
for coll in ax.collections[:-2]:
    y = coll.get_offsets()[0][1]
    coll.set_sizes([100*y])

# Format figure size, spines and grid
ax.figure.set_size_inches(7, 5)
ax.grid(axis='y', color='black', alpha=0.2)
ax.grid(axis='x', which='minor', color='black', alpha=0.2)
ax.spines['bottom'].set(position='zero', color='black', alpha=0.2)
sns.despine(left=True)

# Format ticks
ax.tick_params(axis='both', length=0, pad=10, labelsize=12)
ax.tick_params(axis='x', which='minor', length=25, width=0.8, color=[0, 0, 0, 0.2])
minor_xticks = [tick+0.5 for tick in ax.get_xticks() if tick != ax.get_xticks()[-1]]
ax.set_xticks(minor_xticks, minor=True)
ax.set_yticks(range(0, df['count'].max()+2))

# Edit labels and legend
ax.set_xlabel('Organisation', labelpad=15, size=12)
ax.set_ylabel('No. of Papers', labelpad=15, size=12)
ax.legend(bbox_to_anchor=(1.0, 0.5), loc='center left', frameon=False);

stripplot


Alternatively, you can use scatterplot with the convenient s argument (or size) and then edit the space between the bubbles to reproduce the effect of the missing dodge argument (note that the x_jitter argument seems to have no effect). Here is an example using the same data as before and without all the extra formatting:

# Create seaborn scatterplot with size argument
ax = sns.scatterplot(data=df, x='organisation', y='count',
                     hue='category', s=100*df['count'])
ax.figure.set_size_inches(7, 5)
ax.margins(0.2)

# Dodge bubbles
bubbles = ax.collections[0].get_offsets()
signs = np.repeat([-1, 1], df['organisation'].nunique())
for bubble, sign in zip(bubbles, signs):
    bubble[0] += sign*0.15

scatterplot




As a side note, I recommend that you consider other types of plots for this data. A grouped bar chart:

df.pivot(index='organisation', columns='category').plot.bar()

Or a balloon plot (aka categorical bubble plot):

sns.scatterplot(data=df, x='organisation', y='category', s=100*count).margins(0.4)

Why? In the bubble graph, the counts are displayed using 2 visual attributes, i) the y-coordinate location and ii) the bubble size. Only one of them is really necessary.

Patrick FitzGerald
  • 3,280
  • 2
  • 18
  • 30
  • Hi In your first graph why did the major axis disappear? Can you kindly tell me how to bring them – user3582228 Nov 03 '21 at 09:01
  • @user3582228 Hi, the x-axis is still there but it has been formatted to look like the grid lines, you can set it back to its default format by removing the line `ax.spines['bottom'].set(...)`. The y-axis and the top and right spines will appear again if you remove the line [sns.despine(left=True)](https://seaborn.pydata.org/generated/seaborn.despine.html#seaborn.despine). – Patrick FitzGerald Nov 04 '21 at 02:05