-1

I have a short table includind three columns, two text columns (column 1 and column 2) and one numerical column. I would like to have a matrix / scatter plot (x and y as column 1 and column 2) and size of marker or color of marker as column three

I first used the MultiIndex command to sum up column 1 and column two, since in these columns I do have repeated values. After applying this command I do have a new dataframe with two level index. However, I can fit for each combination of the index a seperate plot ( I used following link as help Pandas Plotting with Multi-Index. However, I want one single plot, on the x axis let's say level = 0, on y axis level=1 and marker size = column three

Table of data

    import pandas as pd
    data=pd.read_excel(path)
    new_frame=data.set_index(["Col 1", "Col 2"])
    new_frame.xs("High Humidity").plot(kind="bar")
    new_frame.xs("Low Humidity").plot(kind="bar")

With my code I only can code plots for all combination. But as mentioned I would like to have a plot where the x Axis is lets say Col 1, y-axis Col 2 and marker size = col 3

Any tips for me :)

Plasma
  • 1,903
  • 1
  • 22
  • 37
SMS
  • 348
  • 2
  • 13

2 Answers2

0

Here is simple example how to do this:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'Col1':['HH','HH','LH','LH'],'Col2':['P','P','P','HT2'],'Col3':[15,20,4,5]})

# get data
x = df['Col1']
y = df['Col2']
marker_sizes = df['Col3']

# plot data
fig, ax = plt.subplots()
ax.scatter(x, y, marker='o', s=marker_sizes)
plt.show()

Output:

enter image description here

Zaraki Kenpachi
  • 5,510
  • 2
  • 15
  • 38
0

@Zaraki,

I think I found a work around which at least satisfies my needs. I added two additional

columns, data["numerical Col 1"]=np.nan and data["numerical Col 2"]=np.nan

Then I did a loop through the frame and created if condition

import pandas as pd
import sys
import matplotlib.pyplot as plt
import numpy as np
data=pd.read_excel(r"C:\Users\116225\Desktop\test_table.xlsx")
data["numerical Col 1"]=np.nan
data["numerical Col 2"]=np.nan
for i in range(len(data["Col 1"])):
    if data.at[i,"Col 1"]=="Low Humidity":
        data.at[i,"numerical Col 1"]=np.random.randint(0,20)
    else:
        data.at[i,"numerical Col 1"]=np.random.randint(21,41)

    if data.at[i,"Col 2"]=="Pulsmax":
        data.at[i,"numerical Col 2"]=np.random.randint(0,20)
    else:
        data.at[i,"numerical Col 2"]=np.random.randint(21,41)

new_frame=data.copy()

x1, y1 = [20, 20], [0, 45]
x2, y2 = [-1, 45], [20, 20]
plt.plot(x1,y1,x2,y2,c="red")
plt.scatter(x=new_frame["numerical Col 1"],y=new_frame["numerical Col    2"],s=new_frame["Col 3"]*1e-3)
plt.tick_params(axis='both', left='off', top='off', right='off', bottom='off', labelleft='off', labeltop='off', labelright='off', labelbottom='off')

On the screenshot you can see the scatter plot with two lines indicating the borders :) enter image description here

SMS
  • 348
  • 2
  • 13